(Perl) Convert HTML to XML with Auto-Correction
Simple HTML to XML conversion. Demonstrates how the HTML is auto-corrected to create well-formed XML. In this example, the closing is missing. Also, text is encapsulated in nodes with the intent to make it easy for programs to identify and extract the text parts.
use chilkat();
# This example assumes the Chilkat API to have been previously unlocked.
# See Global Unlock Sample for sample code.
$htmlToXml = chilkat::CkHtmlToXml->new();
# Indicate the charset of the output XML we'll want.
$htmlToXml->put_XmlCharset("utf-8");
# Set the HTML:
$htmlToXml->put_Html("<html><body><p>This is a test <a href=\"http://www.chilkatsoft.com/\">Chilkat Software</a></body></html>");
# Get the XML:
print $htmlToXml->toXml() . "\r\n";
# This is the output:
# <?xml version="1.0" encoding="utf-8" ?>
#
# <root>
# <html>
# <body>
# <p>
# <text>This is a test </text>
# <a href="http://www.chilkatsoft.com/">
# <text>Chilkat Software</text>
# </a>
# </p>
# </body>
# </html>
# </root
|