Sample code for 30+ languages & platforms
Perl

Drop/Undrop Text Formatting Tags

See more HTML-to-XML/Text Examples

Demonstrates how the DropTextFormattingTags and UndropTextFormattingTags methods work in the Html-to-Xml API.

Chilkat Perl Downloads

Perl
use chilkat();

# This example assumes the Chilkat API to have been previously unlocked.
# See Global Unlock Sample for sample code.

$html = "<html><body><p><b>Hello</b> World!<p>This is a test</body></html>";

# Convert the above to XML
$h2x = chilkat::CkHtmlToXml->new();

# By default, text formatting tags are dropped. Text formatting HTML tags are: b, font, i, u, br, center, em, strong, big, tt, s, small, strike, sub, and sup
$h2x->put_Html($html);
print $h2x->toXml() . "\r\n";

# The resulting XML is:

# <?xml version="1.0" encoding="utf-8"?>
# <root>
#     <html>
#         <body>
#             <p>
#                 <text>Hello World!</text>
#             </p>
#             <p>
#                 <text>This is a test</text>
#             </p>
#         </body>
#     </html>
# </root>

# To preserve text formatting tags, put the h2x instance into the mode where text formatting tags are not dropped:
$h2x->UndropTextFormattingTags();

# Convert again to see the difference:
print $h2x->toXml() . "\r\n";

# The resulting XML is:

# <?xml version="1.0" encoding="utf-8"?>
# <root>
#     <html>
#         <body>
#             <p>
#                 <b>
#                     <text>Hello</text>
#                 </b>
#                 <text> World!</text>
#             </p>
#             <p>
#                 <text>This is a test</text>
#             </p>
#         </body>
#     </html>
# </root>

# Call DropTextFormattingTags to put the h2x instance back in "drop" mode.
$h2x->DropTextFormattingTags();

# Convert again to see the difference:
print $h2x->toXml() . "\r\n";

# The resulting XML is:

# <?xml version="1.0" encoding="utf-8"?>
# <root>
#     <html>
#         <body>
#             <p>
#                 <text>Hello World!</text>
#             </p>
#             <p>
#                 <text>This is a test</text>
#             </p>
#         </body>
#     </html>
# </root>