Sample code for 30+ languages & platforms
Unicode C++

Drop/Undrop Text Formatting Tags

See more HTML-to-XML/Text Examples

Demonstrates how the DropTextFormattingTags and UndropTextFormattingTags methods work in the Html-to-Xml API.

Chilkat Unicode C++ Downloads

Unicode C++
#include <CkHtmlToXmlW.h>

void ChilkatSample(void)
    {
    // This example assumes the Chilkat API to have been previously unlocked.
    // See Global Unlock Sample for sample code.

    const wchar_t *html = L"<html><body><p><b>Hello</b> World!<p>This is a test</body></html>";

    // Convert the above to XML
    CkHtmlToXmlW h2x;

    // By default, text formatting tags are dropped. Text formatting HTML tags are: b, font, i, u, br, center, em, strong, big, tt, s, small, strike, sub, and sup
    h2x.put_Html(html);
    wprintf(L"%s\n",h2x.toXml());

    // The resulting XML is:

    // <?xml version="1.0" encoding="utf-8"?>
    // <root>
    //     <html>
    //         <body>
    //             <p>
    //                 <text>Hello World!</text>
    //             </p>
    //             <p>
    //                 <text>This is a test</text>
    //             </p>
    //         </body>
    //     </html>
    // </root>

    // To preserve text formatting tags, put the h2x instance into the mode where text formatting tags are not dropped:
    h2x.UndropTextFormattingTags();

    // Convert again to see the difference:
    wprintf(L"%s\n",h2x.toXml());

    // The resulting XML is:

    // <?xml version="1.0" encoding="utf-8"?>
    // <root>
    //     <html>
    //         <body>
    //             <p>
    //                 <b>
    //                     <text>Hello</text>
    //                 </b>
    //                 <text> World!</text>
    //             </p>
    //             <p>
    //                 <text>This is a test</text>
    //             </p>
    //         </body>
    //     </html>
    // </root>

    // Call DropTextFormattingTags to put the h2x instance back in "drop" mode.
    h2x.DropTextFormattingTags();

    // Convert again to see the difference:
    wprintf(L"%s\n",h2x.toXml());

    // The resulting XML is:

    // <?xml version="1.0" encoding="utf-8"?>
    // <root>
    //     <html>
    //         <body>
    //             <p>
    //                 <text>Hello World!</text>
    //             </p>
    //             <p>
    //                 <text>This is a test</text>
    //             </p>
    //         </body>
    //     </html>
    // </root>
    }