This page discusses various available options for working with HTML [en.wikipedia.org] documents in your Qt application. Please also read the general considerations outlined on the Handling Document Formats page.
Reading / Writing
Qt’s Scribe framework (see Handling Document Formats) has built-in support for loading from / saving to HTML (see QTextDocument::setHtml and toHtml as well as QTextDocumentWriter). Together with the format-independent API that QTextDocument provides for modifying documents (or creating them from scratch), this makes Scribe an adequate framework for processing or generating HTML documents.
However, it only supports a limited subset of static HTML 4 / CSS 2.1 – corresponding to the limited set of built-in document features which QTextDocument supports internally.
The Webkit-based web browser framework shipped with Qt provides the QWebPage and QWebFrame classes, which can be used to load an HTML document (or any web page) without actually rendering it, and access or modify it through a DOM-like API. Saving back to HTML is possible using QWebFrame::toHtml.
Manual XML processing
If your application needs to parse or write HTML/XHTML documents which are valid XML, consider processing them using Qt’s XML handling classes (see Handling Document Formats).
Note that there are third-party tools/libraries available for automatically converting “normal” (and even broken) HTML documents into valid XHTML/XML which is suitable for processing:
|HTML Tidy [tidy.sourceforge.net]||stand-alone tool||Win, Mac, Linux, …||MIT-like [permissive]|
|TidyLib [tidy.sourceforge.net]||C library||Win, Mac, Linux, …||MIT-like [permissive]|
|Chilkat [chilkatsoft.com]||C++ library||Win, Mac, Linux, …||proprietary|
Manual HTML processing
For specialized HTML parsers with a similarly low-level API as Qt’s XML handling classes, refer to third-party C/C++ libraries, e.g.:
|libxml2 [xmlsoft.org]||C||yes||?||stream, SAX, DOM (non-validating?)||Win, Mac, Linux, …||MIT [permissive]|
|htmlcxx [htmlcxx.sourceforge.net]||C++||yes||yes||SAX, DOM, ? (non-validating)||Win, Linux, ?||LGPL [weak copyleft]|
|libhtml [libhtml.bsd.lv]||C||yes||yes||stream (strongly-validating)||Linux, ?||ICS [permissive]|
Rendering / Interactive Viewing
As already described above, Qt’s Scribe framework supports automatically importing HTML content into a QTextDocument.
Once in that form, you can…
- …render it onto any QPaintDevice using QTextDocument::drawContents.
- …show it to the user through a QTextEdit widget (either in read-only mode, or in editable mode which allows the user to actually edit the document interactively).
Again, the restriction to the limited subset of static HTML 4 / CSS 2.1 supported by QTextDocument applies.