Qt
Internal/Contributor docs for the Qt SDK. <b>Note:</b> These are NOT official API docs; those are found <a href='https://doc.qt.io/'>here</a>.
Loading...
Searching...
No Matches
xml-processing.qdoc
Go to the documentation of this file.
1// Copyright (C) 2020 The Qt Company Ltd.
2// SPDX-License-Identifier: LicenseRef-Qt-Commercial OR GFDL-1.3-no-invariants-only
3
4/*!
5 \group xml-tools
6 \title XML Classes
7
8 \brief Classes that support XML.
9
10 These classes are relevant to \l{XML Processing}{XML} users.
11
12 \generatelist{related}
13*/
14
15/*!
16 \page xml-processing.html
17 \title XML Processing
18 \ingroup explanations-dataprocessingandio
19
20 \brief An Overview of the XML processing facilities in Qt.
21
22 Qt provides two general-purpose sets of APIs to read and write well-formed
23 XML: \l{XML Streaming}{stream based} and
24 \l{Working with the DOM Tree}{DOM based}.
25
26 Qt also provides specific support for some XML dialects. For instance, the
27 Qt SVG module provides the QSvgRenderer and QSvgGenerator classes to read
28 and write a subset of SVG, an XML-based file
29 format. Qt also provides helper functions that may be useful to
30 those working with XML and XHTML: see Qt::escape() and
31 Qt::convertFromPlainText().
32
33 \section1 Topics:
34
35 \list
36 \li \l {Classes for XML Processing}
37 \li \l {An Introduction to Namespaces}
38 \li \l {XML Streaming}
39 \li \l {Working with the DOM Tree}
40 \endlist
41
42 \section1 Classes for XML Processing
43
44 These classes are relevant to XML users.
45
46 \annotatedlist xml-tools
47*/
48
49/*!
50 \page xml-namespaces.html
51 \title An Introduction to Namespaces
52 \target namespaces
53
54 \nextpage XML Streaming
55
56 Parts of the Qt XML module documentation assume that you are familiar
57 with XML namespaces. Here we present a brief introduction; skip to
58 \l{#namespacesConventions}{Qt XML documentation conventions}
59 if you already know this material.
60
61 Namespaces are a concept introduced into XML to allow a more modular
62 design. With their help data processing software can easily resolve
63 naming conflicts in XML documents.
64
65 Consider the following example:
66
67 \snippet code/doc_src_qtxml.qdoc 6
68
69 Here we find three different uses of the name \e title. If you wish to
70 process this document you will encounter problems because each of the
71 \e titles should be displayed in a different manner -- even though
72 they have the same name.
73
74 The solution would be to have some means of identifying the first
75 occurrence of \e title as the title of a book, i.e. to use the \e
76 title element of a book namespace to distinguish it from, for example,
77 the chapter title, e.g.:
78 \snippet code/doc_src_qtxml.qdoc 7
79
80 \e book in this case is a \e prefix denoting the namespace.
81
82 Before we can apply a namespace to element or attribute names we must
83 declare it.
84
85 Namespaces are URIs like \e http://www.example.com/fnord/book/. This
86 does not mean that data must be available at this address; the URI is
87 simply used to provide a unique name.
88
89 We declare namespaces in the same way as attributes; strictly speaking
90 they \e are attributes. To make for example \e
91 http://www.example.com/fnord/ the document's default XML namespace \e
92 xmlns we write
93
94 \snippet code/doc_src_qtxml.qdoc 8
95
96 To distinguish the \e http://www.example.com/fnord/book/ namespace from
97 the default, we must supply it with a prefix:
98
99 \snippet code/doc_src_qtxml.qdoc 9
100
101 A namespace that is declared like this can be applied to element and
102 attribute names by prepending the appropriate prefix and a ":"
103 delimiter. We have already seen this with the \e book:title element.
104
105 Element names without a prefix belong to the default namespace. This
106 rule does not apply to attributes: an attribute without a prefix does
107 not belong to any of the declared XML namespaces at all. Attributes
108 always belong to the "traditional" namespace of the element in which
109 they appear. A "traditional" namespace is not an XML namespace, it
110 simply means that all attribute names belonging to one element must be
111 different. Later we will see how to assign an XML namespace to an
112 attribute.
113
114 Due to the fact that attributes without prefixes are not in any XML
115 namespace there is no collision between the attribute \e title (that
116 belongs to the \e author element) and for example the \e title element
117 within a \e chapter.
118
119 Let's clarify this with an example:
120 \snippet code/doc_src_qtxml.qdoc 10
121
122 Within the \e document element we have two namespaces declared. The
123 default namespace \e http://www.example.com/fnord/ applies to the \e
124 book element, the \e chapter element, the appropriate \e title element
125 and of course to \e document itself.
126
127 The \e book:author and \e book:title elements belong to the namespace
128 with the URI \e http://www.example.com/fnord/book/.
129
130 The two \e book:author attributes \e title and \e name have no XML
131 namespace assigned. They are only members of the "traditional"
132 namespace of the element \e book:author, meaning that for example two
133 \e title attributes in \e book:author are forbidden.
134
135 In the above example we circumvent the last rule by adding a \e title
136 attribute from the \e http://www.example.com/fnord/ namespace to \e
137 book:author: the \e fnord:title comes from the namespace with the
138 prefix \e fnord that is declared in the \e book:author element.
139
140 Clearly the \e fnord namespace has the same namespace URI as the
141 default namespace. So why didn't we simply use the default namespace
142 we'd already declared? The answer is quite complex:
143 \list
144 \li attributes without a prefix don't belong to any XML namespace at
145 all, not even to the default namespace;
146 \li additionally omitting the prefix would lead to a \e title-title clash;
147 \li writing it as \e xmlns:title would declare a new namespace with the
148 prefix \e title instead of applying the default \e xmlns namespace.
149 \endlist
150
151 With the Qt XML classes elements and attributes can be accessed in two
152 ways: either by referring to their qualified names consisting of the
153 namespace prefix and the "real" name (or \e local name) or by the
154 combination of local name and namespace URI.
155
156 More information on XML namespaces can be found at
157 \l http://www.w3.org/TR/REC-xml-names/.
158
159 \target namespacesConventions
160 \section1 Conventions Used in the Qt XML Documentation
161
162 The following terms are used to distinguish the parts of names within
163 the context of namespaces:
164 \list
165 \li The \e {qualified name}
166 is the name as it appears in the document. (In the above example \e
167 book:title is a qualified name.)
168 \li A \e {namespace prefix} in a qualified name
169 is the part to the left of the ":". (\e book is the namespace prefix in
170 \e book:title.)
171 \li The \e {local part} of a name (also referred to as the \e {local
172 name}) appears to the right of the ":". (Thus \e title is the
173 local part of \e book:title.)
174 \li The \e {namespace URI} ("Uniform Resource Identifier") is a unique
175 identifier for a namespace. It looks like a URL
176 (e.g. \e http://www.example.com/fnord/ ) but does not require
177 data to be accessible by the given protocol at the named address.
178 \endlist
179
180 Elements without a ":" (like \e chapter in the example) do not have a
181 namespace prefix. In this case the local part and the qualified name
182 are identical (i.e. \e chapter).
183
184 \sa {DOM Bookmarks Application}
185*/
186
187/*!
188 \page xml-streaming.html
189 \title XML Streaming
190
191 \previouspage An Introduction to Namespaces
192 \nextpage Working with the DOM Tree
193
194 Qt provides two classes for reading and writing XML through a simple streaming
195 API: QXmlStreamReader and QXmlStreamWriter. These classes are located in
196 \l{Qt Serialization}{Qt Serialization (part of QtCore)}.
197
198 A stream reader reports an XML document as a stream
199 of tokens. This differs from SAX as SAX applications provide handlers to
200 receive XML events from the parser whereas the QXmlStreamReader drives the
201 loop, pulling tokens from the reader when they are needed.
202 This pulling approach makes it possible to build recursive descent parsers,
203 allowing XML parsing code to be split into different methods or classes.
204
205 QXmlStreamReader is a well-formed XML 1.0 parser that excludes external
206 parsed entities. Hence, data provided by the stream reader adheres to the
207 W3C's criteria for well-formed XML, as long as no error occurs. Otherwise,
208 functions such as \l{QXmlStreamReader::atEnd()}{atEnd()},
209 \l{QXmlStreamReader::error()}{error()} and \l{QXmlStreamReader::hasError()}
210 {hasError()} can be used to check and view the errors.
211
212 An example of an implementation tha uses QXmlStreamReader would be the
213 \l{QXmlStream Bookmarks Example#xbelreader-class-definition}{XbelReader} in
214 \l{QXmlStream Bookmarks Example}, which wraps a QXmlStreamReader. Read the
215 \l{QXmlStream Bookmarks Example#xbelreader-class-implementation}{implementation}
216 to learn more about how to use the QXmlStreamReader class.
217
218 Paired with QXmlStreamReader is the QXmlStreamWriter class, which provides
219 an XML writer with a simple streaming API. QXmlStreamWriter operates on a
220 QIODevice and has specialized functions for all XML tokens or events you
221 want to write, such as \l{QXmlStreamWriter::writeDTD()}{writeDTD()},
222 \l{QXmlStreamWriter::writeCharacters()}{writeCharacters()},
223 \l{QXmlStreamWriter::writeComment()}{writeComment()} and so on.
224
225 To write XML document with QXmlStreamWriter, you start a document with the
226 \l{QXmlStreamWriter::writeStartDocument()}{writeStartDocument()} function
227 and end it with \l{QXmlStreamWriter::writeEndDocument()}
228 {writeEndDocument()}, which implicitly closes all remaining open tags.
229 Element tags are opened with \l{QXmlStreamWriter::writeStartDocument()}
230 {writeStartDocument()} and followed by
231 \l{QXmlStreamWriter::writeAttribute()}{writeAttribute()} or
232 \l{QXmlStreamWriter::writeAttributes()}{writeAttributes()},
233 element content, and then \l{QXmlStreamWriter::writeEndDocument()}
234 {writeEndDocument()}. Also, \l{QXmlStreamWriter::writeEmptyElement()}
235 {writeEmptyElement()} can be used to write empty elements.
236
237 Element content comprises characters, entity references or nested elements.
238 Content can be written with \l{QXmlStreamWriter::writeCharacters()}
239 {writeCharacters()}, a function that also takes care of escaping all
240 forbidden characters and character sequences,
241 \l{QXmlStreamWriter::writeEntityReference()}{writeEntityReference()},
242 or subsequent calls to \l{QXmlStreamWriter::writeStartElement()}
243 {writeStartElement()}.
244
245 The \l{QXmlStream Bookmarks Example#xbelwriter-class-definition}{XbelWriter}
246 class from \l{QXmlStream Bookmarks Example} wraps a QXmlStreamWriter. View
247 the \l{QXmlStream Bookmarks Example#xbelwriter-class-implementation}{implementation}
248 to see how to use the QXmlStreamWriter class.
249*/
250
251/*!
252 \page xml-dom.tml
253 \title Working with the DOM Tree
254 \target dom
255
256 \previouspage XML Streaming
257
258 DOM Level 2 is a W3C Recommendation for XML interfaces that maps the
259 constituents of an XML document to a tree structure. The specification
260 of DOM Level 2 can be found at \l{http://www.w3.org/DOM/}.
261
262 \target domIntro
263 \section1 Introduction to DOM
264
265 DOM provides an interface to access and change the content and
266 structure of an XML file. It makes a hierarchical view of the document
267 (a tree view). Thus -- in contrast to the streaming API provided
268 by QXmlStreamReader -- an object
269 model of the document is resident in memory after parsing which makes
270 manipulation easy.
271
272 All DOM nodes in the document tree are subclasses of \l QDomNode. The
273 document itself is represented as a \l QDomDocument object.
274
275 Here are the available node classes and their potential child classes:
276
277 \list
278 \li \l QDomDocument: Possible children are
279 \list
280 \li \l QDomElement (at most one)
281 \li \l QDomProcessingInstruction
282 \li \l QDomComment
283 \li \l QDomDocumentType
284 \endlist
285 \li \l QDomDocumentFragment: Possible children are
286 \list
287 \li \l QDomElement
288 \li \l QDomProcessingInstruction
289 \li \l QDomComment
290 \li \l QDomText
291 \li \l QDomCDATASection
292 \li \l QDomEntityReference
293 \endlist
294 \li \l QDomDocumentType: No children
295 \li \l QDomEntityReference: Possible children are
296 \list
297 \li \l QDomElement
298 \li \l QDomProcessingInstruction
299 \li \l QDomComment
300 \li \l QDomText
301 \li \l QDomCDATASection
302 \li \l QDomEntityReference
303 \endlist
304 \li \l QDomElement: Possible children are
305 \list
306 \li \l QDomElement
307 \li \l QDomText
308 \li \l QDomComment
309 \li \l QDomProcessingInstruction
310 \li \l QDomCDATASection
311 \li \l QDomEntityReference
312 \endlist
313 \li \l QDomAttr: Possible children are
314 \list
315 \li \l QDomText
316 \li \l QDomEntityReference
317 \endlist
318 \li \l QDomProcessingInstruction: No children
319 \li \l QDomComment: No children
320 \li \l QDomText: No children
321 \li \l QDomCDATASection: No children
322 \li \l QDomEntity: Possible children are
323 \list
324 \li \l QDomElement
325 \li \l QDomProcessingInstruction
326 \li \l QDomComment
327 \li \l QDomText
328 \li \l QDomCDATASection
329 \li \l QDomEntityReference
330 \endlist
331 \li \l QDomNotation: No children
332 \endlist
333
334 With \l QDomNodeList and \l QDomNamedNodeMap two collection classes
335 are provided: \l QDomNodeList is a list of nodes,
336 and \l QDomNamedNodeMap is used to handle unordered sets of nodes
337 (often used for attributes).
338
339 The \l QDomImplementation class allows the user to query features of the
340 DOM implementation.
341
342 To get started please refer to the \l QDomDocument documentation.
343 You might also want to take a look at the \l{DOM Bookmarks Application},
344 which illustrates how to read and write an XML bookmark file (XBEL)
345 using DOM.
346*/