Qt
Internal/Contributor docs for the Qt SDK. <b>Note:</b> These are NOT official API docs; those are found <a href='https://doc.qt.io/'>here</a>.
Loading...
Searching...
No Matches
scenegraph.qdoc
Go to the documentation of this file.
1// Copyright (C) 2019 The Qt Company Ltd.
2// SPDX-License-Identifier: LicenseRef-Qt-Commercial OR GFDL-1.3-no-invariants-only
3
4/*!
5\title Qt Quick Scene Graph
6\page qtquick-visualcanvas-scenegraph.html
7
8\section1 The Scene Graph in Qt Quick
9
10Qt Quick 2 makes use of a dedicated scene graph that is then traversed and
11rendered via a graphics API such as OpenGL ES, OpenGL, Vulkan, Metal, or Direct
123D. Using a scene graph for graphics rather than the traditional imperative
13painting systems (QPainter and similar), means the scene to be rendered can be
14retained between frames and the complete set of primitives to render is known
15before rendering starts. This opens up for a number of optimizations, such as
16batch rendering to minimize state changes and discarding obscured primitives.
17
18For example, say a user-interface contains a list of ten items
19where each item has a background color, an icon and a text. Using the
20traditional drawing techniques, this would result in 30 draw calls and
21a similar amount of state changes. A scene graph, on the other hand,
22could reorganize the primitives to render such that all backgrounds
23are drawn in one call, then all icons, then all the text, reducing the
24total amount of draw calls to only 3. Batching and state change
25reduction like this can greatly improve performance on some hardware.
26
27The scene graph is closely tied to Qt Quick 2.0 and can not be used
28stand-alone. The scene graph is managed and rendered by the
29QQuickWindow class and custom Item types can add their graphical
30primitives into the scene graph through a call to
31QQuickItem::updatePaintNode().
32
33The scene graph is a graphical representation of the Item scene, an
34independent structure that contains enough information to render all
35the items. Once it has been set up, it can be manipulated and rendered
36independently of the state of the items. On many platforms, the scene
37graph will even be rendered on a dedicated render thread while the GUI
38thread is preparing the next frame's state.
39
40\note Much of the information listed on this page is specific to the built-in,
41default behavior of the Qt Quick Scene graph. When using an alternative scene
42graph adaptation, such as, the \c software adaptation, not all concepts may
43apply. For more information about the different scene graph adaptations see
44\l{qtquick-visualcanvas-adaptations.html}{Scene Graph Adaptations}.
45
46
47\section1 Qt Quick Scene Graph Structure
48
49The scene graph is composed of a number of predefined node types, each
50serving a dedicated purpose. Although we refer to it as a scene graph,
51a more precise definition is node tree. The tree is built from
52QQuickItem types in the QML scene and internally the scene is then
53processed by a renderer which draws the scene. The nodes themselves do
54\b not contain any active drawing code nor virtual \c paint()
55function.
56
57Even though the node tree is mostly built internally by the existing
58Qt Quick QML types, it is possible for users to also add complete
59subtrees with their own content, including subtrees that represent 3D
60models.
61
62
63\section2 Nodes
64
65The most important node for users is the \l QSGGeometryNode. It is
66used to define custom graphics by defining its geometry and
67material. The geometry is defined using \l QSGGeometry and describes
68the shape or mesh of the graphical primitive. It can be a line, a
69rectangle, a polygon, many disconnected rectangles, or complex 3D
70mesh. The material defines how the pixels in this shape are filled.
71
72A node can have any number of children and geometry nodes will be
73rendered so they appear in child-order with parents behind their
74children. \note This does not say anything about the actual rendering
75order in the renderer. Only the visual output is guaranteed.
76
77The available nodes are:
78\annotatedlist{qtquick-scenegraph-nodes}
79
80Custom nodes are added to the scene graph by subclassing
81QQuickItem::updatePaintNode() and setting the
82\l {QQuickItem::ItemHasContents} flag.
83
84\warning It is crucial that native graphics (OpenGL, Vulkan, Metal, etc.)
85operations and interaction with the scene graph happens exclusively on the
86render thread, primarily during the updatePaintNode() call. The rule of thumb
87is to only use classes with the "QSG" prefix inside the
88QQuickItem::updatePaintNode() function.
89
90For more details, see the \l {Scene Graph - Custom Geometry}.
91
92\section3 Preprocessing
93
94Nodes have a virtual QSGNode::preprocess() function, which will be
95called before the scene graph is rendered. Node subclasses can set the
96flag \l QSGNode::UsePreprocess and override the QSGNode::preprocess()
97function to do final preparation of their node. For example, dividing a
98bezier curve into the correct level of detail for the current scale
99factor or updating a section of a texture.
100
101\section3 Node Ownership
102
103Ownership of the nodes is either done explicitly by the creator or by
104the scene graph by setting the flag \l QSGNode::OwnedByParent.
105Assigning ownership to the scene graph is often preferable as it
106simplifies cleanup when the scene graph lives outside the GUI thread.
107
108
109\section2 Materials
110
111The material describes how the interior of a geometry in a \l QSGGeometryNode
112is filled. It encapsulates graphics shaders for the vertex and fragment stages
113of the graphics pipeline and provides ample flexibility in what can be
114achieved, though most of the Qt Quick items themselves only use very basic
115materials, such as solid color and texture fills.
116
117For users who just want to apply custom shading to a QML Item type,
118it is possible to do this directly in QML using the \l ShaderEffect
119type.
120
121Below is a complete list of material classes:
122\annotatedlist{qtquick-scenegraph-materials}
123
124\section2 Convenience Nodes
125
126The scene graph API is low-level and focuses on performance rather than
127convenience. Writing custom geometries and materials from scratch, even the
128most basic ones, requires a non-trivial amount of code. For this reason, the
129API includes a few convenience classes to make the most common custom nodes
130readily available.
131
132\list
133\li \l QSGSimpleRectNode - a QSGGeometryNode subclass which defines a
134rectangular geometry with a solid color material.
135
136\li \l QSGSimpleTextureNode - a QSGGeometryNode subclass which defines
137a rectangular geometry with a texture material.
138\endlist
139
140
141
142\section1 Scene Graph and Rendering
143
144The rendering of the scene graph happens internally in the QQuickWindow class,
145and there is no public API to access it. There are, however, a few places in
146the rendering pipeline where the user can attach application code. This can be
147used to add custom scene graph content or to insert arbitrary rendering
148commands by directly calling the graphics API (OpenGL, Vulkan, Metal, etc.)
149that is in use by the scene graph. The integration points are defined by the
150render loop.
151
152For detailed description of how the scene graph renderer works, see \l {Qt
153Quick Scene Graph Default Renderer}.
154
155There are two render loop variants available: \c basic, and \c threaded.
156\c basic is single-threaded, while \c threaded performs scene graph rendering on a
157dedicated thread. Qt attempts to choose a suitable loop based on the platform
158and possibly the graphics drivers in use. When this is not satisfactory, or for
159testing purposes, the environment variable \c QSG_RENDER_LOOP can be used to
160force the usage of a given loop. To verify which render loop is in use, enable
161the \c qt.scenegraph.general \l {QLoggingCategory}{logging category}.
162
163\section2 Threaded Render Loop ('threaded')
164\target threaded_render_loop
165
166On many configurations, the scene graph rendering will happen on a
167dedicated render thread. This is done to increase parallelism of
168multi-core processors and make better use of stall times such as
169waiting for a blocking swap buffer call. This offers significant
170performance improvements, but imposes certain restrictions on where
171and when interaction with the scene graph can happen.
172
173The following is a simple outline of how a frame gets rendered with the
174threaded render loop and OpenGL. The steps are the same with other graphics
175APIs as well, apart from the OpenGL context specifics.
176
177\image sg-renderloop-threaded.png
178
179\list 1
180
181\li A change occurs in the QML scene, causing \c QQuickItem::update()
182to be called. This can be the result of for instance an animation or
183user input. An event is posted to the render thread to initiate a new
184frame.
185
186\li The render thread prepares to draw a new frame and initiates a block on the
187GUI thread.
188
189\li While the render thread is preparing the new frame, the GUI thread
190calls QQuickItem::updatePolish() to do final touch-up of items before
191they are rendered.
192
193\li GUI thread is blocked.
194
195\li The QQuickWindow::beforeSynchronizing() signal is emitted.
196Applications can make direct connections (using Qt::DirectConnection)
197to this signal to do any preparation required before calls to
198QQuickItem::updatePaintNode().
199
200\li Synchronization of the QML state into the scene graph. This is
201done by calling the QQuickItem::updatePaintNode() function on all
202items that have changed since the previous frame. This is the only
203time the QML items and the nodes in the scene graph interact.
204
205\li GUI thread block is released.
206
207\li The scene graph is rendered:
208 \list 1
209
210 \li The QQuickWindow::beforeRendering() signal is emitted. Applications can
211 make direct connections (using Qt::DirectConnection) to this signal to use
212 custom graphics API calls which will then stack visually beneath the QML
213 scene.
214
215 \li Items that have specified QSGNode::UsePreprocess, will have their
216 QSGNode::preprocess() function invoked.
217
218 \li The renderer processes the nodes.
219
220 \li The renderer generates states and records draw calls for the graphics
221 API in use.
222
223 \li The QQuickWindow::afterRendering() signal is emitted. Applications can
224 make direct connections (using Qt::DirectConnection) to this signal to
225 issue custom graphics API calls which will then stack visually over the QML
226 scene.
227
228 \li The frame is now ready. The buffers are swapped (OpenGL), or a present
229 command is recorded and the command buffers are submitted to a graphics
230 queue (Vulkan, Metal). QQuickWindow::frameSwapped() is emitted.
231
232 \endlist
233
234\li While the render thread is rendering, the GUI is free to advance
235animations, process events, etc.
236
237\endlist
238
239The threaded renderer is currently used by default on Windows with
240Direct3D 11 and with OpenGL when using opengl32.dll, Linux excluding
241Mesa llvmpipe, \macos with Metal, mobile platforms, and Embedded Linux
242with EGLFS, and with Vulkan regardless of the platform. All this may
243change in future releases. It is always possible to force use of the
244threaded renderer by setting \c {QSG_RENDER_LOOP=threaded} in the
245environment.
246
247\section2 Non-threaded Render Loop ('basic')
248
249The non-threaded render loop is currently used by default on Windows with
250OpenGL when not using the system's standard opengl32.dll, \macos with OpenGL,
251WebAssembly, and Linux with some drivers. For the latter this is mostly a
252precautionary measure, as not all combinations of OpenGL drivers and windowing
253systems have been tested.
254
255On macOS and OpenGL, the threaded render loop is not supported when building
256with XCode 10 (10.14 SDK) or later, since this opts in to layer-backed views on
257macOS 10.14. You can build with Xcode 9 (10.13 SDK) to opt out of
258layer-backing, in which case the threaded render loop is available and used by
259default. There is no such restriction with Metal.
260
261The threaded render loop is not supported on WebAssembly, since the web platform
262has limited support for using WebGL on other threads than the main thread, and
263limited support for blocking the main thread.
264
265Even when using the non-threaded render loop, you should write your code as if
266you are using the threaded renderer, as failing to do so will make the code
267non-portable.
268
269The following is a simplified illustration of the frame rendering sequence in
270the non-threaded renderer.
271
272\image sg-renderloop-singlethreaded.png
273
274
275\section2 Driving Animations
276
277\section3 What does \c{Advance Animations} refer to in the above diagrams?
278
279By default, a Qt Quick animation (such, as a \l NumberAnimation) is driven by
280the default animation driver. This relies on basic system timers, such as
281QObject::startTimer(). The timer typically runs with an interval of 16
282milliseconds. While this will never be fully accurate and also depends on the
283accuracy of timers in the underlying platform, it has the benefit of being
284independent of the rendering. It provides uniform results regardless of the
285display refresh rate and if synchronization to the display's vertical sync is
286active or not. This is how animations work with the \c basic render loop.
287
288In order to provide more accurate results with less stutter on-screen,
289independent of the render loop design (be it single threaded or multiple
290threads) a render loop may decide to install its own custom animation driver,
291and take the operation of \c advancing it into its own hands, without relying
292on timers.
293
294This is what the \c threaded render loop implements. In fact, it installs not
295one, but two animation drivers: one on the gui thread (to drive regular
296animations, such as \l NumberAnimation), and one on the render thread (to drive
297render thread animations, i.e. the \l Animator types, such as \l
298OpacityAnimator or \l XAnimator). Both of these are advanced during the
299preparation of a frame, i.e. animations are now synchronized with rendering.
300This makes sense due to presentation being throttled to the display's vertical
301sync by the underlying graphics stack.
302
303Therefore, in the diagram for the \c threaded render loop above, there is an
304explicit \c{Advance animations} step on both threads. For the render thread,
305this is trivial: as the thread is being throttled to vsync, advancing
306animations (for \l Animator types) in each frame as if 16.67 milliseconds had
307elapsed gives more accurate results than relying on a system timer. (when
308throttled to the vsync timing, which is \c{1000/60} milliseconds with a 60 Hz
309refresh rate, it is fair to assume that it has been approximately that long
310since the same operation was done for the previous frame)
311
312The same approach works for animations on the gui (main) thread too: due to the
313essential synchronization of data between the gui and render threads, the gui
314thread is effectively throttled to the same rate as the render thread, while
315still having the benefit of having less work to do, leaving more headroom for
316the application logic since much of the rendering preparations are now
317offloaded to the render thread.
318
319While the above examples used 60 frames per second, Qt Quick is prepared for
320other refresh rates as well: the rate is queried from the QScreen and the
321platform. For example, with a 144 Hz screen the interval is 6.94 ms. At the
322same time this is exactly what can cause trouble if vsync-based throttling is
323not functioning as expected, because if what the render loop thinks is
324happening is not matching reality, incorrect animation pacing will occur.
325
326\note Starting from Qt 6.5, the threaded render loop offers the possibility of
327opting in to another animation driver, based solely on the elapsed time
328(QElapsedTimer). To enable this, set the \c{QSG_USE_SIMPLE_ANIMATION_DRIVER}
329environment variable to a non-zero value. This has the benefits of not needing
330any of the infrastructure for falling back to a QTimer when there are multiple
331windows, not needing heuristics trying determine if vsync-based throttling is
332missing or broken, being compatible with any kind of temporal drifts in vsync
333throttling, and not being tied to the primary screen's refresh rate, thus
334potentially working better in multi-screen setups. It also drives render
335thread animations (the \l Animator types) correctly even if vsync-based
336throttling is broken or disabled. On the other hand, animations may be
337perceived as less smooth with this approach. With compatibility in mind, it is
338offered as an opt-in feature at the moment.
339
340In summary, the \c threaded render loop is expected to provide smoother
341animations with less stutter as long as the following conditions are met:
342
343\list
344
345\li There is exactly one window (as in QQuickWindow) on-screen.
346
347\li VSync-based throttling works as expected with the underyling graphics and
348display stack.
349
350\endlist
351
352\section3 What if there is no or more than one window visible?
353
354When there is no renderable window, for example because our QQuickWindow is
355minimized (Windows) or fully obscured (macOS), we cannot present frames, thus
356cannot rely on the thread "working" in lockstep with the screen refresh rate.
357In this case, the \c threaded render loop automatically switches over to a
358system timer based approach to drive animations, i.e. temporarily switching
359over to the mechanism the \c basic loop would use.
360
361The same is true when there are more than one QQuickWindow instances on-screen.
362The model presented above for advancing animations on the gui thread, enabled
363by its synchronization with the render thread, is not satisfactory anymore, as
364there are now multiple sync points with multiple render threads. (one per
365window.) Here falling back to the system timer based approach becomes necessary
366as well, because how long and often the gui thread will block is now dependent
367on a number of factors, including the content in the windows (are they
368animating? how often are they updating?) and the graphics stack behavior (how
369exactly does it handle two or more threads presenting with wait-for-vsync?). As
370we cannot guarantee being throttled to the presentation rate of the window
371(which window would that be, to begin with?) in a stable, cross-platform
372manner, advancing animations cannot be based on the rendering.
373
374This switch of animation handling mechanisms is transparent to the
375applications.
376
377\section3 What if vsync-based throttling is disfunctional, globally disabled, or the application disabled it itself?
378
379The \c threaded render loop relies on the graphics API implementation and/or
380the windowing system for throttling, for example, by requesting a swap interval
381of 1 in case of OpenGL (GLX, EGL, WGL), calling Present() with an interval of 1
382for Direct 3D, or using the presentation mode \c FIFO with Vulkan.
383
384Some graphics drivers allow users to override this setting and turn it off,
385ignoring Qt's request. An example of this would be a system wide control panel
386of the graphics driver that allows overriding the application's settings with
387regards to vsync. It can also happen that a graphics stack is unable to provide
388proper vsync-based throttling, which can be the case in some virtual machines
389(mainly due to using a software rasterization based implementation of OpenGL or
390Vulkan).
391
392Without blocking in the swap/present operation (or some other graphics
393operation), such a render loop would advance animations too fast. This would be
394no issue with the \c basic render loop, because that always relies on system
395timers. With \c threaded, the behavior can vary based on the Qt version:
396
397\list
398
399\li If a system is known to be unable to provide vsync-based throttling, the
400only option before Qt 6.4 was to use the \c basic render loop, by manually
401setting \c {QSG_RENDER_LOOP=basic} in the environment before running the
402application.
403
404\li Starting with Qt 6.4, setting either the \c{QSG_NO_VSYNC} environment
405variable to a non-zero value, or the window's QSurfaceFormat::swapInterval() to
406\c 0 can both alleviate the problem as well: by explicitly requesting disabling
407vsync-based blocking, regardless of the request having any effect in practice,
408the \c threaded render loop can by extension recognize that relying on vsync to
409drive animations is futile, and it will fall back to using system timers, just
410as it would for more than one window.
411
412\li Even better, starting from Qt 6.4 the scenegraph also attempts to recognize
413using some simple heuristics that the frames are being presented "too fast",
414and automatically switch over to system timers if seen necessary. This means
415that in most cases there will be no need to do anything and applications will
416run animations as expected even when the default render loop is the \c threaded
417one. While this is transparent to applications, for troubleshooting and
418development purposes it is useful to know that this is logged with a \c{"Window
4190x7ffc8489c3d0 is determined to have broken vsync throttling ..."} message
420printed when \c{QSG_INFO} or \c{qt.scenegraph.general} is enabled. This method
421has the downside of activating only after a small set of frames, given that it
422first needs to collect data to evaluate, meaning that when opening a
423QQuickWindow the application may still show overly fast animations for a short
424period of time. Additionally, it may not capture all possible vsync-broken
425situations.
426
427\endlist
428
429Remember however, that by design none of this helps render thread animations
430(the \l Animator types). In the absence of vsync-based blocking,
431\l{Animator}{animators} will advance incorrectly by default, faster than
432expected, even when the workarounds are activated for regular
433\l{Animation}{animations}. If this becomes an issue, consider using the
434alternative animation driver by setting \c{QSG_USE_SIMPLE_ANIMATION_DRIVER}.
435
436\note Be aware that the rendering loop logic and event processing on the GUI
437(main) thread is not necessarily unthrottled even if waiting for vsync is
438disabled: both render loops schedule updates for windows via
439QWindow::requestUpdate(). This is backed by a 5 ms GUI thread timer on most
440platforms, in order to give time for event processing. On some platforms, e.g.
441macOS, it is using platform-specific APIs (such as, CVDisplayLink) to get
442notified about the appropriate time to prepare a new frame, likely tied to the
443display's vsync in some form. This can be relevant in benchmarking and similar
444situations. For applications and tools attempting to perform low-level
445benchmarking it may be beneficial to set the \c{QT_QPA_UPDATE_IDLE_TIME}
446environment variable to \c 0 in order to potentially reduce idle time on the
447GUI thread. For normal application usage the defaults should, in most cases, be
448sufficient.
449
450\note When in doubt, enable the \c {qt.scenegraph.general} and \c
451{qt.scenegraph.time.renderloop} logging categories for troubleshooting, as
452these may reveal some clues as to why rendering and animations are not running
453at the expected pace.
454
455
456\section2 Custom control over rendering with QQuickRenderControl
457
458When using QQuickRenderControl, the responsibility for driving the
459rendering loop is transferred to the application. In this case no
460built-in render loop is used. Instead, it is up to the application to
461invoke the polish, synchronize and rendering steps at the appropriate
462time. It is possible to implement either a threaded or non-threaded
463behavior similar to the ones shown above.
464
465Additionally, applications may wish to implement and install their own
466QAnimationDriver in combination with QQuickRenderControl. This gives full
467control over driving Qt Quick animations, which can be particularly important
468for content that is not shown on screen, bearing no relation to the
469presentation rate simply because there is no presenting of the frame happening.
470This is optional, by default animations will advance based on the system timer.
471
472
473\section2 Extending the Scene Graph with QRhi-based and native 3D rendering
474
475The scene graph offers three methods for integrating application-provided
476graphics commands:
477
478\list
479
480\li Issuing either \l{QRhi}-based or OpenGL, Vulkan, Metal, Direct3D commands
481directly before or after the scene graph's own rendering. This in effect
482prepends or appends a set of draw calls into the main render pass. No additional
483render target is used.
484
485\li Rendering to a texture and creating a textured node in the scene graph. This
486involves an additional render pass and render target.
487
488\li Issuing draw calls inline with the scene graph's own rendering by
489instantiating a QSGRenderNode subclass in the scene graph. This is similar to
490the first approach but the custom draw calls are effectively injected into the
491scene graph's command stream.
492
493\endlist
494
495\section3 Underlay/overlay mode
496
497By connecting to the \l QQuickWindow::beforeRendering() and \l
498QQuickWindow::afterRendering() signals, applications can make \l QRhi or native
4993D API calls directly into the same context as the scene graph is rendering to.
500With APIs like Vulkan or Metal, applications can query native objects, such as,
501the scene graph's command buffer, via QSGRendererInterface, and record commands
502to it as they see fit. As the signal names indicate, the user can then render
503content either under a Qt Quick scene or over it. The benefit of integrating in
504this manner is that no extra render targets are needed to perform the rendering,
505and a possibly expensive texturing step is eliminated. The downside is that the
506custom rendering can only be issued either at the beginning or at the end of Qt
507Quick's own rendering. Using QSGRenderNode instead of the QQuickWindow signals
508can lift that restriction somewhat, but in either case care must be taken when
509it comes to 3D content and depth buffer usage since relying on depth testing and
510rendering with depth write enabled can easily create situations where the custom
511content and the Qt Quick content's depth buffer usage conflict with each other.
512
513From Qt 6.6 the \l QRhi APIs are considered semi-public, i.e. offered to the
514applications and documented, albeit with a limited compatibility guarantee. This
515allows creating portable, cross-platform 2D/3D rendering code by using the same
516graphics and shader abstractions the scene graph itself uses.
517
518The \l {Scene Graph - RHI Under QML} example gives an example on how to
519implement the underlay/overlay approach using \l QRhi.
520
521The \l {Scene Graph - OpenGL Under QML} example gives an example on
522how to use these signals using OpenGL.
523
524The \l {Scene Graph - Direct3D 11 Under QML} example gives an example on
525how to use these signals using Direct3D.
526
527The \l {Scene Graph - Metal Under QML} example gives an example on
528how to use these signals using Metal.
529
530The \l {Scene Graph - Vulkan Under QML} example gives an example on
531how to use these signals using Vulkan.
532
533Starting with Qt 6.0, direct usage of the underlying graphics API must be
534enclosed by a call to \l QQuickWindow::beginExternalCommands() and \l
535QQuickWindow::endExternalCommands(). This concept may be familiar from \l
536QPainter::beginNativePainting(), and serves a similar purpose: it allows the Qt
537Quick Scene Graph to recognize that any cached state and assumptions about the
538state within the currently recorded render pass, if there is one, are now
539invalid, because the application code may have altered it by working directly
540with the underlying graphics API. This is not applicable and necessary when
541using \l QRhi.
542
543When mixing custom OpenGL rendering with the scene graph, it is important the
544application does not leave the OpenGL context in a state with buffers bound,
545attributes enabled, special values in the z-buffer or stencil-buffer or similar.
546Doing so can result in unpredictable behavior.
547
548The custom rendering code must be thread aware in the sense that it should not
549assume being executed on the GUI (main) thread of the application. When
550connecting to the \l QQuickWindow signals, the application should use
551Qt::DirectConnection and understand that the connected slots are invoked on the
552scene graph's dedicated render thread, if there is one.
553
554\section3 The texture-based approach
555
556The texture-based alternative is the most flexible approach when the application
557needs to have a "flattened", 2D image of some custom 3D rendering within the Qt
558Quick scene. This also allows using a dedicated depth/stencil buffer that is
559independent of the buffers used by the main render pass.
560
561When using OpenGL, the legacy convenience class QQuickFramebufferObject can be
562used to achieve this. QRhi-based custom renderers and graphics APIs other than
563OpenGL can also follow this approach, even though QQuickFramebufferObject does
564not currently support them. Creating and rendering to a texture directly with
565the underlying API, followed by wrapping and using this resource in a Qt Quick
566scene in a custom QQuickItem, is demonstrated in the following examples:
567
568\l {Scene Graph - RHI Texture Item} example.
569
570\l {Scene Graph - Vulkan Texture Import} example.
571
572\l {Scene Graph - Metal Texture Import} example.
573
574\section3 The inline approach
575
576Using \l QSGRenderNode the custom draw calls are injected not at the beginning
577or the end of the recording of the scene graph's render pass, but rather during
578the scene graph's rendering process. This is achieved by creating a custom \l
579QQuickItem based by an instance of \l QSGRenderNode, a scene graph node that
580exists specifically to allow issuing graphics commands either via \l QRhi or a
581native 3D API such as OpenGL, Vulkan, Metal, or Direct 3D.
582
583The \l {Scene Graph - Custom QSGRenderNode} example gives a demonstration of
584this approach.
585
586\section2 Custom Items using QPainter
587
588The QQuickItem provides a subclass, QQuickPaintedItem, which allows
589the users to render content using QPainter.
590
591\warning Using QQuickPaintedItem uses an indirect 2D surface to render
592its content, either using software rasterization or using an OpenGL
593framebuffer object (FBO), so the rendering is a two-step
594operation. First rasterize the surface, then draw the surface. Using
595scene graph API directly is always significantly faster.
596
597\section1 Logging Support
598
599The scene graph has support for a number of logging categories. These
600can be useful in tracking down both performance issues and bugs in
601addition to being helpful to Qt contributors.
602
603\list
604
605\li \c {qt.scenegraph.time.texture} - logs the time spent doing texture uploads
606
607\li \c {qt.scenegraph.time.compilation} - logs the time spent doing shader compilation
608
609\li \c {qt.scenegraph.time.renderer} - logs the time spent in the various steps of the renderer
610
611\li \c {qt.scenegraph.time.renderloop} - logs the time spent in the various
612steps of the render loop. With the \c threaded render loop this gives an
613insight into the time elapsed between the various frame preparation steps both
614on the GUI and the render thread. It can therefore also be a useful
615troubleshooting tool, for example, to confirm how vsync-based throttling and
616other low-level Qt enablers, such as QWindow::requestUpdate(), affect the
617rendering and presentation pipeline.
618
619\li \c {qt.scenegraph.time.glyph} - logs the time spent preparing distance field glyphs
620
621\li \c {qt.scenegraph.general} - logs general information about various parts of the scene graph and the graphics stack
622
623\li \c {qt.scenegraph.renderloop} - creates a detailed log of the various stages involved in rendering. This log mode is primarily useful for developers working on Qt.
624
625\endlist
626
627The legacy \c{QSG_INFO} environment variable is also available. Setting it to a
628non-zero value enables the \c{qt.scenegraph.general} category.
629
630\note When encountering graphics problems, or when in doubt which render loop
631or graphics API is in use, always start the application with at least
632\c{qt.scenegraph.general} and \c{qt.rhi.*} enabled, or \c{QSG_INFO=1} set. This
633will then print some essential information onto the debug output during
634initialization.
635
636\section1 Scene Graph Backend
637
638In addition to the public API, the scene graph has an adaptation layer
639which opens up the implementation to do hardware specific
640adaptations. This is an undocumented, internal and private plugin API,
641which lets hardware adaptation teams make the most of their hardware.
642It includes:
643
644\list
645
646\li Custom textures; specifically the implementation of
647QQuickWindow::createTextureFromImage and the internal representation
648of the texture used by \l Image and \l BorderImage types.
649
650\li Custom renderer; the adaptation layer lets the plugin decide how
651the scene graph is traversed and rendered, making it possible to
652optimize the rendering algorithm for a specific hardware or to make
653use of extensions which improve performance.
654
655\li Custom scene graph implementation of many of the default QML
656types, including its text and font rendering.
657
658\li Custom animation driver; allows the animation system to hook
659into the low-level display vertical refresh to get smooth rendering.
660
661\li Custom render loop; allows better control over how QML deals
662with multiple windows.
663
664\endlist
665
666*/
667
668/*!
669 \title Qt Quick Scene Graph Default Renderer
670 \page qtquick-visualcanvas-scenegraph-renderer.html
671
672 This document explains how the default scene graph renderer works internally,
673 so that one can write code that uses it in an optimal fashion, both
674 performance and feature-wise.
675
676 One does not need to understand the internals of the renderer to get
677 good performance. However, it might help when integrating with the
678 scene graph or to figure out why it is not possible to squeeze the
679 maximum efficiency out of the graphics chip.
680
681 \note Even in the case where every frame is unique and everything is
682 uploaded from scratch, the default renderer will perform well.
683
684 The Qt Quick items in a QML scene populate a tree of QSGNode
685 instances. Once created, this tree is a complete description of how
686 a certain frame should be rendered. It does not contain any
687 references back to the Qt Quick items at all and will on most
688 platforms be processed and rendered in a separate thread. The
689 renderer is a self contained part of the scene graph which traverses
690 the QSGNode tree and uses geometry defined in QSGGeometryNode and
691 shader state defined in QSGMaterial to update the graphics state and
692 generate draw calls.
693
694 If needed, the renderer can be completely replaced using the
695 internal scene graph back-end API. This is mostly interesting for
696 platform vendors who wish to take advantage of non-standard hardware
697 features. For the majority of use cases, the default renderer will be
698 sufficient.
699
700 The default renderer focuses on two primary strategies to optimize
701 the rendering: Batching of draw calls, and retention of geometry on
702 the GPU.
703
704 \section1 Batching
705
706 Whereas a traditional 2D API, such as QPainter, Cairo or Context2D, is
707 written to handle thousands of individual draw calls per frame, OpenGL and
708 other hardware accelerated APIs perform best when the number of draw calls is
709 very low and state changes are kept to a minimum.
710
711 \note While \c OpenGL is used as an example in the following sections, the
712 same concepts apply to other graphics APIs as well.
713
714 Consider the following use case:
715
716 \image visualcanvas_list.png
717
718 The simplest way of drawing this list is on a cell-by-cell basis. First,
719 the background is drawn. This is a rectangle of a specific color. In
720 OpenGL terms this means selecting a shader program to do solid color
721 fills, setting up the fill color, setting the transformation matrix
722 containing the x and y offsets and then using for instance
723 \c glDrawArrays to draw two triangles making up the rectangle. The icon
724 is drawn next. In OpenGL terms this means selecting a shader program
725 to draw textures, selecting the active texture to use, setting the
726 transformation matrix, enabling alpha-blending and then using for
727 instance \c glDrawArrays to draw the two triangles making up the
728 bounding rectangle of the icon. The text and separator line between
729 cells follow a similar pattern. And this process is repeated for
730 every cell in the list, so for a longer list, the overhead imposed
731 by OpenGL state changes and draw calls completely outweighs the
732 benefit that using a hardware accelerated API could provide.
733
734 When each primitive is large, this overhead is negligible, but in
735 the case of a typical UI, there are many small items which add up to
736 a considerable overhead.
737
738 The default scene graph renderer works within these
739 limitations and will try to merge individual primitives together
740 into batches while preserving the exact same visual result. The
741 result is fewer OpenGL state changes and a minimal amount of draw
742 calls, resulting in optimal performance.
743
744 \section2 Opaque Primitives
745
746 The renderer separates between opaque primitives and primitives
747 which require alpha blending. By using OpenGL's Z-buffer and giving
748 each primitive a unique z position, the renderer can freely reorder
749 opaque primitives without any regard for their location on screen
750 and which other elements they overlap with. By looking at each
751 primitive's material state, the renderer will create opaque
752 batches. From Qt Quick core item set, this includes Rectangle items
753 with opaque colors and fully opaque images, such as JPEGs or BMPs.
754
755 Another benefit of using opaque primitives is that opaque
756 primitives do not require \c GL_BLEND to be enabled, which can be
757 quite costly, especially on mobile and embedded GPUs.
758
759 Opaque primitives are rendered in a front-to-back manner with
760 \c glDepthMask and \c GL_DEPTH_TEST enabled. On GPUs that internally do
761 early-z checks, this means that the fragment shader does not need to
762 run for pixels or blocks of pixels that are obscured. Beware that
763 the renderer still needs to take these nodes into account and the
764 vertex shader is still run for every vertex in these primitives, so
765 if the application knows that something is fully obscured, the best
766 thing to do is to explicitly hide it using Item::visible or
767 Item::opacity.
768
769 \note The Item::z is used to control an Item's stacking order
770 relative to its siblings. It has no direct relation to the renderer and
771 OpenGL's Z-buffer.
772
773 \section2 Alpha Blended Primitives
774
775 Once opaque primitives have been drawn, the renderer will disable
776 \c glDepthMask, enable \c GL_BLEND and render all alpha blended primitives
777 in a back-to-front manner.
778
779 Batching of alpha blended primitives requires a bit more effort in
780 the renderer as elements that are overlapping need to be rendered in
781 the correct order for alpha blending to look correct. Relying on the
782 Z-buffer alone is not enough. The renderer does a pass over all
783 alpha blended primitives and will look at their bounding rect in
784 addition to their material state to figure out which elements can be
785 batched and which can not.
786
787 \image visualcanvas_overlap.png
788
789 In the left-most case, the blue backgrounds can be drawn in one call
790 and the two text elements in another call, as the texts only overlap
791 a background which they are stacked in front of. In the right-most
792 case, the background of "Item 4" overlaps the text of "Item 3" so in
793 this case, each of backgrounds and texts needs to be drawn using
794 separate calls.
795
796 Z-wise, the alpha primitives are interleaved with the opaque nodes
797 and may trigger early-z when available, but again, setting
798 Item::visible to false is always faster.
799
800 \section2 Mixing with 3D Primitives
801
802 The scene graph can support pseudo 3D and proper 3D primitives. For
803 instance, one can implement a "page curl" effect using a
804 ShaderEffect or implement a bumpmapped torus using QSGGeometry and a
805 custom material. While doing so, one needs to take into account that
806 the default renderer already makes use of the depth buffer.
807
808 The renderer modifies the vertex shader returned from
809 QSGMaterialShader::vertexShader() and compresses the z values of the
810 vertex after the model-view and projection matrices have been applied
811 and then adds a small translation on the z to position it the
812 correct z position.
813
814 The compression assumes that the z values are in the range of 0 to
815 1.
816
817 \section2 Texture Atlas
818
819 The active texture is a unique OpenGL state, which means that
820 multiple primitives using different OpenGL textures cannot be
821 batched. The Qt Quick scene graph, for this reason, allows multiple
822 QSGTexture instances to be allocated as smaller sub-regions of a
823 larger texture; a texture atlas.
824
825 The biggest benefit of texture atlases is that multiple QSGTexture
826 instances now refer to the same OpenGL texture instance. This makes
827 it possible to batch textured draw calls as well, such as Image
828 items, BorderImage items, ShaderEffect items and also C++ types such
829 as QSGSimpleTextureNode and custom QSGGeometryNodes using textures.
830
831 \note Large textures do not go into the texture atlas.
832
833 Atlas based textures are created by passing
834 QQuickWindow::TextureCanUseAtlas to the
835 QQuickWindow::createTextureFromImage().
836
837 \note Atlas based textures do not have texture coordinates ranging
838 from 0 to 1. Use QSGTexture::normalizedTextureSubRect() to get the
839 atlas texture coordinates.
840
841 The scene graph uses heuristics to figure out how large the atlas
842 should be and what the size threshold for being entered into the
843 atlas is. If different values are needed, it is possible to override
844 them using the environment variables \c {QSG_ATLAS_WIDTH=[width]},
845 \c {QSG_ATLAS_HEIGHT=[height]} and \c
846 {QSG_ATLAS_SIZE_LIMIT=[size]}. Changing these values will mostly be
847 interesting for platform vendors.
848
849 \section1 Batch Roots
850
851 In addition to merging compatible primitives into batches, the
852 default renderer also tries to minimize the amount of data that
853 needs to be sent to the GPU for every frame. The default renderer
854 identifies subtrees which belong together and tries to put these
855 into separate batches. Once batches are identified, they are merged,
856 uploaded and stored in GPU memory, using Vertex Buffer Objects.
857
858 \section2 Transform Nodes
859
860 Each Qt Quick Item inserts a QSGTransformNode into the scene graph
861 tree to manage its x, y, scale or rotation. Child items will be
862 populated under this transform node. The default renderer tracks
863 the state of transform nodes between frames and will look at
864 subtrees to decide if a transform node is a good candidate to become
865 a root for a set of batches. A transform node which changes between
866 frames and which has a fairly complex subtree can become a batch
867 root.
868
869 QSGGeometryNodes in the subtree of a batch root are pre-transformed
870 relative to the root on the CPU. They are then uploaded and retained
871 on the GPU. When the transform changes, the renderer only needs to
872 update the matrix of the root, not each individual item, making list
873 and grid scrolling very fast. For successive frames, as long as
874 nodes are not being added or removed, rendering the list is
875 effectively for free. When new content enters the subtree, the batch
876 that gets it is rebuilt, but this is still relatively fast. There are
877 usually several unchanging frames for every frame with added or
878 removed nodes when panning through a grid or list.
879
880 Another benefit of identifying transform nodes as batch roots is
881 that it allows the renderer to retain the parts of the tree that have
882 not changed. For instance, say a UI consists of a list and a button
883 row. When the list is being scrolled and delegates are being added
884 and removed, the rest of the UI, the button row, is unchanged and
885 can be drawn using the geometry already stored on the GPU.
886
887 The node and vertex threshold for a transform node to become a batch
888 root can be overridden using the environment variables \c
889 {QSG_RENDERER_BATCH_NODE_THRESHOLD=[count]} and \c
890 {QSG_RENDERER_BATCH_VERTEX_THRESHOLD=[count]}. Overriding these flags
891 will be mostly useful for platform vendors.
892
893 \note Beneath a batch root, one batch is created for each unique
894 set of material state and geometry type.
895
896 \section2 Clipping
897
898 When setting Item::clip to true, it will create a QSGClipNode with a
899 rectangle in its geometry. The default renderer will apply this clip
900 by using scissoring in OpenGL. If the item is rotated by a
901 non-90-degree angle, the OpenGL's stencil buffer is used. Qt Quick
902 Item only supports setting a rectangle as clip through QML, but the
903 scene graph API and the default renderer can use any shape for
904 clipping.
905
906 When applying a clip to a subtree, that subtree needs to be rendered
907 with a unique OpenGL state. This means that when Item::clip is true,
908 batching of that item is limited to its children. When there are
909 many children, like a ListView or GridView, or complex children,
910 like a TextArea, this is fine. One should, however, use clip on
911 smaller items with caution as it prevents batching. This includes
912 button label, text field or list delegate and table cells.
913 Clipping a Flickable (or item view) can often be avoided by arranging
914 the UI so that opaque items cover areas around the Flickable, and
915 otherwise relying on the window edges to clip everything else.
916
917 Setting Item::clip to \c true also sets the \l QQuickItem::ItemIsViewport
918 flag; child items with the \l QQuickItem::ItemObservesViewport flag may
919 use the viewport for a rough pre-clipping step: e.g. \l Text omits
920 lines of text that are completely outside the viewport. Omitting scene
921 graph nodes or limiting the \l {QSGGeometry::vertexCount()}{vertices}
922 is an optimization, which can be achieved by setting the
923 \l {QQuickItem::flags()}{flags} in C++ rather than setting
924 \l Item::clip in QML.
925
926 When implementing QQuickItem::updatePaintNode() in a custom item,
927 if it can render a lot of details over a large geometric area,
928 you should think about whether it's efficient to limit the graphics
929 to the viewport; if so, you can set the \l {QQuickItem::}
930 {ItemObservesViewport} flag and read the currently exposed area from
931 QQuickItem::clipRect(). One consequence is that updatePaintNode() will be
932 called more often (typically once per frame whenever content is moving in
933 the viewport).
934
935 \section2 Vertex Buffers
936
937 Each batch uses a vertex buffer object (VBO) to store its data on
938 the GPU. This vertex buffer is retained between frames and updated
939 when the part of the scene graph that it represents changes.
940
941 By default, the renderer will upload data into the VBO using
942 \c GL_STATIC_DRAW. It is possible to select different upload strategy
943 by setting the environment variable \c
944 {QSG_RENDERER_BUFFER_STRATEGY=[strategy]}. Valid values are \c
945 stream and \c dynamic. Changing this value is mostly useful for
946 platform vendors.
947
948 \section1 Antialiasing
949
950 The scene graph supports two types of antialiasing. By default, primitives
951 such as rectangles and images will be antialiased by adding more
952 vertices along the edge of the primitives so that the edges fade
953 to transparent. We call this method \e {vertex antialiasing}. If the
954 user requests a multisampled OpenGL context, by setting a QSurfaceFormat
955 with samples greater than \c 0 using QQuickWindow::setFormat(), the
956 scene graph will prefer multisample based antialiasing (MSAA).
957 The two techniques will affect how the rendering happens internally
958 and have different limitations.
959
960 It is also possible to override the antialiasing method used by
961 setting the environment variable \c {QSG_ANTIALIASING_METHOD}
962 to either \c vertex or \c {msaa}.
963
964 Vertex antialiasing can produce seams between edges of adjacent
965 primitives, even when the two edges are mathematically the same.
966 Multisample antialiasing does not.
967
968
969 \section2 Vertex Antialiasing
970
971 Vertex antialiasing can be enabled and disabled on a per-item basis
972 using the Item::antialiasing property. It will work regardless of
973 what the underlying hardware supports and produces higher quality
974 antialiasing, both for normally rendered primitives and also for
975 primitives captured into framebuffer objects, for instance using
976 the ShaderEffectSource type.
977
978 The downside to using vertex antialiasing is that each primitive
979 with antialiasing enabled will have to be blended. In terms of
980 batching, this means that the renderer needs to do more work to
981 figure out if the primitive can be batched or not and due to overlaps
982 with other elements in the scene, it may also result in less batching,
983 which could impact performance.
984
985 On low-end hardware blending can also be quite expensive so for an
986 image or rounded rectangle that covers most of the screen, the amount
987 of blending needed for the interior of these primitives can result
988 in significant performance loss as the entire primitive must be blended.
989
990 \section2 Multisample Antialiasing
991
992 Multisample antialiasing is a hardware feature where the hardware
993 calculates a coverage value per pixel in the primitive. Some hardware
994 can multisample at a very low cost, while other hardware may
995 need both more memory and more GPU cycles to render a frame.
996
997 Using multisample antialiasing, many primitives, such as rounded
998 rectangles and image elements can be antialiased and still be
999 \e opaque in the scene graph. This means the renderer has an easier
1000 job when creating batches and can rely on early-z to avoid overdraw.
1001
1002 When multisample antialiasing is used, content rendered into
1003 framebuffer objects need additional extensions to support multisampling
1004 of framebuffers. Typically \c GL_EXT_framebuffer_multisample and
1005 \c GL_EXT_framebuffer_blit. Most desktop chips have these extensions
1006 present, but they are less common in embedded chips. When framebuffer
1007 multisampling is not available in the hardware, content rendered into
1008 framebuffer objects will not be antialiased, including the content of
1009 a ShaderEffectSource.
1010
1011
1012 \section1 Performance
1013
1014 As stated in the beginning, understanding the finer details of the
1015 renderer is not required to get good performance. It is written to
1016 optimize for common use cases and will perform quite well under
1017 almost any circumstances.
1018
1019 \list
1020
1021 \li Good performance comes from effective batching, with as little
1022 as possible of the geometry being uploaded again and again. By
1023 setting the environment variable \c {QSG_RENDERER_DEBUG=render}, the
1024 renderer will output statistics on how well the batching goes, how
1025 many batches are used, which batches are retained and which are opaque and
1026 not. When striving for optimal performance, uploads should happen
1027 only when really needed, batches should be fewer than 10 and at
1028 least 3-4 of them should be opaque.
1029
1030 \li The default renderer does not do any CPU-side viewport clipping
1031 nor occlusion detection. If something is not supposed to be visible,
1032 it should not be shown. Use \c {Item::visible: false} for items that
1033 should not be drawn. The primary reason for not adding such logic is
1034 that it adds additional cost which would also hurt applications that
1035 took care in behaving well.
1036
1037 \li Make sure the texture atlas is used. The Image and BorderImage
1038 items will use it unless the image is too large. For textures
1039 created in C++, pass QQuickWindow::TextureCanUseAtlas when
1040 calling QQuickWindow::createTexture().
1041 By setting the environment variable \c {QSG_ATLAS_OVERLAY} all atlas
1042 textures will be colorized so they are easily identifiable in the
1043 application.
1044
1045 \li Use opaque primitives where possible. Opaque primitives are
1046 faster to process in the renderer and faster to draw on the GPU. For
1047 instance, PNG files will often have an alpha channel, even though
1048 each pixel is fully opaque. JPG files are always opaque. When
1049 providing images to a QQuickImageProvider or creating images with
1050 QQuickWindow::createTextureFromImage(), let the image have
1051 QImage::Format_RGB32, when possible.
1052
1053 \li Be aware of that overlapping compound items, like in the
1054 illustration above, cannot be batched.
1055
1056 \li Clipping breaks batching. Never use on a per-item basis, inside
1057 table cells, item delegates or similar. Instead of clipping text,
1058 use eliding. Instead of clipping an image, create a
1059 QQuickImageProvider that returns a cropped image.
1060
1061 \li Batching only works for 16-bit indices. All built-in items use
1062 16-bit indices, but a custom geometry is free to also use 32-bit
1063 indices.
1064
1065 \li Some material flags prevent batching, the most limiting one
1066 being QSGMaterial::RequiresFullMatrix which prevents all batching.
1067
1068 \li Applications with a monochrome background should set it using
1069 QQuickWindow::setColor() rather than using a top-level Rectangle item.
1070 QQuickWindow::setColor() will be used in a call to \c glClear(),
1071 which is potentially faster.
1072
1073 \li Mipmapped Image items are not placed in the global atlas and will
1074 not be batched.
1075
1076 \li A bug in the OpenGL driver related to framebuffer object (FBO) readbacks
1077 may corrupt rendered glyphs. If you set the \c QML_USE_GLYPHCACHE_WORKAROUND
1078 environment variable, Qt keeps an additional copy of the glyph in RAM. This
1079 means that performance is slightly lower when drawing glyphs that have not
1080 been drawn before, as Qt accesses the extra copy via the CPU. It also means
1081 that the glyph cache will use twice as much memory. The quality is not
1082 affected by this.
1083
1084 \endlist
1085
1086 If an application performs poorly, make sure that rendering is
1087 actually the bottleneck. Use a profiler! The environment variable \c
1088 {QSG_RENDER_TIMING=1} will output a number of useful timing
1089 parameters which can be useful in pinpointing where a problem lies.
1090
1091 \section1 Visualizing
1092
1093 To visualize the various aspects of the scene graph's default renderer, the
1094 \c QSG_VISUALIZE environment variable can be set to one of the values
1095 detailed in each section below. We provide examples of the output of
1096 some of the variables using the following QML code:
1097
1098 \code
1099 import QtQuick 2.2
1100
1101 Rectangle {
1102 width: 200
1103 height: 140
1104
1105 ListView {
1106 id: clippedList
1107 x: 20
1108 y: 20
1109 width: 70
1110 height: 100
1111 clip: true
1112 model: ["Item A", "Item B", "Item C", "Item D"]
1113
1114 delegate: Rectangle {
1115 color: "lightblue"
1116 width: parent.width
1117 height: 25
1118
1119 Text {
1120 text: modelData
1121 anchors.fill: parent
1122 horizontalAlignment: Text.AlignHCenter
1123 verticalAlignment: Text.AlignVCenter
1124 }
1125 }
1126 }
1127
1128 ListView {
1129 id: clippedDelegateList
1130 x: clippedList.x + clippedList.width + 20
1131 y: 20
1132 width: 70
1133 height: 100
1134 clip: true
1135 model: ["Item A", "Item B", "Item C", "Item D"]
1136
1137 delegate: Rectangle {
1138 color: "lightblue"
1139 width: parent.width
1140 height: 25
1141 clip: true
1142
1143 Text {
1144 text: modelData
1145 anchors.fill: parent
1146 horizontalAlignment: Text.AlignHCenter
1147 verticalAlignment: Text.AlignVCenter
1148 }
1149 }
1150 }
1151 }
1152 \endcode
1153
1154 For the ListView on the left, we set its \l {Item::clip}{clip} property to
1155 \c true. For the ListView on right, we also set each delegate's
1156 \l {Item::clip}{clip} property to \c true to illustrate the effects of
1157 clipping on batching.
1158
1159 \image visualize-original.png "Original"
1160 Original
1161
1162 \note The visualized elements do not respect clipping, and rendering order is
1163 arbitrary.
1164
1165 \section2 Visualizing Batches
1166
1167 Setting \c QSG_VISUALIZE to \c batches visualizes batches in the renderer.
1168 Merged batches are drawn with a solid color and unmerged batches are drawn
1169 with a diagonal line pattern. Few unique colors means good batching.
1170 Unmerged batches are bad if they contain many individual nodes.
1171
1172 \image visualize-batches.png "batches"
1173 \c QSG_VISUALIZE=batches
1174
1175 \section2 Visualizing Clipping
1176
1177 Setting \c QSG_VISUALIZE to \c clip draws red areas on top of the scene
1178 to indicate clipping. As Qt Quick Items do not clip by default, no clipping
1179 is usually visualized.
1180
1181 \image visualize-clip.png
1182 \c QSG_VISUALIZE=clip
1183
1184 \section2 Visualizing Changes
1185
1186 Setting \c QSG_VISUALIZE to \c changes visualizes changes in the renderer.
1187 Changes in the scenegraph are visualized with a flashing overlay of a random
1188 color. Changes on a primitive are visualized with a solid color, while
1189 changes in an ancestor, such as matrix or opacity changes, are visualized
1190 with a pattern.
1191
1192 \section2 Visualizing Overdraw
1193
1194 Setting \c QSG_VISUALIZE to \c overdraw visualizes overdraw in the renderer.
1195 Visualize all items in 3D to highlight overdraws. This mode can also be used
1196 to detect geometry outside the viewport to some extent. Opaque items are
1197 rendered with a green tint, while translucent items are rendered with a red
1198 tint. The bounding box for the viewport is rendered in blue. Opaque content
1199 is easier for the scenegraph to process and is usually faster to render.
1200
1201 Note that the root rectangle in the code above is superfluous as the window
1202 is also white, so drawing the rectangle is a waste of resources in this case.
1203 Changing it to an Item can give a slight performance boost.
1204
1205 \image visualize-overdraw-1.png "overdraw-1"
1206 \image visualize-overdraw-2.png "overdraw-2"
1207 \c QSG_VISUALIZE=overdraw
1208
1209 \section1 Rendering via the Qt Rendering Hardware Interface
1210
1211 From Qt 6.0 onwards, the default adaptation always renders via a graphics
1212 abstraction layer, the Qt Rendering Hardware Interface (RHI), provided by the
1213 \l [QtGui]{Qt GUI} module. This means that, unlike Qt 5, no direct OpenGL calls are made
1214 by the scene graph. Rather, it records resource and draw commands by using the
1215 RHI APIs, which then translate the command stream into OpenGL, Vulkan, Metal,
1216 or Direct 3D calls. Shader handling is also unified by writing shader code
1217 once, compiling to \l{https://www.khronos.org/spir/}{SPIR-V}, and then
1218 translating to the language appropriate for the various graphics APIs.
1219
1220 To control the behavior, the following environment variables can be used:
1221
1222 \table 100%
1223 \header
1224 \li Environment Variable
1225 \li Possible Values
1226 \li Description
1227
1228 \row
1229 \li \c QSG_RHI_BACKEND
1230 \li \c vulkan, \c metal, \c opengl, \c d3d11, \c d3d12
1231 \li Requests the specific RHI backend. By default the targeted graphics API
1232 is chosen based on the platform, unless overridden by this variable or the
1233 equivalent C++ APIs. The defaults are currently Direct3D 11 for Windows,
1234 Metal for macOS, OpenGL elsewhere.
1235
1236 \row
1237 \li \c QSG_INFO
1238 \li \c 1
1239 \li Like with the OpenGL-based rendering path, setting this enables printing system
1240 information when initializing the Qt Quick scene graph. This can be very useful for
1241 troubleshooting.
1242
1243 \row
1244 \li \c QSG_RHI_DEBUG_LAYER
1245 \li \c 1
1246 \li Where applicable (Vulkan, Direct3D), enables the graphics API implementation's
1247 debug or validation layers, if available, either on the graphics device or the instance
1248 object. For Metal on \macos, set the environment variable
1249 \c{METAL_DEVICE_WRAPPER_TYPE=1} instead.
1250
1251 \row
1252 \li \c QSG_RHI_PREFER_SOFTWARE_RENDERER
1253 \li \c 1
1254 \li Requests choosing an adapter or physical device that uses software-based
1255 rasterization. Applicable only when the underlying API has support for
1256 enumerating adapters (for example, Direct3D or Vulkan), and is ignored
1257 otherwise.
1258
1259 \endtable
1260
1261 Applications wishing to always run with a single given graphics API, can
1262 request this via C++ as well. For example, the following call made early in
1263 main(), before constructing any QQuickWindow, forces the use of Vulkan (and
1264 will fail otherwise):
1265
1266 \badcode
1267 QQuickWindow::setGraphicsApi(QSGRendererInterface::Vulkan);
1268 \endcode
1269
1270 See QSGRendererInterface::GraphicsApi. The enum values \c OpenGL, \c Vulkan,
1271 \c Metal, \c Direct3D11, \c Direct3D12 are equivalent in effect to running
1272 with \c QSG_RHI_BACKEND set to the equivalent string key.
1273
1274 All QRhi backends will choose the system default GPU adapter or physical
1275 device, unless overridden by \c{QSG_RHI_PREFER_SOFTWARE_RENDERER} or a
1276 backend-specific variable, such as, \c{QT_D3D_ADAPTER_INDEX} or
1277 \c{QT_VK_PHYSICAL_DEVICE_INDEX}. No further adapter configurability is
1278 provided at this time.
1279
1280 Starting with Qt 6.5, some of the settings that were previously only exposed
1281 as environment variables are available as C++ APIs in
1282 QQuickGraphicsConfiguration. For example, setting \c QSG_RHI_DEBUG_LAYER and
1283 calling
1284 \l{QQuickGraphicsConfiguration::setDebugLayer()}{setDebugLayer(true)}
1285 are equivalent.
1286 */