3 Document Splitting in WebRender
4 ===============================
9 The fundamental goal of document splitting is to fix a specific performance issue.
10 The architecture of WebRender is such that any time a content process sends a new
11 display list to the compositor, WebRender needs to do a full scene build that
12 includes both the content and chrome areas. Likewise if the chrome process sends
13 a new display list to the compositor. This means that animations such as the
14 tab loading spinner or an animated gif will cause much more WR activity than
17 With document splitting, the WR scene is split into two (or more) documents
18 based on the visual location of the elements. Everything in the "chrome area"
19 of the window (including the tab bar, URL bar, navigation buttons, etc.) are
20 considered part of one document, and anything below that (including content,
21 devtools, sidebars, etc.) is in another document. Bug 1549976 introduces a
22 third "popover" document that encompasses elements that straddle both of these
23 visually, but that's a side-effect of the implementation rather than being
24 driven by the fundamental problem being addressed.
26 With the documents split like so, an animation in the UI (such as the tab loading
27 spinner) runs independently of the content (by virtue of being in a different
28 document) and vice-versa, which results in better user-perceived performance.
33 Document splitting is so called because inside the WR code and bindings, a
34 "document" is an independent pathway to the compositor that has its own scene,
37 In most of the C++ code in gfx/layers and other parts of Firefox/Gecko, the term
38 "render root" is used instead. A "render root" is exactly equivalent to a WR
39 document; the two terms are used interchangeably. The naming is this way
40 because "document" has different pre-existing meanings in Gecko-land. In this
41 documentation those other meanings are irrelevant and so "document" always refers
42 to the same thing as "render root".
44 At various points there have been discussions to renaming things so that everything
45 is more consistent and less confusing, but that hasn't happened yet.
47 Fundamental data types
48 ----------------------
51 <https://searchfox.org/mozilla-central/rev/da14c413ef663eb1ba246799e94a240f81c42488/gfx/webrender_bindings/WebRenderTypes.h#65>`_
52 data type is an enumeration of all the different documents that are expected
53 to be created. As of this writing, the enumeration contains two entries: Default
54 and Content. The Default document refers to the one that holds the "chrome" stuff
55 and the Content document refers to the one that holds the "content" area stuff.
57 If document splitting is disabled (gfx.webrender.split-render-roots=false) then
58 everything lives in the Default document and the Content document is always empty.
60 Additional data structures in the same file (e.g. wr::RenderRootArray<T>) facilitate
61 converting pre-document-splitting code into document-splitting-aware code, usually
62 by turning a single object into an array of objects, one per document.
67 The notion of having multiple documents has to be introduced at a fairly fundamental
68 level in order to be propagated through the entire rendering pipeline. It starts
69 in the front-end HTML/XUL code where certain elements are annotated as being the
70 transition point between documents. For example, `this code
71 <https://searchfox.org/mozilla-central/rev/8ed8474757695cdae047150a0eaf94a5f1c96dbe/browser/base/content/browser.xhtml#1304>`_
72 explicitly identifies an element and its descendants as being in the "content"
73 document instead of the "default" (or "chrome") document.
75 These attributes are `read during display list building
76 <https://searchfox.org/mozilla-central/rev/8ed8474757695cdae047150a0eaf94a5f1c96dbe/layout/xul/nsBoxFrame.cpp#1112>`_
77 to create the nsDisplayRenderRoot display item in the Gecko display list.
79 When the Gecko display list is processed to create a WebRender display list,
80 it actually ends up creating multiple WR display lists, one for each document. This
81 is necessary because the documents are handled independently inside WR, and so
82 each get their own WebRenderAPI object and separate display list. The way the
83 implementation manages this is by `creating a "sub builder"
84 <https://searchfox.org/mozilla-central/rev/8ed8474757695cdae047150a0eaf94a5f1c96dbe/layout/painting/nsDisplayList.cpp#7043,7065>`_
85 for the render root that is being descended into, and using that instead of the
86 main WR display list builder as the display list is recursed into.
88 Note also that clip chains and stacking contexts are per-document, so when
89 recursing past a nsDisplayRenderRoot item, the `ClipManager and StackingContextHelper
90 <https://searchfox.org/mozilla-central/rev/da14c413ef663eb1ba246799e94a240f81c42488/gfx/layers/wr/WebRenderCommandBuilder.h#236-237>`_
91 being used switches to one specific to the new document. For this to work there are
92 certain assumptions that must hold, which are described in the next section.
93 Other things that must now be managed on a per-document basis are generally
94 encapsulated into the RenderRootStateManager class, and the WebRenderLayerManager
95 holds an `array of these
96 <https://searchfox.org/mozilla-central/rev/8ed8474757695cdae047150a0eaf94a5f1c96dbe/gfx/layers/wr/WebRenderLayerManager.h#242>`_.
98 After the Gecko display list is converted to a set of WebRender display lists
99 (one per document), these are sent across IPC along with any associated resources
100 as part of the `WebRender transaction
101 <https://searchfox.org/mozilla-central/rev/8ed8474757695cdae047150a0eaf94a5f1c96dbe/gfx/layers/ipc/PWebRenderBridge.ipdl#50>`_.
102 Conceptually, the parent side simply demultiplexes the data for different documents,
103 and submits the data for each document to the corresponding WebRenderAPI instance.
105 Limitations/Assumptions
106 -----------------------
108 One of the fundamental issues with the document splitting implementation is that
109 we can have stuff in the UI process that's part of the "content" renderroot (e.g.
110 a sidebar that appears to the left of the content area). The expectation for
111 front-end authors would be that this would be affected by ancestor elements that
112 are also in the UI process. Consider this outline of a Gecko display list:
116 - Root display item R
117 - ... stuff here (call it Q) ...
120 - display item B (flagged as being in the content renderroot)
123 If item P was a filter, for example, that would normally apply to all of items
124 A, B, and C. This would mean either sharing the filter between the "chrome" renderroot
125 and the "content" renderroot, or duplicating it such that it existed in both
126 renderroots. The sharing is not possible as it violates the independence of WR
127 documents. The duplication is technically possible, but could result in visual
128 glitches as the two documents would be processed and composited separately.
130 In order to avoid this problem, the design of document splitting explicitly assumes
131 that such a scenario will not happen. In particular, the only information that
132 gets carried across the render root boundary is the positioning offset. Any
133 filters, transforms that are not 2D axis-aligned, opacity, or mix blend mode
134 properties do NOT get carried across the render root boundary. Similarly, a
135 scrollframe may not contain content from multiple render roots, because that
136 would lead to a similar problem in APZ where it would have to update the scroll
137 position of scrollframes in multiple documents and they might get composited
138 at separate times, resulting in visual glitches.
143 On the content side, all of the document splitting work happens in the UI process.
144 In other words, content processes don't generally know what document they are part
145 of, and don't ever split their display lists into multiple documents. Only the UI
146 process ever sends multiple display lists to the compositor side.
148 There are a number of APIs on PWebRenderBridge where a wr::RenderRoot is passed
149 across from the content side to the compositor side. And since PWebRenderBridge
150 is a unified protocol that is used by both the UI process and content processes
151 to communicate with the compositor, the content processes must provide *some*
152 value for the wr::RenderRoot. But since it doesn't (or shouldn't) be aware of
153 what document it's in, it must always pass wr::RenderRoot::Default.
155 Compositor-side code in WebRenderBridgeParent is responsible for checking that
156 any wr::RenderRoot values provided from a content process are in fact wr::RenderRoot::Default.
157 If this is not the case, it is either a programmer error or a hijacked content
158 process, and appropriate handling should be used. In particular, the compositor
159 side code should *never* blindly use the wr::RenderRoot value provided over the IPC
160 channel as hijacked content processes could force the compositor into leaking
161 information or otherwise violate the security and integrity of the browser. Instead,
162 the compositor is responsible for determining where the content is attached in
163 the display list of the UI process, and determine the appropriate document for that
164 content process. This information is stored in the `WebRenderBridgeParent::mRenderRoot
165 <https://searchfox.org/mozilla-central/rev/8ed8474757695cdae047150a0eaf94a5f1c96dbe/gfx/layers/wr/WebRenderBridgeParent.h#495>`_
168 Implementation details
169 ----------------------
171 This section describes various knots of complexity in the document splitting
172 implementation. That is, these pieces are thought to introduce higher-than-normal
173 levels of complexity into the feature, and should be handled with care.
178 When a display list transaction is sent from the content side to the compositor,
179 APZ is also notified of the update, so that it can internally update its own
180 data structures. One of these data structures is a tree representation of the
181 scrollable frames on the page. With document splitting, the scrollable frames
182 may now be split across multiple documents. APZ needs to record which document
183 each scrollable frame belongs to, so that when providing the async scroll offset
184 to WebRender, it can send the scroll offset for a given a scrollable frame to the
185 correct WebRender document. As one might expect, this is stored in the `mRenderRoot
186 <https://searchfox.org/mozilla-central/rev/06bd14ced96f25ff1dbd5352cb985fc0fa12a64e/gfx/layers/apz/src/AsyncPanZoomController.h#916>`_
187 field in the AsyncPanZoomController (there is one instance of this per scrollable
190 Additionally, when new display list transactions and other messages are received
191 in WebRenderBridgeParent, APZ cannot process these updates right away. Doing so
192 would cause APZ to respond to user input based on the new display list, while
193 the WebRender internal state still corresponds to the old display list. To ensure
194 that APZ and WR's internal state remain in sync, APZ puts these update messages
195 into an `"updater queue"
196 <https://searchfox.org/mozilla-central/rev/06bd14ced96f25ff1dbd5352cb985fc0fa12a64e/gfx/layers/apz/src/APZUpdater.cpp#340>`_
197 which is processed synchronously with the WebRender scene swap. This ensures that
198 APZ's internal state is updated at the same time that WebRender swaps in the new
199 scene, and everything stays in sync. Conceptually this is relatively simple,
200 until we add document splitting to the mix.
202 Now instead of one scene swap, we have multiple scene swaps happening, one for
203 each of the documents. In other words, even though WebRenderBridgeParent gets a
204 single "display list transaction", the display lists for the different documents
205 modify WR's internal state at different times. Consequently, to keep APZ in sync,
206 we must apply a similar "splitting" to the APZ updater queue, so that messages
207 pertaining to a particular document are applied synchronously with that
208 document's scene swap.
210 (As a relevant aside: there other messages that APZ receives over other IPC
211 channels (e.g. PAPZCTreeManager) that have ordering requirements with the
212 PWebRenderBridge messages, and so those also normally end up in the updater queue.
213 Consequently, these other messages are also now subjected to the splitting of
216 Again, conceptually this is relatively simple - we just need to keep a separate
217 queue for each document, and when an update message comes in, we decide which
218 document a given update message is associated with, and put the message into the
219 corresponding queue. The catch is that often these messages deal with a specific
220 element or scrollframe on the page, and so when the message is sent from the
221 UI process, we need to do a DOM or frame tree walk to determine which render root
222 that element is associated with. There are some `GetRenderRootForXXX
223 <https://searchfox.org/mozilla-central/rev/06bd14ced96f25ff1dbd5352cb985fc0fa12a64e/gfx/thebes/gfxUtils.h#317-322>`_
224 helpers in gfxUtils that assist with this task.
226 The other catch is that an APZ message may be associated with multiple documents.
227 A concrete example is if a user on a touch device does a multitouch action with
228 different fingers landing on different documents, which would trigger a call to
230 <https://searchfox.org/mozilla-central/rev/06bd14ced96f25ff1dbd5352cb985fc0fa12a64e/gfx/layers/ipc/APZCTreeManagerParent.cpp#76>`_
231 with multiple targets, each potentially belonging to a different render root.
232 In this case, we need to ensure that the message only gets processed after
233 the corresponding scene swaps for all the related documents. This is currently
234 implemented by having each message in the queue associated with a set of documents
235 rather than a single document, and only processing the message once all the
236 documents have done their scene swap. In the example above, this is indicated by
237 building the set of render roots `here
238 <https://searchfox.org/mozilla-central/rev/06bd14ced96f25ff1dbd5352cb985fc0fa12a64e/gfx/layers/ipc/APZCTreeManagerParent.cpp#83>`_
239 and passing that to the updater queue when queueing the message. This interaction
240 is a source of some complexity and may have latent bugs.
245 Bug 1547351 provided a new and tricky problem where a content process is rendering
246 stuff that needs to go into the "default" document because it's actually an
247 out-of-process addon content that renders in the chrome area. Prior to this bug,
248 the WebRenderBridgeParent instances that corresponded to content processes
249 (hereafter referred to as "sub-WRBPs", in contrast to the "root WRBP" that
250 corresponds to the UI process) simply assumed they were in the "Content" document,
251 but this bug proved that this simplistic assumption does not always hold.
253 The solution chosen to this problem was to have the root WebRenderLayerManager
254 (that lives in the trusted UI process) to annotate each out-of-process subpipeline
255 with the render root it belongs in, and send that information over to the
256 root WRBP as part of the display list transaction. The sub-WRBPs know their own
257 pipeline ids, and therefore can find their render root by querying the root WRBP.
258 The catch is that sub-WRBPs may receive display list transactions *before* the
259 root WRBP receives the display list update that contains the render root mapping
260 information. This happens in cases like during tab switch preload, where the
261 user mouses over a background tab, and we pre-render it (i.e. compute and send
262 the display list for that tab to the compositor) so that the tab switch is faster.
263 In this scenario, that display list/subpipeline is not actually rendered, is not
264 tied in to the display list of the UI process, and therefore doesn't get associated
267 When the sub-WRBP receives a transaction in a scenario like this, it cannot
268 actually process it (by sending it to WebRender) because it doesn't know which
269 WR document it associated with. So it has to hold on to it in a "deferred update"
270 queue until some later point where it does find out which WR document it is
271 associated with, and at that point it can process the deferred update queue.
273 Again, conceptually this is straightforward, but the implementation produces a
274 bunch of complexity because it needs to handle both orderings - the case where
275 the sub-WRBP knows its render root, and the case where it doesn't yet. And the
276 root WRBP, upon receiving a new transaction, would need to notify the sub-WRBPs
277 of their render roots and trigger processing of the deferred updates.
279 Further complicating matters is Fission, because with Fission there can be
280 pipelines nested to arbitrary depths. This results in a tree of sub-WRBPs, with
281 each WRBP knowing what its direct children are, and only the root WRBP knowing
282 which documents its immediate children are in. So there could be a chain of
283 sub-WRBPs with a "missing link" (i.e. one that doesn't yet know what its children
284 are, because it hasn't received a display list transaction yet) and upon filling
285 in that missing link, all the descendant WRBPs from that point suddenly also
286 know which WR document they are associated with and can process their deferred
289 Managing all this deferred state, ensuring it is processed as soon as possible,
290 and clearing it out when the content side is torn down (which may happen without
291 it ever being rendered) is a source of complexity and may have latent bugs.
293 Transaction completion
294 ~~~~~~~~~~~~~~~~~~~~~~
296 Transactions between the content and compositor side are throttled such that
297 the content side doesn't go nuts pushing over display lists to the compositor
298 when the compositor has a backlog of pending display lists. The way the throttling
299 works is that each transaction sent has a transaction id, and after the compositor
300 is done processing a transaction, it reports the completed transaction id back
301 to the content side. The content side can use this information to track how many
302 transactions are inflight at any given time and apply throttling as necessary.
304 With document splitting, a transaction sent from the content side gets split up
305 and sent to multiple WR documents, each of which are operating independently of
306 each other. If we propagate the transaction id to each of those WR documents,
307 then the first document to complete its work would trigger the "transaction complete"
308 message back to the content, which would unthrottle the next transaction. In this
309 scenario, other documents may still be backlogged, so the unthrottling is
312 Instead, what we want is for all documents processing a particular transaction
313 id to finish their work and render before we send the completion message back
314 to content. In fact, there's a bunch of work that falls into the same category
315 as this completion message - stuff that should happen after all the WR documents
316 are done processing their pieces of the split transaction.
318 The way this is managed is via a conditional in `HandleFrame
319 <https://searchfox.org/mozilla-central/rev/06bd14ced96f25ff1dbd5352cb985fc0fa12a64e/gfx/webrender_bindings/RenderThread.cpp#988>`_.
320 This code is invoked once for each document as it advances to the rendering step,
321 and the code in `RenderThread::IncRenderingFrameCount
322 <https://searchfox.org/mozilla-central/rev/06bd14ced96f25ff1dbd5352cb985fc0fa12a64e/gfx/webrender_bindings/RenderThread.cpp#552-553>`_
323 acts as a barrier to ensure that the call chain only gets propagated once all
324 the documents have done their processing work.
326 I'm listing this piece as a potential source of complexity for document splitting
327 because it seems like a fairly important piece but the relevant code is
328 "buried" away in a place where one might not easily stumble upon it. It's also not
329 clear to me that the implications of this problem and solution have been fully
330 explored. In particular, I assume that there are latent bugs here because other
331 pieces of code were assuming a certain behaviour from the pre-document-splitting
332 code that the post-document-splitting code may not satisfy exactly.