gfx/docs/AsyncPanZoom.rst

   1 .. _apz:
   2
   3 Asynchronous Panning and Zooming
   4 ================================
   5
   6 **This document is a work in progress. Some information may be missing
   7 or incomplete.**
   8
   9 .. image:: AsyncPanZoomArchitecture.png
  10
  11 Goals
  12 -----
  13
  14 We need to be able to provide a visual response to user input with
  15 minimal latency. In particular, on devices with touch input, content
  16 must track the finger exactly while panning, or the user experience is
  17 very poor. According to the UX team, 120ms is an acceptable latency
  18 between user input and response.
  19
  20 Context and surrounding architecture
  21 ------------------------------------
  22
  23 The fundamental problem we are trying to solve with the Asynchronous
  24 Panning and Zooming (APZ) code is that of responsiveness. By default,
  25 web browsers operate in a “game loop” that looks like this:
  26
  27 ::
  28
  29        while true:
  30            process input
  31            do computations
  32            repaint content
  33            display repainted content
  34
  35 In browsers the “do computation” step can be arbitrarily expensive
  36 because it can involve running event handlers in web content. Therefore,
  37 there can be an arbitrary delay between the input being received and the
  38 on-screen display getting updated.
  39
  40 Responsiveness is always good, and with touch-based interaction it is
  41 even more important than with mouse or keyboard input. In order to
  42 ensure responsiveness, we split the “game loop” model of the browser
  43 into a multithreaded variant which looks something like this:
  44
  45 ::
  46
  47        Thread 1 (compositor thread)
  48        while true:
  49            receive input
  50            send a copy of input to thread 2
  51            adjust rendered content based on input
  52            display adjusted rendered content
  53
  54        Thread 2 (main thread)
  55        while true:
  56            receive input from thread 1
  57            do computations
  58            rerender content
  59            update the copy of rendered content in thread 1
  60
  61 This multithreaded model is called off-main-thread compositing (OMTC),
  62 because the compositing (where the content is displayed on-screen)
  63 happens on a separate thread from the main thread. Note that this is a
  64 very very simplified model, but in this model the “adjust rendered
  65 content based on input” is the primary function of the APZ code.
  66
  67 A couple of notes on APZ's relationship to other browser architecture
  68 improvements:
  69
  70 1. Due to Electrolysis (e10s), Site Isolation (Fission), and GPU Process
  71    isolation, the above two threads often actually run in different
  72    processes. APZ is largely agnostic to this, as all communication
  73    between the two threads for APZ purposes happens using an IPC layer
  74    that abstracts over communication between threads vs. processes.
  75 2. With the WebRender graphics backend, part of the rendering pipeline is
  76    also offloaded from the main thread. In this architecture, the
  77    information sent from the main thread consists of a display list, and
  78    scrolling-related metadata referencing content in that display list.
  79    The metadata is kept in a queue until the display list undergoes an
  80    additional rendering step in the compositor (scene building). At this
  81    point, we are ready to tell APZ about the new content and have it
  82    start applying adjustments to it, as further rendering steps beyond
  83    scene building are done synchronously on each composite.
  84
  85 The compositor in theory can continuously composite previously rendered
  86 content (adjusted on each composite by APZ) to the screen while the
  87 main thread is busy doing other things and rendering new content.
  88
  89 The APZ code takes the input events that are coming in from the hardware
  90 and uses them to figure out what the user is trying to do (e.g. pan the
  91 page, zoom in). It then expresses this user intention in the form of
  92 translation and/or scale transformation matrices. These transformation
  93 matrices are applied to the rendered content at composite time, so that
  94 what the user sees on-screen reflects what they are trying to do as
  95 closely as possible.
  96
  97 Technical overview
  98 ------------------
  99
 100 As per the heavily simplified model described above, the fundamental
 101 purpose of the APZ code is to take input events and produce
 102 transformation matrices. This section attempts to break that down and
 103 identify the different problems that make this task non-trivial.
 104
 105 Checkerboarding
 106 ~~~~~~~~~~~~~~~
 107
 108 The area of page content for which a display list is built and sent to
 109 the compositor is called the “displayport”. The APZ code is responsible
 110 for determining how large the displayport should be. On the one hand, we
 111 want the displayport to be as large as possible. At the very least it
 112 needs to be larger than what is visible on-screen, because otherwise, as
 113 soon as the user pans, there will be some unpainted area of the page
 114 exposed. However, we cannot always set the displayport to be the entire
 115 page, because the page can be arbitrarily long and this would require an
 116 unbounded amount of memory to store. Therefore, a good displayport size
 117 is one that is larger than the visible area but not so large that it is a
 118 huge drain on memory. Because the displayport is usually smaller than the
 119 whole page, it is always possible for the user to scroll so fast that
 120 they end up in an area of the page outside the displayport. When this
 121 happens, they see unpainted content; this is referred to as
 122 “checkerboarding”, and we try to avoid it where possible.
 123
 124 There are many possible ways to determine what the displayport should be
 125 in order to balance the tradeoffs involved (i.e. having one that is too
 126 big is bad for memory usage, and having one that is too small results in
 127 excessive checkerboarding). Ideally, the displayport should cover
 128 exactly the area that we know the user will make visible. Although we
 129 cannot know this for sure, we can use heuristics based on current
 130 panning velocity and direction to ensure a reasonably-chosen displayport
 131 area. This calculation is done in the APZ code, and a new desired
 132 displayport is frequently sent to the main thread as the user is panning
 133 around.
 134
 135 Multiple scrollable elements
 136 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 137
 138 Consider, for example, a scrollable page that contains an iframe which
 139 itself is scrollable. The iframe can be scrolled independently of the
 140 top-level page, and we would like both the page and the iframe to scroll
 141 responsively. This means that we want independent asynchronous panning
 142 for both the top-level page and the iframe. In addition to iframes,
 143 elements that have the overflow:scroll CSS property set are also
 144 scrollable. In the display list, scrollable elements are arranged in a
 145 tree structure, and in the APZ code we have a matching tree of
 146 AsyncPanZoomController (APZC) objects, one for each scrollable element.
 147 To manage this tree of APZC instances, we have a single APZCTreeManager
 148 object. Each APZC is relatively independent and handles the scrolling for
 149 its associated scrollable element, but there are some cases in which they
 150 need to interact; these cases are described in the sections below.
 151
 152 Hit detection
 153 ~~~~~~~~~~~~~
 154
 155 Consider again the case where we have a scrollable page that contains an
 156 iframe which itself is scrollable. As described above, we will have two
 157 APZC instances - one for the page and one for the iframe. When the user
 158 puts their finger down on the screen and moves it, we need to do some
 159 sort of hit detection in order to determine whether their finger is on
 160 the iframe or on the top-level page. Based on where their finger lands,
 161 the appropriate APZC instance needs to handle the input.
 162
 163 This hit detection is done by APZCTreeManager in collaboration with
 164 WebRender, which has more detailed information about the structure of
 165 the page content than is stored in APZ directly. See
 166 :ref:`this section <wr-hit-test-details>` for more details.
 167
 168 Also note that for some types of input (e.g. when the user puts two
 169 fingers down to do a pinch) we do not want the input to be “split”
 170 across two different APZC instances. In the case of a pinch, for
 171 example, we find a “common ancestor” APZC instance - one that is
 172 zoomable and contains all of the touch input points, and direct the
 173 input to that APZC instance.
 174
 175 Scroll Handoff
 176 ~~~~~~~~~~~~~~
 177
 178 Consider yet again the case where we have a scrollable page that
 179 contains an iframe which itself is scrollable. Say the user scrolls the
 180 iframe so that it reaches the bottom. If the user continues panning on
 181 the iframe, the expectation is that the top-level page will start
 182 scrolling. However, as discussed in the section on hit detection, the
 183 APZC instance for the iframe is separate from the APZC instance for the
 184 top-level page. Thus, we need the two APZC instances to communicate in
 185 some way such that input events on the iframe result in scrolling on the
 186 top-level page. This behaviour is referred to as “scroll handoff” (or
 187 “fling handoff” in the case where analogous behaviour results from the
 188 scrolling momentum of the page after the user has lifted their finger).
 189
 190 .. _input-event-untransformation:
 191
 192 Input event untransformation
 193 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 194
 195 The APZC architecture by definition results in two copies of a “scroll
 196 position” for each scrollable element. There is the original copy on the
 197 main thread that is accessible to web content and the layout and
 198 painting code. And there is a second copy on the compositor side, which
 199 is updated asynchronously based on user input, and corresponds to what
 200 the user visually sees on the screen. Although these two copies may
 201 diverge temporarily, they are reconciled periodically. In particular,
 202 they diverge while the APZ code is performing an async pan or zoom
 203 action on behalf of the user, and are reconciled when the APZ code
 204 requests a repaint from the main thread.
 205
 206 Because of the way input events are represented, this has some
 207 unfortunate consequences. Input event coordinates are represented
 208 relative to the device screen - so if the user touches at the same
 209 physical spot on the device, the same input events will be delivered
 210 regardless of the content scroll position. When the main thread receives
 211 a touch event, it combines that with the content scroll position in order
 212 to figure out what DOM element the user touched. However, because we now
 213 have two different scroll positions, this process may not work perfectly.
 214 A concrete example follows:
 215
 216 Consider a device with screen size 600 pixels tall. On this device, a
 217 user is viewing a document that is 1000 pixels tall, and that is
 218 scrolled down by 200 pixels. That is, the vertical section of the
 219 document from 200px to 800px is visible. Now, if the user touches a
 220 point 100px from the top of the physical display, the hardware will
 221 generate a touch event with y=100. This will get sent to the main
 222 thread, which will add the scroll position (200) and get a
 223 document-relative touch event with y=300. This new y-value will be used
 224 in hit detection to figure out what the user touched. If the document
 225 had a absolute-positioned div at y=300, then that would receive the
 226 touch event.
 227
 228 Now let us add some async scrolling to this example. Say that the user
 229 additionally scrolls the document by another 10 pixels asynchronously
 230 (i.e. only on the compositor thread), and then does the same touch
 231 event. The same input event is generated by the hardware, and as before,
 232 the document will deliver the touch event to the div at y=300. However,
 233 visually, the document is scrolled by an additional 10 pixels so this
 234 outcome is wrong. What needs to happen is that the APZ code needs to
 235 intercept the touch event and account for the 10 pixels of asynchronous
 236 scroll. Therefore, the input event with y=100 gets converted to y=110 in
 237 the APZ code before being passed on to the main thread. The main thread
 238 then adds the scroll position it knows about and determines that the
 239 user touched at a document-relative position of y=310.
 240
 241 Analogous input event transformations need to be done for horizontal
 242 scrolling and zooming.
 243
 244 Content independently adjusting scrolling
 245 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 246
 247 As described above, there are two copies of the scroll position in the
 248 APZ architecture - one on the main thread and one on the compositor
 249 thread. Usually for architectures like this, there is a single “source
 250 of truth” value and the other value is simply a copy. However, in this
 251 case that is not easily possible to do. The reason is that both of these
 252 values can be legitimately modified. On the compositor side, the input
 253 events the user is triggering modify the scroll position, which is then
 254 propagated to the main thread. However, on the main thread, web content
 255 might be running Javascript code that programmatically sets the scroll
 256 position (via window.scrollTo, for example). Scroll changes driven from
 257 the main thread are just as legitimate and need to be propagated to the
 258 compositor thread, so that the visual display updates in response.
 259
 260 Because the cross-thread messaging is asynchronous, reconciling the two
 261 types of scroll changes is a tricky problem. Our design solves this
 262 using various flags and generation counters. The general heuristic we
 263 have is that content-driven scroll position changes (e.g. scrollTo from
 264 JS) are never lost. For instance, if the user is doing an async scroll
 265 with their finger and content does a scrollTo in the middle, then some
 266 of the async scroll would occur before the “jump” and the rest after the
 267 “jump”.
 268
 269 Content preventing default behaviour of input events
 270 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 271
 272 Another problem that we need to deal with is that web content is allowed
 273 to intercept touch events and prevent the “default behaviour” of
 274 scrolling. This ability is defined in web standards and is
 275 non-negotiable. Touch event listeners in web content are allowed call
 276 preventDefault() on the touchstart or first touchmove event for a touch
 277 point; doing this is supposed to “consume” the event and prevent
 278 touch-based panning. As we saw in a previous section, the input event
 279 needs to be untransformed by the APZ code before it can be delivered to
 280 content. But, because of the preventDefault problem, we cannot fully
 281 process the touch event in the APZ code until content has had a chance
 282 to handle it.
 283
 284 To balance the needs of correctness (which calls for allowing web content
 285 to successfully prevent default handling of events if it wishes to) and
 286 responsiveness (which calls for avoiding blocking on web content
 287 Javascript for a potentially-unbounded amount of time before reacting to
 288 an event), APZ gives web content a "deadline" to process the event and
 289 tell APZ whether preventDefault() was called on the event. The deadline
 290 is 400ms from the time APZ receives the event on desktop, and 600ms on
 291 mobile. If web content is able to process the event before this deadline,
 292 the decision to preventDefault() the event or not will be respected. If
 293 web content fails to process the event before the deadline, APZ assumes
 294 preventDefault() will not be called and goes ahead and processes the
 295 event.
 296
 297 To implement this, upon receiving a touch event, APZ immediately returns
 298 an untransformed version that can be dispatched to content. It also
 299 schedules the 400ms or 600ms timeout. There is an API that allows the
 300 main-thread event dispatching code to notify APZ as to whether or not the
 301 default action should be prevented. If the APZ content response timeout
 302 expires, or if the main-thread event dispatching code notifies the APZ of
 303 the preventDefault status, then the APZ continues with the processing of
 304 the events (which may involve discarding the events).
 305
 306 To limit the responsiveness impact of this round-trip to content, APZ
 307 tries to identify cases where it can rule out preventDefault() as a
 308 possible outcome. To this end, the hit-testing information sent to the
 309 compositor includes information about which regions of the page are
 310 occupied by elements that have a touch event listener. If an event
 311 targets an area outside of these regions, preventDefault() can be ruled
 312 out, and the round-trip skipped.
 313
 314 Additionally, recent enhancements to web standards have given page
 315 authors new tools that can further limit the responsiveness impact of
 316 preventDefault():
 317
 318 1. Event listeners can be registered as "passive", which means they
 319    are not allowed to call preventDefault(). Authors can use this flag
 320    when writing listeners that only need to observe the events, not alter
 321    their behaviour via preventDefault(). The presence of passive event
 322    listeners does not cause APZ to perform the content round-trip.
 323 2. If page authors wish to disable certain types of touch interactions
 324    completely, they can use the ``touch-action`` CSS property from the
 325    pointer-events spec to do so declaratively, instead of registering
 326    event listeners that call preventDefault(). Touch-action flags are
 327    also included in the hit-test information sent to the compositor, and
 328    APZ uses this information to respect ``touch-action``. (Note that the
 329    touch-action information sent to the compositor is not always 100%
 330    accurate, and sometimes APZ needs to fall back on asking the main
 331    thread for touch-action information, which again involves a
 332    round-trip.)
 333
 334 Other event types
 335 ~~~~~~~~~~~~~~~~~
 336
 337 The above sections talk mostly about touch events, but over time APZ has
 338 been extended to handle a variety of other event types, such as trackpad
 339 and mousewheel scrolling, scrollbar thumb dragging, and keyboard
 340 scrolling in some cases. Much of the above applies to these other event
 341 types too (for example, wheel events can be prevent-defaulted as well).
 342
 343 Importantly, the "untransformation" described above needs to happen even
 344 for event types which are not handled in APZ, such as mouse click events,
 345 since async scrolling can still affect the correct targeting of such
 346 events.
 347
 348
 349 Technical details
 350 -----------------
 351
 352 This section describes various pieces of the APZ code, and goes into
 353 more specific detail on APIs and code than the previous sections. The
 354 primary purpose of this section is to help people who plan on making
 355 changes to the code, while also not going into so much detail that it
 356 needs to be updated with every patch.
 357
 358 Overall flow of input events
 359 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 360
 361 This section describes how input events flow through the APZ code.
 362
 363 Disclaimer: some details in this section are out of date (for example,
 364 it assumes the case where the main thread and compositor thread are
 365 in the same process, which is rarely the case these days, so in practice
 366 e.g. steps 6 and 8 involve IPC, not just "stack unwinding").
 367
 368 1.  Input events arrive from the hardware/widget code into the APZ via
 369     APZCTreeManager::ReceiveInputEvent. The thread that invokes this is
 370     called the "controller thread", and may or may not be the same as the
 371     Gecko main thread.
 372 2.  Conceptually the first thing that the APZCTreeManager does is to
 373     associate these events with “input blocks”. An input block is a set
 374     of events that share certain properties, and generally are intended
 375     to represent a single gesture. For example with touch events, all
 376     events following a touchstart up to but not including the next
 377     touchstart are in the same block. All of the events in a given block
 378     will go to the same APZC instance and will either all be processed
 379     or all be dropped.
 380 3.  Using the first event in the input block, the APZCTreeManager does a
 381     hit-test to see which APZC it hits. If no APZC is hit, the events are
 382     discarded and we jump to step 6. Otherwise, the input block is tagged
 383     with the hit APZC as a tentative target and put into a global APZ
 384     input queue. In addition the target APZC, the result of the hit test
 385     also includes whether the input event landed on a "dispatch-to-content"
 386     region. These are regions of the page where there is something going
 387     on that requires dispatching the event to content and waiting for
 388     a response _before_ processing the event in APZ; an example of this
 389     is a region containing an element with a non-passive event listener,
 390     as described above. (TODO: Add a section that talks about the other
 391     uses of the dispatch-to-content mechanism.)
 392 4.
 393
 394     i.  If the input events landed outside a dispatch-to-content region,
 395         any available events in the input block are processed. These may
 396         trigger behaviours like scrolling or tap gestures.
 397     ii. If the input events landed inside a dispatch-to-content region,
 398         the events are left in the queue and a timeout is initiated. If
 399         the timeout expires before step 9 is completed, the APZ assumes
 400         the input block was not cancelled and the tentative target is
 401         correct, and processes them as part of step 10.
 402
 403 5.  The call stack unwinds back to APZCTreeManager::ReceiveInputEvent,
 404     which does an in-place modification of the input event so that any
 405     async transforms are removed.
 406 6.  The call stack unwinds back to the widget code that called
 407     ReceiveInputEvent. This code now has the event in the coordinate
 408     space Gecko is expecting, and so can dispatch it to the Gecko main
 409     thread.
 410 7.  Gecko performs its own usual hit-testing and event dispatching for
 411     the event. As part of this, it records whether any touch listeners
 412     cancelled the input block by calling preventDefault(). It also
 413     activates inactive scrollframes that were hit by the input events.
 414 8.  The call stack unwinds back to the widget code, which sends two
 415     notifications to the APZ code on the controller thread. The first
 416     notification is via APZCTreeManager::ContentReceivedInputBlock, and
 417     informs the APZ whether the input block was cancelled. The second
 418     notification is via APZCTreeManager::SetTargetAPZC, and informs the
 419     APZ of the results of the Gecko hit-test during event dispatch. Note
 420     that Gecko may report that the input event did not hit any
 421     scrollable frame at all. The SetTargetAPZC notification happens only
 422     once per input block, while the ContentReceivedInputBlock
 423     notification may happen once per block, or multiple times per block,
 424     depending on the input type.
 425 9.
 426
 427     i.   If the events were processed as part of step 4(i), the
 428          notifications from step 8 are ignored and step 10 is skipped.
 429     ii.  If events were queued as part of step 4(ii), and steps 5-8
 430          complete before the timeout, the arrival of both notifications
 431          from step 8 will mark the input block ready for processing.
 432     iii. If events were queued as part of step 4(ii), but steps 5-8 take
 433          longer than the timeout, the notifications from step 8 will be
 434          ignored and step 10 will already have happened.
 435
 436 10. If events were queued as part of step 4(ii) they are now either
 437     processed (if the input block was not cancelled and Gecko detected a
 438     scrollframe under the input event, or if the timeout expired) or
 439     dropped (all other cases). Note that the APZC that processes the
 440     events may be different at this step than the tentative target from
 441     step 3, depending on the SetTargetAPZC notification. Processing the
 442     events may trigger behaviours like scrolling or tap gestures.
 443
 444 If the CSS touch-action property is enabled, the above steps are
 445 modified as follows:
 446
 447 * In step 4, the APZC also requires the allowed touch-action behaviours
 448   for the input event. This might have been determined as part of the
 449   hit-test in APZCTreeManager; if not, the events are queued.
 450 * In step 6, the widget code determines the content element at the point
 451   under the input element, and notifies the APZ code of the allowed
 452   touch-action behaviours. This notification is sent via a call to
 453   APZCTreeManager::SetAllowedTouchBehavior on the input thread.
 454 * In step 9(ii), the input block will only be marked ready for processing
 455   once all three notifications arrive.
 456
 457 Threading considerations
 458 ^^^^^^^^^^^^^^^^^^^^^^^^
 459
 460 The bulk of the input processing in the APZ code happens on what we call
 461 “the controller thread”. In practice the controller thread could be the
 462 Gecko main thread, the compositor thread, or some other thread. There are
 463 obvious downsides to using the Gecko main thread - that is,“asynchronous”
 464 panning and zooming is not really asynchronous as input events can only
 465 be processed while Gecko is idle. In an e10s environment, using the Gecko
 466 main thread of the chrome process is acceptable, because the code running
 467 in that process is more controllable and well-behaved than arbitrary web
 468 content. Using the compositor thread as the controller thread could work
 469 on some platforms, but may be inefficient on others. For example, on
 470 Android (Fennec) we receive input events from the system on a dedicated
 471 UI thread. We would have to redispatch the input events to the compositor
 472 thread if we wanted to the input thread to be the same as the compositor
 473 thread. This introduces a potential for higher latency, particularly if
 474 the compositor does any blocking operations - blocking SwapBuffers
 475 operations, for example. As a result, the APZ code itself does not assume
 476 that the controller thread will be the same as the Gecko main thread or
 477 the compositor thread.
 478
 479 Active vs. inactive scrollframes
 480 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 481
 482 The number of scrollframes on a page is potentially unbounded. However,
 483 we do not want to create a separate displayport for each scrollframe
 484 right away, as this would require large amounts of memory. Therefore,
 485 scrollframes as designated as either “active” or “inactive”. Active
 486 scrollframes get a displayport, and an APZC on the compositor side.
 487 Inactive scrollframes do not get a displayport (a display list is only
 488 built for their viewport, i.e. what is currently visible) and do not get
 489 an APZC.
 490
 491 Consider a page with a scrollframe that is initially inactive. This
 492 scroll frame does not get an APZC, and therefore events targeting it will
 493 target the APZC for the nearest active scrollable ancestor (let's call it
 494 P; note, the rootmost scroll frame in a given process is always active).
 495 However, the presence of the inactive scroll frame is reflected by a
 496 dispatch-to-content region that prevents events over the frame from
 497 erroneously scrolling P.
 498
 499 When the user starts interacting with that content, the hit-test in the
 500 APZ code hits the dispatch-to-content region of P. The input block
 501 therefore has a tentative target of P when it goes into step 4(ii) in the
 502 flow above. When gecko processes the input event, it must detect the
 503 inactive scrollframe and activate it, as part of step 7. Finally, the
 504 widget code sends the SetTargetAPZC notification in step 8 to notify the
 505 APZ that the input block should really apply to this new APZC. An issue
 506 here is that the transaction containing metadata for the newly active
 507 scroll frame must reach the compositor and APZ before the SetTargetAPZC
 508 notification. If this does not occur within the 400ms timeout, the APZ
 509 code will be unable to update the tentative target, and will continue to
 510 use P for that input block. Input blocks that start after the transaction
 511 will get correctly routed to the new scroll frame as there will now be an
 512 APZC instance for the active scrollframe.
 513
 514 This model implies that when the user initially attempts to scroll an
 515 inactive scrollframe, it may end up scrolling an ancestor scrollframe.
 516 Only after the round-trip to the gecko thread is complete is there an
 517 APZC for async scrolling to actually occur on the scrollframe itself. At
 518 that point the scrollframe will start receiving new input blocks and will
 519 scroll normally.
 520
 521 Note: with Fission (where inactive scroll frames would make it impossible
 522 to target the correct process in all situations; see
 523 :ref:`this section <fission-hit-testing>` for more details) and WebRender
 524 (which makes displayports more lightweight as the actual rendering is
 525 offloaded to the compositor and can be done on demand), inactive scroll
 526 frames are being phased out, and we are moving towards a model where all
 527 scroll frames with nonempty scroll ranges are active and get a
 528 displayport and an APZC. To conserve memory, displayports for scroll
 529 frames which have not been recently scrolled are kept to a "minimal" size
 530 equal to the viewport size.
 531
 532 WebRender Integration
 533 ~~~~~~~~~~~~~~~~~~~~~
 534
 535 This section describes how APZ interacts with the WebRender graphics
 536 backend.
 537
 538 Note that APZ predates WebRender, having initially been written to work
 539 with the earlier Layers graphics backend. The design of Layers has
 540 influenced APZ significantly, and this still shows in some places in the
 541 code. Now that the Layers backend has been removed, there may be
 542 opportunities to streamline the interaction between APZ and WebRender.
 543
 544
 545 HitTestingTree
 546 ^^^^^^^^^^^^^^
 547
 548 The APZCTreeManager keeps as part of its internal state a tree of
 549 HitTestingTreeNode instances. This is referred to as the HitTestingTree.
 550
 551 The main purpose of the HitTestingTree is to model the spatial
 552 relationships between content that's affected by async scrolling. Tree
 553 nodes fall roughly into the following categories:
 554
 555 * Nodes representing scrollable content in an active scroll frame. These
 556   nodes are associated with the scroll frame's APZC.
 557 * Nodes representing other content that may move in special ways in
 558   response to async scrolling, such as fixed content, sticky content, and
 559   scrollbars.
 560 * (Non-leaf) nodes which do not represent any content, just metadata
 561   (e.g. a transform) that applies to its descendant nodes.
 562
 563 An APZC may be associated with multiple nodes, if e.g. a scroll frame
 564 scrolls two pieces of content that are interleaved with non-scrolling
 565 content.
 566
 567 Arranging these nodes in a tree allows modelling relationships such as
 568 what content is scrolled by a given scroll frame, what the scroll handoff
 569 relationships are between APZCs, and what content is subject to what
 570 transforms.
 571
 572 An additional use of the HitTestingTree is to allow APZ to keep content
 573 processes up to date about enclosing transforms that they are subject to.
 574 See :ref:`this section <sending-transforms-to-content-processes>` for
 575 more details.
 576
 577 (In the past, with the Layers backend, the HitTestingTree was also used
 578 for compositor hit testing, hence the name. This is no longer the case,
 579 and there may be opportunities to simplify the tree as a result.)
 580
 581 The HitTestingTree is created from another tree data structure called
 582 WebRenderScrollData. The relevant types here are:
 583
 584 * WebRenderScrollData which stores the entire tree.
 585 * WebRenderLayerScrollData, which represents a single "layer" of content,
 586   i.e. a group of display items that move together when scrolling (or
 587   metadata applying to a subtree of such layers). In the Layers backend,
 588   such content would be rendered into a single texture which could then
 589   be moved asynchronously at composite time. Since a layer of content can
 590   be scrolled by multiple (nested) scroll frames, a
 591   WebRenderLayerScrollData may contain scroll metadata for more than one
 592   scroll frame.
 593 * WebRenderScrollDataWrapper, which wraps WebRenderLayerScrollData
 594   but "expanded" in a way that each node only stores metadata for
 595   a single scroll frame. WebRenderScrollDataWrapper nodes have a
 596   1:1 correspondence with HitTestingTreeNodes.
 597
 598 It's not clear whether the distinction between WebRenderLayerScrollData
 599 and WebRenderScrollDataWrapper is still useful in a WebRender-only world.
 600 The code could potentially be revised such that we directly build and
 601 store nodes of a single type with the behaviour of
 602 WebRenderScrollDataWrapper.
 603
 604 The WebRenderScrollData structure is built on the main thread, and
 605 then shipped over IPC to the compositor where it's used to construct
 606 the HitTestingTree.
 607
 608 WebRenderScrollData is built in WebRenderCommandBuilder, during the
 609 same traversal of the Gecko display list that is used to build the
 610 WebRender display list. As of this writing, the architecture for this is
 611 that, as we walk the Gecko display list, we query it to see if it
 612 contains any information that APZ might need to know (e.g. CSS
 613 transforms) via a call to ``nsDisplayItem::UpdateScrollData(nullptr,
 614 nullptr)``. If this call returns true, we create a
 615 WebRenderLayerScrollData instance for the item, and populate it with the
 616 necessary information in ``WebRenderLayerScrollData::Initialize``. We also
 617 create WebRenderLayerScrollData instances if we detect (via ASR changes)
 618 that we are now processing a Gecko display item that is in a different
 619 scrollframe than the previous item.
 620
 621 The main sources of complexity in this code come from:
 622
 623 1. Ensuring the ScrollMetadata instances end on the proper
 624    WebRenderLayerScrollData instances (such that every path from a leaf
 625    WebRenderLayerScrollData node to the root has a consistent ordering of
 626    scrollframes without duplications).
 627 2. The deferred-transform optimization that is described in more detail
 628    at the declaration of ``StackingContextHelper::mDeferredTransformItem``.
 629
 630 .. _wr-hit-test-details:
 631
 632 Hit-testing
 633 ^^^^^^^^^^^
 634
 635 Since the HitTestingTree is not used for actual hit-testing purposes
 636 with the WebRender backend (see previous section), this section describes
 637 how hit-testing actually works with WebRender.
 638
 639 The Gecko display list contains display items
 640 (``nsDisplayCompositorHitTestInfo``) that store hit-testing state. These
 641 items implement the ``CreateWebRenderCommands`` method and generate a "hit-test
 642 item" into the WebRender display list. This is basically just a rectangle
 643 item in the WebRender display list that is no-op for painting purposes,
 644 but contains information that should be returned by the hit-test (specifically
 645 the hit info flags and the scrollId of the enclosing scrollframe). The
 646 hit-test item gets clipped and transformed in the same way that all the other
 647 items in the WebRender display list do, via clip chains and enclosing
 648 reference frame/stacking context items.
 649
 650 When WebRender needs to do a hit-test, it goes through its display list,
 651 taking into account the current clips and transforms, adjusted for the
 652 most recent async scroll/zoom, and determines which hit-test item(s) are under
 653 the target point, and returns those items. APZ can then take the frontmost
 654 item from that list (or skip over it if it happens to be inside a OOP
 655 subdocument that's ``pointer-events:none``) and use that as the hit target.
 656 Note that the hit-test uses the last transform provided by the
 657 ``SampleForWebRender`` API (see next section) which generally reflects the
 658 last composite, and doesn't take into account further changes to the
 659 transforms that have occurred since then. In practice, we should be
 660 compositing frequently enough that this doesn't matter much.
 661
 662 When debugging hit-test issues, it is often useful to apply the patches
 663 on bug 1656260, which introduce a guid on Gecko display items and propagate
 664 it all the way through to where APZ gets the hit-test result. This allows
 665 answering the question "which nsDisplayCompositorHitTestInfo was responsible
 666 for this hit-test result?" which is often a very good first step in
 667 solving the bug. From there, one can determine if there was some other
 668 display item in front that should have generated a
 669 nsDisplayCompositorHitTestInfo but didn't, or if display item itself had
 670 incorrect information. The second patch on that bug further allows exposing
 671 hand-written debug info to the APZ code, so that the WR hit-testing
 672 mechanism itself can be more effectively debugged, in case there is a problem
 673 with the WR display items getting improperly transformed or clipped.
 674
 675 The information returned by WebRender to APZ in response to the hit test
 676 is enough for APZ to identify a HitTestingTreeNode as the target of the
 677 event. APZ can then take actions such as scrolling the target node's
 678 associated APZC, or other appropriate actions (e.g. initiating a scrollbar
 679 drag if a scrollbar thumb node was targeted by a mouse-down event).
 680
 681 Sampling
 682 ^^^^^^^^
 683
 684 The compositing step needs to read the latest async transforms from APZ
 685 in order to ensure scrollframes are rendered at the right position. The API for this is
 686 exposed via the ``APZSampler`` class. When WebRender is ready to do a composite,
 687 it invokes ``APZSampler::SampleForWebRender``. In here, APZ gathers all async
 688 transforms that WebRender needs to know about, including transforms to apply
 689 to scrolled content, fixed and sticky content, and scrollbar thumbs.
 690
 691 Along with sampling the APZ transforms, the compositor also triggers APZ
 692 animations to advance to the next timestep (usually the next vsync). This
 693 happens just before reading the APZ transforms.
 694
 695 Fission Integration
 696 ~~~~~~~~~~~~~~~~~~~
 697
 698 This section describes how APZ interacts with the Fission (Site Isolation)
 699 project.
 700
 701 Introduction
 702 ^^^^^^^^^^^^
 703
 704 Fission is an architectural change motivated by security considerations,
 705 where web content from each origin is isolated in its own process. Since
 706 a page can contain a mixture of content from different origins (for
 707 example, the top level page can be content from origin A, and it can
 708 contain an iframe with content from origin B), that means that rendering
 709 and interacting with a page can now involve coordination between APZ and
 710 multiple content processes.
 711
 712 .. _fission-hit-testing:
 713
 714 Content Process Selection for Input Events
 715 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 716
 717 Input events are initially received in the browser's parent process.
 718 With Fission, the browser needs to decide which of possibly several
 719 content processes an event is targeting.
 720
 721 Since process boundaries correspond to iframe (subdocument) boundaries,
 722 and every (html) document has a root scroll frame, process boundaries are
 723 therefore also scroll frame boundaries. Since APZ already needs a hit
 724 test mechanism to be able to determine which scroll frame an event
 725 targets, this hit test mechanism was a good fit to also use to determine
 726 which content process an event targets.
 727
 728 APZ's hit test was therefore expanded to serve this purpose as well. This
 729 mostly required only minor modifications, such as making sure that APZ
 730 knows about the root scroll frames of iframes even if they're not
 731 scrollable. Since APZ already needs to process all input events to
 732 potentially apply :ref:`untransformations <input-event-untransformation>`
 733 related to async scrolling, as part of this process it now also labels
 734 input events with information identifying which content process they
 735 target.
 736
 737 Hit Testing Accuracy
 738 ^^^^^^^^^^^^^^^^^^^^
 739
 740 Prior to Fission, APZ's hit test could afford to be somewhat inaccurate,
 741 as it could fall back on the dispatch-to-content mechanism to wait for
 742 a more accurate answer from the main thread if necessary, suffering a
 743 performance cost only (not a correctness cost).
 744
 745 With Fission, an inaccurate compositor hit test now implies a correctness
 746 cost, as there is no cross-process main-thread fallback mechanism.
 747 (Such a mechanism was considered, but judged to require too much
 748 complexity and IPC traffic to be worth it.)
 749
 750 Luckily, with WebRender the compositor has much more detailed information
 751 available to use for hit testing than it did with Layers. For example,
 752 the compositor can perform accurate hit testing even in the presence of
 753 irregular shapes such as rounded corners.
 754
 755 APZ leverages WebRender's more accurate hit testing ability to aim to
 756 accurately select the target process (and target scroll frame) for an
 757 event in general.
 758
 759 One consequence of this is that the dispatch-to-content mechanism is now
 760 used less often than before (its primary remaining use is handling
 761 `preventDefault()`).
 762
 763 .. _sending-transforms-to-content-processes:
 764
 765 Sending Transforms To Content Processes
 766 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 767
 768 Content processes sometimes need to be able to convert between screen
 769 coordinates and their local coordinates. To do this, they need to know
 770 about any transforms that their containing iframe and its ancestors are
 771 subject to, including async transforms (particularly in cases where the
 772 async transforms persist for more than just a few frames).
 773
 774 APZ has information about these transforms in its HitTestingTree. With
 775 Fission, APZ periodically sends content processes information about these
 776 transforms so that they are kept relatively up to date.
 777
 778 Testing
 779 -------
 780
 781 APZ makes use of several test frameworks to verify the expected behavior
 782 is seen.
 783
 784 Mochitest
 785 ~~~~~~~~~
 786
 787 The APZ specific mochitests are useful when specific gestures or events need to be tested
 788 with specific content. The APZ mochitests are located in `gfx/layers/apz/test/mochitest`_.
 789 To run all of the APZ mochitests, run something like the following:
 790
 791 ::
 792
 793     ./mach mochitest ./gfx/layers/apz/test/mochitest
 794
 795 The APZ mochitests are often organized as subtests that run in a group. For example,
 796 the `test_group_hittest-2.html`_ contains >20 subtests like
 797 `helper_hittest_overscroll.html`_. When working on a specific subtest, it is often
 798 helpful to use the `apz.subtest` preference to filter the subtests run to just the
 799 tests you are working on. For example, the following would only run the
 800 `helper_hittest_overscroll.html`_ subtest of the `test_group_hittest-2.html`_ group.
 801
 802 ::
 803
 804     ./mach mochitest --setpref apz.subtest=helper_hittest_overscroll.html \
 805         ./gfx/layers/apz/test/mochitest/test_group_hittest-2.html
 806
 807 For more information on mochitest, see the `Mochitest Documentation`_.
 808
 809 .. _gfx/layers/apz/test/mochitest: https://searchfox.org/mozilla-central/source/gfx/layers/apz/test/mochitest
 810 .. _test_group_hittest-2.html: https://searchfox.org/mozilla-central/source/gfx/layers/apz/test/mochitest/test_group_hittest-2.html
 811 .. _helper_hittest_overscroll.html: https://searchfox.org/mozilla-central/source/gfx/layers/apz/test/mochitest/helper_hittest_overscroll.html
 812 .. _Mochitest Documentation: /testing/mochitest-plain/index.html
 813
 814 GTest
 815 ~~~~~
 816
 817 The APZ specific GTests can be found in `gfx/layers/apz/test/gtest/`_. To run
 818 these tests, run something like the following:
 819
 820 ::
 821
 822     ./mach gtest "APZ*"
 823
 824 For more information, see the `GTest Documentation`_.
 825
 826 .. _GTest Documentation: /gtest/index.html
 827 .. _gfx/layers/apz/test/gtest/: https://searchfox.org/mozilla-central/source/gfx/layers/apz/test/gtest/
 828
 829 Reftests
 830 ~~~~~~~~
 831
 832 The APZ reftests can be found in `layout/reftests/async-scrolling/`_ and
 833 `gfx/layers/apz/test/reftest`_. To run the relevant reftests for APZ, run
 834 a large portion of the APZ reftests, run something like the following:
 835
 836 ::
 837
 838     ./mach reftest ./layout/reftests/async-scrolling/
 839
 840 Useful information about the reftests can be found in the `Reftest Documentation`_.
 841
 842 There is no defined process for choosing which directory the APZ reftests
 843 should be placed in, but in general reftests should exist where other
 844 similar tests do.
 845
 846 .. _layout/reftests/async-scrolling/: https://searchfox.org/mozilla-central/source/layout/reftests/async-scrolling/
 847 .. _gfx/layers/apz/test/reftest: https://searchfox.org/mozilla-central/source/gfx/layers/apz/test/reftest/
 848 .. _Reftest Documentation: /layout/Reftest.html
 849
 850 Threading / Locking Overview
 851 ----------------------------
 852
 853 Threads
 854 ~~~~~~~
 855
 856 There are three threads relevant to APZ: the **controller thread**,
 857 the **updater thread**, and the **sampler thread**. This table lists
 858 which threads play these roles on each platform / configuration:
 859
 860 ===================== ============= ============== =============
 861 APZ Thread Name       Desktop       Desktop+GPU    Android
 862 ===================== ============= ============== =============
 863 **controller thread** UI main       GPU main       Java UI
 864 **updater thread**    SceneBuilder  SceneBuilder   SceneBuilder
 865 **sampler thread**    RenderBackend RenderBackend  RenderBackend
 866 ===================== ============= ============== =============
 867
 868 Locks
 869 ~~~~~
 870
 871 There are also a number of locks used in APZ code:
 872
 873 ======================= ==============================
 874 Lock type               How many instances
 875 ======================= ==============================
 876 APZ tree lock           one per APZCTreeManager
 877 APZC map lock           one per APZCTreeManager
 878 APZC instance lock      one per AsyncPanZoomController
 879 APZ test lock           one per APZCTreeManager
 880 Checkerboard event lock one per AsyncPanZoomController
 881 ======================= ==============================
 882
 883 Thread / Lock Ordering
 884 ~~~~~~~~~~~~~~~~~~~~~~
 885
 886 To avoid deadlocks, the threads and locks have a global **ordering**
 887 which must be respected.
 888
 889 Respecting the ordering means the following:
 890
 891 - Let "A < B" denote that A occurs earlier than B in the ordering
 892 - Thread T may only acquire lock L, if T < L
 893 - A thread may only acquire lock L2 while holding lock L1, if L1 < L2
 894 - A thread may only block on a response from another thread T while holding a lock L, if L < T
 895
 896 **The lock ordering is as follows**:
 897
 898 1. UI main
 899 2. GPU main (only if GPU process enabled)
 900 3. Compositor thread
 901 4. SceneBuilder thread
 902 5. **APZ tree lock**
 903 6. RenderBackend thread
 904 7. **APZC map lock**
 905 8. **APZC instance lock**
 906 9. **APZ test lock**
 907 10. **Checkerboard event lock**
 908
 909 Example workflows
 910 ^^^^^^^^^^^^^^^^^
 911
 912 Here are some example APZ workflows. Observe how they all obey
 913 the global thread/lock ordering. Feel free to add others:
 914
 915 - **Input handling** (with GPU process): UI main -> GPU main -> APZ tree lock -> RenderBackend thread
 916 - **Sync messages** in ``PCompositorBridge.ipdl``: UI main thread -> Compositor thread
 917 - **GetAPZTestData**: Compositor thread -> SceneBuilder thread -> test lock
 918 - **Scene swap**: SceneBuilder thread -> APZ tree lock -> RenderBackend thread
 919 - **Updating hit-testing tree**: SceneBuilder thread -> APZ tree lock -> APZC instance lock
 920 - **Updating APZC map**: SceneBuilder thread -> APZ tree lock -> APZC map lock
 921 - **Sampling and animation deferred tasks** [1]_: RenderBackend thread -> APZC map lock -> APZC instance lock
 922 - **Advancing animations**: RenderBackend thread -> APZC instance lock
 923
 924 .. [1] It looks like there are two deferred tasks that actually need the tree lock,
 925    ``AsyncPanZoomController::HandleSmoothScrollOverscroll`` and
 926    ``AsyncPanZoomController::HandleFlingOverscroll``. We should be able to rewrite
 927    these to use the map lock instead of the tree lock.
 928    This will allow us to continue running the deferred tasks on the sampler
 929    thread rather than having to bounce them to another thread.