vcl/README.scheduler

   1 = Introduction =
   2
   3 The VCL scheduler handles LOs primary event queue. It is simple by design,
   4 currently just a single-linked list, processed in list-order by priority
   5 using round-robin for reoccurring tasks.
   6
   7 The scheduler has the following behaviour:
   8
   9 B.1. Tasks are scheduled just priority based
  10 B.2. Implicitly cooperative AKA non-preemptive
  11 B.3. It's not "fair" in any way (a consequence of B.2)
  12 B.4. Tasks are handled round-robin (per priority)
  13 B.5. Higher priorities have lower values
  14 B.6. A small set of priorities instead of an flexible value AKA int
  15
  16 There are some consequences due to this design.
  17
  18 C.1. Higher priority tasks starve lower priority tasks
  19      As long as a higher task is available, lower tasks are never run!
  20      See Anti-pattern.
  21
  22 C.2. Tasks should be split into sensible blocks
  23      If this can't really be done, process pending tasks by calling
  24      Application::Reschedule(). Or use a thread.
  25
  26 C.3. This is not an OS scheduler
  27      There is no real way to "fix" B.2. and B.3.
  28      If you need to do a preemptive task, use a thread!
  29      Otherwise make your task suspendable and check SalInstance::AnyInput
  30      or call Application::Reschedule regularly.
  31
  32
  33 = Driving the scheduler AKA the system timer =
  34
  35   1. There is just one system timer, which drives LO event loop
  36   2. The timer has to run in the main window thread
  37   3. The scheduler is run with the Solar mutex acquired
  38   4. The system timer is a single-shot timer
  39   5. The scheduler system event / message has a low system priority.
  40      All system events should have a higher priority.
  41
  42 Every time a task is started, the scheduler timer is adjusted. When the timer
  43 fires, it posts an event to the system message queue. If the next most
  44 important task is an Idle (AKA instant, 0ms timeout), the event is pushed to
  45 the back of the queue, so we don't starve system messages, otherwise to the
  46 front. This is especially important to get a correct SalInstance::AnyInput
  47 handling, as this is used to suspend long background Idle tasks.
  48
  49 Every time the scheduler is invoked it searches for the next task to process,
  50 restarts the timer with the timeout for the next event and then invokes the
  51 task. After invoking the task and if the task is still active, it is pushed
  52 to the end of the queue and the timeout is eventually adjusted.
  53
  54
  55 = Locking =
  56
  57 The locking is quite primitive: all interaction with internal Scheduler
  58 structures are locked. This includes the ImplSchedulerContext and the
  59 Task::mpSchedulerData, which is actually a part of the scheduler.
  60 Before invoking the task, we have to release the lock, so others can
  61 Start new Tasks.
  62
  63
  64 = Lifecycle / thread-safety of Scheduler-based objects =
  65
  66 A scheduler object it thread-safe in the way, that it can be associated to
  67 any thread and any thread is free to call any functions on it. The owner must
  68 guarantee that the Invoke() function can be called, while the Scheduler object
  69 exists / is not disposed.
  70
  71
  72 = Anti-pattern: Dependencies via (fine grained) priorities =
  73
  74 "Idle 1" should run before "Idle 2", therefore give "Idle 1" a higher priority
  75 then "Idle 2". This just works correct for low frequency idles, but otherwise
  76 always breaks!
  77
  78 If you have some longer work - even if it can be split by into schedulable,
  79 smaller blocks - you normally don't want to schedule it with a non-default
  80 priority, as it starves all lower priority tasks. Even if a block was processed
  81 in "Idle 1", it is scheduled with the same (higher) priority again. Changing
  82 the "Idle" to a "Timer" also won't work, as this breaks the dependency.
  83
  84 What is needed is task based dependency handling, so if "Task 1" is done, it
  85 has to start "Task 2" and if "Task 1" is started again, it has to stop
  86 "Task 2". This currently has to be done by the implementor, but this feature
  87 can be added to the scheduler reasonably.
  88
  89
  90 = Implementation details =
  91
  92 == General: event priority for DoYield ==
  93
  94 There are three types of events, with different priority:
  95
  96 1. LO user events
  97 2. System events
  98 3. LO Scheduler event
  99
 100 They should be processed according to the following code:
 101
 102 bool DoYield( bool bWait, bool bAllCurrent )
 103 {
 104     bool bWasEvent = ProcessUserEvents( bAllCurrent );
 105     if ( !bAllCurrent && bWasEvent )
 106         return true;
 107     bWasEvent = ProcessSystemEvents( bAllCurrent, &bWasSchedulerEvent ) || bWasEvent;
 108     if ( !bWasSchedulerEvent && IsSchedulerEvent() )
 109     {
 110         ProcessSchedulerEvent()
 111         bWasEvent = true;
 112     }
 113     if ( !bWasEvent && bWait )
 114     {
 115         WaitForSystemEvents();
 116         bWasEvent = true;
 117     }
 118     return bWasEvent;
 119 }
 120
 121 == General: main thread deferral ==
 122
 123 Currently for Mac and Windows, we run main thread deferrals by disabling the
 124 SolarMutex using a boolean. In the case of the redirect, this makes
 125 tryToAcquire and doAcquire return true or 1, while a release is ignored.
 126 Also the IsCurrentThread() mutex check function will act accordingly, so all
 127 the DBG_TESTSOLARMUTEX won't fail.
 128
 129 Since we just disable the locks when we start running the deferred code in the
 130 main thread, we won't let the main thread run into stuff, where it would
 131 normally wait for the SolarMutex.
 132
 133 Eventually this will move into the SolarMutex. KDE / Qt also does main
 134 thread redirects using Qt::BlockingQueuedConnection.
 135
 136 == General: processing all current events for DoYield ==
 137
 138 This is easily implemented for all non-priority queue based implementations.
 139 Windows and macOS both have a timestamp attached to their events / messages,
 140 so simply get the current time and just process anything < timestamp.
 141 For the KDE backend this is already the default behaviour - single event
 142 processing isn't even supported. The headless backend accomplishes this by
 143 just processing a copy of the list of current events.
 144
 145 Problematic in this regard is the Gtk+ backend. g_main_context_iteration
 146 dispatches "only those highest priority event sources". There is no real way
 147 to tell, when these became ready. I've added a workaround idea to the TODO
 148 list. FWIW: Qt runs just a single timer source in the glib main context,
 149 basically the same we're doing with the LO scheduler as a system event.
 150
 151 The gen X11 backend has some levels of redirection, but needs quite some work
 152 to get this fixed.
 153
 154 == General: non-main thread yield ==
 155
 156 Yielding from a non-main thread must not wait in the main thread, as this
 157 may block the main thread until some events happen.
 158
 159 Currently we wait on an extra conditional, which is cleared by the main event
 160 loop.
 161
 162 == General: invalidation of elapsed timer event messages ==
 163
 164 Since the system timer to run the scheduler is single-shot, there should never
 165 be more then one elapsed timer event in system event queue. When stopping or
 166 restarting the timer, we eventually have to remove the now invalid event from
 167 the queue.
 168
 169 But for the Windows and macOS backends this may fail as they have delayed
 170 posting of events, so a consecutive remove after a post will actually yield no
 171 remove. On Windows we even get unwanted processing of events outside of the
 172 main event loop, which may call the Scheduler, as timer management is handled
 173 in critical scheduler code.
 174
 175 To prevent these problems, we don't even try to remove these events, but
 176 invalidate them by versioning the timer events. Timer events with invalid
 177 versions are processed but simply don't run the scheduler.
 178
 179 == macOS implementation details ==
 180
 181 Generally the Scheduler is handled as expected, except on resize, which is
 182 handled with different runloop-modes in macOS. In case of a resize, the normal
 183 runloop is suspended in sendEvent, so we can't call the scheduler via posted
 184 main loop-events. Instead the scheduler uses the timer again.
 185
 186 Like the Windows backend, all Cocoa / GUI handling also has to be run in
 187 the main thread. We're emulating Windows out-of-order PeekMessage processing,
 188 via a YieldWakeupEvent and two conditionals. When in a RUNINMAIN call, all
 189 the DBG_TESTSOLARMUTEX calls are disabled, as we can't release the SolarMutex,
 190 but we can prevent running any other SolarMutex based code. Those wakeup
 191 events must be ignored to prevent busy-locks. For more info read the "General:
 192 main thread deferral" section.
 193
 194 We can neither rely on macOS dispatch_sync code block execution nor the
 195 message handling, as both can't be prioritized or filtered and the first
 196 does also not allow nested execution and is just processed in sequence.
 197
 198 There is also a workaround for a problem for pushing tasks to an empty queue,
 199 as [NSApp postEvent: ... atStart: NO] doesn't append the event, if the
 200 message queue is empty.
 201
 202 An additional problem is the filtering of events on Window close. This drops
 203 posted timer events, when a Window is closed resulting in a busy DoYield loop,
 204 so we have to re-post the event, after closing a window.
 205
 206 == Windows implementation details ==
 207
 208 Posted or sent event messages often trigger processing of WndProc in
 209 PeekMessage, GetMessage or DispatchMessage, independently from the message to
 210 fetch, remove or dispatch ("During this call, the system delivers pending,
 211 nonqueued messages..."). Additionally messages have an inherited priority
 212 based on the function used to generate them. Even if WM_TIMER messages should
 213 have the lowest priority, a manually posted WM_TIMER is processed with the
 214 priority of a PostMessage message.
 215
 216 So we're giving up on processing all our Scheduler events as a message in the
 217 system message loop. Instead we just indicate a 0ms timer message by setting
 218 the m_bDirectTimeout in the timer object. This timer is always processed, if
 219 the system message wasn't already our timer. As a result we can also skip the
 220 polling. All this is one more reason to drop the single message processing
 221 in favour of always processing all pending (system) events.
 222
 223 There is another special case, we have to handle: window updates during move
 224 and resize of windows. These system actions run in their own nested message
 225 loop. So we have to completely switch to timers, even for 0ms. But these
 226 posted events prevent any event processing, while we're busy. The only viable
 227 solution seems to be to switch to WM_TIMER based timers, as these generate
 228 messages with the lowest system priority (but they don't allow 0ms timeouts).
 229 So processing slows down during resize and move, but we gain working painting,
 230 even when busy.
 231
 232 An additional workaround is implemented for the delayed queuing of posted
 233 messages, where PeekMessage in WinSalTimer::Stop() won't be able remove the
 234 just posted timer callback message. See "General: invalidation of elapsed
 235 timer event messages" for the details.
 236
 237 To run the required GUI code in the main thread without unlocking the
 238 SolarMutex, we "disable" it. For more infos read the "General: main thread
 239 deferral" section.
 240
 241 == KDE implementation details ==
 242
 243 This implementation also works as intended. But there is a different Yield
 244 handling, because Qts QAbstractEventDispatcher::processEvents will always
 245 process all pending events.
 246
 247
 248 = TODOs and ideas =
 249
 250 == Task dependencies AKA children ==
 251
 252 Every task can have a list of children / a child.
 253
 254  * When a task is stopped, the children are started.
 255  * When a task is started, the children are stopped.
 256
 257 This should be easy to implement.
 258
 259 == Per priority time-sorted queues ==
 260
 261 This would result in O(1) scheduler. It was used in the Linux kernel for some
 262 time (search Ingo Molnar's O(1) scheduler). This can be a scheduling
 263 optimization, which would prevent walking longer event list. But probably the
 264 management overhead would be too large, as we have many one-shot events.
 265
 266 To find the next task the scheduler just walks the (constant) list of priority
 267 queues and schedules the first ready event of any queue.
 268
 269 The downside of this approach: Insert / Start / Reschedule(for "auto" tasks)
 270 now need O(log(n)) to find the position in the queue of the priority.
 271
 272 == Always process all (higher priority) pending events ==
 273
 274 Currently Application::Reschedule() processes a single event or "all" events,
 275 with "all" defined as "100 events" in most backends. This already is ignored
 276 by the KDE backend, as Qt defines its QAbstractEventDispatcher::processEvents
 277 processing all pending events (there are ways to skip event classes, but no
 278 easy way to process just a single event).
 279
 280 Since the Scheduler is always handled by the system message queue, there is
 281 really no more reasoning to stop after 100 events to prevent LO Scheduler
 282 starvation.
 283
 284 == Drop static inherited or composed Task objects ==
 285
 286 The sequence of destruction of static objects is not defined. So a static Task
 287 can not be guaranteed to happen before the Scheduler. When dynamic unloading
 288 is involved, this becomes an even worse problem. This way we could drop the
 289 mbStatic workaround from the Task class.
 290
 291 == Run the LO application in its own thread ==
 292
 293 This would probably get rid of most of the macOS and Windows implementation
 294 details / workarounds, but is quite probably a large amount of work.
 295
 296 Instead of LO running in the main process / thread, we run it in a 2nd thread
 297 and defer al GUI calls to the main thread. This way it'll hopefully not block
 298 and can process system events.
 299
 300 That's just a theory - it definitely needs more analysis before even attending
 301 an implementation.
 302
 303 == Re-evaluate the macOS ImplNSAppPostEvent ==
 304
 305 Probably a solution comparable to the Windows backends delayed PostMessage
 306 workaround using a validation timestamp is better then the current peek,
 307 remove, re-postEvent, which has to run in the main thread.
 308
 309 Originally I didn't evaluate, if the event is actually lost or just delayed.
 310
 311 == Drop nMaxEvents from Gtk+ based backends ==
 312
 313 gint last_priority = G_MAXINT;
 314 bool bWasEvent = false;
 315 do {
 316     gint max_priority;
 317     g_main_context_acquire( NULL );
 318     bool bHasPending = g_main_context_prepare( NULL, &max_priority );
 319     g_main_context_release( NULL );
 320     if ( bHasPending )
 321     {
 322         if ( last_priority > max_priority )
 323         {
 324             bHasPending = g_main_context_iteration( NULL, bWait );
 325             bWasEvent = bWasEvent || bHasPending;
 326         }
 327         else
 328             bHasPending = false;
 329     }
 330 }
 331 while ( bHasPending )
 332
 333 The idea is to use g_main_context_prepare and keep the max_priority as an
 334 indicator. We cannot prevent running newer lower events, but we can prevent
 335 running new higher events, which should be sufficient for most stuff.
 336
 337 This also touches user event processing, which currently runs as a high
 338 priority idle in the event loop.
 339
 340 == Drop nMaxEvents from gen (X11) backend ==
 341
 342 A few layers of indirection make this code hard to follow. The SalXLib::Yield
 343 and SalX11Display::Yield architecture makes it impossible to process just the
 344 current events. This really needs a refactoring and rearchitecture step, which
 345 will also affect the Gtk+ and KDE backend for the user event handling.