doc/executor.md

   1 # Executor
   2
   3 ## Overview
   4
   5 The executor API allows to submit _runnables_ to be executed from background
   6 threads.
   7
   8 The runnable instances are allocated and freed by the caller. It is the
   9 responsibility of the caller to guarantee that the runnable is valid until
  10 the task is canceled or complete.
  11
  12 The caller is expected to (but not forced to) embed the runnable into its
  13 custom task structure:
  14
  15 ```c
  16 struct my_task
  17 {
  18     /* custom data */
  19     int num;
  20     char *str;
  21
  22     /* embedded runnable */
  23     struct vlc_runnable runnable;
  24 };
  25
  26 void Run(void *userdata)
  27 {
  28     // ...
  29 }
  30 ```
  31
  32 and submit it as follow:
  33
  34 ```c
  35     struct my_task *task = ...;
  36
  37     task->runnable.run = Run;
  38     task->runnable.userdata = task; /* userdata passed to Run() */
  39
  40     vlc_executor_Submit(executor, &task->runnable);
  41 ```
  42
  43 Since the task is allocated by the caller, vlc_executor_Submit() may not fail
  44 (it returns void).
  45
  46 The cancelation of a submitted runnable may be requested. It succeeds only if
  47 the runnable is not started yet.
  48
  49 ```c
  50 VLC_API void
  51 vlc_executor_Submit(vlc_executor_t *executor, struct vlc_runnable *runnable);
  52
  53 VLC_API bool
  54 vlc_executor_Cancel(vlc_executor_t *executor, struct vlc_runnable *runnable);
  55 ```
  56
  57
  58 ## Design discussions
  59
  60 The design of this API is the result of many discussions, especially on the
  61 mailing-list:
  62  - <https://mailman.videolan.org/pipermail/vlc-devel/2020-August/136696.html>
  63  - <https://mailman.videolan.org/pipermail/vlc-devel/2020-September/136944.html>
  64  - <https://mailman.videolan.org/pipermail/vlc-devel/2020-September/137188.html>
  65
  66 This documentation aims to explain the rationale for each decision.
  67
  68
  69 ### Runnable allocation
  70
  71 The `struct vlc_runnable` must be provided by the client, and the executor uses
  72 only this instance to represent its task (it does not allocate per-task private
  73 data).
  74
  75 Firstly, note that this allows `vlc_executor_Submit()` to return `void`, so no
  76 error handling is necessary. It's minor, but convenient.
  77
  78 An attractive alternative would be to return a new allocated runnable on
  79 submission, which could be used for cancelation:
  80
  81 ```c
  82 struct vlc_runnable *
  83 vlc_executor_Submit(vlc_executor_t *, void (*run)(void *), void *userdata);
  84
  85 void vlc_executor_Cancel(vlc_executor_t *, struct vlc_runnable *);
  86 ```
  87
  88 But this is inherently racy: a naive implementation could return a pointer to a
  89 `vlc_runnable` already freed. Indeed, `vlc_executor_Submit()` must post the
  90 runnable to some pending queue, and in theory it may be executed by a background
  91 thread even before the function returns.
  92
  93 For example, if the executor frees the runnable as soon as its execution is
  94 complete, this simple code is incorrect:
  95
  96 ```c
  97 struct my_task
  98 {
  99     char *some_custom_data;
 100     struct vlc_runnable *runnable; /* keep a pointer to cancel it */
 101 };
 102
 103 void Run(void *userdata)
 104 {
 105     struct my_task *task = userdata;
 106     /* (task->runnable may be uninitialized here) */
 107 }
 108 ```
 109
 110 ```c
 111     struct my_task *task = ...;
 112     task->runnable = vlc_executor_Submit(executor, Run, task);
 113     /* task->runnable may be already freed here */
 114     vlc_executor_Cancel(executor, task->runnable); /* boom, use-after-free */
 115 ```
 116
 117 To avoid the use-after-free, the runnable ownership must be shared. This makes
 118 the API more complex:
 119
 120 ```c
 121 struct vlc_runnable_t *runnable = vlc_executor_Submit(executor, Run, task);
 122 vlc_runnable_Release(runnable); /* don't need the runnable anymore */
 123 ```
 124
 125 However, this would still be insufficient: the task resources must typically be
 126 released at the end of the `run()` callback, but at this point
 127 `vlc_executor_Submit()` may not have returned yet, so the runnable instance
 128 would not be known by the user.
 129
 130 Since in practice, the client needs its own task structure (to store the
 131 parameters, state and result of the execution), so it must allocate it and pass
 132 it to the executor (as userdata) anyway. Therefore, it's simpler if the task
 133 already embed the runnable.
 134
 135 Note that it is not mandatory to embed the runnable into the client task
 136 structure (it's just convenient). The runnable could be anywhere, for example on
 137 the stack (provided that it lives long enough).
 138
 139 This design also has disadvantages. In particular, it forces the runnable queue
 140 to be implemented as an intrusive list. As a consequence, the public `struct
 141 vlc_runnable` must include an executor-private field (`struct vlc_list node`),
 142 and the same runnable may not be queued twice (it is not possible to put the
 143 same item twice in the same intrusive list).
 144
 145 Therefore, a client must always explicitly create a new runnable for each
 146 execution. In practice, it should not be a problem though, since it must often
 147 create a new custom task for the execution state anyway.
 148
 149 In addition, further extensions of the executor API are constrained by the fact
 150 that it could not allocate and store per-task data (other than the intrusive
 151 list node).
 152
 153
 154 ### Cancelation
 155
 156 It is important to be able to cancel and interrupt a running task.
 157
 158 However, in C, the interruption mechanism is very specific to the concrete task
 159 implementation, so it could not be provided by the executor: it has to be
 160 provided by the user. And since the user is also at the origin of the
 161 cancelation request (which could lead to the interruption), the executor just
 162 let the user handle interruption manually.
 163
 164 Since it does not handle interruption, it could not provide a generic timeout
 165 mechanism either (see below). The user has to handle timeout manually.
 166
 167 It just provides a way to cancel a queued task not started yet:
 168
 169 ```c
 170 VLC_API bool
 171 vlc_executor_Cancel(vlc_executor_t *executor, struct vlc_runnable *runnable);
 172 ```
 173
 174 The runnable instance is created and destroyed by the user, so there is no
 175 possible use-after-free race condition.
 176
 177 Since the runnable is queued via an intrusive list, it can be removed in O(1).
 178
 179 The tricky part is that it must indicate to the user if the runnable has
 180 actually been dequeued or not (i.e. if it has already been taken by a thread to
 181 be run). Indeed, the user must be able to know if the `run()` callback will be
 182 executed or not, in order to release the task resources correctly in all cases,
 183 without race conditions. This adds some complexity for the user.
 184
 185 For the implementation, the problem is that even if we can remove an item in an
 186 intrusive list in O(1), there is a priori no way to know immediately if the item
 187 was in a specific list or not. To circumvent this problem, when an item is
 188 dequeued, the executor resets the list nodes to `NULL`. This allows to store 1
 189 bit of information indicating whether the item has been dequeued or not.
 190
 191 Note that the use-after-free race condition on cancel could alternatively be
 192 solved by passing an optional user-provided `void *id`:
 193
 194 ```c
 195 struct vlc_runnable
 196 {
 197     void (*run)(void *userdata)
 198     void *userdata;
 199     struct vlc_list node;
 200
 201     /* add a user-provided id (may be NULL) used for cancelation */
 202     void *id;
 203 }
 204
 205 void vlc_executor_Submit(vlc_executor_t *executor, struct vlc_runnable *);
 206
 207 void vlc_executor_Cancel(vlc_executor_t *executor, void *id);
 208 ```
 209
 210 This would solve the problem because the user owns the `id`, so it is guaranteed
 211 to exist on cancel. This is also practical, because the user could just pass a
 212 pointer to its custom task structure.
 213
 214 An additional advantage is that a higher-level API (like the preparser) could
 215 directly benefit from the cancelation API, without tracking its submitted tasks
 216 to find the matching runnable.
 217
 218 But it would only make sense if the user was not already forced to track its
 219 submitted tasks, to be able to cancel them on deletion. Indeed, since the
 220 executor must report for each task if it has been dequeued or not, it is not
 221 possible to provide a function to cancel all tasks at once. Therefore, the user
 222 has to keep track of submitted tasks, which is inconvenient.
 223
 224
 225 ### Timeout
 226
 227 A timeout implementation is often very specific to the concrete task
 228 implementation (for example, calling `vlc_cond_timedwait()` instead of
 229 `vlc_cond_wait()`), so it makes sense to let the user implement it.
 230
 231 However, we could see the timeout as a cancelation where the deadline is known
 232 in advance. Since the user must already implement the interruption on its own,
 233 it could have been used to provide a timeout mechanism "for free" (for the
 234 user).
 235
 236 We decided against it, for several reasons:
 237  - in theory, the user may want to do something different on timeout and
 238    cancelation (it may for example consider that one is an error and the other
 239    is not);
 240  - this would require more callbacks and would complexify the executor API;
 241  - if it's necessary, the user could just use a separate component (timer) to
 242    trigger the interruption.
 243
 244
 245 ## Conclusion
 246
 247 The resulting executor is a "minimal" API, simple and general: it just handles
 248 execution.
 249
 250 As a drawback, it does not help for interruption, cancelation and timeout, so
 251 the user has more work to do on its own. For example, it must always track its
 252 submitted tasks, and implement boilerplate to correctly handle cancelation and
 253 interruption on deletion.
 254
 255 But I can't think of an executor API design in C with all the desirable
 256 properties. Any design I considered sucked one way or another. :/