2 ======================================================================
4 General ideas for improvements
5 ------------------------------
7 * Listen on specific interfaces or protocols.
9 * Performance - measure and improve it. Chart it over various buffer
10 sizes and threads, as that should make it easier to identify
13 * For parallel plugins, only create threads on demand from parallel
14 client requests, rather than pre-creating all threads at connection
15 time, up to the thread pool size limit. Of course, once created, a
16 thread is reused as possible until the connection closes.
18 * A new threading model, SERIALIZE_RETIREMENT, which lets the client
19 queue up multiple requests and processes them in parallel in the
20 plugin, but where the responses sent back to the client are in the
21 same order as the client's request rather than the plugin's
22 completion. This is stricter than fully parallel, but looser than
23 SERIALIZE_REQUESTS (in particular, a client can get
24 non-deterministic behavior if batched requests touch the same area
27 * Async callbacks. The current parallel support requires one thread
28 per pending message; a solution with fewer threads would split
29 low-level code between request and response, where the callback has
30 to inform nbdkit when the response is ready:
31 https://www.redhat.com/archives/libguestfs/2018-January/msg00149.html
33 * More NBD protocol features. The currently missing features are
34 structured replies for sparse reads, and online resize.
36 * Test that zero-length read/write/extents requests behave sanely
37 (NBD protocol says they are unspecified).
39 * If a client negotiates structured replies, and issues a read/extents
40 call that exceeds EOF (qemu 3.1 is one such client, when nbdkit
41 serves non-sector-aligned images), return the valid answer for the
42 subset of the request in range and then NBD_REPLY_TYPE_ERROR_OFFSET
43 for the tail, rather than erroring the entire request.
45 * Audit the code base to get rid of strerror() usage (the function is
46 not thread-safe); however, using geterror_r() can be tricky as it
47 has a different signature in glibc than in POSIX.
49 * Teach nbdkit_error() to have smart newline appending (for existing
50 inconsistent clients), while fixing internal uses to omit the
51 newline. Commit ef4f72ef has some ideas on smart newlines, but that
52 should probably be factored into a utility function.
54 * Add a mode of operation where nbdkit is handed a pre-opened fd to be
55 used immediately in transmission phase (skipping handshake). There
56 are already third-party clients of the kernel's /dev/nbdX which rely
57 on their own protocol instead of NBD handshake, before calling
58 ioctl(NBD_SET_SOCK); this mode would let the third-party client
59 continue to keep their non-standard handshake while utilizing nbdkit
60 to prototype new behaviors in serving the kernel.
62 * "nbdkit.so": nbdkit as a loadable shared library. The aim of nbdkit
63 is to make it reusable from other programs (see nbdkit-captive(1)).
64 If it was a loadable shared library it would be even more reusable.
65 API would allow you to create an nbdkit instance, configure it (same
66 as the current command line), start it serving on a socket, etc.
67 However perhaps the current ability to work well as a subprocess is
68 good enough? Also allowing multiple instances of nbdkit to be
69 loaded in the same process is probably impossible.
71 * Examine other fuzzers: https://gitlab.com/akihe/radamsa
73 * common/utils/vector.h could be extended and used in other places:
74 - there are some more possible places in the server (anywhere using
76 - add more iterators, map function, etc, as required.
78 * password=- to mean read a password interactively from /dev/tty (not
81 Suggestions for plugins
82 -----------------------
84 Note: qemu supports other formats such as libnfs, iscsi, gluster and
85 ceph/rbd, and while similar plugins could be written for nbdkit there
86 is no compelling reason unless the result is better than qemu-nbd.
87 For the majority of users it would be better if they were directed to
88 qemu-nbd for these use cases.
90 * libblkio https://gitlab.com/libblkio/libblkio is a library for high
91 performance block device I/O. It presently (2021) only creates an
92 abstraction over io_uring for Linux devices, but in future it could
93 support vhost-user. This would allow those devices to be exposed as
94 NBD through nbdkit (although the only known exporter of vhost-user,
95 qemu, can already expose NBD directly). There is some mismatch
96 between the libblkio API and what nbdkit plugins expect -- in
97 particular we may need to use bounce buffers.
101 https://lists.gnu.org/archive/html/qemu-devel/2017-11/msg02971.html
102 is a partial solution but it needs cleaning up.
104 nbdkit-floppy-plugin:
106 * Add boot sector support. In theory this is easy (eg. using
107 SYSLINUX), but the practical reality of making a fully bootable
108 floppy is rather more complex.
110 * Add multiple dir merging.
112 nbdkit-linuxdisk-plugin:
114 * Add multiple dir merging (in e2fsprogs mke2fs).
118 * Map export name to file parameter, allowing clients to access
119 multiple disks from a single VM.
121 * For testing VDDK, try to come up with delay filter + sparse plugin
122 settings which behave closely (in terms of performance and API
123 latency) to the real thing. This would allow us to tune some
124 performance tools without needing VMware all the time.
126 nbdkit-torrent-plugin:
128 * There are lots of settings we could map into parameters:
129 https://www.libtorrent.org/reference-Settings.html#settings_pack
131 * The C++ could be a lot more natural. At the moment it's a kind of
132 “C with C++ extensions”.
134 nbdkit-ondemand-plugin:
136 * Implement more callbacks, eg. .zero
138 * Allow client to select size up to a limit, eg. by sending export
139 names like ‘export:4G’.
143 * Allow expressions to evaluate to numbers, offsets, etc so that this
146 nbdkit data '(0x55 0xAA)*n' n=1000
148 * Allow inclusion of files where the file is not binary but is written
151 * Like $VAR but the variable is either binary or base64.
155 * Look at using skopeo instead of podman pull
156 (https://github.com/containers/skopeo)
158 Suggestions for language plugins
159 --------------------------------
163 * Get the __docstring__ from the module and print it in --help output.
164 This requires changes to the core API so that config_help is a
165 function rather than a variable (see V3 suggestions below).
167 Suggestions for filters
168 -----------------------
170 * Add shared filter. Take advantage of filter context APIs to open a
171 single context into the backend shared among multiple client
172 connections. This may even allow a filter to offer a more parallel
173 threading model than the underlying plugin.
175 * CBT filter to track dirty blocks. See these links for inspiration:
176 https://www.cloudandheat.com/block-level-data-tracking-using-davice-mappers-dm-era/
177 https://github.com/qemu/qemu/blob/master/docs/interop/bitmaps.rst
179 * masking plugin features for testing clients (see 'nozero' and 'fua'
180 filters for examples)
182 * "bandwidth quota" filter which would close a connection after it
183 exceeded a certain amount of bandwidth up or down.
185 * "forward-only" filter. This would turn random access requests from
186 the client into serial requests in the plugin, meaning that the
187 plugin could be written to assume that requests only happen from
188 beginning to end. This is would be useful for plugins that have to
189 deal with non-seekable compressed data. Note the filter would have
190 to work by caching already-read data in a temporary file.
192 * nbdkit-cache-filter should handle ENOSPC errors automatically by
193 reclaiming blocks from the cache
195 * nbdkit-cache-filter could use a background thread for reclaiming.
197 * zstd filter was requested as a way to do what we currently do with
198 xz but saving many hours on compression (at the cost of hundreds of
200 https://github.com/facebook/zstd/issues/395#issuecomment-535875379
202 * nbdkit-exitlast-filter could probably use a configurable timeout so
203 that there is a grace period in case another connection comes along.
205 * nbdkit-pause-filter would probably be more useful if the filter
206 could inject a flush after pausing. However this requires that
207 filter background threads have access to the plugin (see above).
211 * This filter should also support LUKSv2 (and so should qemu).
213 * There are some missing features: ESSIV, more ciphers.
215 * Implement trim and zero if possible.
217 nbdkit-readahead-filter:
219 * The filter should open a new connection to the plugin per background
220 thread so it is able to work with plugins that use the
221 serialize_requests thread model (like curl). At the moment it makes
222 requests on the same connection, so it requires plugins to use the
223 parallel thread model.
225 * It should combine (or avoid) overlapping cache requests.
229 * allow other kinds of traffic shaping such as VBR
231 * limit traffic per client (ie. per IP address)
233 * split large requests to avoid long, lumpy sleeps when request size
234 is much larger than rate limit
238 * allow user to specify which errors cause a retry and which ones are
239 passed through; for example there's probably no point retrying on
242 * there are all kinds of extra complications possible here,
243 eg. specifying a pattern of retrying and reopening:
244 retry-method=RRRORRRRRORRRRR meaning to retry the data command 3
245 times, reopen, retry 5 times, etc.
251 * permit hostnames and hostname wildcards to be used in the
254 * the allow and deny lists should be updatable while nbdkit is
255 running, for example by storing them in a database file
257 nbdkit-extentlist-filter:
259 * read the extents generated by qemu-img map, allowing extents to be
260 ported from a qemu block device
262 * make non-read-only access safe by updating the extent list when the
263 filter sees writes and trims
265 nbdkit-exportname-filter:
267 * find a way to call the plugin's .list_exports during .open, so that
268 we can enforce exportname-strict=true without command line redundancy
270 * add a mode for passing in a file containing exportnames in the same
271 manner accepted by the sh/eval plugins, rather than one name (and no
272 description) per config parameter
277 Things like filtering by IP address can be done using external
278 wrappers (TCP wrappers, systemd), or nbdkit-ip-filter.
280 However it might be nice to have a configurable filter for preventing
281 valid but not sensible requests. The server already filters invalid
282 requests. This would be like seccomp, and could be implemented using
283 an eBPF-based parser. Unfortunately actual eBPF is difficult to use
284 for userspace processes. The "standard" isn't solidly defined - the
285 Linux kernel implementation is the standard - and Linux has by far the
286 best implementation, particularly around bytecode verification and
287 JITting. There is a userspace VM (ubpf) but it has very limited
288 capabilities compared to Linux.
293 Filters allow certain types of composition, but others would not be
294 possible, for example RAIDing over multiple nbd sources. Because the
295 plugin API limits us to loading a single plugin to the server, the
296 best way to do this (and the most robust) is to compose multiple
297 nbdkit processes. Perhaps libnbd will prove useful for this purpose.
302 * Figure out how to get 'make distcheck' working. VPATH builds are
303 working, but various pkg-config results that try to stick
304 bash-completion and ocaml add-ons into their system-wide home do
305 not play nicely with --prefix builds for a non-root user.
307 * Right now, 'make check' builds keys with an expiration of 1 year
308 only if they don't exist, and we leave the keys around except under
309 'make distclean'. This leads to testsuite failures when
310 (re-)building in an incremental tree first started more than a year
311 ago. Better would be having make unconditionally run the generator
312 scripts, but tweak the scripts themselves to be a no-op unless the
313 keys don't exist or have expired.
318 Currently many features are missing, including:
320 * Daemonization. This is not really applicable for Windows where you
321 would instead want to run nbdkit as a service using something like
322 SRVANY. You must use the -f option or one of the other options that
325 * These options are all unimplemented:
326 --exit-with-parent, --group, --run, --selinux-label, --single, --swap,
329 * Many other plugins and filters.
331 * errno_is_preserved should use GetLastError and/or WSAGetLastError
332 but currently does neither so errors from plugins are probably wrong
335 * Most tests are skipped because of the missing features above.
340 From time to time we may update the plugin protocol. This section
341 collects ideas for things which might be fixed in the next version of
344 Note that we keep the old protocol(s) around so that source
345 compatibility is retained. Plugins must opt in to the new protocol
346 using ‘#define NBDKIT_API_VERSION <version>’.
348 * All methods taking a ‘count’ field should be uint64_t (instead of
349 uint32_t). Although the NBD protocol does not support 64 bit
350 lengths, it might do in future.
352 * v2 .can_zero is tri-state for filters, but bool for plugins; in v3,
353 even plugins should get a say on whether to emulate
355 * v2 .can_extents is bool, and cannot affect our response to
356 NBD_OPT_SET_META_CONTEXT (that is, we are blindly emulating extent
357 support regardless of the export name). In v3, .can_extents should
358 be tri-state, with an explicit way to disable that context on a
361 * pread could be changed to allow it to support Structured Replies
362 (SRs). This could mean allowing it to return partial data, holes,
363 zeroes, etc. For a client that negotiates SR coupled with a plugin
364 that supports .extents, the v2 protocol would allow us to at least
365 synthesize NBD_REPLY_TYPE_OFFSET_HOLE for less network traffic, even
366 though the plugin will still have to fully populate the .pread
367 buffer; the v3 protocol should make sparse reads more direct.
369 * Parameters should be systematized so that they aren't just (key,
370 value) strings. nbdkit should know the possible keys for the plugin
371 and filters, and the type of the values, and both check and parse
374 * Modify open() API so it takes an export name and tls parameter.
376 * Modify get_ready() API to pass final selected thread model. Filters
377 already get this information, but to date, we have not yet seen a
378 reason to add nbdkit_get_thread_model for use by v2 plugins.
380 * Change config_help from a variable to a function.
382 * Consider extra parameters to .block_size(). First so that we can
383 separately control maximum data request (eg. pread) vs maximum
384 virtual request (eg. zero). Note that the NBD protocol does not yet
385 support this distinction. We may also need a 'bool strict'
386 parameter to specify whether the client requested block size
387 information or not. qemu-nbd can distinguish these two cases.
389 * Renumber thread models. Redefine existing thread models like this:
391 model existing value new value
392 SERIALIZE_CONNECTIONS 0 1000
393 SERIALIZE_ALL_REQUESTS 1 2000
394 SERIALIZE_REQUESTS 2 3000
397 This allows new thread models to be inserted before and between the
398 existing ones. In particular SERIALIZE_RETIREMENT has been
399 suggested above (inserted between SERIALIZE_REQUESTS and PARALLEL).
400 We could also imagine a thread model more like the one VDDK really
401 wants which calls open and close from the main thread.
403 For backwards compatibility the old numbers 0-3 can be transparently
404 mapped to the new values.