Documentation/TODO

   1 * Flesh out the child ops some more
   2   - move away from 'syscall counts' towards 'iteration count'
   3   - Add more things that a real program would do.
   4     - add all the ops things like fsx do.
   5     - do file ops on a bunch of trinity test files
   6     - open->read->close
   7     - open->mmap->access mem->close
   8     - sysctl flipper.
   9     - pick random elevator alg for all queues
  10     - fork-and-dirty mappings
  11   - Ability to mark some ops as 'NEEDS_ROOT'.
  12   - Move the drop privs code from main to just before we start a new child.
  13
  14 * maps.c improvements:
  15   - Sometimes generate overlapping addresses/lengths when we have ARG_ADDRESS/ARG_ADDRESS2 pairs
  16   - make sure ARG_ADDRESS only uses addresses from this list, and audit all other mmap/malloc uses
  17     in sanitise routines.
  18   - munge lengths when handing them out.
  19   - more access patterns
  20   - mmap files
  21     (we do this already, but don't track it properly)
  22   - get_map_fragment()
  23     - mprotect parts of a map
  24       will need to somehow track what pages are RO/RW etc
  25   - keep track of holes when munmap'd
  26     split maps in two ?
  27     (store original len, and current len)
  28
  29 * munge_process() on child startup
  30   - replace fork() with random clone()
  31   - run children in different namespaces, personalities.
  32   - unshare
  33
  34 * ioctl improvements
  35   - needs filename globbing for some ioctls
  36   - Sanitise routines for more ioctls
  37     - ext4
  38   - Maybe just make the ioctl's be NEED_ROOT child ops
  39
  40 * Some debugging enhancements.
  41   - dump_child_state(childno), to debug a specific child. (Dump all a childs shm arrays)
  42   - Make -D use a separate debug log file
  43
  44 * taint postmortem.
  45   - If we have logging disabled, we don't have a clue really what happened if we taint, and exit.
  46     Store a ringbuffer of the last few syscalls, and dump it before we exit.
  47   - Compare timestamp that taint was noticed at, ignore all later.
  48   - Do taint watching in the child loop too.
  49
  50 * --dry-run mode.
  51    need to work around segv's when we do things like mmap->post and register null maps.
  52
  53 * Rewrite the fd code.
  54   - kill off NR_FILE_FDS
  55   - open some files in the child too after forking.
  56     - this requires a child-local fd mapping table.
  57       Maybe we can then reduce the size of the shared shm->file_fds
  58   - When requesting an fd, occasionally generate a new one.
  59   - Could we do the nftw walks in parallel ?
  60     That would speed up startup a lot.  Though would need to pass list back up to main thread somehow.
  61   - support for multiple victim file parameters
  62   - When picking a random path, instead of treating the pool of paths as one thing,
  63     treat it as multiple (/dev, /sys, /proc). And then do a 1-in-3 chance
  64     of getting one of those. Right now, because there are 5-6 digits worth of /proc & /sys,
  65     they dominate over the /dev entries.
  66   - more fd 'types' (fanotify_init)
  67   - Add a parameter to specify only certain fd types.  --fds=sockets,files
  68
  69 * Change regeneration code.
  70   - Instead of every n syscalls, make it happen after 15 minutes (but with a minimum of n syscalls)
  71
  72 * Pretty-print improvements.
  73  - decode fd number -> filename in output
  74  - decode addresses when printing them out to print 'page_rand+4' instead of a hex address.
  75  - ->decode_argN functions to print decoded flags etc.
  76  - decode maps.
  77
  78 * Watchdog improvements
  79   - in main loop, check watchdog is still alive
  80   - RT watchdog task ?
  81
  82 * filename related issues.
  83   - filename cache.
  84     Similar to the socketcache. Create on startup, then on loading, invalidate entries
  85     that aren't present (/proc/pid etc).
  86     This should improve reproducability in some cases. Right now, when a syscall
  87     says "open the 5231st filename in the list", it differs across runs because we're
  88     rebuilding the list, and the system state means stuff moves around.
  89   - Add a way to add a filter to filelist.
  90     ie, something like --filter-files=*[0-9]* to ignore all files with numbers in them.
  91     Maybe also make this a config file parser.
  92   - Dump filelist to a logfile. (Perhaps this ties in with the idea above to cache the filelist?)
  93   - blacklist filenames for victim path & /proc/self/exe
  94     - make sure we don't call unlink() or rmdir() on them
  95     - also need to watch /proc/$$/exe, look up using shm->pids.
  96   - file list struct extensions
  97     - use count
  98
  99 * Networking improvements.
 100   - Rewrite socket generation.
 101     Organise into (sorted) per-protocol buckets of linked-lists..
 102     - Search buckets for dupes before adding.
 103   - for syscalls that take a fd and a sockaddr, look up the triplet and match.
 104   - Flesh out sockaddr/socket gen for all remaining protocols.
 105   - setsockopt on network sockets when we regenerate
 106     Disabled right now, because it causes some socket types to hang.
 107   - specify an ip of a victim machine (Maybe also config file)
 108
 109 * Improve the ->post routine to walk a list of objects that we allocated during a
 110    syscalls ->sanitise in a ->post method.
 111   - On return from the syscall, we don't call the destructor immediately.
 112     We pick a small random number, and do N other syscalls before we do the destruction.
 113     This requires us to create a list of work to be done later, along with a pointer
 114     to any allocated data.
 115   - some ancillary data needs to be destroyed immediately after the syscall
 116     (it's pointless keeping around mangled pathnames for eg).
 117     For this, we just destroy it in ->post
 118   - Right now ->sanitise routines have to pick either a map, or malloc itself and
 119     do the right thing to free it in ->post. By tagging what the allocation type was in
 120     generic-sanitise, we can do multiple types.
 121
 122 * Perform some checks on return from syscall
 123   - check padding between struct members is zeroed.
 124
 125 * Output errno distribution on exit
 126
 127 * allow for multiple -G's (after there is more than 'vm')
 128
 129 * audit which syscalls never succeed, and write sanitise routines for them
 130
 131 * if a read() blocks and we get a kill from the watchdog, blacklist (close?) that fd/filename.
 132
 133 * Some of the syscalls marked AVOID are done so for good reason.
 134   - Revisit fuzzing ptrace.
 135     - It's disabled currently because of situations like..
 136     child a traces child b
 137     child a segfaults
 138     child b never proceeds, and doesn't get untraced.
 139
 140 * Further syscall annotation improvements
 141   - Finish annotating syscall return types
 142
 143 * structured logging.
 144   - To begin with, in parallel with existing text based logging.
 145   - Basic premise is that we store records of each syscall in a manner that would
 146     allow easier replay of logs.
 147     - For eg, if a param is an fd, we store the type (socket/file/etc..)
 148       as well as a pathname/socket triplet/whatever to create it.
 149   - Eventually, kill off the text based logging, and replace it with
 150     ./trinity --parselog=mylog.bin
 151   - Done correctly, this should allow automated bisecting of replays.
 152     - Different replay strategies:
 153       - replay log in reverse order
 154       - brute force replay using 1 call at a time from beginning of log + last syscall.
 155         (possibly unnecessary if the above strategies are good enough)
 156   - Once bisected, have a tool that can parse the binary log, and generate C.
 157   - Would need a separate binary logfile for each child.
 158     Because locking on a shared file would slow things down, and effectively single
 159     thread things, unless the children pass things to a separate logger thread, which
 160     has its own problems like potentially losing the last N syscalls if we crash)
 161     - To begin with, just allow replay/bisect using one child process.
 162       Synchronising threads across different runs may be complicated.
 163
 164 * Misc cleanups
 165   - syscall.c:syscall32()  hide the assembly in a macro in arch-*.h and only do it
 166      #ifdef HAVE_SYSCALL_32
 167   - Move arch specific syscalls into syscalls/arch/
 168   - Move addresses in get_interesting_value() to a function in per-arch headers.
 169
 170 * watch dmesg buffer for interesting kernel messages and halt if necessary. Lockdep for eg.
 171   - Pause on oops.
 172     Sometimes we might want to read trinity state when we trigger a bad event.
 173
 174 * Blocked child improvements.
 175   - if we find a blocking fd, check if it's a socket, and shutdown() it.
 176     (tricky: we need to do the shutdown in the main process, and then tell other children)
 177
 178 * make -p take an arg for seconds
 179