bugs/issue-62b9a845f0df4d6dc0895d10c0bba2bc5054cfd8.yaml

   1 --- !ditz.rubyforge.org,2008-03-06/issue
   2 title: Parallelize 3D octree traversal
   3 desc: 3D octree traversal should be efficiently parallelized.
   4 type: :task
   5 component: libale
   6 release:
   7 reporter: David Hilvert <dhilvert@auricle.dyndns.org>
   8 status: :unstarted
   9 disposition:
  10 creation_time: 2009-01-12 06:47:27.669390 Z
  11 references: []
  12
  13 id: 62b9a845f0df4d6dc0895d10c0bba2bc5054cfd8
  14 log_events:
  15 - - 2009-01-12 06:50:35.672248 Z
  16   - David Hilvert <dhilvert@auricle.dyndns.org>
  17   - created
  18   - |-
  19     One possible approach to parallelization would be to process no more than one
  20     ray concurrently in time for a given OpenCL work group, but rather to divide
  21     a ray selected for processing at a given time into segments (e.g., according to
  22     octree divisions) within the work group.
  23
  24     Alternatively, nearby spatial rays could perhaps be processed concurrently
  25     within a work group, with the effect that octree accesses above a certain size
  26     could be synchronized (and, of course, below a certain size would be
  27     unsynchronized).
  28 - - 2009-01-12 06:55:14.412179 Z
  29   - David Hilvert <dhilvert@auricle.dyndns.org>
  30   - commented
  31   - |-
  32     If spatial division of a ray were desired for parallelization within a work
  33     group, the natural sort of division would probably be in the space defined by a
  34     sorting of octree subspaces (e.g., according to distance from the viewpoint)
  35     already completed (e.g., on the host processor).  (Sorting of this kind is
  36     already performed in ALE.)
  37 - - 2009-01-12 10:17:17.224350 Z
  38   - David Hilvert <dhilvert@auricle.dyndns.org>
  39   - commented
  40   - |-
  41     In the case that nearby spatial rays are processed concurrently within a work
  42     group, note that it might be possible to synchronize all octree accesses (i.e.,
  43     according to an established sort order [e.g., from viewpoint] of octree
  44     elements) across rays, so that processing for each ray accesses the same
  45     sequence of octree elements.  Such an approach might have some inefficiency due
  46     to unnecessary accesses, but might still be more efficient overall due to
  47     synchronization.