1 --- !ditz.rubyforge.org,2008-03-06/issue
2 title: Parallelize 3D octree traversal
3 desc: 3D octree traversal should be efficiently parallelized.
7 reporter: David Hilvert <dhilvert@auricle.dyndns.org>
10 creation_time: 2009-01-12 06:47:27.669390 Z
13 id: 62b9a845f0df4d6dc0895d10c0bba2bc5054cfd8
15 - - 2009-01-12 06:50:35.672248 Z
16 - David Hilvert <dhilvert@auricle.dyndns.org>
19 One possible approach to parallelization would be to process no more than one
20 ray concurrently in time for a given OpenCL work group, but rather to divide
21 a ray selected for processing at a given time into segments (e.g., according to
22 octree divisions) within the work group.
24 Alternatively, nearby spatial rays could perhaps be processed concurrently
25 within a work group, with the effect that octree accesses above a certain size
26 could be synchronized (and, of course, below a certain size would be
28 - - 2009-01-12 06:55:14.412179 Z
29 - David Hilvert <dhilvert@auricle.dyndns.org>
32 If spatial division of a ray were desired for parallelization within a work
33 group, the natural sort of division would probably be in the space defined by a
34 sorting of octree subspaces (e.g., according to distance from the viewpoint)
35 already completed (e.g., on the host processor). (Sorting of this kind is
36 already performed in ALE.)
37 - - 2009-01-12 10:17:17.224350 Z
38 - David Hilvert <dhilvert@auricle.dyndns.org>
41 In the case that nearby spatial rays are processed concurrently within a work
42 group, note that it might be possible to synchronize all octree accesses (i.e.,
43 according to an established sort order [e.g., from viewpoint] of octree
44 elements) across rays, so that processing for each ray accesses the same
45 sequence of octree elements. Such an approach might have some inefficiency due
46 to unnecessary accesses, but might still be more efficient overall due to