1 Title: Computing Bandwidth Adjustments
2 Filename: 161-computing-bandwidth-adjustments.txt
11 There is high variance in the performance of the Tor network. Despite
12 our efforts to balance load evenly across the Tor nodes, some nodes are
13 significantly slower and more overloaded than others.
15 Proposal 160 describes how we can augment the directory authorities to
16 vote on measured bandwidths for routers. This proposal describes what
17 goes into the measuring process.
20 2. Measurement Selection
22 The general idea is to determine a load factor representing the ratio
23 of the capacity of measured nodes to the rest of the network. This load
24 factor could be computed from three potentially relevant statistics:
25 circuit failure rates, circuit extend times, or stream capacity.
27 Circuit failure rates and circuit extend times appear to be
28 non-linearly proportional to node load. We've observed that the same
29 nodes when scanned at US nighttime hours (when load is presumably
30 lower) exhibit almost no circuit failure, and significantly faster
31 extend times than when scanned during the day.
33 Stream capacity, however, is much more uniform, even during US
34 nighttime hours. Moreover, it is a more intuitive representation of
35 node capacity, and also less dependent upon distance and latency
36 if amortized over large stream fetches.
39 3. Average Stream Bandwidth Calculation
41 The average stream bandwidths are obtained by dividing the network into
42 slices of 50 nodes each, grouped according to advertised node bandwidth.
44 Two hop circuits are built using nodes from the same slice, and a large
45 file is downloaded via these circuits. The file sizes are set based
46 on node percentile rank as follows:
54 These sizes are based on measurements performed during test scans.
56 This process is repeated until each node has been chosen to participate
57 in at least 5 circuits.
62 The ratios are calculated by dividing each measured value by the
68 After the base ratios are calculated, a second pass is performed
69 to remove any streams with nodes of ratios less than X=0.5 from
70 the results of other nodes. In addition, all outlying streams
71 with capacity of one standard deviation below a node's average
74 The final ratio result will be greater of the unfiltered ratio
75 and the filtered ratio.
78 6. Pseudocode for Ratio Calculation Algorithm
80 Here is the complete pseudocode for the ratio algorithm:
82 Slices = {S | S is 50 nodes of similar consensus capacity}
84 while exists node N in S with circ_chosen(N) < 7:
85 fetch_slice_file(build_2hop_circuit(N, (exit in S)))
87 BW_measured(N) = MEAN(b | b is bandwidth of a stream through N)
88 Bw_stddev(N) = STDDEV(b | b is bandwidth of a stream through N)
89 Bw_avg(S) = MEAN(b | b = BW_measured(N) for all N in S)
91 Normal_Streams(N) = {stream via N | bandwidth >= BW_measured(N)}
92 BW_Norm_measured(N) = MEAN(b | b is a bandwidth of Normal_Streams(N))
94 Bw_net_avg(Slices) = MEAN(BW_measured(N) for all N in Slices)
95 Bw_Norm_net_avg(Slices) = MEAN(BW_Norm_measured(N) for all N in Slices)
98 Bw_net_ratio(N) = Bw_measured(N)/Bw_net_avg(Slices)
99 Bw_Norm_net_ratio(N) = BW_Norm_measured(N)/Bw_Norm_net_avg(Slices)
101 ResultRatio(N) = MAX(Bw_net_ratio(N), Bw_Norm_net_ratio(N))
104 7. Security implications
106 The ratio filtering will deal with cases of sabotage by dropping
107 both very slow outliers in stream average calculations, as well
108 as dropping streams that used very slow nodes from the calculation
111 This scheme will not address nodes that try to game the system by
112 providing better service to scanners. The scanners can be detected
113 at the entry by IP address, and at the exit by the destination fetch
116 Measures can be taken to obfuscate and separate the scanners' source
117 IP address from the directory authority IP address. For instance,
118 scans can happen offsite and the results can be rsynced into the
119 authorities. The destination server IP can also change.
121 Neither of these methods are foolproof, but such nodes can already
122 lie about their bandwidth to attract more traffic, so this solution
123 does not set us back any in that regard.
128 Because each slice takes as long as 6 hours to complete, we will want
129 to parallelize as much as possible. This will be done by concurrently
130 running multiple scanners from each authority to deal with different
131 segments of the network. Each scanner piece will continually loop
132 over a portion of the network, outputting files of the form:
134 node_id=<idhex> SP strm_bw=<BW_measured(N)> SP
135 filt_bw=<BW_Norm_measured(N)> ns_bw=<CurrentConsensusBw(N)> NL
137 The most recent file from each scanner will be periodically gathered
138 by another script that uses them to produce network-wide averages
139 and calculate ratios as per the algorithm in section 6. Because nodes
140 may shift in capacity, they may appear in more than one slice and/or
141 appear more than once in the file set. The most recently measured
142 line will be chosen in this case.
145 9. Integration with Proposal 160
147 The final results will be produced for the voting mechanism
148 described in Proposal 160 by multiplying the derived ratio by
149 the average published consensus bandwidth during the course of the
150 scan, and taking the weighted average with the previous consensus
153 Bw_new = Round((Bw_current * Alpha + Bw_scan_avg*Bw_ratio)/(Alpha + 1))
155 The Alpha parameter is a smoothing parameter intended to prevent
156 rapid oscillation between loaded and unloaded conditions. It is
157 currently fixed at 0.333.
159 The Round() step consists of rounding to the 3 most significant figures
160 in base10, and then rounding that result to the nearest 1000, with
161 a minimum value of 1000.
163 This will produce a new bandwidth value that will be output into a
164 file consisting of lines of the form:
166 node_id=<idhex> SP bw=<Bw_new> NL
168 The first line of the file will contain a timestamp in UNIX time()
169 seconds. This will be used by the authority to decide if the
170 measured values are too old to use.
172 This file can be either copied or rsynced into a directory readable
173 by the directory authority.