3 Tor directory protocol for 0.1.1.x series
5 0. Scope and preliminaries
7 This document should eventually be merged to replace and supplement the
8 existing notes on directories in tor-spec.txt.
10 This is not a finalized version; what we actually wind up implementing
11 may be different from the system described here.
15 There are several problems with the way Tor handles directory information
16 in version 0.1.0.x and earlier. Here are the problems we try to fix with
17 this new design, already partially implemented in 0.1.1.x:
18 1. Directories are very large and use up a lot of bandwidth: clients
19 download descriptors for all router several times an hour.
20 2. Every directory authority is a trust bottleneck: if a single
21 directory authority lies, it can make clients believe for a time an
22 arbitrarily distorted view of the Tor network.
23 3. Our current "verified server" system is kind of nonsensical.
24 4. Getting more directory authorities adds more points of failure and
25 worsens possible partitioning attacks.
27 There are two problems that remain unaddressed by this design.
28 5. Requiring every client to know about every router won't scale.
29 6. Requiring every directory cache to know every router won't scale.
33 There is a small set (say, around 10) of semi-trusted directory
34 authorities. A default list of authorities is shipped with the Tor
35 software. Users can change this list, but are encouraged not to do so, in
36 order to avoid partitioning attacks.
38 Routers periodically upload signed "descriptors" to the directory
39 authorities describing their keys, capabilities, and other information.
40 Routers may act as directory mirrors (also called "caches"), to reduce
41 load on the directory authorities. They announce this in their
44 Each directory authority periodically generates and signs a compact
45 "network status" document that lists that authority's view of the current
46 descriptors and status for known routers, but which does not include the
47 descriptors themselves.
49 Directory mirrors download, cache, and re-serve network-status documents
52 Clients, directory mirrors, and directory authorities all use
53 network-status documents to find out when their list of routers is
54 out-of-date. If it is, they download any missing router descriptors.
55 Clients download missing descriptors from mirrors; mirrors and authorities
56 download from authorities. Descriptors are downloaded by the hash of the
57 descriptor, not by the server's identity key: this prevents servers from
58 attacking clients by giving them descriptors nobody else uses.
60 All directory information is uploaded and downloaded with HTTP.
62 Coordination among directory authorities is done client-side: clients
63 compute a vote-like algorithm among the network-status documents they
64 have, and base their decisions on the result.
66 1.1. What's different from 0.1.0.x?
68 Clients used to download a signed concatenated set of router descriptors
69 (called a "directory") from directory mirrors, regardless of which
70 descriptors had changed.
72 Between downloading directories, clients would download "network-status"
73 documents that would list which servers were supposed to running.
75 Clients would always believe the most recently published network-status
76 document they were served.
78 Routers used to upload fresh descriptors all the time, whether their keys
79 and other information had changed or not.
83 The router descriptor format is unchanged from tor-spec.txt.
85 ORs SHOULD generate a new router descriptor whenever any of the
86 following events have occurred:
88 - A period of time (18 hrs by default) has passed since the last
89 time a descriptor was generated.
91 - A descriptor field other than bandwidth or uptime has changed.
93 - Bandwidth has changed by more than +/- 50% from the last time a
94 descriptor was generated, and at least a given interval of time
95 (20 mins by default) has passed since then.
97 - Its uptime has been reset (by restarting).
99 After generating a descriptor, ORs upload it to every directory
100 authority they know, by posting it to the URL
102 http://<hostname>/tor/
104 3. Network status format
106 Directory authorities generate, sign, and compress network-status
107 documents. Directory servers SHOULD generate a fresh network-status
108 document when the contents of such a document would be different from the
109 last one generated, and some time (at least one second, possibly longer)
110 has passed since the last one was generated.
112 The network status document contains a preamble, a set of router status
113 entries, and a signature, in that order.
115 We use the same meta-format as used for directories and router descriptors
116 in "tor-spec.txt". Implementations MAY insert blank lines
117 for clarity between sections; these blank lines are ignored.
118 Implementations MUST NOT depend on blank lines in any particular location.
120 As used here, "whitespace" is a sequence of 1 or more tab or space
123 The preamble contains:
125 "network-status-version" -- A document format version. For this
126 specification, the version is "2".
127 "dir-source" -- The authority's hostname, current IP address, and
128 directory port, all separated by whitespace.
129 "fingerprint" -- A base16-encoded hash of the signing key's
130 fingerprint, with no additional spaces added.
131 "contact" -- An arbitrary string describing how to contact the
132 directory server's administrator. Administrators should include at
133 least an email address and a PGP fingerprint.
134 "dir-signing-key" -- The directory server's public signing key.
135 "client-versions" -- A comma-separated list of recommended client
137 "server-versions" -- A comma-separated list of recommended server
139 "published" -- The publication time for this network-status object.
140 "dir-options" -- A set of flags, in any order, separated by whitespace:
141 "Names" if this directory authority performs name bindings.
142 "Versions" if this directory authority recommends software versions.
144 The dir-options entry is optional. The "-versions" entries are required if
145 the "Versions" flag is present. The other entries are required and must
146 appear exactly once. The "network-status-version" entry must appear first;
147 the others may appear in any order. Implementations MUST ignore
148 additional arguments to the items above, and MUST ignore unrecognized
151 For each router, the router entry contains: (This format is designed for
154 "r" -- followed by the following elements, in order, separated by
157 - A hash of its identity key, encoded in base64, with trailing =
159 - A hash of its most recent descriptor, encoded in base64, with
160 trailing = signs removed. (The hash is calculated as for
161 computing the signature of a descriptor.)
162 - The publication time of its most recent descriptor, in the form
163 YYYY-MM-DD HH:MM:SS, in GMT.
166 - A directory port (or "0" for none")
167 "s" -- A series of whitespace-separated status flags, in any order:
168 "Authority" if the router is a directory authority.
169 "Exit" if the router is useful for building general-purpose exit
171 "Fast" if the router is suitable for high-bandwidth circuits.
172 "Guard" if the router is suitable for use as an entry guard.
173 (Currently, this means 'fast' and 'stable'.)
174 "Named" if the router's identity-nickname mapping is canonical,
175 and this authority binds names.
176 "Stable" if the router is suitable for long-lived circuits.
177 "Running" if the router is currently usable.
178 "Valid" if the router has been 'validated'.
179 "V2Dir" if the router implements this protocol.
181 The "r" entry for each router must appear first and is required. The
182 's" entry is optional. Unrecognized flags and extra elements on the
183 "r" line must be ignored.
185 The signature section contains:
187 "directory-signature". A signature of the rest of the document using
188 the directory authority's signing key.
190 We compress the network status list with zlib before transmitting it.
192 3.1. Establishing server status
194 [[XXXXX Describe how authorities actually decide Fast, Named, Stable,
197 For each OR, a directory server remembers whether the OR was running and
198 functional the last time they tried to connect to it, and possibly other
199 liveness information.
201 Directory server administrators may label some servers or IPs as
202 blacklisted, and elect not to include them in their network-status lists.
204 Thus, the network-status list includes all non-blacklisted,
205 non-expired, non-superseded descriptors for ORs that the directory has
206 observed at least once to be running.
208 Directory server administrators may decide to support name binding. If
209 they do, then they must maintain a file of nickname-to-identity-key
210 mappings, and try to keep this file consistent with other directory
211 servers. If they don't, they act as clients, and report bindings made by
212 other directory servers (name X is bound to identity Y if at least one
213 binding directory lists it, and no directory binds X to some other Y'.)
217 4. Directory server operation
219 All directory authorities and directory mirrors ("directory servers")
220 implement this section, except as noted.
222 4.1. Accepting uploads (authorities only)
224 When a router posts a signed descriptor to a directory authority, the
225 authority first checks whether it is well-formed and correctly
226 self-signed. If it is, the authority next verifies that the nickname
227 question is already assigned to a router with a different public key.
228 Finally, the authority MAY check that the router is not blacklisted
229 because of its key, IP, or another reason.
231 If the descriptor passes these tests, and the authority does not already
232 have a descriptor for a router with this public key, it accepts the
233 descriptor and remembers it.
235 If the authority _does_ have a descriptor with the same public key, the
236 newly uploaded descriptor is remembered if its publication time is more
237 recent than the most recent old descriptor for that router, and either:
238 - There are non-cosmetic differences between the old descriptor and the
240 - Enough time has passed between the descriptors' publication times.
241 (Currently, 12 hours.)
243 Differences between router descriptors are "non-cosmetic" if they would be
244 sufficient to force an upload as described in section 2 above.
246 Note that the "cosmetic difference" test only applies to uploaded
247 descriptors, not to descriptors that the authority downloads from other
250 4.2. Downloading network-status documents
252 All directory servers (authorities and mirrors) try to keep a fresh
253 set of network-status documents from every authority. To do so,
254 every 5 minutes, each authority asks every other authority for its
255 most recent network-status document. Every 15 minutes, each mirror
256 picks a random authority and asks it for the most recent network-status
257 documents for all the authorities the authority knows about (including
258 the chosen authority itself).
260 Directory servers and mirrors remember and serve the most recent
261 network-status document they have from each authority. Other
262 network-status documents don't need to be stored. If the most recent
263 network-status document is over 10 days old, it is discarded anyway.
264 Mirrors SHOULD store and serve network-status documents from authorities
265 they don't recognize, but SHOULD NOT use such documents for any other
268 4.3. Downloading and storing router descriptors
270 Periodically (currently, every 10 seconds), directory servers check
271 whether there are any specific descriptors (as identified by descriptor
272 hash in a network-status document) that they do not have and that they
273 are not currently trying to download.
275 If so, the directory server launches requests to the authorities for these
276 descriptors, such that each authority is only asked for descriptors listed
277 in its most recent network-status. When more than one authority lists the
278 descriptor, we choose which to ask at random.
280 If one of these downloads fails, we do not try to download that descriptor
281 from the authority that failed to serve it again unless we receive a newer
282 network-status from that authority that lists the same descriptor.
284 Directory servers must potentially cache multiple descriptors for each
285 router. Servers must not discard any descriptor listed by any current
286 network-status document from any authority. If there is enough space to
287 store additional descriptors [XXXXXX then how do we pick.]
289 Authorities SHOULD NOT download descriptors for routers that they would
290 immediately reject for reasons listed in 3.1.
294 "Fingerprints" in these URLs are base-16-encoded SHA1 hashes.
296 The authoritative network-status published by a host should be available at:
297 http://<hostname>/tor/status/authority.z
299 The network-status published by a host with fingerprint
300 <F> should be available at:
301 http://<hostname>/tor/status/fp/<F>.z
303 The network-status documents published by hosts with fingerprints
304 <F1>,<F2>,<F3> should be available at:
305 http://<hostname>/tor/status/fp/<F1>+<F2>+<F3>.z
307 The most recent network-status documents from all known authorities,
308 concatenated, should be available at:
309 http://<hostname>/tor/status/all.z
311 The most recent descriptor for a server whose identity key has a
312 fingerprint of <F> should be available at:
313 http://<hostname>/tor/server/fp/<F>.z
315 The most recent descriptors for servers with identity fingerprints
316 <F1>,<F2>,<F3> should be available at:
317 http://<hostname>/tor/server/fp/<F1>+<F2>+<F3>.z
319 (NOTE: Implementations SHOULD NOT download descriptors by identity key
320 fingerprint. This allows a corrupted server (in collusion with a cache) to
321 provide a unique descriptor to a client, and thereby partition that client
322 from the rest of the network.)
324 The server descriptor with (descriptor) digest <D> (in hex) should be
326 http://<hostname>/tor/server/d/<D>.z
328 The most recent descriptors with digests <D1>,<D2>,<D3> should be
330 http://<hostname>/tor/server/d/<D1>+<D2>+<D3>.z
332 The most recent descriptor for this server should be at:
333 http://<hostname>/tor/server/authority.z
334 [Nothing in the Tor protocol uses this resource yet, but it is useful
335 for debugging purposes. Also, the official Tor implementations
336 (starting at 0.1.1.x) use this resource to test whether a server's
337 own DirPort is reachable.]
339 A concatenated set of the most recent descriptors for all known servers
340 should be available at:
341 http://<hostname>/tor/server/all.z
343 For debugging, directories SHOULD expose non-compressed objects at URLs like
344 the above, but without the final ".z".
346 Clients MUST handle compressed concatenated information in two forms:
347 - A concatenated list of zlib-compressed objects.
348 - A zlib-compressed concatenated list of objects.
349 Directory servers MAY generate either format: the former requires less
350 CPU, but the latter requires less bandwidth.
352 Clients SHOULD use upper case letters (A-F) when base16-encoding
353 fingerprints. Servers MUST accept both upper and lower case fingerprints
356 5. Client operation: downloading information
358 Every Tor that is not a directory server (that is, clients and ORs that do
359 not have a DirPort set) implements this section.
361 5.1. Downloading network-status documents
363 Each client maintains an ordered list of directory authorities.
364 Insofar as possible, clients SHOULD all use the same ordered list.
366 Clients check whether they have enough recently published network-status
367 documents (currently, this means that they must have a network-status
368 published within the last 48 hours for over half of the authorities).
369 If they do not, they download enough network-status documents so that this
372 Also, if the most recently published network-status document is over 30
373 minutes old, the client downloads a network-status document.
375 When choosing which documents to download, clients treat their list of
376 directory authorities as a circular ring, and begin with the authority
377 appearing immediately after the authority for their most recently
378 published network-status document.
380 If enough mirrors (currently 4) claim not to have a given network status,
381 we stop trying to download that authority's network-status, until we
382 download a new network-status that makes us believe that the authority in
385 Network-status documents published over 10 hours in the past are
388 5.2. Downloading router descriptors
390 Clients try to have the best descriptor for each router. A descriptor is
392 * it the most recently published descriptor listed for that router by
393 at least two network-status documents.
394 * OR, no descriptor for that router is listed by two or more
395 network-status documents, and it is the most recently published
396 descriptor listed by any network-status document.
398 Periodically (currently every 10 seconds) clients check whether there are
399 any "downloadable" descriptors. A descriptor is downloadable if:
400 - It is the "best" descriptor for some router.
401 - The descriptor was published at least 5 minutes (???) in the past.
402 [This prevents clients from trying to fetch descriptors that the
403 mirrors have not yet retrieved and cached.]
404 - The client does not currently have it.
405 - The client is not currently trying to download it.
407 If at least 1/16 of known routers have downloadable descriptors, or if
408 enough time (currently 10 minutes) has passed since the last time the
409 client tried to download descriptors, it launches requests for all
410 downloadable descriptors, as described in 5.3 below.
412 When a descriptor download fails, the client notes it, and does not
413 consider the descriptor downloadable again until a certain amount of time
414 has passed. (Currently 0 seconds for the first failure, 60 seconds for the
415 second, 5 minutes for the third, 10 minutes for the fourth, and 1 day
416 thereafter.) Periodically (currently once an hour) clients reset the
419 No descriptors are downloaded until the client has downloaded more than
420 half of the network-status documents.
422 5.3. Managing downloads
424 When a client has no live network-status documents, it downloads
425 network-status documents from a randomly chosen authority. In all other
426 cases, the client downloads from mirrors randomly chosen from among those
427 believed to be V2 directory servers. (This information comes from the
428 network-status documents; see 6 below.)
430 When downloading multiple router descriptors, the client chooses multiple
432 - At least 3 different mirrors are used, except when this would result
433 in more than one request for under 4 descriptors.
434 - No more than 128 descriptors are requested from a single mirror.
435 - Otherwise, as few mirrors as possible are used.
436 After choosing mirrors, the client divides the descriptors among them
439 After receiving any response client MUST discard any network-status
440 documents and descriptors that it did not request.
442 6. Using directory information
444 Everyone besides directory authorities uses the approaches in this section
445 to decide which servers to use and what their keys are likely to be.
446 (Directory authorities just believe their own opinions, as in 3.1 above.)
448 6.1. Choosing routers for circuits.
450 Tor implementations only pay attention to "live" network-status documents.
451 A network status is "live" if it is the most recently downloaded network
452 status document for a given directory server, and the server is a
453 directory server trusted by the client, and the network-status document is
454 no more than 2 days old.
456 For time-sensitive information, Tor implementations focus on "recent"
457 network-status documents. A network status is "recent" if it is live, and
458 if it was published in the last 60 minutes. If there are fewer
459 than 3 such documents, the most recently published 3 are "recent." If
460 there are fewer than 3 in all, all are "recent.")
462 Circuits SHOULD NOT be built until the client has enough directory
463 information: at least two live network-status documents, and descriptors
464 for at least 1/4 of the servers believed to be running.
466 A server is "listed" if it is included by more than half of the live
467 network status documents. Clients SHOULD NOT use unlisted servers.
469 A server is "valid" if it is listed as valid by more than half of the live
470 network-status documents. Clients SHOULD NOT use non-valid servers unless
471 specifically configured to do so.
473 A server is "running" if it is listed as running by more than half of the
474 recent network-status documents. Clients SHOULD NOT try to use
477 A server is believed to be a directory mirror if it is listed as a V2
478 directory by more than half of the recent network-status documents.
482 In order to provide human-memorable names for individual server
483 identities, some directory servers bind names to IDs. Clients handle
486 When a client encounters a name it has not mapped before:
488 If all the live "Naming" network-status documents the client has
489 claim that the name binds to some identity ID, and the client has at
490 least three live network-status documents, the client maps the name to
493 If a client encounters a name it has mapped before:
495 It uses the last-mapped identity value, unless all of the "Naming"
496 network status documents that list the name bind it to some other
499 When a user tries to refer to a router with a name that does not have a
500 mapping under the above rules, the implementation SHOULD warn the user.
501 After giving the warning, the implementation MAY use a router that at
502 least one Naming authority maps the name to, so long as no other naming
503 authority maps that name to a different router.
505 6.2. Software versions
507 An implementation of Tor SHOULD warn when it has live network-statuses from
508 more than half of the authorities, and it is running a software version
509 not listed on more than half of the live "Versioning" network-status
514 - Are the magic numbers above sane?
516 - Client-knowledge partitioning is worrisome. Most versions of this
517 don't seem to be worse than the Danezis-Murdoch tracing attack, since
518 an attacker can't do more than deduce probable exits from entries (or
519 vice versa). But what about when the client connects to A and B but in
520 a different order? How bad can it be partitioned based on its