1 $Id: /tor/branches/eventdns/doc/dir-spec.txt 9469 2006-11-01T23:56:30.179423Z nickm $
3 Voting on the Tor Directory System
5 0. Scope and preliminaries
7 This document describes a consensus voting scheme for Tor directories.
8 Once it's accepted, it should be merged with dir-spec.txt. Some
9 preliminaries for authority and caching support should be done during
10 the 0.1.2.x series; the main deployment should come during the 0.1.3.x
13 0.1. Goals and motivation: voting.
15 The current directory system relies on clients downloading separate
16 network status statements from the caches signed by each directory.
17 Clients download a new statement every 30 minutes or so, choosing to
18 replace the oldest statement they currently have.
20 This creates a partitioning problem: different clients have different
21 "most recent" networkstatus sources, and different versions of each
22 (since authorities change their statements often).
24 It also creates a scaling problem: most of the downloaded networkstatus
25 are probably quite similar, and the redundancy grows as we add more
28 So if we have clients only download a single multiply signed consensus
29 network status statement, we can:
31 - Reduce client partitioning
32 - Reduce client-side and cache-side storage
33 - Simplify client-side voting code (by moving voting away from the
36 We should try to do this without:
37 - Assuming that client-side or cache-side clocks are more correct
39 - Assuming that authority clocks are perfectly correct.
40 - Degrading badly if a few authorities die or are offline for a bit.
42 We do not have to perform well if:
43 - No clique of more than half the authorities can agree about who
48 Instead of publishing a network status whenever something changes,
49 each authority instead publishes a fresh network status only once per
50 "period" (say, 60 minutes). Authorities either upload this network
51 status (or "vote") to every other authority, or download every other
52 authority's "vote" (see 3.1 below for discussion on push vs pull).
54 After an authority has (or has become convinced that it won't be able to
55 get) every other authority's vote, it deterministically computes a
56 consensus networkstatus, and signs it. Authorities download (or are
57 uploaded; see 3.1) one another's signatures, and form a multiply signed
58 consensus. This multiply-signed consensus is what caches cache and what
61 If an authority is down, authorities vote based on what they *can*
62 download/get uploaded.
64 If an authority is "a little" down and only some authorities can reach
65 it, authorities try to get its info from other authorities.
67 If an authority computes the vote wrong, its signature isn't included on
70 Clients use a consensus if it is "trusted": signed by more than half the
71 authorities they recognize. If clients can't find any such consensus,
72 they use the most recent trusted consensus they have. If they don't
73 have any trusted consensus, they warn the user and refuse to operate
74 (and if DirServers is not the default, beg the user to adapt the list
79 2.1. Vote specifications
81 Votes in v2.1 are just like v2 network status documents. We add these
82 fields to the preamble:
84 "vote-status" -- the word "vote".
86 "valid-until" -- the time when this authority expects to publish its
89 "known-flags" -- a space-separated list of flags that will sometimes
90 be included on "s" lines later in the vote.
92 "dir-source" -- as before, except the "hostname" part MUST be the
93 authority's nickname, which MUST be unique among authorities, and
94 MUST match the nickname in the "directory-signature" entry.
96 Authorities SHOULD cache their most recently generated votes so they
97 can persist them across restarts. Authorities SHOULD NOT generate
98 another document until valid-until has passed.
100 Router entries in the vote MUST be sorted in ascending order by router
101 identity digest. The flags in "s" lines MUST appear in alphabetical
104 Votes SHOULD be synchronized to half-hour publication intervals (one
105 hour? XXX say more; be more precise.)
107 XXXX some way to request older networkstatus docs?
109 2.2. Consensus directory specifications
111 Consensuses are like v2.1 votes, except for the following fields:
113 "vote-status" -- the word "consensus".
115 "published" is the latest of all the published times on the votes.
117 "valid-until" is the earliest of all the valid-until times on the
120 "dir-source" and "fingerprint" and "dir-signing-key" and "contact"
121 are included for each authority that contributed to the vote.
123 "vote-digest" for each authority that contributed to the vote,
124 calculated as for the digest in the signature on the vote. [XXX
125 re-English this sentence]
127 "client-versions" and "server-versions" are sorted in ascending
128 order based on version-spec.txt.
130 "dir-options" and "known-flags" are not included.
131 [XXX really? why not list the ones that are used in the consensus?
132 For example, right now BadExit is in use, but no servers would be
133 labelled BadExit, and it's still worth knowing that it was considered
134 by the authorities. -RD]
136 The fields MUST occur in the following order:
137 "network-status-version"
141 For each authority, sorted in ascending order of nickname, case-
143 "dir-source", "fingerprint", "contact", "dir-signing-key",
148 The signatures at the end of the document appear as multiple instances
149 of directory-signature, sorted in ascending order by nickname,
152 A router entry should be included in the result if it is included by more
153 than half of the authorities (total authorities, not just those whose votes
154 we have). A router entry has a flag set if it is included by more than
155 half of the authorities who care about that flag. [XXXX this creates an
156 incentive for attackers to DOS authorities whose votes they don't like.
157 Can we remember what flags people set the last time we saw them? -NM]
158 [Which 'we' are we talking here? The end-users never learn which
159 authority sets which flags. So you're thinking the authorities
160 should record the last vote they saw from each authority and if it's
161 within a week or so, count all the flags that it advertised as 'no'
162 votes? Plausible. -RD]
164 The signature hash covers from the "network-status-version" line through
165 the characters "directory-signature" in the first "directory-signature"
168 Consensus directories SHOULD be rejected if they are not signed by more
169 than half of the known authorities.
171 2.2.1. Detached signatures
173 Assuming full connectivity, every authority should compute and sign the
174 same consensus directory in each period. Therefore, it isn't necessary to
175 download the consensus computed by each authority; instead, the authorities
176 only push/fetch each others' signatures. A "detached signature" document
177 contains a single "consensus-digest" entry and one or more
178 directory-signature entries. [XXXX specify more.]
180 2.3. URLs and timelines
182 2.3.1. URLs and timeline used for agreement
184 An authority SHOULD publish its vote immediately at the start of each voting
185 period. It does this by making it available at
186 http://<hostname>/tor/status-vote/current/authority.z
187 and sending it in an HTTP POST request to each other authority at the URL
188 http://<hostname>/tor/post/vote
190 If, N minutes after the voting period has begun, an authority does not have
191 a current statement from another authority, the first authority retrieves
192 the other's statement.
194 Once an authority has a vote from another authority, it makes it available
196 http://<hostname>/tor/status-vote/current/<fp>.z
197 where <fp> is the fingerprint of the other authority's identity key.
199 The consensus network status, along with as many signatures as the server
200 currently knows, should be available at
201 http://<hostname>/tor/status-vote/current/consensus.z
202 All of the detached signatures it knows for consensus status should be
204 http://<hostname>/tor/status-vote/current/consensus-signatures.z
206 Once an authority has computed and signed a consensus network status, it
207 should send its detached signature to each other authority in an HTTP POST
209 http://<hostname>/tor/post/consensus-signature
212 [XXXX Store votes to disk.]
214 2.3.2. Serving a consensus directory
216 Once the authority is done getting signatures on the consensus directory,
217 it should serve it from:
218 http://<hostname>/tor/status/consensus.z
220 Caches SHOULD download consensus directories from an authority and serve
221 them from the same URL.
223 2.3.3. Timeline and synchronization
227 2.4. Distributing routerdescs between authorities
229 Consensus will be more meaningful if authorities take steps to make sure
230 that they all have the same set of descriptors _before_ the voting
231 starts. This is safe, since all descriptors are self-certified and
232 timestamped: it's always okay to replace a signed descriptor with a more
233 recent one signed by the same identity.
235 In the long run, we might want some kind of sophisticated process here.
236 For now, since authorities already download one another's networkstatus
237 documents and use them to determine what descriptors to download from one
238 another, we can rely on this existing mechanism to keep authorities up to
241 [We should do a thorough read-through of dir-spec again to make sure
242 that the authorities converge on which descriptor to "prefer" for
243 each router. Right now the decision happens at the client, which is
244 no longer the right place for it. -RD]
246 3. Questions and concerns
250 The URLs above define a push mechanism for publishing votes and consensus
251 signatures via HTTP POST requests, and a pull mechanism for downloading
252 these documents via HTTP GET requests. As specified, every authority will
253 post to every other. The "download if no copy has been received" mechanism
254 exists only as a fallback.
258 The "opt" keyword in Tor's directory formats was originally intended to
259 mean, "it is okay to ignore this entry if you don't understand it"; the
260 default behavior has been "discard a routerdesc if it contains entries you
263 But so far, every new flag we have added has been marked 'opt'. It would
264 probably make sense to change the default behavior to "ignore unrecognized
265 fields", and add the statement that clients SHOULD ignore fields they don't
266 recognize. As a meta-principle, we should say that clients and servers
267 MUST NOT have to understand new fields in order to use directory documents
270 Of course, this will make it impossible to say, "The format has changed a
271 lot; discard this quietly if you don't understand it." We could do that by
272 adding a version field.
274 3.3. Multilevel keys.
276 Replacing a directory authority's identity key in the event of a compromise
277 would be tremendously annoying. We'd need to tell every client to switch
278 their configuration, or update to a new version with an uploaded list. So
279 long as some weren't upgraded, they'd be at risk from whoever had
282 With this in mind, it's a shame that our current protocol forces us to
283 store identity keys unencrypted in RAM. We need some kind of signing key
284 stored unencrypted, since we need to generate new descriptors/directories
285 and rotate link and onion keys regularly. (And since, of course, we can't
286 ask server operators to be on-hand to enter a passphrase every time we
287 want to rotate keys or sign a descriptor.)
289 The obvious solution seems to be to have a signing-only key that lives
290 indefinitely (months or longer) and signs descriptors and link keys, and a
291 separate identity key that's used to sign the signing key. Tor servers
292 could run in one of several modes:
293 1. Identity key stored encrypted. You need to pick a passphrase when
294 you enable this mode, and re-enter this passphrase every time you
295 rotate the signing key.
296 1'. Identity key stored separate. You save your identity key to a
297 floppy, and use the floppy when you need to rotate the signing key.
298 2. All keys stored unencrypted. In this case, we might not want to even
299 *have* a separate signing key. (We'll need to support no-separate-
300 signing-key mode anyway to keep old servers working.)
301 3. All keys stored encrypted. You need to enter a passphrase to start
303 (Of course, we might not want to implement all of these.)
305 Case 1 is probably most usable and secure, if we assume that people don't
306 forget their passphrases or lose their floppies. We could mitigate this a
307 bit by encouraging people to PGP-encrypt their passphrases to themselves,
308 or keep a cleartext copy of their secret key secret-split into a few
309 pieces, or something like that.
311 Migration presents another difficulty, especially with the authorities. If
312 we use the current set of identity keys as the new identity keys, we're in
313 the position of having sensitive keys that have been stored on
314 media-of-dubious-encryption up to now. Also, we need to keep old clients
315 (who will expect descriptors to be signed by the identity keys they know
316 and love, and who will not understand signing keys) happy.
318 I'd enumerate designs here, but I'm hoping that somebody will come up with
319 a better one, so I'll try not to prejudice them with more ideas yet.
321 Oh, and of course, we'll want to make sure that the keys are
326 3.4. Long and short descriptors
328 Some of the costliest fields in the current directory protocol are ones
329 that no client actually uses. In particular, the "read-history" and
330 "write-history" fields are used only by the authorities for monitoring the
331 status of the network. If we took them out, the size of a compressed list
332 of all the routers would fall by about 60%. (No other disposable field
333 would save more than 2%.)
335 One possible solution here is that routers should generate and upload a
336 short-form and long-form descriptor. Only the short-form descriptor should
337 ever be used by anybody for routing. The long-form descriptor should be
338 used only for analytics and other tools. (If we allowed people to route with
339 long descriptors, we'd have to ensure that they stayed in sync with the
340 short ones somehow.) We can ensure that the short descriptors are used by
341 only recommending those in the network statuses.
343 Another possible solution would be to drop these fields from descriptors,
344 and have them uploaded as a part of a separate "bandwidth report" to the
345 authorities. This could help prevent the mistake of using long descriptors
346 in the place of short ones.
352 Gzip would be easier to work with than zlib; bzip2 would result in smaller
353 data lengths. [Concretely, we're looking at about 10-15% space savings at
354 the expense of 3-5x longer compression time for using bzip2.] Doing
355 on-the-fly gzip requires zlib 1.2 or later; doing bzip2 requires bzlib.
356 Pre-compressing status documents in multiple formats would force us to use
357 more memory to hold them.
361 For directory voting:
362 * It would be cool if caches could get ready to download consensus
363 status docs, verify enough signatures, and serve them now. That way
364 once stuff works all we need to do is upgrade the authorities. Caches
365 don't need to verify the correctness of the format so long as it's
366 signed (or maybe multisigned?). We need to make sure that caches back
367 off very quickly from downloading consensus docs until they're
368 actually implemented.
370 For dropping the "opt" requirement:
371 * stopped requiring it as of 0.1.2.5-alpha. Stop generating it once
372 earlier formats are obsolete.
377 For long/short descriptors:
379 * Authorities should accept both, now, and silently drop short
381 * Routers should upload both once authorities accept them.
382 * There should be a "long descriptor" url and the current "normal" URL.
383 Authorities should serve long descriptors from both URLs.
384 * Once tools that want long descriptors support fetching them from the
385 "long descriptor" URL:
386 * Have authorities remember short descriptors, and serve them from the