On OSX, decline to use the built-in select-baed poll wrapper, since it seems to somet...
[tor.git] / doc / tor-design.tex
blob5a7cc2520d0494706ea620eba8698deb71a97e43
2 \documentclass[times,10pt,twocolumn]{article}
3 \usepackage{latex8}
4 \usepackage{times}
5 \usepackage{url}
6 \usepackage{graphics}
7 \usepackage{amsmath}
8 \usepackage{epsfig}
10 \pagestyle{plain}
12 \renewcommand\url{\begingroup \def\UrlLeft{<}\def\UrlRight{>}\urlstyle{tt}\Url}
13 \newcommand\emailaddr{\begingroup \def\UrlLeft{<}\def\UrlRight{>}\urlstyle{tt}\Url}
15 \newcommand{\workingnote}[1]{} % The version that hides the note.
16 %\newcommand{\workingnote}[1]{(**#1)} % The version that makes the note visible.
19 % If an URL ends up with '%'s in it, that's because the line *in the .bib/.tex
20 % file* is too long, so break it there (it doesn't matter if the next line is
21 % indented with spaces). -DH
23 %\newif\ifpdf
24 %\ifx\pdfoutput\undefined
25 % \pdffalse
26 %\else
27 % \pdfoutput=1
28 % \pdftrue
29 %\fi
31 \newenvironment{tightlist}{\begin{list}{$\bullet$}{
32 \setlength{\itemsep}{0mm}
33 \setlength{\parsep}{0mm}
34 % \setlength{\labelsep}{0mm}
35 % \setlength{\labelwidth}{0mm}
36 % \setlength{\topsep}{0mm}
37 }}{\end{list}}
39 % Cut down on whitespace above and below figures displayed at head/foot of
40 % page.
41 \setlength{\textfloatsep}{3mm}
42 % Cut down on whitespace above and below figures displayed in middle of page
43 \setlength{\intextsep}{3mm}
45 \begin{document}
47 %% Use dvipdfm instead. --DH
48 %\ifpdf
49 % \pdfcompresslevel=9
50 % \pdfpagewidth=\the\paperwidth
51 % \pdfpageheight=\the\paperheight
52 %\fi
54 \title{Tor: The Second-Generation Onion Router} %\\DRAFT VERSION}
55 % Putting the 'Private' back in 'Virtual Private Network'
57 \author{Roger Dingledine \\ The Free Haven Project \\ arma@freehaven.net \and
58 Nick Mathewson \\ The Free Haven Project \\ nickm@freehaven.net \and
59 Paul Syverson \\ Naval Research Lab \\ syverson@itd.nrl.navy.mil}
61 \maketitle
62 \thispagestyle{empty}
64 \begin{abstract}
65 We present Tor, a circuit-based low-latency anonymous communication
66 service. This second-generation Onion Routing system addresses limitations
67 in the original design. Tor adds perfect forward secrecy, congestion
68 control, directory servers, integrity checking, configurable exit policies,
69 and a practical design for rendezvous points. Tor works on the real-world
70 Internet, requires no special privileges or kernel modifications, requires
71 little synchronization or coordination between nodes, and provides a
72 reasonable tradeoff between anonymity, usability, and efficiency.
73 We briefly describe our experiences with an international network of
74 more than a dozen hosts. % that has been running for several months.
75 We close with a list of open problems in anonymous communication.
76 \end{abstract}
78 %\begin{center}
79 %\textbf{Keywords:} anonymity, peer-to-peer, remailer, nymserver, reply block
80 %\end{center}
82 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
84 \Section{Overview}
85 \label{sec:intro}
87 Onion Routing is a distributed overlay network designed to anonymize
88 TCP-based applications like web browsing, secure shell,
89 and instant messaging. Clients choose a path through the network and
90 build a \emph{circuit}, in which each node (or ``onion router'' or ``OR'')
91 in the path knows its predecessor and successor, but no other nodes in
92 the circuit. Traffic flows down the circuit in fixed-size
93 \emph{cells}, which are unwrapped by a symmetric key at each node
94 (like the layers of an onion) and relayed downstream. The
95 Onion Routing project published several design and analysis
96 papers \cite{or-ih96,or-jsac98,or-discex00,or-pet00}. While a wide area Onion
97 Routing network was deployed briefly, the only long-running
98 public implementation was a fragile
99 proof-of-concept that ran on a single machine. Even this simple deployment
100 processed connections from over sixty thousand distinct IP addresses from
101 all over the world at a rate of about fifty thousand per day.
102 But many critical design and deployment issues were never
103 resolved, and the design has not been updated in several years. Here
104 we describe Tor, a protocol for asynchronous, loosely federated onion
105 routers that provides the following improvements over the old Onion
106 Routing design:
108 \textbf{Perfect forward secrecy:} Onion Routing
109 was originally vulnerable to a single hostile node recording traffic and
110 later compromising successive nodes in the circuit and forcing them
111 to decrypt it. Rather than using a single multiply encrypted data
112 structure (an \emph{onion}) to lay each circuit,
113 Tor now uses an incremental or \emph{telescoping} path-building design,
114 where the initiator negotiates session keys with each successive hop in
115 the circuit. Once these keys are deleted, subsequently compromised nodes
116 cannot decrypt old traffic. As a side benefit, onion replay detection
117 is no longer necessary, and the process of building circuits is more
118 reliable, since the initiator knows when a hop fails and can then try
119 extending to a new node.
121 \textbf{Separation of ``protocol cleaning'' from anonymity:}
122 Onion Routing originally required a separate ``application
123 proxy'' for each supported application protocol---most of which were
124 never written, so many applications were never supported. Tor uses the
125 standard and near-ubiquitous SOCKS~\cite{socks4} proxy interface, allowing
126 us to support most TCP-based programs without modification. Tor now
127 relies on the filtering features of privacy-enhancing
128 application-level proxies such as Privoxy~\cite{privoxy}, without trying
129 to duplicate those features itself.
131 \textbf{No mixing, padding, or traffic shaping (yet):} Onion
132 Routing originally called for batching and reordering cells as they arrived,
133 assumed padding between ORs, and in
134 later designs added padding between onion proxies (users) and
135 ORs~\cite{or-ih96,or-jsac98}. Tradeoffs between padding protection
136 and cost were discussed, and \emph{traffic shaping} algorithms were
137 theorized~\cite{or-pet00} to provide good security without expensive
138 padding, but no concrete padding scheme was suggested.
139 Recent research~\cite{econymics}
140 and deployment experience~\cite{freedom21-security} suggest that this
141 level of resource use is not practical or economical; and even full
142 link padding is still vulnerable~\cite{defensive-dropping}. Thus,
143 until we have a proven and convenient design for traffic shaping or
144 low-latency mixing that improves anonymity against a realistic
145 adversary, we leave these strategies out.
147 \textbf{Many TCP streams can share one circuit:} Onion Routing originally
148 built a separate circuit for each
149 application-level request, but this required
150 multiple public key operations for every request, and also presented
151 a threat to anonymity from building so many circuits; see
152 Section~\ref{sec:maintaining-anonymity}. Tor multiplexes multiple TCP
153 streams along each circuit to improve efficiency and anonymity.
155 \textbf{Leaky-pipe circuit topology:} Through in-band signaling
156 within the circuit, Tor initiators can direct traffic to nodes partway
157 down the circuit. This novel approach
158 allows traffic to exit the circuit from the middle---possibly
159 frustrating traffic shape and volume attacks based on observing the end
160 of the circuit. (It also allows for long-range padding if
161 future research shows this to be worthwhile.)
163 \textbf{Congestion control:} Earlier anonymity designs do not
164 address traffic bottlenecks. Unfortunately, typical approaches to
165 load balancing and flow control in overlay networks involve inter-node
166 control communication and global views of traffic. Tor's decentralized
167 congestion control uses end-to-end acks to maintain anonymity
168 while allowing nodes at the edges of the network to detect congestion
169 or flooding and send less data until the congestion subsides.
171 \textbf{Directory servers:} The earlier Onion Routing design
172 planned to flood state information through the network---an approach
173 that can be unreliable and complex. % open to partitioning attacks.
174 Tor takes a simplified view toward distributing this
175 information. Certain more trusted nodes act as \emph{directory
176 servers}: they provide signed directories describing known
177 routers and their current state. Users periodically download them
178 via HTTP.
180 \textbf{Variable exit policies:} Tor provides a consistent mechanism
181 for each node to advertise a policy describing the hosts
182 and ports to which it will connect. These exit policies are critical
183 in a volunteer-based distributed infrastructure, because each operator
184 is comfortable with allowing different types of traffic to exit the Tor
185 network from his node.
187 \textbf{End-to-end integrity checking:} The original Onion Routing
188 design did no integrity checking on data. Any node on the
189 circuit could change the contents of data cells as they passed by---for
190 example, to alter a connection request so it would connect
191 to a different webserver, or to `tag' encrypted traffic and look for
192 corresponding corrupted traffic at the network edges~\cite{minion-design}.
193 Tor hampers these attacks by verifying data integrity before it leaves
194 the network.
196 %\textbf{Improved robustness to failed nodes:} A failed node
197 %in the old design meant that circuit building failed, but thanks to
198 %Tor's step-by-step circuit building, users notice failed nodes
199 %while building circuits and route around them. Additionally, liveness
200 %information from directories allows users to avoid unreliable nodes in
201 %the first place.
202 %% Can't really claim this, now that we've found so many variants of
203 %% attack on partial-circuit-building. -RD
205 \textbf{Rendezvous points and hidden services:}
206 Tor provides an integrated mechanism for responder anonymity via
207 location-protected servers. Previous Onion Routing designs included
208 long-lived ``reply onions'' that could be used to build circuits
209 to a hidden server, but these reply onions did not provide forward
210 security, and became useless if any node in the path went down
211 or rotated its keys. In Tor, clients negotiate {\it rendezvous points}
212 to connect with hidden servers; reply onions are no longer required.
214 Unlike Freedom~\cite{freedom2-arch}, Tor does not require OS kernel
215 patches or network stack support. This prevents us from anonymizing
216 non-TCP protocols, but has greatly helped our portability and
217 deployability.
219 %Unlike Freedom~\cite{freedom2-arch}, Tor only anonymizes
220 %TCP-based protocols---not requiring patches (or built-in support) in an
221 %operating system's network stack has been valuable to Tor's
222 %portability and deployability.
224 We have implemented all of the above features except rendezvous
225 points. Our source code is
226 available under a free license, and Tor
227 %, as far as we know, is unencumbered by patents.
228 is not covered by the patent that affected distribution and use of
229 earlier versions of Onion Routing.
230 We have deployed a wide-area alpha network
231 to test the design, to get more experience with usability
232 and users, and to provide a research platform for experimentation.
233 As of this writing, the network stands at eighteen nodes in thirteen
234 distinct administrative domains on two continents.
236 We review previous work in Section~\ref{sec:related-work}, describe
237 our goals and assumptions in Section~\ref{sec:assumptions},
238 and then address the above list of improvements in
239 Sections~\ref{sec:design} and~\ref{sec:other-design}. We summarize
240 in Section~\ref{sec:attacks} how our design stands up to
241 known attacks, and talk about our early deployment experiences in
242 Section~\ref{sec:in-the-wild}. We conclude with a list of open problems in
243 Section~\ref{sec:maintaining-anonymity} and future work for the Onion
244 Routing project in Section~\ref{sec:conclusion}.
246 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
248 \Section{Related work}
249 \label{sec:related-work}
251 Modern anonymity systems date to Chaum's {\bf Mix-Net}
252 design~\cite{chaum-mix}. Chaum
253 proposed hiding the correspondence between sender and recipient by
254 wrapping messages in layers of public-key cryptography, and relaying them
255 through a path composed of ``mixes.'' Each mix in turn
256 decrypts, delays, and re-orders messages, before relaying them
257 onward.
258 %toward their destinations.
260 Subsequent relay-based anonymity designs have diverged in two
261 main directions. Systems like {\bf Babel}~\cite{babel},
262 {\bf Mixmaster}~\cite{mixmaster-spec},
263 and {\bf Mixminion}~\cite{minion-design} have tried
264 to maximize anonymity at the cost of introducing comparatively large and
265 variable latencies. Because of this decision, these \emph{high-latency}
266 networks resist strong global adversaries,
267 but introduce too much lag for interactive tasks like web browsing,
268 Internet chat, or SSH connections.
270 Tor belongs to the second category: \emph{low-latency} designs that
271 try to anonymize interactive network traffic. These systems handle
272 a variety of bidirectional protocols. They also provide more convenient
273 mail delivery than the high-latency anonymous email
274 networks, because the remote mail server provides explicit and timely
275 delivery confirmation. But because these designs typically
276 involve many packets that must be delivered quickly, it is
277 difficult for them to prevent an attacker who can eavesdrop both ends of the
278 communication from correlating the timing and volume
279 of traffic entering the anonymity network with traffic leaving it \cite{SS03}.
280 These
281 protocols are similarly vulnerable to an active adversary who introduces
282 timing patterns into traffic entering the network and looks
283 for correlated patterns among exiting traffic.
284 Although some work has been done to frustrate
285 these attacks, %\footnote{
286 % The most common approach is to pad and limit communication to a constant
287 % rate, or to limit
288 % the variation in traffic shape. Doing so can have prohibitive bandwidth
289 % costs and/or performance limitations.
291 % Point in the footnote is covered above, yes? -PS
292 most designs protect primarily against traffic analysis rather than traffic
293 confirmation (see Section~\ref{subsec:threat-model}).
295 The simplest low-latency designs are single-hop proxies such as the
296 {\bf Anonymizer}~\cite{anonymizer}: a single trusted server strips the
297 data's origin before relaying it. These designs are easy to
298 analyze, but users must trust the anonymizing proxy.
299 Concentrating the traffic to this single point increases the anonymity set
300 (the people a given user is hiding among), but it is vulnerable if the
301 adversary can observe all traffic going into and out of the proxy.
303 More complex are distributed-trust, circuit-based anonymizing systems.
304 In these designs, a user establishes one or more medium-term bidirectional
305 end-to-end circuits, and tunnels data in fixed-size cells.
306 Establishing circuits is computationally expensive and typically
307 requires public-key
308 cryptography, whereas relaying cells is comparatively inexpensive and
309 typically requires only symmetric encryption.
310 Because a circuit crosses several servers, and each server only knows
311 the adjacent servers in the circuit, no single server can link a
312 user to her communication partners.
314 The {\bf Java Anon Proxy} (also known as JAP or Web MIXes) uses fixed shared
315 routes known as \emph{cascades}. As with a single-hop proxy, this
316 approach aggregates users into larger anonymity sets, but again an
317 attacker only needs to observe both ends of the cascade to bridge all
318 the system's traffic. The Java Anon Proxy's design
319 calls for padding between end users and the head of the
320 cascade~\cite{web-mix}. However, it is not demonstrated whether the current
321 implementation's padding policy improves anonymity.
323 {\bf PipeNet}~\cite{back01, pipenet}, another low-latency design proposed
324 around the same time as Onion Routing, gave
325 stronger anonymity but allowed a single user to shut
326 down the network by not sending. Systems like {\bf ISDN
327 mixes}~\cite{isdn-mixes} were designed for other environments with
328 different assumptions.
329 %XXX please can we fix this sentence to something less demeaning
331 In P2P designs like {\bf Tarzan}~\cite{tarzan:ccs02} and
332 {\bf MorphMix}~\cite{morphmix:fc04}, all participants both generate
333 traffic and relay traffic for others. These systems aim to conceal
334 whether a given peer originated a request
335 or just relayed it from another peer. While Tarzan and MorphMix use
336 layered encryption as above, {\bf Crowds}~\cite{crowds-tissec} simply assumes
337 an adversary who cannot observe the initiator: it uses no public-key
338 encryption, so any node on a circuit can read the circuit's traffic.
340 {\bf Hordes}~\cite{hordes-jcs} is based on Crowds but also uses multicast
341 responses to hide the initiator. {\bf Herbivore}~\cite{herbivore} and
342 $\mbox{\bf P}^{\mathbf 5}$~\cite{p5} go even further, requiring broadcast.
343 These systems are designed primarily for communication between peers,
344 although Herbivore users can make external connections by
345 requesting a peer to serve as a proxy.
347 Systems like {\bf Freedom} and the original Onion Routing build circuits
348 all at once, using a layered ``onion'' of public-key encrypted messages,
349 each layer of which provides session keys and the address of the
350 next server in the circuit. Tor as described herein, Tarzan, MorphMix,
351 {\bf Cebolla}~\cite{cebolla}, and Rennhard's {\bf Anonymity Network}~\cite{anonnet}
352 build circuits
353 in stages, extending them one hop at a time.
354 Section~\ref{subsubsec:constructing-a-circuit} describes how this
355 approach enables perfect forward secrecy.
357 Circuit-based anonymity designs must choose which protocol layer
358 to anonymize. They may choose to intercept IP packets directly, and
359 relay them whole (stripping the source address) along the
360 circuit~\cite{freedom2-arch,tarzan:ccs02}. Alternatively, like
361 Tor, they may accept TCP streams and relay the data in those streams
362 along the circuit, ignoring the breakdown of that data into TCP
363 segments~\cite{morphmix:fc04,anonnet}. Finally, they may accept
364 application-level protocols (such as HTTP) and relay the application
365 requests themselves along the circuit.
366 Making this protocol-layer decision requires a compromise between flexibility
367 and anonymity. For example, a system that understands HTTP, such as Crowds,
368 can strip
369 identifying information from those requests, can take advantage of caching
370 to limit the number of requests that leave the network, and can batch
371 or encode those requests to minimize the number of connections.
372 On the other hand, an IP-level anonymizer can handle nearly any protocol,
373 even ones unforeseen by its designers (though these systems require
374 kernel-level modifications to some operating systems, and so are more
375 complex and less portable). TCP-level anonymity networks like Tor present
376 a middle approach: they are fairly application neutral (so long as the
377 application supports, or can be tunneled across, TCP), but by treating
378 application connections as data streams rather than raw TCP packets,
379 they avoid the well-known inefficiencies of tunneling TCP over
380 TCP~\cite{tcp-over-tcp-is-bad}.
382 Distributed-trust anonymizing systems need to prevent attackers from
383 adding too many servers and thus compromising user paths.
384 Tor relies on a small set of well-known directory servers, run by
385 independent parties, to decide which nodes can
386 join. Tarzan and MorphMix allow unknown users to run servers, and use
387 a limited resource (like IP addresses) to prevent an attacker from
388 controlling too much of the network. Crowds suggests requiring
389 written, notarized requests from potential crowd members.
391 Anonymous communication is essential for censorship-resistant
392 systems like Eternity~\cite{eternity}, Free~Haven~\cite{freehaven-berk},
393 Publius~\cite{publius}, and Tangler~\cite{tangler}. Tor's rendezvous
394 points enable connections between mutually anonymous entities; they
395 are a building block for location-hidden servers, which are needed by
396 Eternity and Free~Haven.
398 % didn't include rewebbers. No clear place to put them, so I'll leave
399 % them out for now. -RD
401 \Section{Design goals and assumptions}
402 \label{sec:assumptions}
404 \noindent{\large\bf Goals}\\
405 Like other low-latency anonymity designs, Tor seeks to frustrate
406 attackers from linking communication partners, or from linking
407 multiple communications to or from a single user. Within this
408 main goal, however, several considerations have directed
409 Tor's evolution.
411 \textbf{Deployability:} The design must be deployed and used in the
412 real world. Thus it
413 must not be expensive to run (for example, by requiring more bandwidth
414 than volunteers are willing to provide); must not place a heavy
415 liability burden on operators (for example, by allowing attackers to
416 implicate onion routers in illegal activities); and must not be
417 difficult or expensive to implement (for example, by requiring kernel
418 patches, or separate proxies for every protocol). We also cannot
419 require non-anonymous parties (such as websites)
420 to run our software. (Our rendezvous point design does not meet
421 this goal for non-anonymous users talking to hidden servers,
422 however; see Section~\ref{subsec:rendezvous}.)
424 \textbf{Usability:} A hard-to-use system has fewer users---and because
425 anonymity systems hide users among users, a system with fewer users
426 provides less anonymity. Usability is thus not only a convenience:
427 it is a security requirement~\cite{econymics,back01}. Tor should
428 therefore not
429 require modifying applications; should not introduce prohibitive delays;
430 and should require users to make as few configuration decisions
431 as possible. Finally, Tor should be easily implemented on all common
432 platforms; we cannot require users to change their operating system
433 to be anonymous. (The current Tor implementation runs on Windows and
434 assorted Unix clones including Linux, FreeBSD, and MacOS X.)
436 \textbf{Flexibility:} The protocol must be flexible and well-specified,
437 so Tor can serve as a test-bed for future research.
438 Many of the open problems in low-latency anonymity
439 networks, such as generating dummy traffic or preventing Sybil
440 attacks~\cite{sybil}, may be solvable independently from the issues
441 solved by
442 Tor. Hopefully future systems will not need to reinvent Tor's design.
443 %(But note that while a flexible design benefits researchers,
444 %there is a danger that differing choices of extensions will make users
445 %distinguishable. Experiments should be run on a separate network.)
447 \textbf{Simple design:} The protocol's design and security
448 parameters must be well-understood. Additional features impose implementation
449 and complexity costs; adding unproven techniques to the design threatens
450 deployability, readability, and ease of security analysis. Tor aims to
451 deploy a simple and stable system that integrates the best accepted
452 approaches to protecting anonymity.\\
454 \noindent{\large\bf Non-goals}\label{subsec:non-goals}\\
455 In favoring simple, deployable designs, we have explicitly deferred
456 several possible goals, either because they are solved elsewhere, or because
457 they are not yet solved.
459 \textbf{Not peer-to-peer:} Tarzan and MorphMix aim to scale to completely
460 decentralized peer-to-peer environments with thousands of short-lived
461 servers, many of which may be controlled by an adversary. This approach
462 is appealing, but still has many open
463 problems~\cite{tarzan:ccs02,morphmix:fc04}.
465 \textbf{Not secure against end-to-end attacks:} Tor does not claim
466 to provide a definitive solution to end-to-end timing or intersection
467 attacks. Some approaches, such as having users run their own onion routers,
468 may help;
469 see Section~\ref{sec:maintaining-anonymity} for more discussion.
471 \textbf{No protocol normalization:} Tor does not provide \emph{protocol
472 normalization} like Privoxy or the Anonymizer. If senders want anonymity from
473 responders while using complex and variable
474 protocols like HTTP, Tor must be layered with a filtering proxy such
475 as Privoxy to hide differences between clients, and expunge protocol
476 features that leak identity.
477 Note that by this separation Tor can also provide services that
478 are anonymous to the network yet authenticated to the responder, like
479 SSH. Similarly, Tor does not integrate
480 tunneling for non-stream-based protocols like UDP; this must be
481 provided by an external service if appropriate.
483 \textbf{Not steganographic:} Tor does not try to conceal who is connected
484 to the network.
486 \SubSection{Threat Model}
487 \label{subsec:threat-model}
489 A global passive adversary is the most commonly assumed threat when
490 analyzing theoretical anonymity designs. But like all practical
491 low-latency systems, Tor does not protect against such a strong
492 adversary. Instead, we assume an adversary who can observe some fraction
493 of network traffic; who can generate, modify, delete, or delay
494 traffic; who can operate onion routers of his own; and who can
495 compromise some fraction of the onion routers.
497 In low-latency anonymity systems that use layered encryption, the
498 adversary's typical goal is to observe both the initiator and the
499 responder. By observing both ends, passive attackers can confirm a
500 suspicion that Alice is
501 talking to Bob if the timing and volume patterns of the traffic on the
502 connection are distinct enough; active attackers can induce timing
503 signatures on the traffic to force distinct patterns. Rather
504 than focusing on these \emph{traffic confirmation} attacks,
505 we aim to prevent \emph{traffic
506 analysis} attacks, where the adversary uses traffic patterns to learn
507 which points in the network he should attack.
509 Our adversary might try to link an initiator Alice with her
510 communication partners, or try to build a profile of Alice's
511 behavior. He might mount passive attacks by observing the network edges
512 and correlating traffic entering and leaving the network---by
513 relationships in packet timing, volume, or externally visible
514 user-selected
515 options. The adversary can also mount active attacks by compromising
516 routers or keys; by replaying traffic; by selectively denying service
517 to trustworthy routers to move users to
518 compromised routers, or denying service to users to see if traffic
519 elsewhere in the
520 network stops; or by introducing patterns into traffic that can later be
521 detected. The adversary might subvert the directory servers to give users
522 differing views of network state. Additionally, he can try to decrease
523 the network's reliability by attacking nodes or by performing antisocial
524 activities from reliable nodes and trying to get them taken down---making
525 the network unreliable flushes users to other less anonymous
526 systems, where they may be easier to attack. We summarize
527 in Section~\ref{sec:attacks} how well the Tor design defends against
528 each of these attacks.
530 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
532 \Section{The Tor Design}
533 \label{sec:design}
535 The Tor network is an overlay network; each onion router (OR)
536 runs as a normal
537 user-level process without any special privileges.
538 Each onion router maintains a TLS~\cite{TLS}
539 connection to every other onion router.
540 %(We discuss alternatives to this clique-topology assumption in
541 %Section~\ref{sec:maintaining-anonymity}.)
542 % A subset of the ORs also act as
543 %directory servers, tracking which routers are in the network;
544 %see Section~\ref{subsec:dirservers} for directory server details.
545 Each user
546 runs local software called an onion proxy (OP) to fetch directories,
547 establish circuits across the network,
548 and handle connections from user applications. These onion proxies accept
549 TCP streams and multiplex them across the circuits. The onion
550 router on the other side
551 of the circuit connects to the destinations of
552 the TCP streams and relays data.
554 Each onion router maintains a long-term identity key and a short-term
555 onion key. The identity
556 key is used to sign TLS certificates, to sign the OR's \emph{router
557 descriptor} (a summary of its keys, address, bandwidth, exit policy,
558 and so on), and (by directory servers) to sign directories. %Changing
559 %the identity key of a router is considered equivalent to creating a
560 %new router.
561 The onion key is used to decrypt requests
562 from users to set up a circuit and negotiate ephemeral keys.
563 The TLS protocol also establishes a short-term link key when communicating
564 between ORs. Short-term keys are rotated periodically and
565 independently, to limit the impact of key compromise.
567 Section~\ref{subsec:cells} presents the fixed-size
568 \emph{cells} that are the unit of communication in Tor. We describe
569 in Section~\ref{subsec:circuits} how circuits are
570 built, extended, truncated, and destroyed. Section~\ref{subsec:tcp}
571 describes how TCP streams are routed through the network. We address
572 integrity checking in Section~\ref{subsec:integrity-checking},
573 and resource limiting in Section~\ref{subsec:rate-limit}.
574 Finally,
575 Section~\ref{subsec:congestion} talks about congestion control and
576 fairness issues.
578 \SubSection{Cells}
579 \label{subsec:cells}
581 Onion routers communicate with one another, and with users' OPs, via
582 TLS connections with ephemeral keys. Using TLS conceals the data on
583 the connection with perfect forward secrecy, and prevents an attacker
584 from modifying data on the wire or impersonating an OR.
586 Traffic passes along these connections in fixed-size cells. Each cell
587 is 512 bytes, %(but see Section~\ref{sec:conclusion} for a discussion of
588 %allowing large cells and small cells on the same network),
589 and consists of a header and a payload. The header includes a circuit
590 identifier (circID) that specifies which circuit the cell refers to
591 (many circuits can be multiplexed over the single TLS connection), and
592 a command to describe what to do with the cell's payload. (Circuit
593 identifiers are connection-specific: each single circuit has a different
594 circID on each OP/OR or OR/OR connection it traverses.)
595 Based on their command, cells are either \emph{control} cells, which are
596 always interpreted by the node that receives them, or \emph{relay} cells,
597 which carry end-to-end stream data. The control cell commands are:
598 \emph{padding} (currently used for keepalive, but also usable for link
599 padding); \emph{create} or \emph{created} (used to set up a new circuit);
600 and \emph{destroy} (to tear down a circuit).
602 Relay cells have an additional header (the relay header) after the
603 cell header, containing a streamID (stream identifier: many streams can
604 be multiplexed over a circuit); an end-to-end checksum for integrity
605 checking; the length of the relay payload; and a relay command.
606 The entire contents of the relay header and the relay cell payload
607 are encrypted or decrypted together as the relay cell moves along the
608 circuit, using the 128-bit AES cipher in counter mode to generate a
609 cipher stream. The relay commands are: \emph{relay
610 data} (for data flowing down the stream), \emph{relay begin} (to open a
611 stream), \emph{relay end} (to close a stream cleanly), \emph{relay
612 teardown} (to close a broken stream), \emph{relay connected}
613 (to notify the OP that a relay begin has succeeded), \emph{relay
614 extend} and \emph{relay extended} (to extend the circuit by a hop,
615 and to acknowledge), \emph{relay truncate} and \emph{relay truncated}
616 (to tear down only part of the circuit, and to acknowledge), \emph{relay
617 sendme} (used for congestion control), and \emph{relay drop} (used to
618 implement long-range dummies).
619 We give a visual overview of cell structure plus the details of relay
620 cell structure, and then describe each of these cell types and commands
621 in more detail below.
623 \begin{figure}[h]
624 \unitlength=1cm
625 \centering
626 \begin{picture}(8.0,1.5)
627 \put(4,.5){\makebox(0,0)[c]{\epsfig{file=cell-struct,width=7cm}}}
628 \end{picture}
629 \end{figure}
631 \SubSection{Circuits and streams}
632 \label{subsec:circuits}
634 Onion Routing originally built one circuit for each
635 TCP stream. Because building a circuit can take several tenths of a
636 second (due to public-key cryptography and network latency),
637 this design imposed high costs on applications like web browsing that
638 open many TCP streams.
640 In Tor, each circuit can be shared by many TCP streams. To avoid
641 delays, users construct circuits preemptively. To limit linkability
642 among their streams, users' OPs build a new circuit
643 periodically if the previous one has been used,
644 and expire old used circuits that no longer have any open streams.
645 OPs consider making a new circuit once a minute: thus
646 even heavy users spend negligible time
647 building circuits, but a limited number of requests can be linked
648 to each other through a given exit node. Also, because circuits are built
649 in the background, OPs can recover from failed circuit creation
650 without delaying streams and thereby harming user experience.\\
652 \begin{figure}[h]
653 \centering
654 \mbox{\epsfig{figure=interaction,width=8.75cm}}
655 \caption{Alice builds a two-hop circuit and begins fetching a web page.}
656 \label{fig:interaction}
657 \end{figure}
659 \noindent{\large\bf Constructing a circuit}\label{subsubsec:constructing-a-circuit}\\
660 %\subsubsection{Constructing a circuit}
661 A user's OP constructs circuits incrementally, negotiating a
662 symmetric key with each OR on the circuit, one hop at a time. To begin
663 creating a new circuit, the OP (call her Alice) sends a
664 \emph{create} cell to the first node in her chosen path (call him Bob).
665 (She chooses a new
666 circID $C_{AB}$ not currently used on the connection from her to Bob.)
667 The \emph{create} cell's
668 payload contains the first half of the Diffie-Hellman handshake
669 ($g^x$), encrypted to the onion key of the OR (call him Bob). Bob
670 responds with a \emph{created} cell containing the second half of the
671 DH handshake, along with a hash of the negotiated key $K=g^{xy}$.
673 Once the circuit has been established, Alice and Bob can send one
674 another relay cells encrypted with the negotiated
675 key.\footnote{Actually, the negotiated key is used to derive two
676 symmetric keys: one for each direction.} More detail is given in
677 the next section.
679 To extend the circuit further, Alice sends a \emph{relay extend} cell
680 to Bob, specifying the address of the next OR (call her Carol), and
681 an encrypted $g^{x_2}$ for her. Bob copies the half-handshake into a
682 \emph{create} cell, and passes it to Carol to extend the circuit.
683 (Bob chooses a new circID $C_{BC}$ not currently used on the connection
684 between him and Carol. Alice never needs to know this circID; only Bob
685 associates $C_{AB}$ on his connection with Alice to $C_{BC}$ on
686 his connection with Carol.)
687 When Carol responds with a \emph{created} cell, Bob wraps the payload
688 into a \emph{relay extended} cell and passes it back to Alice. Now
689 the circuit is extended to Carol, and Alice and Carol share a common key
690 $K_2 = g^{x_2 y_2}$.
692 To extend the circuit to a third node or beyond, Alice
693 proceeds as above, always telling the last node in the circuit to
694 extend one hop further.
696 This circuit-level handshake protocol achieves unilateral entity
697 authentication (Alice knows she's handshaking with the OR, but
698 the OR doesn't care who is opening the circuit---Alice uses no public key
699 and is trying to remain anonymous) and unilateral key authentication
700 (Alice and the OR agree on a key, and Alice knows only the OR learns
701 it). It also achieves forward
702 secrecy and key freshness. More formally, the protocol is as follows
703 (where $E_{PK_{Bob}}(\cdot)$ is encryption with Bob's public key,
704 $H$ is a secure hash function, and $|$ is concatenation):
705 \begin{equation*}
706 \begin{aligned}
707 \mathrm{Alice} \rightarrow \mathrm{Bob}&: E_{PK_{Bob}}(g^x) \\
708 \mathrm{Bob} \rightarrow \mathrm{Alice}&: g^y, H(K | \mathrm{``handshake"}) \\
709 \end{aligned}
710 \end{equation*}
712 \noindent In the second step, Bob proves that it was he who received $g^x$,
713 and who chose $y$. We use PK encryption in the first step
714 (rather than, say, using the first two steps of STS, which has a
715 signature in the second step) because a single cell is too small to
716 hold both a public key and a signature. Preliminary analysis with the
717 NRL protocol analyzer~\cite{meadows96} shows this protocol to be
718 secure (including perfect forward secrecy) under the
719 traditional Dolev-Yao model.\\
721 \noindent{\large\bf Relay cells}\\
722 %\subsubsection{Relay cells}
724 Once Alice has established the circuit (so she shares keys with each
725 OR on the circuit), she can send relay cells. Recall that every relay
726 cell has a streamID that indicates to which
727 stream the cell belongs. %This streamID allows a relay cell to be
728 %addressed to any OR on the circuit.
729 Upon receiving a relay
730 cell, an OR looks up the corresponding circuit, and decrypts the relay
731 header and payload with the session key for that circuit.
732 If the cell is headed away from Alice the OR then checks
733 whether the decrypted streamID is recognized---either because it
734 corresponds to an open stream at this OR for the given circuit, or because
735 it is the control streamID (zero). If the OR recognizes the
736 streamID, it accepts the relay cell and processes it as described
737 below. Otherwise,
738 the OR looks up the circID and OR for the
739 next step in the circuit, replaces the circID as appropriate, and
740 sends the decrypted relay cell to the next OR. (If the OR at the end
741 of the circuit receives an unrecognized relay cell, an error has
742 occurred, and the cell is discarded.)
744 OPs treat incoming relay cells similarly: they iteratively unwrap the
745 relay header and payload with the session keys shared with each
746 OR on the circuit, from the closest to farthest. % (Because we use a
747 %stream cipher, encryption operations may be inverted in any order.)
748 If at any stage the OP recognizes the streamID, the cell must have
749 originated at the OR whose encryption has just been removed.
751 To construct a relay cell addressed to a given OR, Alice iteratively
752 encrypts the cell payload (that is, the relay header and payload) with
753 the symmetric key of each hop up to that OR. Because the streamID is
754 encrypted to a different value at each step, only at the targeted OR
755 will it have a meaningful value.\footnote{
756 % Should we just say that 2^56 is itself negligible?
757 % Assuming 4-hop circuits with 10 streams per hop, there are 33
758 % possible bad streamIDs before the last circuit. This still
759 % gives an error only once every 2 million terabytes (approx).
760 With 48 bits of streamID per cell, the probability of an accidental
761 collision is far lower than the chance of hardware failure.}
762 This \emph{leaky pipe} circuit topology
763 allows Alice's streams to exit at different ORs on a single circuit.
764 Alice may choose different exit points because of their exit policies,
765 or to keep the ORs from knowing that two streams
766 originate from the same person.
768 When an OR later replies to Alice with a relay cell, it
769 encrypts the cell's relay header and payload with the single key it
770 shares with Alice, and sends the cell back toward Alice along the
771 circuit. Subsequent ORs add further layers of encryption as they
772 relay the cell back to Alice.
774 To tear down a circuit, Alice sends a \emph{destroy} control
775 cell. Each OR in the circuit receives the \emph{destroy} cell, closes
776 all streams on that circuit, and passes a new \emph{destroy} cell
777 forward. But just as circuits are built incrementally, they can also
778 be torn down incrementally: Alice can send a \emph{relay
779 truncate} cell to a single OR on a circuit. That OR then sends a
780 \emph{destroy} cell forward, and acknowledges with a
781 \emph{relay truncated} cell. Alice can then extend the circuit to
782 different nodes, without signaling to the intermediate nodes (or
783 a limited observer) that she has changed her circuit.
784 Similarly, if a node on the circuit goes down, the adjacent
785 node can send a \emph{relay truncated} cell back to Alice. Thus the
786 ``break a node and see which circuits go down''
787 attack~\cite{freedom21-security} is weakened.
789 \SubSection{Opening and closing streams}
790 \label{subsec:tcp}
792 When Alice's application wants a TCP connection to a given
793 address and port, it asks the OP (via SOCKS) to make the
794 connection. The OP chooses the newest open circuit (or creates one if
795 none is available), and chooses a suitable OR on that circuit to be the
796 exit node (usually the last node, but maybe others due to exit policy
797 conflicts; see Section~\ref{subsec:exitpolicies}.) The OP then opens
798 the stream by sending a \emph{relay begin} cell to the exit node,
799 using a streamID of zero (so the OR will recognize it), containing as
800 its relay payload a new random streamID, the destination
801 address, and the destination port. Once the
802 exit node connects to the remote host, it responds
803 with a \emph{relay connected} cell. Upon receipt, the OP sends a
804 SOCKS reply to notify the application of its success. The OP
805 now accepts data from the application's TCP stream, packaging it into
806 \emph{relay data} cells and sending those cells along the circuit to
807 the chosen OR.
809 There's a catch to using SOCKS, however---some applications pass the
810 alphanumeric hostname to the Tor client, while others resolve it into
811 an IP address first and then pass the IP address to the Tor client. If
812 the application does DNS resolution first, Alice thereby reveals her
813 destination to the remote DNS server, rather than sending the hostname
814 through the Tor network to be resolved at the far end. Common applications
815 like Mozilla and SSH have this flaw.
817 With Mozilla, the flaw is easy to address: the filtering HTTP
818 proxy called Privoxy gives a hostname to the Tor client, so Alice's
819 computer never does DNS resolution.
820 But a portable general solution, such as is needed for
821 SSH, is
822 an open problem. Modifying or replacing the local nameserver
823 can be invasive, brittle, and unportable. Forcing the resolver
824 library to do resolution via TCP rather than UDP is
825 hard, and also has portability problems. We could also provide a
826 tool similar to \emph{dig} to perform a private lookup through the
827 Tor network. Currently, we encourage the use of
828 privacy-aware proxies like Privoxy wherever possible.
830 Closing a Tor stream is analogous to closing a TCP stream: it uses a
831 two-step handshake for normal operation, or a one-step handshake for
832 errors. If the stream closes abnormally, the adjacent node simply sends a
833 \emph{relay teardown} cell. If the stream closes normally, the node sends
834 a \emph{relay end} cell down the circuit. When the other side has sent
835 back its own \emph{relay end} cell, the stream can be torn down. Because
836 all relay cells use layered encryption, only the destination OR knows
837 that a given relay cell is a request to close a stream. This two-step
838 handshake allows Tor to support TCP-based applications that use half-closed
839 connections.
840 % such as broken HTTP clients that close their side of the
841 %stream after writing but are still willing to read.
843 \SubSection{Integrity checking on streams}
844 \label{subsec:integrity-checking}
846 Because the old Onion Routing design used a stream cipher without integrity
847 checking, traffic was
848 vulnerable to a malleability attack: though the attacker could not
849 decrypt cells, any changes to encrypted data
850 would create corresponding changes to the data leaving the network.
851 This weakness allowed an adversary to change a padding cell to a destroy
852 cell; change the destination address in a \emph{relay begin} cell to the
853 adversary's webserver; or change an FTP command from
854 {\tt dir} to {\tt rm~*}. Any adversary who could guess the encrypted
855 content could introduce such corruption in a stream. (Even an external
856 adversary could do this, because the link encryption similarly used a
857 stream cipher.)
859 Tor uses TLS on its links---its integrity checking
860 protects data from modification by external adversaries.
861 Addressing the insider malleability attack, however, is
862 more complex.
864 We could do integrity checking of the relay cells at each hop, either
865 by including hashes or by using an authenticating cipher mode like
866 EAX~\cite{eax}, but there are some problems. First, these approaches
867 impose a message-expansion overhead at each hop, and so we would have to
868 either leak the path length or waste bytes by padding to a maximum
869 path length. Second, these solutions can only verify traffic coming
870 from Alice: ORs would not be able to produce suitable hashes for
871 the intermediate hops, since the ORs on a circuit do not know the
872 other ORs' session keys. Third, we have already accepted that our design
873 is vulnerable to end-to-end timing attacks; tagging attacks performed
874 within the circuit provide no additional information to the attacker.
876 Thus, we check integrity only at the edges of each stream. When Alice
877 negotiates a key with a new hop, they each initialize a SHA-1
878 digest with a derivative of that key,
879 thus beginning with randomness that only the two of them know. From
880 then on they each incrementally add to the SHA-1 digest the contents of
881 all relay cells they create, and include with each relay cell the
882 first four bytes of the current digest. Each also keeps a SHA-1
883 digest of data received, to verify that the received hashes are correct.
885 To be sure of removing or modifying a cell, the attacker must be able
886 to deduce the current digest state (which depends on all
887 traffic between Alice and Bob, starting with their negotiated key).
888 Attacks on SHA-1 where the adversary can incrementally add to a hash
889 to produce a new valid hash don't work, because all hashes are
890 end-to-end encrypted across the circuit. The computational overhead
891 of computing the digests is minimal compared to doing the AES
892 encryption performed at each hop of the circuit. We use only four
893 bytes per cell to minimize overhead; the chance that an adversary will
894 correctly guess a valid hash
895 %, plus the payload the current cell,
897 acceptably low, given that Alice or Bob tear down the circuit if they
898 receive a bad hash.
900 \SubSection{Rate limiting and fairness}
901 \label{subsec:rate-limit}
903 Volunteers are more willing to run services that can limit
904 their bandwidth usage. To accommodate them, Tor servers use a
905 token bucket approach~\cite{tannenbaum96} to
906 enforce a long-term average rate of incoming bytes, while still
907 permitting short-term bursts above the allowed bandwidth.
908 % Current bucket sizes are set to ten seconds' worth of traffic.
910 %Further, we want to avoid starving any Tor streams. Entire circuits
911 %could starve if we read greedily from connections and one connection
912 %uses all the remaining bandwidth. We solve this by dividing the number
913 %of tokens in the bucket by the number of connections that want to read,
914 %and reading at most that number of bytes from each connection. We iterate
915 %this procedure until the number of tokens in the bucket is under some
916 %threshold (currently 10KB), at which point we greedily read from connections.
918 Because the Tor protocol generates roughly the same number of outgoing
919 bytes as incoming bytes, it is sufficient in practice to limit only
920 incoming bytes.
921 With TCP streams, however, the correspondence is not one-to-one:
922 relaying a single incoming byte can require an entire 512-byte cell.
923 (We can't just wait for more bytes, because the local application may
924 be waiting for a reply.) Therefore, we treat this case as if the entire
925 cell size had been read, regardless of the fullness of the cell.
927 Further, inspired by Rennhard et al's design in~\cite{anonnet}, a
928 circuit's edges can heuristically distinguish interactive streams from bulk
929 streams by comparing the frequency with which they supply cells. We can
930 provide good latency for interactive streams by giving them preferential
931 service, while still giving good overall throughput to the bulk
932 streams. Such preferential treatment presents a possible end-to-end
933 attack, but an adversary observing both
934 ends of the stream can already learn this information through timing
935 attacks.
937 \SubSection{Congestion control}
938 \label{subsec:congestion}
940 Even with bandwidth rate limiting, we still need to worry about
941 congestion, either accidental or intentional. If enough users choose the
942 same OR-to-OR connection for their circuits, that connection can become
943 saturated. For example, an attacker could send a large file
944 through the Tor network to a webserver he runs, and then
945 refuse to read any of the bytes at the webserver end of the
946 circuit. Without some congestion control mechanism, these bottlenecks
947 can propagate back through the entire network. We don't need to
948 reimplement full TCP windows (with sequence numbers,
949 the ability to drop cells when we're full and retransmit later, and so
950 on),
951 because TCP already guarantees in-order delivery of each
952 cell.
953 %But we need to investigate further the effects of the current
954 %parameters on throughput and latency, while also keeping privacy in mind;
955 %see Section~\ref{sec:maintaining-anonymity} for more discussion.
956 We describe our response below.
958 \textbf{Circuit-level throttling:}
959 To control a circuit's bandwidth usage, each OR keeps track of two
960 windows. The \emph{packaging window} tracks how many relay data cells the OR is
961 allowed to package (from incoming TCP streams) for transmission back to the OP,
962 and the \emph{delivery window} tracks how many relay data cells it is willing
963 to deliver to TCP streams outside the network. Each window is initialized
964 (say, to 1000 data cells). When a data cell is packaged or delivered,
965 the appropriate window is decremented. When an OR has received enough
966 data cells (currently 100), it sends a \emph{relay sendme} cell towards the OP,
967 with streamID zero. When an OR receives a \emph{relay sendme} cell with
968 streamID zero, it increments its packaging window. Either of these cells
969 increments the corresponding window by 100. If the packaging window
970 reaches 0, the OR stops reading from TCP connections for all streams
971 on the corresponding circuit, and sends no more relay data cells until
972 receiving a \emph{relay sendme} cell.
974 The OP behaves identically, except that it must track a packaging window
975 and a delivery window for every OR in the circuit. If a packaging window
976 reaches 0, it stops reading from streams destined for that OR.
978 \textbf{Stream-level throttling}:
979 The stream-level congestion control mechanism is similar to the
980 circuit-level mechanism. ORs and OPs use \emph{relay sendme} cells
981 to implement end-to-end flow control for individual streams across
982 circuits. Each stream begins with a packaging window (currently 500 cells),
983 and increments the window by a fixed value (50) upon receiving a \emph{relay
984 sendme} cell. Rather than always returning a \emph{relay sendme} cell as soon
985 as enough cells have arrived, the stream-level congestion control also
986 has to check whether data has been successfully flushed onto the TCP
987 stream; it sends the \emph{relay sendme} cell only when the number of bytes pending
988 to be flushed is under some threshold (currently 10 cells' worth).
990 %% Maybe omit this next paragraph. -NM
991 %Currently, non-data relay cells do not affect the windows. Thus we
992 %avoid potential deadlock issues, for example, arising because a stream
993 %can't send a \emph{relay sendme} cell when its packaging window is empty.
995 These arbitrarily chosen parameters seem to give tolerable throughput
996 and delay; see Section~\ref{sec:in-the-wild}.
998 \SubSection{Rendezvous Points and hidden services}
999 \label{subsec:rendezvous}
1001 Rendezvous points are a building block for \emph{location-hidden
1002 services} (also known as \emph{responder anonymity}) in the Tor
1003 network. Location-hidden services allow Bob to offer a TCP
1004 service, such as a webserver, without revealing his IP address.
1005 This type of anonymity protects against distributed DoS attacks:
1006 attackers are forced to attack the onion routing network
1007 because they do not know Bob's IP address.
1009 Our design for location-hidden servers has the following goals.
1010 \textbf{Access-controlled:} Bob needs a way to filter incoming requests,
1011 so an attacker cannot flood Bob simply by making many connections to him.
1012 \textbf{Robust:} Bob should be able to maintain a long-term pseudonymous
1013 identity even in the presence of router failure. Bob's service must
1014 not be tied to a single OR, and Bob must be able to tie his service
1015 to new ORs. \textbf{Smear-resistant:}
1016 A social attacker who offers an illegal or disreputable location-hidden
1017 service should not be able to ``frame'' a rendezvous router by
1018 making observers believe the router created that service.
1019 \textbf{Application-transparent:} Although we require users
1020 to run special software to access location-hidden servers, we must not
1021 require them to modify their applications.
1023 We provide location-hiding for Bob by allowing him to advertise
1024 several onion routers (his \emph{introduction points}) as contact
1025 points. He may do this on any robust efficient
1026 key-value lookup system with authenticated updates, such as a
1027 distributed hash table (DHT) like CFS~\cite{cfs:sosp01}\footnote{
1028 Rather than rely on an external infrastructure, the Onion Routing network
1029 can run the DHT itself. At first, we can run a simple lookup
1030 system on the
1031 directory servers.} Alice, the client, chooses an OR as her
1032 \emph{rendezvous point}. She connects to one of Bob's introduction
1033 points, informs him of her rendezvous point, and then waits for him
1034 to connect to the rendezvous point. This extra level of indirection
1035 helps Bob's introduction points avoid problems associated with serving
1036 unpopular files directly (for example, if Bob serves
1037 material that the introduction point's community finds objectionable,
1038 or if Bob's service tends to get attacked by network vandals).
1039 The extra level of indirection also allows Bob to respond to some requests
1040 and ignore others.
1042 In Appendix~\ref{sec:rendezvous-specifics} we provide a more detailed
1043 description of the rendezvous protocol, integration issues, attacks,
1044 and related rendezvous work.
1046 \Section{Other design decisions}
1047 \label{sec:other-design}
1049 \SubSection{Resource management and denial-of-service}
1050 \label{subsec:dos}
1052 Providing Tor as a public service creates many opportunities for
1053 denial-of-service attacks against the network. While
1054 flow control and rate limiting (discussed in
1055 Section~\ref{subsec:congestion}) prevent users from consuming more
1056 bandwidth than routers are willing to provide, opportunities remain for
1057 users to
1058 consume more network resources than their fair share, or to render the
1059 network unusable for others.
1061 First of all, there are several CPU-consuming denial-of-service
1062 attacks wherein an attacker can force an OR to perform expensive
1063 cryptographic operations. For example, an attacker who sends a
1064 \emph{create} cell full of junk bytes can force an OR to perform an RSA
1065 decrypt. Similarly, an attacker can
1066 fake the start of a TLS handshake, forcing the OR to carry out its
1067 (comparatively expensive) half of the handshake at no real computational
1068 cost to the attacker.
1070 We have not yet implemented any defenses for these attacks, but several
1071 approaches are possible. First, ORs can
1072 require clients to solve a puzzle~\cite{puzzles-tls} while beginning new
1073 TLS handshakes or accepting \emph{create} cells. So long as these
1074 tokens are easy to verify and computationally expensive to produce, this
1075 approach limits the attack multiplier. Additionally, ORs can limit
1076 the rate at which they accept \emph{create} cells and TLS connections,
1077 so that
1078 the computational work of processing them does not drown out the
1079 symmetric cryptography operations that keep cells
1080 flowing. This rate limiting could, however, allow an attacker
1081 to slow down other users when they build new circuits.
1083 % What about link-to-link rate limiting?
1085 Adversaries can also attack the Tor network's hosts and network
1086 links. Disrupting a single circuit or link breaks all streams passing
1087 along that part of the circuit. Users similarly lose service
1088 when a router crashes or its operator restarts it. The current
1089 Tor design treats such attacks as intermittent network failures, and
1090 depends on users and applications to respond or recover as appropriate. A
1091 future design could use an end-to-end TCP-like acknowledgment protocol,
1092 so that no streams are lost unless the entry or exit point itself is
1093 disrupted. This solution would require more buffering at the network
1094 edges, however, and the performance and anonymity implications from this
1095 extra complexity still require investigation.
1097 \SubSection{Exit policies and abuse}
1098 \label{subsec:exitpolicies}
1100 % originally, we planned to put the "users only know the hostname,
1101 % not the IP, but exit policies are by IP" problem here too. Not
1102 % worth putting in the submission, but worth thinking about putting
1103 % in sometime somehow. -RD
1105 Exit abuse is a serious barrier to wide-scale Tor deployment. Anonymity
1106 presents would-be vandals and abusers with an opportunity to hide
1107 the origins of their activities. Attackers can harm the Tor network by
1108 implicating exit servers for their abuse. Also, applications that commonly
1109 use IP-based authentication (such as institutional mail or webservers)
1110 can be fooled by the fact that anonymous connections appear to originate
1111 at the exit OR.
1113 We stress that Tor does not enable any new class of abuse. Spammers
1114 and other attackers already have access to thousands of misconfigured
1115 systems worldwide, and the Tor network is far from the easiest way
1116 to launch attacks.
1117 %Indeed, because of its limited
1118 %anonymity, Tor is probably not a good way to commit crimes.
1119 But because the
1120 onion routers can be mistaken for the originators of the abuse,
1121 and the volunteers who run them may not want to deal with the hassle of
1122 explaining anonymity networks to irate administrators, we must block or limit
1123 the abuse that travels through the Tor network.
1125 To mitigate abuse issues, in Tor, each onion router's \emph{exit policy}
1126 describes to which external addresses and ports the router will
1127 connect. On one end of the spectrum are \emph{open exit}
1128 nodes that will connect anywhere. On the other end are \emph{middleman}
1129 nodes that only relay traffic to other Tor nodes, and \emph{private exit}
1130 nodes that only connect to a local host or network. Using a private
1131 exit (if one exists) is a more secure way for a client to connect to a
1132 given host or network---an external adversary cannot eavesdrop traffic
1133 between the private exit and the final destination, and so is less sure of
1134 Alice's destination and activities. Most onion routers in the current
1135 network function as
1136 \emph{restricted exits} that permit connections to the world at large,
1137 but prevent access to certain abuse-prone addresses and services such
1138 as SMTP.
1139 Additionally, in some cases the OR can authenticate clients to
1140 prevent exit abuse without harming anonymity~\cite{or-discex00}.
1142 %The abuse issues on closed (e.g. military) networks are different
1143 %from the abuse on open networks like the Internet. While these IP-based
1144 %access controls are still commonplace on the Internet, on closed networks,
1145 %nearly all participants will be honest, and end-to-end authentication
1146 %can be assumed for important traffic.
1148 Many administrators use port restrictions to support only a
1149 limited set of services, such as HTTP, SSH, or AIM.
1150 This is not a complete solution, of course, since abuse opportunities for these
1151 protocols are still well known.
1153 We have not yet encountered any abuse in the deployed network, but if
1154 we do we should consider using proxies to clean traffic for certain
1155 protocols as it leaves the network. For example, much abusive HTTP
1156 behavior (such as exploiting buffer overflows or well-known script
1157 vulnerabilities) can be detected in a straightforward manner.
1158 Similarly, one could run automatic spam filtering software (such as
1159 SpamAssassin) on email exiting the OR network.
1161 ORs may also rewrite exiting traffic to append
1162 headers or other information indicating that the traffic has passed
1163 through an anonymity service. This approach is commonly used
1164 by email-only anonymity systems. ORs can also
1165 run on servers with hostnames like {\tt anonymous} to further
1166 alert abuse targets to the nature of the anonymous traffic.
1168 A mixture of open and restricted exit nodes allows the most
1169 flexibility for volunteers running servers. But while having many
1170 middleman nodes provides a large and robust network,
1171 having only a few exit nodes reduces the number of points
1172 an adversary needs to monitor for traffic analysis, and places a
1173 greater burden on the exit nodes. This tension can be seen in the
1174 Java Anon Proxy
1175 cascade model, wherein only one node in each cascade needs to handle
1176 abuse complaints---but an adversary only needs to observe the entry
1177 and exit of a cascade to perform traffic analysis on all that
1178 cascade's users. The hydra model (many entries, few exits) presents a
1179 different compromise: only a few exit nodes are needed, but an
1180 adversary needs to work harder to watch all the clients; see
1181 Section~\ref{sec:conclusion}.
1183 Finally, we note that exit abuse must not be dismissed as a peripheral
1184 issue: when a system's public image suffers, it can reduce the number
1185 and diversity of that system's users, and thereby reduce the anonymity
1186 of the system itself. Like usability, public perception is a
1187 security parameter. Sadly, preventing abuse of open exit nodes is an
1188 unsolved problem, and will probably remain an arms race for the
1189 foreseeable future. The abuse problems faced by Princeton's CoDeeN
1190 project~\cite{darkside} give us a glimpse of likely issues.
1192 \SubSection{Directory Servers}
1193 \label{subsec:dirservers}
1195 First-generation Onion Routing designs~\cite{freedom2-arch,or-jsac98} used
1196 in-band network status updates: each router flooded a signed statement
1197 to its neighbors, which propagated it onward. But anonymizing networks
1198 have different security goals than typical link-state routing protocols.
1199 For example, delays (accidental or intentional)
1200 that can cause different parts of the network to have different views
1201 of link-state and topology are not only inconvenient: they give
1202 attackers an opportunity to exploit differences in client knowledge.
1203 We also worry about attacks to deceive a
1204 client about the router membership list, topology, or current network
1205 state. Such \emph{partitioning attacks} on client knowledge help an
1206 adversary to efficiently deploy resources
1207 against a target~\cite{minion-design}.
1209 Tor uses a small group of redundant, well-known onion routers to
1210 track changes in network topology and node state, including keys and
1211 exit policies. Each such \emph{directory server} acts as an HTTP
1212 server, so clients can fetch current network state
1213 and router lists, and so other ORs can upload
1214 state information. Onion routers periodically publish signed
1215 statements of their state to each directory server. The directory servers
1216 combine this information with their own views of network liveness,
1217 and generate a signed description (a \emph{directory}) of the entire
1218 network state. Client software is
1219 pre-loaded with a list of the directory servers and their keys,
1220 to bootstrap each client's view of the network.
1221 % XXX this means that clients will be forced to upgrade as the
1222 % XXX dirservers change or get compromised. argue that this is ok.
1224 When a directory server receives a signed statement for an OR, it
1225 checks whether the OR's identity key is recognized. Directory
1226 servers do not automatically advertise unrecognized ORs. (If they did,
1227 an adversary could take over the network by creating many
1228 servers~\cite{sybil}.) Instead, new nodes must be approved by the
1229 directory
1230 server administrator before they are included. Mechanisms for automated
1231 node approval are an area of active research, and are discussed more
1232 in Section~\ref{sec:maintaining-anonymity}.
1234 Of course, a variety of attacks remain. An adversary who controls
1235 a directory server can track clients by providing them different
1236 information---perhaps by listing only nodes under its control, or by
1237 informing only certain clients about a given node. Even an external
1238 adversary can exploit differences in client knowledge: clients who use
1239 a node listed on one directory server but not the others are vulnerable.
1241 Thus these directory servers must be synchronized and redundant, so
1242 that they can agree on a common directory. Clients should only trust
1243 this directory if it is signed by a threshold of the directory
1244 servers.
1246 The directory servers in Tor are modeled after those in
1247 Mixminion~\cite{minion-design}, but our situation is easier. First,
1248 we make the
1249 simplifying assumption that all participants agree on the set of
1250 directory servers. Second, while Mixminion needs to predict node
1251 behavior, Tor only needs a threshold consensus of the current
1252 state of the network.
1254 Tor directory servers build a consensus directory through a simple
1255 four-round broadcast protocol. In round one, each server dates and
1256 signs its current opinion, and broadcasts it to the other directory
1257 servers; then in round two, each server rebroadcasts all the signed
1258 opinions it has received. At this point all directory servers check
1259 to see whether any server has signed multiple opinions in the same
1260 period. Such a server is either broken or cheating, so the protocol
1261 stops and notifies the administrators, who either remove the cheater
1262 or wait for the broken server to be fixed. If there are no
1263 discrepancies, each directory server then locally computes an algorithm
1264 (described below)
1265 on the set of opinions, resulting in a uniform shared directory. In
1266 round three servers sign this directory and broadcast it; and finally
1267 in round four the servers rebroadcast the directory and all the
1268 signatures. If any directory server drops out of the network, its
1269 signature is not included on the final directory.
1271 The rebroadcast steps ensure that a directory server is heard by
1272 either all of the other servers or none of them, even when some links
1273 are down (assuming that any two directory servers can talk directly or
1274 via a third). Broadcasts are feasible because there are relatively few
1275 directory servers (currently 3, but we expect as many as 9 as the network
1276 scales). Computing the shared directory locally is a straightforward
1277 threshold voting process: we include an OR if a majority of directory
1278 servers believe it to be good.
1280 To avoid attacks where a router connects to all the directory servers
1281 but refuses to relay traffic from other routers, the directory servers
1282 must build circuits and use them to anonymously test router
1283 reliability~\cite{mix-acc}. Unfortunately, this defense is not yet
1284 designed or
1285 implemented.
1287 Using directory servers is simpler and more flexible than flooding.
1288 Flooding is expensive, and complicates the analysis when we
1289 start experimenting with non-clique network topologies. Signed
1290 directories can be cached by other
1291 onion routers,
1292 so directory servers are not a performance
1293 bottleneck when we have many users, and do not aid traffic analysis by
1294 forcing clients to periodically announce their existence to any
1295 central point.
1298 \Section{Attacks and Defenses}
1299 \label{sec:attacks}
1301 Below we summarize a variety of attacks, and discuss how well our
1302 design withstands them.\\
1304 \noindent{\large\bf Passive attacks}\\
1305 \emph{Observing user traffic patterns.} Observing a user's connection
1306 will not reveal her destination or data, but it will
1307 reveal traffic patterns (both sent and received). Profiling via user
1308 connection patterns requires further processing, because multiple
1309 application streams may be operating simultaneously or in series over
1310 a single circuit.
1312 \emph{Observing user content.} While content at the user end is encrypted,
1313 connections to responders may not be (indeed, the responding website
1314 itself may be hostile). While filtering content is not a primary goal
1315 of Onion Routing, Tor can directly use Privoxy and related
1316 filtering services to anonymize application data streams.
1318 \emph{Option distinguishability.} We allow clients to choose
1319 configuration options. For example, clients concerned about request
1320 linkability should rotate circuits more often than those concerned
1321 about traceability. Allowing choice may attract users with different
1322 %There is economic incentive to attract users by
1323 %allowing this choice;
1324 needs; but clients who are
1325 in the minority may lose more anonymity by appearing distinct than they
1326 gain by optimizing their behavior~\cite{econymics}.
1328 \emph{End-to-end timing correlation.} Tor only minimally hides
1329 such correlations. An attacker watching patterns of
1330 traffic at the initiator and the responder will be
1331 able to confirm the correspondence with high probability. The
1332 greatest protection currently available against such confirmation is to hide
1333 the connection between the onion proxy and the first Tor node,
1334 by running the OP on the Tor node or behind a firewall. This approach
1335 requires an observer to separate traffic originating at the onion
1336 router from traffic passing through it: a global observer can do this,
1337 but it might be beyond a limited observer's capabilities.
1339 \emph{End-to-end size correlation.} Simple packet counting
1340 will also be effective in confirming
1341 endpoints of a stream. However, even without padding, we have some
1342 limited protection: the leaky pipe topology means different numbers
1343 of packets may enter one end of a circuit than exit at the other.
1345 \emph{Website fingerprinting.} All the effective passive
1346 attacks above are traffic confirmation attacks,
1347 which puts them outside our design goals. There is also
1348 a passive traffic analysis attack that is potentially effective.
1349 Rather than searching exit connections for timing and volume
1350 correlations, the adversary may build up a database of
1351 ``fingerprints'' containing file sizes and access patterns for
1352 targeted websites. He can later confirm a user's connection to a given
1353 site simply by consulting the database. This attack has
1354 been shown to be effective against SafeWeb~\cite{hintz-pet02}.
1355 It may be less effective against Tor, since
1356 streams are multiplexed within the same circuit, and
1357 fingerprinting will be limited to
1358 the granularity of cells (currently 512 bytes). Additional
1359 defenses could include
1360 larger cell sizes, padding schemes to group websites
1361 into large sets, and link
1362 padding or long-range dummies.\footnote{Note that this fingerprinting
1363 attack should not be confused with the much more complicated latency
1364 attacks of~\cite{back01}, which require a fingerprint of the latencies
1365 of all circuits through the network, combined with those from the
1366 network edges to the target user and the responder website.}\\
1368 \noindent{\large\bf Active attacks}\\
1369 \emph{Compromise keys.} An attacker who learns the TLS session key can
1370 see control cells and encrypted relay cells on every circuit on that
1371 connection; learning a circuit
1372 session key lets him unwrap one layer of the encryption. An attacker
1373 who learns an OR's TLS private key can impersonate that OR for the TLS
1374 key's lifetime, but he must
1375 also learn the onion key to decrypt \emph{create} cells (and because of
1376 perfect forward secrecy, he cannot hijack already established circuits
1377 without also compromising their session keys). Periodic key rotation
1378 limits the window of opportunity for these attacks. On the other hand,
1379 an attacker who learns a node's identity key can replace that node
1380 indefinitely by sending new forged descriptors to the directory servers.
1382 \emph{Iterated compromise.} A roving adversary who can
1383 compromise ORs (by system intrusion, legal coercion, or extralegal
1384 coercion) could march down the circuit compromising the
1385 nodes until he reaches the end. Unless the adversary can complete
1386 this attack within the lifetime of the circuit, however, the ORs
1387 will have discarded the necessary information before the attack can
1388 be completed. (Thanks to the perfect forward secrecy of session
1389 keys, the attacker cannot force nodes to decrypt recorded
1390 traffic once the circuits have been closed.) Additionally, building
1391 circuits that cross jurisdictions can make legal coercion
1392 harder---this phenomenon is commonly called ``jurisdictional
1393 arbitrage.'' The Java Anon Proxy project recently experienced the
1394 need for this approach, when
1395 a German court forced them to add a backdoor to
1396 all of their nodes~\cite{jap-backdoor}.
1398 \emph{Run a recipient.} An adversary running a webserver
1399 trivially learns the timing patterns of users connecting to it, and
1400 can introduce arbitrary patterns in its responses.
1401 End-to-end attacks become easier: if the adversary can induce
1402 users to connect to his webserver (perhaps by advertising
1403 content targeted to those users), she now holds one end of their
1404 connection. There is also a danger that application
1405 protocols and associated programs can be induced to reveal information
1406 about the initiator. Tor depends on Privoxy and similar protocol cleaners
1407 to solve this latter problem.
1409 \emph{Run an onion proxy.} It is expected that end users will
1410 nearly always run their own local onion proxy. However, in some
1411 settings, it may be necessary for the proxy to run
1412 remotely---typically, in institutions that want
1413 to monitor the activity of those connecting to the proxy.
1414 Compromising an onion proxy compromises all future connections
1415 through it.
1417 \emph{DoS non-observed nodes.} An observer who can only watch some
1418 of the Tor network can increase the value of this traffic
1419 by attacking non-observed nodes to shut them down, reduce
1420 their reliability, or persuade users that they are not trustworthy.
1421 The best defense here is robustness.
1423 \emph{Run a hostile OR.} In addition to being a local observer,
1424 an isolated hostile node can create circuits through itself, or alter
1425 traffic patterns to affect traffic at other nodes. Nonetheless, a hostile
1426 node must be immediately adjacent to both endpoints to compromise the
1427 anonymity of a circuit. If an adversary can
1428 run multiple ORs, and can persuade the directory servers
1429 that those ORs are trustworthy and independent, then occasionally
1430 some user will choose one of those ORs for the start and another
1431 as the end of a circuit. If an adversary
1432 controls $m>1$ out of $N$ nodes, he can to correlate at most
1433 $\left(\frac{m}{N}\right)^2$ of the traffic in this way---although an
1434 adversary
1435 could possibly attract a disproportionately large amount of traffic
1436 by running an OR with a permissive exit policy, or by
1437 degrading the reliability of other routers.
1439 \emph{Introduce timing into messages.} This is simply a stronger
1440 version of passive timing attacks already discussed earlier.
1442 \emph{Tagging attacks.} A hostile node could ``tag'' a
1443 cell by altering it. If the
1444 stream were, for example, an unencrypted request to a Web site,
1445 the garbled content coming out at the appropriate time would confirm
1446 the association. However, integrity checks on cells prevent
1447 this attack.
1449 \emph{Replace contents of unauthenticated protocols.} When
1450 relaying an unauthenticated protocol like HTTP, a hostile exit node
1451 can impersonate the target server. Clients
1452 should prefer protocols with end-to-end authentication.
1454 \emph{Replay attacks.} Some anonymity protocols are vulnerable
1455 to replay attacks. Tor is not; replaying one side of a handshake
1456 will result in a different negotiated session key, and so the rest
1457 of the recorded session can't be used.
1459 \emph{Smear attacks.} An attacker could use the Tor network for
1460 socially disapproved acts, to bring the
1461 network into disrepute and get its operators to shut it down.
1462 Exit policies reduce the possibilities for abuse, but
1463 ultimately the network requires volunteers who can tolerate
1464 some political heat.
1466 \emph{Distribute hostile code.} An attacker could trick users
1467 into running subverted Tor software that did not, in fact, anonymize
1468 their connections---or worse, could trick ORs into running weakened
1469 software that provided users with less anonymity. We address this
1470 problem (but do not solve it completely) by signing all Tor releases
1471 with an official public key, and including an entry in the directory
1472 that lists which versions are currently believed to be secure. To
1473 prevent an attacker from subverting the official release itself
1474 (through threats, bribery, or insider attacks), we provide all
1475 releases in source code form, encourage source audits, and
1476 frequently warn our users never to trust any software (even from
1477 us) that comes without source.\\
1479 \noindent{\large\bf Directory attacks}\\
1480 \emph{Destroy directory servers.} If a few directory
1481 servers disappear, the others still decide on a valid
1482 directory. So long as any directory servers remain in operation,
1483 they will still broadcast their views of the network and generate a
1484 consensus directory. (If more than half are destroyed, this
1485 directory will not, however, have enough signatures for clients to
1486 use it automatically; human intervention will be necessary for
1487 clients to decide whether to trust the resulting directory.)
1489 \emph{Subvert a directory server.} By taking over a directory server,
1490 an attacker can partially influence the final directory. Since ORs
1491 are included or excluded by majority vote, the corrupt directory can
1492 at worst cast a tie-breaking vote to decide whether to include
1493 marginal ORs. It remains to be seen how often such marginal cases
1494 occur in practice.
1496 \emph{Subvert a majority of directory servers.} An adversary who controls
1497 more than half the directory servers can include as many compromised
1498 ORs in the final directory as he wishes. We must ensure that directory
1499 server operators are independent and attack-resistant.
1501 \emph{Encourage directory server dissent.} The directory
1502 agreement protocol assumes that directory server operators agree on
1503 the set of directory servers. An adversary who can persuade some
1504 of the directory server operators to distrust one another could
1505 split the quorum into mutually hostile camps, thus partitioning
1506 users based on which directory they use. Tor does not address
1507 this attack.
1509 \emph{Trick the directory servers into listing a hostile OR.}
1510 Our threat model explicitly assumes directory server operators will
1511 be able to filter out most hostile ORs.
1512 % If this is not true, an
1513 % attacker can flood the directory with compromised servers.
1515 \emph{Convince the directories that a malfunctioning OR is
1516 working.} In the current Tor implementation, directory servers
1517 assume that an OR is running correctly if they can start a TLS
1518 connection to it. A hostile OR could easily subvert this test by
1519 accepting TLS connections from ORs but ignoring all cells. Directory
1520 servers must actively test ORs by building circuits and streams as
1521 appropriate. The tradeoffs of a similar approach are discussed
1522 in~\cite{mix-acc}.\\
1524 \Section{Early experiences: Tor in the Wild}
1525 \label{sec:in-the-wild}
1527 As of mid-January 2004, the Tor network consists of 18 nodes
1528 (16 in the US, 2 in Europe), and more are joining each week as the code
1529 matures.\footnote{For comparison, the current remailer network
1530 has about 30 reliable nodes. We haven't asked PlanetLab to provide
1531 Tor nodes, since their AUP wouldn't allow exit nodes (see
1532 also~\cite{darkside}) and because we aim to build a long-term community of
1533 node operators and developers.} Each node has at least a 768Kb/768Kb
1534 connection, and
1535 many have 10Mb. The number of users varies (and of course, it's hard to
1536 tell for sure), but we sometimes have several hundred users---administrators at
1537 several companies have begun sending their entire departments' web
1538 traffic through Tor, to block other divisions of
1539 their company from reading their traffic. Tor users have reported using
1540 the network for web browsing, FTP, IRC, AIM, Kazaa, and SSH.
1542 Each Tor node currently processes roughly 800,000 relay
1543 cells (a bit under half a gigabyte) per week. On average, about 80\%
1544 of each 500-byte payload is full for cells going back to the client,
1545 whereas about 40\% is full for cells coming from the client. (The difference
1546 arises because most of the network's traffic is web browsing.) Interactive
1547 traffic like SSH brings down the average a lot---once we have more
1548 experience, and assuming we can resolve the anonymity issues, we may
1549 partition traffic into two relay cell sizes: one to handle
1550 bulk traffic and one for interactive traffic.
1552 Based in part on our restrictive default exit policy (we
1553 reject SMTP requests) and our low profile, we have had no abuse
1554 issues since the network was deployed in October
1555 2003. Our slow growth rate gives us time to add features,
1556 resolve bugs, and get a feel for what users actually want from an
1557 anonymity system. Even though having more users would bolster our
1558 anonymity sets, we are not eager to attract the Kazaa or warez
1559 communities---we feel that we must build a reputation for privacy, human
1560 rights, research, and other socially laudable activities.
1562 As for performance, profiling shows that Tor spends almost
1563 all its CPU time in AES, which is fast. Current latency is attributable
1564 to two factors. First, network latency is critical: we are
1565 intentionally bouncing traffic around the world several times. Second,
1566 our end-to-end congestion control algorithm focuses on protecting
1567 volunteer servers from accidental DoS rather than on optimizing
1568 performance. % Right now the first $500 \times 500\mbox{B}=250\mbox{KB}$
1569 %of the stream arrives
1570 %quickly, and after that throughput depends on the rate that \emph{relay
1571 %sendme} acknowledgments arrive.
1572 To quantify these effects, we did some informal tests using a network of 4
1573 nodes on the same machine (a heavily loaded 1GHz Athlon). We downloaded a 60
1574 megabyte file from {\tt debian.org} every 30 minutes for 54 hours (108 sample
1575 points). It arrived in about 300 seconds on average, compared to 210s for a
1576 direct download. We ran a similar test on the production Tor network,
1577 fetching the front page of {\tt cnn.com} (55 kilobytes): %every 10 minutes for
1578 %26 hours (156 sample points):
1579 while a direct
1580 download consistently took about 0.3s, the performance through Tor was highly
1581 variable. Some downloads were as fast as 0.6s, with a median at 2.7s, and
1582 80\% finishing within 5.7s. It seems that as the network expands, the chance
1583 of building a slow circuit (one that includes a slow or heavily loaded node
1584 or link) is increasing. On the other hand, as our users remain satisfied
1585 with this increased latency, we can address our performance incrementally as we
1586 proceed with development.\footnote{For example, we have just begun pushing
1587 a pipelining patch to the production network that seems to decrease
1588 latency for medium-to-large files; we will present revised benchmarks
1589 as they become available.}
1591 %With the current network's topology and load, users can typically get 1-2
1592 %megabits sustained transfer rate, which is good enough for now.
1593 Indeed, the Tor
1594 design aims foremost to provide a security research platform; performance
1595 only needs to be sufficient to retain users~\cite{econymics,back01}.
1596 We can tweak the congestion control
1597 parameters to provide faster throughput at the cost of
1598 larger buffers at each node; adding the heuristics mentioned in
1599 Section~\ref{subsec:rate-limit} to favor low-volume
1600 streams may also help. More research remains to find the
1601 right balance.
1602 % We should say _HOW MUCH_ latency there is in these cases. -NM
1604 %performs badly on lossy networks. may need airhook or something else as
1605 %transport alternative?
1607 Although Tor's clique topology and full-visibility directories present
1608 scaling problems, we still expect the network to support a few hundred
1609 nodes and maybe 10,000 users before we're forced to become
1610 more distributed. With luck, the experience we gain running the current
1611 topology will help us choose among alternatives when the time comes.
1613 \Section{Open Questions in Low-latency Anonymity}
1614 \label{sec:maintaining-anonymity}
1616 In addition to the non-goals in
1617 Section~\ref{subsec:non-goals}, many other questions must be solved
1618 before we can be confident of Tor's security.
1620 Many of these open issues are questions of balance. For example,
1621 how often should users rotate to fresh circuits? Frequent rotation
1622 is inefficient, expensive, and may lead to intersection attacks and
1623 predecessor attacks~\cite{wright03}, but infrequent rotation makes the
1624 user's traffic linkable. Besides opening fresh circuits, clients can
1625 also exit from the middle of the circuit,
1626 or truncate and re-extend the circuit. More analysis is
1627 needed to determine the proper tradeoff.
1629 %% Duplicated by 'Better directory distribution' in section 9.
1631 %A similar question surrounds timing of directory operations: how often
1632 %should directories be updated? Clients that update infrequently receive
1633 %an inaccurate picture of the network, but frequent updates can overload
1634 %the directory servers. More generally, we must find more
1635 %decentralized yet practical ways to distribute up-to-date snapshots of
1636 %network status without introducing new attacks.
1638 How should we choose path lengths? If Alice only ever uses two hops,
1639 then both ORs can be certain that by colluding they will learn about
1640 Alice and Bob. In our current approach, Alice always chooses at least
1641 three nodes unrelated to herself and her destination.
1642 %% This point is subtle, but not IMO necessary. Anybody who thinks
1643 %% about it will see that it's implied by the above sentence; anybody
1644 %% who doesn't think about it is safe in his ignorance.
1646 %Thus normally she chooses
1647 %three nodes, but if she is running an OR and her destination is on an OR,
1648 %she uses five.
1649 Should Alice choose a random path length (say,
1650 increasing it from a geometric distribution) to foil an attacker who
1651 uses timing to learn that he is the fifth hop and thus concludes that
1652 both Alice and the responder are on ORs?
1654 Throughout this paper, we have assumed that end-to-end traffic
1655 confirmation will immediately and automatically defeat a low-latency
1656 anonymity system. Even high-latency anonymity systems can be
1657 vulnerable to end-to-end traffic confirmation, if the traffic volumes
1658 are high enough, and if users' habits are sufficiently
1659 distinct~\cite{statistical-disclosure,limits-open}. Can anything be
1660 done to
1661 make low-latency systems resist these attacks as well as high-latency
1662 systems? Tor already makes some effort to conceal the starts and ends of
1663 streams by wrapping long-range control commands in identical-looking
1664 relay cells. Link padding could frustrate passive observers who count
1665 packets; long-range padding could work against observers who own the
1666 first hop in a circuit. But more research remains to find an efficient
1667 and practical approach. Volunteers prefer not to run constant-bandwidth
1668 padding; but no convincing traffic shaping approach has been
1669 specified. Recent work on long-range padding~\cite{defensive-dropping}
1670 shows promise. One could also try to reduce correlation in packet timing
1671 by batching and re-ordering packets, but it is unclear whether this could
1672 improve anonymity without introducing so much latency as to render the
1673 network unusable.
1675 A cascade topology may better defend against traffic confirmation by
1676 aggregating users, and making padding and
1677 mixing more affordable. Does the hydra topology (many input nodes,
1678 few output nodes) work better against some adversaries? Are we going
1679 to get a hydra anyway because most nodes will be middleman nodes?
1681 Common wisdom suggests that Alice should run her own OR for best
1682 anonymity, because traffic coming from her node could plausibly have
1683 come from elsewhere. How much mixing does this approach need? Is it
1684 immediately beneficial because of real-world adversaries that can't
1685 observe Alice's router, but can run routers of their own?
1687 To scale to many users, and to prevent an attacker from observing the
1688 whole network, it may be necessary
1689 to support far more servers than Tor currently anticipates.
1690 This introduces several issues. First, if approval by a central set
1691 of directory servers is no longer feasible, what mechanism should be used
1692 to prevent adversaries from signing up many colluding servers? Second,
1693 if clients can no longer have a complete picture of the network,
1694 how can they perform discovery while preventing attackers from
1695 manipulating or exploiting gaps in their knowledge? Third, if there
1696 are too many servers for every server to constantly communicate with
1697 every other, which non-clique topology should the network use?
1698 (Restricted-route topologies promise comparable anonymity with better
1699 scalability~\cite{danezis-pets03}, but whatever topology we choose, we
1700 need some way to keep attackers from manipulating their position within
1701 it~\cite{casc-rep}.) Fourth, if no central authority is tracking
1702 server reliability, how do we stop unreliable servers from making
1703 the network unusable? Fifth, do clients receive so much anonymity
1704 from running their own ORs that we should expect them all to do
1705 so~\cite{econymics}, or do we need another incentive structure to
1706 motivate them? Tarzan and MorphMix present possible solutions.
1708 % advogato, captcha
1710 When a Tor node goes down, all its circuits (and thus streams) must break.
1711 Will users abandon the system because of this brittleness? How well
1712 does the method in Section~\ref{subsec:dos} allow streams to survive
1713 node failure? If affected users rebuild circuits immediately, how much
1714 anonymity is lost? It seems the problem is even worse in a peer-to-peer
1715 environment---such systems don't yet provide an incentive for peers to
1716 stay connected when they're done retrieving content, so we would expect
1717 a higher churn rate.
1719 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1721 \Section{Future Directions}
1722 \label{sec:conclusion}
1724 Tor brings together many innovations into a unified deployable system. The
1725 next immediate steps include:
1727 \emph{Scalability:} Tor's emphasis on deployability and design simplicity
1728 has led us to adopt a clique topology, semi-centralized
1729 directories, and a full-network-visibility model for client
1730 knowledge. These properties will not scale past a few hundred servers.
1731 Section~\ref{sec:maintaining-anonymity} describes some promising
1732 approaches, but more deployment experience will be helpful in learning
1733 the relative importance of these bottlenecks.
1735 \emph{Bandwidth classes:} This paper assumes that all ORs have
1736 good bandwidth and latency. We should instead adopt the MorphMix model,
1737 where nodes advertise their bandwidth level (DSL, T1, T3), and
1738 Alice avoids bottlenecks by choosing nodes that match or
1739 exceed her bandwidth. In this way DSL users can usefully join the Tor
1740 network.
1742 \emph{Incentives:} Volunteers who run nodes are rewarded with publicity
1743 and possibly better anonymity~\cite{econymics}. More nodes means increased
1744 scalability, and more users can mean more anonymity. We need to continue
1745 examining the incentive structures for participating in Tor. Further,
1746 we need to explore more approaches to limiting abuse, and understand
1747 why most people don't bother using privacy systems.
1749 \emph{Cover traffic:} Currently Tor omits cover traffic---its costs
1750 in performance and bandwidth are clear but its security benefits are
1751 not well understood. We must pursue more research on link-level cover
1752 traffic and long-range cover traffic to determine whether some simple padding
1753 method offers provable protection against our chosen adversary.
1755 %%\emph{Offer two relay cell sizes:} Traffic on the Internet tends to be
1756 %%large for bulk transfers and small for interactive traffic. One cell
1757 %%size cannot be optimal for both types of traffic.
1758 % This should go in the spec and todo, but not the paper yet. -RD
1760 \emph{Caching at exit nodes:} Perhaps each exit node should run a
1761 caching web proxy~\cite{shsm03}, to improve anonymity for cached pages
1762 (Alice's request never
1763 leaves the Tor network), to improve speed, and to reduce bandwidth cost.
1764 On the other hand, forward security is weakened because caches
1765 constitute a record of retrieved files. We must find the right
1766 balance between usability and security.
1768 \emph{Better directory distribution:}
1769 Clients currently download a description of
1770 the entire network every 15 minutes. As the state grows larger
1771 and clients more numerous, we may need a solution in which
1772 clients receive incremental updates to directory state.
1773 More generally, we must find more
1774 scalable yet practical ways to distribute up-to-date snapshots of
1775 network status without introducing new attacks.
1777 \emph{Implement location-hidden services:} The design in
1778 Appendix~\ref{sec:rendezvous-specifics} has not yet been implemented.
1779 While doing
1780 so we are likely to encounter additional issues that must be resolved,
1781 both in terms of usability and anonymity.
1783 \emph{Further specification review:} Our public
1784 byte-level specification~\cite{tor-spec} needs
1785 external review. We hope that as Tor
1786 is deployed, more people will examine its
1787 specification.
1789 \emph{Multisystem interoperability:} We are currently working with the
1790 designer of MorphMix to unify the specification and implementation of
1791 the common elements of our two systems. So far, this seems
1792 to be relatively straightforward. Interoperability will allow testing
1793 and direct comparison of the two designs for trust and scalability.
1795 \emph{Wider-scale deployment:} The original goal of Tor was to
1796 gain experience in deploying an anonymizing overlay network, and
1797 learn from having actual users. We are now at a point in design
1798 and development where we can start deploying a wider network. Once
1799 we have many actual users, we will doubtlessly be better
1800 able to evaluate some of our design decisions, including our
1801 robustness/latency tradeoffs, our performance tradeoffs (including
1802 cell size), our abuse-prevention mechanisms, and
1803 our overall usability.
1805 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1807 %% commented out for anonymous submission
1808 \section*{Acknowledgments}
1809 We thank Peter Palfrader, Geoff Goodell, Adam Shostack, Joseph Sokol-Margolis,
1810 John Bashinski, and Zack Brown
1811 for editing and comments;
1812 Matej Pfajfar, Andrei Serjantov, Marc Rennhard for design discussions;
1813 Bram Cohen for congestion control discussions;
1814 Adam Back for suggesting telescoping circuits; and
1815 Cathy Meadows for formal analysis of the \emph{extend} protocol.
1816 This work has been supported by ONR and DARPA.
1818 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1820 \bibliographystyle{latex8}
1821 \bibliography{tor-design}
1823 \newpage
1824 \appendix
1826 \Section{Rendezvous points and hidden services}
1827 \label{sec:rendezvous-specifics}
1829 In this appendix we provide specifics about the rendezvous points
1830 of Section~\ref{subsec:rendezvous}. % We also describe integration
1831 %issues and related work.
1832 %, and how well the rendezvous point design
1833 %withstands various attacks.
1834 % (Not any more; it's currently commented out. -NM)
1836 %\SubSection{Rendezvous protocol overview}
1838 The following steps are
1839 %We give an overview of the steps of a rendezvous. These are
1840 performed on behalf of Alice and Bob by their local OPs;
1841 application integration is described more fully below.
1843 \begin{tightlist}
1844 \item Bob chooses some introduction points, and advertises them on
1845 the DHT. He can add more later.
1846 \item Bob builds a circuit to each of his introduction points,
1847 and waits for requests.
1848 \item Alice learns about Bob's service out of band (perhaps Bob told her,
1849 or she found it on a website). She retrieves the details of Bob's
1850 service from the DHT.
1851 \item Alice chooses an OR as the rendezvous point (RP) for this
1852 transaction. She builds a circuit to the RP, and gives it a
1853 rendezvous cookie to recognize Bob.
1854 \item Alice opens an anonymous stream to one of Bob's introduction
1855 points, and gives it a message (encrypted to Bob's public key)
1856 telling it about herself,
1857 her RP and rendezvous cookie, and the
1858 start of a DH
1859 handshake. The introduction point sends the message to Bob.
1860 \item If Bob wants to talk to Alice, he builds a circuit to Alice's
1861 RP and sends the rendezvous cookie, the second half of the DH
1862 handshake, and a hash of the session
1863 key they now share. By the same argument as in
1864 Section~\ref{subsubsec:constructing-a-circuit}, Alice knows she
1865 shares the key only with Bob.
1866 \item The RP connects Alice's circuit to Bob's. Note that RP can't
1867 recognize Alice, Bob, or the data they transmit.
1868 \item Alice sends a \emph{relay begin} cell along the circuit. It
1869 arrives at Bob's OP, which connects to Bob's
1870 webserver.
1871 \item An anonymous stream has been established, and Alice and Bob
1872 communicate as normal.
1873 \end{tightlist}
1875 When establishing an introduction point, Bob provides the onion router
1876 with a public ``introduction'' key. The hash of this public key
1877 identifies a unique service, and (since Bob signs his
1878 messages) prevents anybody else from usurping Bob's introduction point
1879 in the future. Bob uses the same public key to establish the other
1880 introduction points for his service, and periodically refreshes his
1881 entry in the DHT.
1883 The message that Alice gives
1884 the introduction point includes a hash of Bob's public key % to identify
1885 %the service, along with
1886 and an optional initial authorization token (the
1887 introduction point can do prescreening, for example to block replays). Her
1888 message to Bob may include an end-to-end authorization token so Bob
1889 can choose whether to respond.
1890 The authorization tokens can be used to provide selective access:
1891 important users can get uninterrupted access.
1892 %important users get tokens to ensure uninterrupted access. %to the
1893 %service.
1894 During normal situations, Bob's service might simply be offered
1895 directly from mirrors, while Bob gives out tokens to high-priority users. If
1896 the mirrors are knocked down,
1897 %by distributed DoS attacks or even
1898 %physical attack,
1899 those users can switch to accessing Bob's service via
1900 the Tor rendezvous system.
1902 Bob's introduction points are themselves subject to DoS---he must
1903 open many introduction points or risk such an attack.
1904 He can provide selected users with a current list or future schedule of
1905 unadvertised introduction points;
1906 this is most practical
1907 if there is a stable and large group of introduction points
1908 available. Bob could also give secret public keys
1909 for consulting the DHT\@. All of these approaches
1910 limit exposure even when
1911 some selected users collude in the DoS\@.
1913 \SubSection{Integration with user applications}
1915 Bob configures his onion proxy to know the local IP address and port of his
1916 service, a strategy for authorizing clients, and a public key. Bob
1917 publishes the public key, an expiration time, and
1918 the current introduction points for his service into the DHT, indexed
1919 by the hash of the public key. Bob's webserver is unmodified,
1920 and doesn't even know that it's hidden behind the Tor network.
1922 Alice's applications also work unchanged---her client interface
1923 remains a SOCKS proxy. We encode all of the necessary information
1924 into the fully qualified domain name Alice uses when establishing her
1925 connection. Location-hidden services use a virtual top level domain
1926 called {\tt .onion}: thus hostnames take the form {\tt x.y.onion} where
1927 {\tt x} is the authorization cookie and {\tt y} encodes the hash of
1928 the public key. Alice's onion proxy
1929 examines addresses; if they're destined for a hidden server, it decodes
1930 the key and starts the rendezvous as described above.
1932 \subsection{Previous rendezvous work}
1934 Rendezvous points in low-latency anonymity systems were first
1935 described for use in ISDN telephony~\cite{jerichow-jsac98,isdn-mixes}.
1936 Later low-latency designs used rendezvous points for hiding location
1937 of mobile phones and low-power location
1938 trackers~\cite{federrath-ih96,reed-protocols97}. Rendezvous for
1939 low-latency
1940 Internet connections was suggested in early Onion Routing
1941 work~\cite{or-ih96}, but the first published design was by Ian
1942 Goldberg~\cite{ian-thesis}. His design differs from
1943 ours in three ways. First, Goldberg suggests that Alice should manually
1944 hunt down a current location of the service via Gnutella; our approach
1945 makes lookup transparent to the user, as well as faster and more robust.
1946 Second, in Tor the client and server negotiate session keys
1947 via Diffie-Hellman, so plaintext is not exposed even at the rendezvous point. Third,
1948 our design minimizes the exposure from running the
1949 service, to encourage volunteers to offer introduction and rendezvous
1950 services. Tor's introduction points do not output any bytes to the
1951 clients; the rendezvous points don't know the client or the server,
1952 and can't read the data being transmitted. The indirection scheme is
1953 also designed to include authentication/authorization---if Alice doesn't
1954 include the right cookie with her request for service, Bob need not even
1955 acknowledge his existence.
1957 %\SubSection{Attacks against rendezvous points}
1959 %We describe here attacks against rendezvous points and how well
1960 %the system protects against them.
1962 %\emph{Make many introduction requests.} An attacker could
1963 %try to deny Bob service by flooding his introduction points with
1964 %requests. Because the introduction points can block requests that
1965 %lack authorization tokens, however, Bob can restrict the volume of
1966 %requests he receives, or require a certain amount of computation for
1967 %every request he receives.
1969 %\emph{Attack an introduction point.} An attacker could
1970 %disrupt a location-hidden service by disabling its introduction
1971 %points. But because a service's identity is attached to its public
1972 %key, the service can simply re-advertise
1973 %itself at a different introduction point. Advertisements can also be
1974 %done secretly so that only high-priority clients know the address of
1975 %Bob's introduction points or so that different clients know of different
1976 %introduction points. This forces the attacker to disable all possible
1977 %introduction points.
1979 %\emph{Compromise an introduction point.} An attacker who controls
1980 %Bob's introduction point can flood Bob with
1981 %introduction requests, or prevent valid introduction requests from
1982 %reaching him. Bob can notice a flood, and close the circuit. To notice
1983 %blocking of valid requests, however, he should periodically test the
1984 %introduction point by sending rendezvous requests and making
1985 %sure he receives them.
1987 %\emph{Compromise a rendezvous point.} A rendezvous
1988 %point is no more sensitive than any other OR on
1989 %a circuit, since all data passing through the rendezvous is encrypted
1990 %with a session key shared by Alice and Bob.
1992 \end{document}
1994 % Style guide:
1995 % U.S. spelling
1996 % avoid contractions (it's, can't, etc.)
1997 % prefer ``for example'' or ``such as'' to e.g.
1998 % prefer ``that is'' to i.e.
1999 % 'mix', 'mixes' (as noun)
2000 % 'mix-net'
2001 % 'mix', 'mixing' (as verb)
2002 % 'middleman' [Not with a hyphen; the hyphen has been optional
2003 % since Middle English.]
2004 % 'nymserver'
2005 % 'Cypherpunk', 'Cypherpunks', 'Cypherpunk remailer'
2006 % 'Onion Routing design', 'onion router' [note capitalization]
2007 % 'SOCKS'
2008 % Try not to use \cite as a noun.
2009 % 'Authorizating' sounds great, but it isn't a word.
2010 % 'First, second, third', not 'Firstly, secondly, thirdly'.
2011 % 'circuit', not 'channel'
2012 % Typography: no space on either side of an em dash---ever.
2013 % Hyphens are for multi-part words; en dashs imply movement or
2014 % opposition (The Alice--Bob connection); and em dashes are
2015 % for punctuation---like that.
2016 % A relay cell; a control cell; a \emph{create} cell; a
2017 % \emph{relay truncated} cell. Never ``a \emph{relay truncated}.''
2019 % 'Substitute ``Damn'' every time you're inclined to write ``very;'' your
2020 % editor will delete it and the writing will be just as it should be.'
2021 % -- Mark Twain