doc/design-paper/tor-design.tex

   1 \documentclass[twocolumn]{article}
   2 \usepackage{usenix}
   3
   4 %\documentclass[times,10pt,twocolumn]{article}
   5 %\usepackage{latex8}
   6 %\usepackage{times}
   7 \usepackage{url}
   8 \usepackage{graphics}
   9 \usepackage{amsmath}
  10 \usepackage{epsfig}
  11
  12 \pagestyle{empty}
  13
  14 \renewcommand\url{\begingroup \def\UrlLeft{<}\def\UrlRight{>}\urlstyle{tt}\Url}
  15 \newcommand\emailaddr{\begingroup \def\UrlLeft{<}\def\UrlRight{>}\urlstyle{tt}\Url}
  16
  17 \newcommand{\workingnote}[1]{}        % The version that hides the note.
  18 %\newcommand{\workingnote}[1]{(**#1)}   % The version that makes the note visible.
  19
  20 % If an URL ends up with '%'s in it, that's because the line *in the .bib/.tex
  21 % file* is too long, so break it there (it doesn't matter if the next line is
  22 % indented with spaces). -DH
  23
  24 %\newif\ifpdf
  25 %\ifx\pdfoutput\undefined
  26 %   \pdffalse
  27 %\else
  28 %   \pdfoutput=1
  29 %   \pdftrue
  30 %\fi
  31
  32 \newenvironment{tightlist}{\begin{list}{$\bullet$}{
  33   \setlength{\itemsep}{0mm}
  34     \setlength{\parsep}{0mm}
  35     %  \setlength{\labelsep}{0mm}
  36     %  \setlength{\labelwidth}{0mm}
  37     %  \setlength{\topsep}{0mm}
  38     }}{\end{list}}
  39
  40 % Cut down on whitespace above and below figures displayed at head/foot of
  41 % page.
  42 \setlength{\textfloatsep}{3mm}
  43 % Cut down on whitespace above and below figures displayed in middle of page
  44 \setlength{\intextsep}{3mm}
  45
  46 \begin{document}
  47
  48 %% Use dvipdfm instead. --DH
  49 %\ifpdf
  50 %  \pdfcompresslevel=9
  51 %  \pdfpagewidth=\the\paperwidth
  52 %  \pdfpageheight=\the\paperheight
  53 %\fi
  54
  55 \title{Tor: The Second-Generation Onion Router} %\\DRAFT VERSION}
  56 % Putting the 'Private' back in 'Virtual Private Network'
  57
  58 \author{Roger Dingledine \\ The Free Haven Project \\ arma@freehaven.net \and
  59 Nick Mathewson \\ The Free Haven Project \\ nickm@freehaven.net \and
  60 Paul Syverson \\ Naval Research Lab \\ syverson@itd.nrl.navy.mil}
  61
  62 \maketitle
  63 \thispagestyle{empty}
  64
  65 \begin{abstract}
  66 We present Tor, a circuit-based low-latency anonymous communication
  67 service. This second-generation Onion Routing system addresses limitations
  68 in the original design by adding perfect forward secrecy, congestion
  69 control, directory servers, integrity checking, configurable exit policies,
  70 and a practical design for location-hidden services via rendezvous
  71 points. Tor works on the real-world
  72 Internet, requires no special privileges or kernel modifications, requires
  73 little synchronization or coordination between nodes, and provides a
  74 reasonable tradeoff between anonymity, usability, and efficiency.
  75 We briefly describe our experiences with an international network of
  76 more than 30 nodes. % that has been running for several months.
  77 We close with a list of open problems in anonymous communication.
  78 \end{abstract}
  79
  80 %\begin{center}
  81 %\textbf{Keywords:} anonymity, peer-to-peer, remailer, nymserver, reply block
  82 %\end{center}
  83
  84 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  85
  86 \section{Overview}
  87 \label{sec:intro}
  88
  89 Onion Routing is a distributed overlay network designed to anonymize
  90 TCP-based applications like web browsing, secure shell,
  91 and instant messaging. Clients choose a path through the network and
  92 build a \emph{circuit}, in which each node (or ``onion router'' or ``OR'')
  93 in the path knows its predecessor and successor, but no other nodes in
  94 the circuit.  Traffic flows down the circuit in fixed-size
  95 \emph{cells}, which are unwrapped by a symmetric key at each node
  96 (like the layers of an onion) and relayed downstream. The
  97 Onion Routing project published several design and analysis
  98 papers \cite{or-ih96,or-jsac98,or-discex00,or-pet00}. While a wide area Onion
  99 Routing network was deployed briefly, the only long-running
 100 public implementation was a fragile
 101 proof-of-concept that ran on a single machine. Even this simple deployment
 102 processed connections from over sixty thousand distinct IP addresses from
 103 all over the world at a rate of about fifty thousand per day.
 104 But many critical design and deployment issues were never
 105 resolved, and the design has not been updated in years. Here
 106 we describe Tor, a protocol for asynchronous, loosely federated onion
 107 routers that provides the following improvements over the old Onion
 108 Routing design:
 109
 110 \textbf{Perfect forward secrecy:} In the original Onion Routing design,
 111 a single hostile node could record traffic and
 112 later compromise successive nodes in the circuit and force them
 113 to decrypt it. Rather than using a single multiply encrypted data
 114 structure (an \emph{onion}) to lay each circuit,
 115 Tor now uses an incremental or \emph{telescoping} path-building design,
 116 where the initiator negotiates session keys with each successive hop in
 117 the circuit.  Once these keys are deleted, subsequently compromised nodes
 118 cannot decrypt old traffic.  As a side benefit, onion replay detection
 119 is no longer necessary, and the process of building circuits is more
 120 reliable, since the initiator knows when a hop fails and can then try
 121 extending to a new node.
 122
 123 \textbf{Separation of ``protocol cleaning'' from anonymity:}
 124 Onion Routing originally required a separate ``application
 125 proxy'' for each supported application protocol---most of which were
 126 never written, so many applications were never supported.  Tor uses the
 127 standard and near-ubiquitous SOCKS~\cite{socks4} proxy interface, allowing
 128 us to support most TCP-based programs without modification.  Tor now
 129 relies on the filtering features of privacy-enhancing
 130 application-level proxies such as Privoxy~\cite{privoxy}, without trying
 131 to duplicate those features itself.
 132
 133 \textbf{No mixing, padding, or traffic shaping (yet):} Onion
 134 Routing originally called for batching and reordering cells as they arrived,
 135 assumed padding between ORs, and in
 136 later designs added padding between onion proxies (users) and
 137 ORs~\cite{or-ih96,or-jsac98}.  Tradeoffs between padding protection
 138 and cost were discussed, and \emph{traffic shaping} algorithms were
 139 theorized~\cite{or-pet00} to provide good security without expensive
 140 padding, but no concrete padding scheme was suggested.
 141 Recent research~\cite{econymics}
 142 and deployment experience~\cite{freedom21-security} suggest that this
 143 level of resource use is not practical or economical; and even full
 144 link padding is still vulnerable~\cite{defensive-dropping}. Thus,
 145 until we have a proven and convenient design for traffic shaping or
 146 low-latency mixing that improves anonymity against a realistic
 147 adversary, we leave these strategies out.
 148
 149 \textbf{Many TCP streams can share one circuit:} Onion Routing originally
 150 built a separate circuit for each
 151 application-level request, but this required
 152 multiple public key operations for every request, and also presented
 153 a threat to anonymity from building so many circuits; see
 154 Section~\ref{sec:maintaining-anonymity}.  Tor multiplexes multiple TCP
 155 streams along each circuit to improve efficiency and anonymity.
 156
 157 \textbf{Leaky-pipe circuit topology:} Through in-band signaling
 158 within the circuit, Tor initiators can direct traffic to nodes partway
 159 down the circuit. This novel approach
 160 allows traffic to exit the circuit from the middle---possibly
 161 frustrating traffic shape and volume attacks based on observing the end
 162 of the circuit. (It also allows for long-range padding if
 163 future research shows this to be worthwhile.)
 164
 165 \textbf{Congestion control:} Earlier anonymity designs do not
 166 address traffic bottlenecks. Unfortunately, typical approaches to
 167 load balancing and flow control in overlay networks involve inter-node
 168 control communication and global views of traffic. Tor's decentralized
 169 congestion control uses end-to-end acks to maintain anonymity
 170 while allowing nodes at the edges of the network to detect congestion
 171 or flooding and send less data until the congestion subsides.
 172
 173 \textbf{Directory servers:} The earlier Onion Routing design
 174 planned to flood state information through the network---an approach
 175 that can be unreliable and complex. % open to partitioning attacks.
 176 Tor takes a simplified view toward distributing this
 177 information. Certain more trusted nodes act as \emph{directory
 178 servers}: they provide signed directories describing known
 179 routers and their current state. Users periodically download them
 180 via HTTP.
 181
 182 \textbf{Variable exit policies:} Tor provides a consistent mechanism
 183 for each node to advertise a policy describing the hosts
 184 and ports to which it will connect. These exit policies are critical
 185 in a volunteer-based distributed infrastructure, because each operator
 186 is comfortable with allowing different types of traffic to exit
 187 from his node.
 188
 189 \textbf{End-to-end integrity checking:} The original Onion Routing
 190 design did no integrity checking on data. Any node on the
 191 circuit could change the contents of data cells as they passed by---for
 192 example, to alter a connection request so it would connect
 193 to a different webserver, or to `tag' encrypted traffic and look for
 194 corresponding corrupted traffic at the network edges~\cite{minion-design}.
 195 Tor hampers these attacks by verifying data integrity before it leaves
 196 the network.
 197
 198 %\textbf{Improved robustness to failed nodes:} A failed node
 199 %in the old design meant that circuit building failed, but thanks to
 200 %Tor's step-by-step circuit building, users notice failed nodes
 201 %while building circuits and route around them. Additionally, liveness
 202 %information from directories allows users to avoid unreliable nodes in
 203 %the first place.
 204 %% Can't really claim this, now that we've found so many variants of
 205 %% attack on partial-circuit-building. -RD
 206
 207 \textbf{Rendezvous points and hidden services:}
 208 Tor provides an integrated mechanism for responder anonymity via
 209 location-protected servers.  Previous Onion Routing designs included
 210 long-lived ``reply onions'' that could be used to build circuits
 211 to a hidden server, but these reply onions did not provide forward
 212 security, and became useless if any node in the path went down
 213 or rotated its keys.  In Tor, clients negotiate {\it rendezvous points}
 214 to connect with hidden servers; reply onions are no longer required.
 215
 216 Unlike Freedom~\cite{freedom2-arch}, Tor does not require OS kernel
 217 patches or network stack support.  This prevents us from anonymizing
 218 non-TCP protocols, but has greatly helped our portability and
 219 deployability.
 220
 221 %Unlike Freedom~\cite{freedom2-arch}, Tor only anonymizes
 222 %TCP-based protocols---not requiring patches (or built-in support) in an
 223 %operating system's network stack has been valuable to Tor's
 224 %portability and deployability.
 225
 226 We have implemented all of the above features, including rendezvous
 227 points. Our source code is
 228 available under a free license, and Tor
 229 %, as far as we know, is unencumbered by patents.
 230 is not covered by the patent that affected distribution and use of
 231 earlier versions of Onion Routing.
 232 We have deployed a wide-area alpha network
 233 to test the design, to get more experience with usability
 234 and users, and to provide a research platform for experimentation.
 235 As of this writing, the network stands at 32 nodes %in thirteen
 236 %distinct administrative domains
 237 spread over two continents.
 238
 239 We review previous work in Section~\ref{sec:related-work}, describe
 240 our goals and assumptions in Section~\ref{sec:assumptions},
 241 and then address the above list of improvements in
 242 Sections~\ref{sec:design},~\ref{sec:rendezvous}, and~\ref{sec:other-design}.
 243 We summarize
 244 in Section~\ref{sec:attacks} how our design stands up to
 245 known attacks, and talk about our early deployment experiences in
 246 Section~\ref{sec:in-the-wild}. We conclude with a list of open problems in
 247 Section~\ref{sec:maintaining-anonymity} and future work for the Onion
 248 Routing project in Section~\ref{sec:conclusion}.
 249
 250 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 251
 252 \section{Related work}
 253 \label{sec:related-work}
 254
 255 Modern anonymity systems date to Chaum's {\bf Mix-Net}
 256 design~\cite{chaum-mix}. Chaum
 257 proposed hiding the correspondence between sender and recipient by
 258 wrapping messages in layers of public-key cryptography, and relaying them
 259 through a path composed of ``mixes.''  Each mix in turn
 260 decrypts, delays, and re-orders messages before relaying them
 261 onward.
 262 %toward their destinations.
 263
 264 Subsequent relay-based anonymity designs have diverged in two
 265 main directions. Systems like {\bf Babel}~\cite{babel},
 266 {\bf Mixmaster}~\cite{mixmaster-spec},
 267 and {\bf Mixminion}~\cite{minion-design} have tried
 268 to maximize anonymity at the cost of introducing comparatively large and
 269 variable latencies. Because of this decision, these \emph{high-latency}
 270 networks resist strong global adversaries,
 271 but introduce too much lag for interactive tasks like web browsing,
 272 Internet chat, or SSH connections.
 273
 274 Tor belongs to the second category: \emph{low-latency} designs that
 275 try to anonymize interactive network traffic. These systems handle
 276 a variety of bidirectional protocols. They also provide more convenient
 277 mail delivery than the high-latency anonymous email
 278 networks, because the remote mail server provides explicit and timely
 279 delivery confirmation. But because these designs typically
 280 involve many packets that must be delivered quickly, it is
 281 difficult for them to prevent an attacker who can eavesdrop both ends of the
 282 communication from correlating the timing and volume
 283 of traffic entering the anonymity network with traffic leaving it~\cite{SS03}.
 284 These
 285 protocols are similarly vulnerable to an active adversary who introduces
 286 timing patterns into traffic entering the network and looks
 287 for correlated patterns among exiting traffic.
 288 Although some work has been done to frustrate these attacks, most designs
 289 protect primarily against traffic analysis rather than traffic
 290 confirmation (see Section~\ref{subsec:threat-model}).
 291
 292 The simplest low-latency designs are single-hop proxies such as the
 293 {\bf Anonymizer}~\cite{anonymizer}: a single trusted server strips the
 294 data's origin before relaying it.  These designs are easy to
 295 analyze, but users must trust the anonymizing proxy.
 296 Concentrating the traffic to this single point increases the anonymity set
 297 (the people a given user is hiding among), but it is vulnerable if the
 298 adversary can observe all traffic entering and leaving the proxy.
 299
 300 More complex are distributed-trust, circuit-based anonymizing systems.
 301 In these designs, a user establishes one or more medium-term bidirectional
 302 end-to-end circuits, and tunnels data in fixed-size cells.
 303 Establishing circuits is computationally expensive and typically
 304 requires public-key
 305 cryptography, whereas relaying cells is comparatively inexpensive and
 306 typically requires only symmetric encryption.
 307 Because a circuit crosses several servers, and each server only knows
 308 the adjacent servers in the circuit, no single server can link a
 309 user to her communication partners.
 310
 311 The {\bf Java Anon Proxy} (also known as JAP or Web MIXes) uses fixed shared
 312 routes known as \emph{cascades}.  As with a single-hop proxy, this
 313 approach aggregates users into larger anonymity sets, but again an
 314 attacker only needs to observe both ends of the cascade to bridge all
 315 the system's traffic.  The Java Anon Proxy's design
 316 calls for padding between end users and the head of the
 317 cascade~\cite{web-mix}. However, it is not demonstrated whether the current
 318 implementation's padding policy improves anonymity.
 319
 320 {\bf PipeNet}~\cite{back01, pipenet}, another low-latency design proposed
 321 around the same time as Onion Routing, gave
 322 stronger anonymity but allowed a single user to shut
 323 down the network by not sending. Systems like {\bf ISDN
 324 mixes}~\cite{isdn-mixes} were designed for other environments with
 325 different assumptions.
 326 %XXX please can we fix this sentence to something less demeaning
 327
 328 In P2P designs like {\bf Tarzan}~\cite{tarzan:ccs02} and
 329 {\bf MorphMix}~\cite{morphmix:fc04}, all participants both generate
 330 traffic and relay traffic for others. These systems aim to conceal
 331 whether a given peer originated a request
 332 or just relayed it from another peer. While Tarzan and MorphMix use
 333 layered encryption as above, {\bf Crowds}~\cite{crowds-tissec} simply assumes
 334 an adversary who cannot observe the initiator: it uses no public-key
 335 encryption, so any node on a circuit can read users' traffic.
 336
 337 {\bf Hordes}~\cite{hordes-jcs} is based on Crowds but also uses multicast
 338 responses to hide the initiator. {\bf Herbivore}~\cite{herbivore} and
 339 $\mbox{\bf P}^{\mathbf 5}$~\cite{p5} go even further, requiring broadcast.
 340 These systems are designed primarily for communication among peers,
 341 although Herbivore users can make external connections by
 342 requesting a peer to serve as a proxy.
 343
 344 Systems like {\bf Freedom} and the original Onion Routing build circuits
 345 all at once, using a layered ``onion'' of public-key encrypted messages,
 346 each layer of which provides session keys and the address of the
 347 next server in the circuit. Tor as described herein, Tarzan, MorphMix,
 348 {\bf Cebolla}~\cite{cebolla}, and Rennhard's {\bf Anonymity Network}~\cite{anonnet}
 349 build circuits
 350 in stages, extending them one hop at a time.
 351 Section~\ref{subsubsec:constructing-a-circuit} describes how this
 352 approach enables perfect forward secrecy.
 353
 354 Circuit-based designs must choose which protocol layer
 355 to anonymize. They may intercept IP packets directly, and
 356 relay them whole (stripping the source address) along the
 357 circuit~\cite{freedom2-arch,tarzan:ccs02}.  Like
 358 Tor, they may accept TCP streams and relay the data in those streams,
 359 ignoring the breakdown of that data into TCP
 360 segments~\cite{morphmix:fc04,anonnet}. Finally, like Crowds, they may accept
 361 application-level protocols such as HTTP and relay the application
 362 requests themselves.
 363 Making this protocol-layer decision requires a compromise between flexibility
 364 and anonymity.  For example, a system that understands HTTP
 365 can strip
 366 identifying information from requests, can take advantage of caching
 367 to limit the number of requests that leave the network, and can batch
 368 or encode requests to minimize the number of connections.
 369 On the other hand, an IP-level anonymizer can handle nearly any protocol,
 370 even ones unforeseen by its designers (though these systems require
 371 kernel-level modifications to some operating systems, and so are more
 372 complex and less portable). TCP-level anonymity networks like Tor present
 373 a middle approach: they are application neutral (so long as the
 374 application supports, or can be tunneled across, TCP), but by treating
 375 application connections as data streams rather than raw TCP packets,
 376 they avoid the inefficiencies of tunneling TCP over
 377 TCP.
 378
 379 Distributed-trust anonymizing systems need to prevent attackers from
 380 adding too many servers and thus compromising user paths.
 381 Tor relies on a small set of well-known directory servers, run by
 382 independent parties, to decide which nodes can
 383 join. Tarzan and MorphMix allow unknown users to run servers, and use
 384 a limited resource (like IP addresses) to prevent an attacker from
 385 controlling too much of the network.  Crowds suggests requiring
 386 written, notarized requests from potential crowd members.
 387
 388 Anonymous communication is essential for censorship-resistant
 389 systems like Eternity~\cite{eternity}, Free~Haven~\cite{freehaven-berk},
 390 Publius~\cite{publius}, and Tangler~\cite{tangler}. Tor's rendezvous
 391 points enable connections between mutually anonymous entities; they
 392 are a building block for location-hidden servers, which are needed by
 393 Eternity and Free~Haven.
 394
 395 % didn't include rewebbers. No clear place to put them, so I'll leave
 396 % them out for now. -RD
 397
 398 \section{Design goals and assumptions}
 399 \label{sec:assumptions}
 400
 401 \noindent{\large\bf Goals}\\
 402 Like other low-latency anonymity designs, Tor seeks to frustrate
 403 attackers from linking communication partners, or from linking
 404 multiple communications to or from a single user.  Within this
 405 main goal, however, several considerations have directed
 406 Tor's evolution.
 407
 408 \textbf{Deployability:} The design must be deployed and used in the
 409 real world.  Thus it
 410 must not be expensive to run (for example, by requiring more bandwidth
 411 than volunteers are willing to provide); must not place a heavy
 412 liability burden on operators (for example, by allowing attackers to
 413 implicate onion routers in illegal activities); and must not be
 414 difficult or expensive to implement (for example, by requiring kernel
 415 patches, or separate proxies for every protocol).  We also cannot
 416 require non-anonymous parties (such as websites)
 417 to run our software.  (Our rendezvous point design does not meet
 418 this goal for non-anonymous users talking to hidden servers,
 419 however; see Section~\ref{sec:rendezvous}.)
 420
 421 \textbf{Usability:} A hard-to-use system has fewer users---and because
 422 anonymity systems hide users among users, a system with fewer users
 423 provides less anonymity.  Usability is thus not only a convenience:
 424 it is a security requirement~\cite{econymics,back01}. Tor should
 425 therefore not
 426 require modifying familiar applications; should not introduce prohibitive
 427 delays;
 428 and should require as few configuration decisions
 429 as possible.  Finally, Tor should be easily implementable on all common
 430 platforms; we cannot require users to change their operating system
 431 to be anonymous.  (Tor currently runs on Win32, Linux,
 432 Solaris, BSD-style Unix, MacOS X, and probably others.)
 433
 434 \textbf{Flexibility:} The protocol must be flexible and well-specified,
 435 so Tor can serve as a test-bed for future research.
 436 Many of the open problems in low-latency anonymity
 437 networks, such as generating dummy traffic or preventing Sybil
 438 attacks~\cite{sybil}, may be solvable independently from the issues
 439 solved by
 440 Tor. Hopefully future systems will not need to reinvent Tor's design.
 441 %(But note that while a flexible design benefits researchers,
 442 %there is a danger that differing choices of extensions will make users
 443 %distinguishable. Experiments should be run on a separate network.)
 444
 445 \textbf{Simple design:} The protocol's design and security
 446 parameters must be well-understood. Additional features impose implementation
 447 and complexity costs; adding unproven techniques to the design threatens
 448 deployability, readability, and ease of security analysis. Tor aims to
 449 deploy a simple and stable system that integrates the best accepted
 450 approaches to protecting anonymity.\\
 451
 452 \noindent{\large\bf Non-goals}\label{subsec:non-goals}\\
 453 In favoring simple, deployable designs, we have explicitly deferred
 454 several possible goals, either because they are solved elsewhere, or because
 455 they are not yet solved.
 456
 457 \textbf{Not peer-to-peer:} Tarzan and MorphMix aim to scale to completely
 458 decentralized peer-to-peer environments with thousands of short-lived
 459 servers, many of which may be controlled by an adversary.  This approach
 460 is appealing, but still has many open
 461 problems~\cite{tarzan:ccs02,morphmix:fc04}.
 462
 463 \textbf{Not secure against end-to-end attacks:} Tor does not claim
 464 to completely solve end-to-end timing or intersection
 465 attacks. Some approaches, such as having users run their own onion routers,
 466 may help;
 467 see Section~\ref{sec:maintaining-anonymity} for more discussion.
 468
 469 \textbf{No protocol normalization:} Tor does not provide \emph{protocol
 470 normalization} like Privoxy or the Anonymizer. If senders want anonymity from
 471 responders while using complex and variable
 472 protocols like HTTP, Tor must be layered with a filtering proxy such
 473 as Privoxy to hide differences between clients, and expunge protocol
 474 features that leak identity.
 475 Note that by this separation Tor can also provide services that
 476 are anonymous to the network yet authenticated to the responder, like
 477 SSH. Similarly, Tor does not integrate
 478 tunneling for non-stream-based protocols like UDP; this must be
 479 provided by an external service if appropriate.
 480
 481 \textbf{Not steganographic:} Tor does not try to conceal who is connected
 482 to the network.
 483
 484 \subsection{Threat Model}
 485 \label{subsec:threat-model}
 486
 487 A global passive adversary is the most commonly assumed threat when
 488 analyzing theoretical anonymity designs. But like all practical
 489 low-latency systems, Tor does not protect against such a strong
 490 adversary. Instead, we assume an adversary who can observe some fraction
 491 of network traffic; who can generate, modify, delete, or delay
 492 traffic;  who can operate onion routers of his own; and who can
 493 compromise some fraction of the onion routers.
 494
 495 In low-latency anonymity systems that use layered encryption, the
 496 adversary's typical goal is to observe both the initiator and the
 497 responder. By observing both ends, passive attackers can confirm a
 498 suspicion that Alice is
 499 talking to Bob if the timing and volume patterns of the traffic on the
 500 connection are distinct enough; active attackers can induce timing
 501 signatures on the traffic to force distinct patterns. Rather
 502 than focusing on these \emph{traffic confirmation} attacks,
 503 we aim to prevent \emph{traffic
 504 analysis} attacks, where the adversary uses traffic patterns to learn
 505 which points in the network he should attack.
 506
 507 Our adversary might try to link an initiator Alice with her
 508 communication partners, or try to build a profile of Alice's
 509 behavior. He might mount passive attacks by observing the network edges
 510 and correlating traffic entering and leaving the network---by
 511 relationships in packet timing, volume, or externally visible
 512 user-selected
 513 options. The adversary can also mount active attacks by compromising
 514 routers or keys; by replaying traffic; by selectively denying service
 515 to trustworthy routers to move users to
 516 compromised routers, or denying service to users to see if traffic
 517 elsewhere in the
 518 network stops; or by introducing patterns into traffic that can later be
 519 detected. The adversary might subvert the directory servers to give users
 520 differing views of network state. Additionally, he can try to decrease
 521 the network's reliability by attacking nodes or by performing antisocial
 522 activities from reliable nodes and trying to get them taken down---making
 523 the network unreliable flushes users to other less anonymous
 524 systems, where they may be easier to attack. We summarize
 525 in Section~\ref{sec:attacks} how well the Tor design defends against
 526 each of these attacks.
 527
 528 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 529
 530 \section{The Tor Design}
 531 \label{sec:design}
 532
 533 The Tor network is an overlay network; each onion router (OR)
 534 runs as a normal
 535 user-level process without any special privileges.
 536 Each onion router maintains a TLS~\cite{TLS}
 537 connection to every other onion router.
 538 %(We discuss alternatives to this clique-topology assumption in
 539 %Section~\ref{sec:maintaining-anonymity}.)
 540 % A subset of the ORs also act as
 541 %directory servers, tracking which routers are in the network;
 542 %see Section~\ref{subsec:dirservers} for directory server details.
 543 Each user
 544 runs local software called an onion proxy (OP) to fetch directories,
 545 establish circuits across the network,
 546 and handle connections from user applications.  These onion proxies accept
 547 TCP streams and multiplex them across the circuits. The onion
 548 router on the other side
 549 of the circuit connects to the requested destinations
 550 and relays data.
 551
 552 Each onion router maintains a long-term identity key and a short-term
 553 onion key. The identity
 554 key is used to sign TLS certificates, to sign the OR's \emph{router
 555 descriptor} (a summary of its keys, address, bandwidth, exit policy,
 556 and so on), and (by directory servers) to sign directories. %Changing
 557 %the identity key of a router is considered equivalent to creating a
 558 %new router.
 559 The onion key is used to decrypt requests
 560 from users to set up a circuit and negotiate ephemeral keys.
 561 The TLS protocol also establishes a short-term link key when communicating
 562 between ORs. Short-term keys are rotated periodically and
 563 independently, to limit the impact of key compromise.
 564
 565 Section~\ref{subsec:cells} presents the fixed-size
 566 \emph{cells} that are the unit of communication in Tor. We describe
 567 in Section~\ref{subsec:circuits} how circuits are
 568 built, extended, truncated, and destroyed. Section~\ref{subsec:tcp}
 569 describes how TCP streams are routed through the network.  We address
 570 integrity checking in Section~\ref{subsec:integrity-checking},
 571 and resource limiting in Section~\ref{subsec:rate-limit}.
 572 Finally,
 573 Section~\ref{subsec:congestion} talks about congestion control and
 574 fairness issues.
 575
 576 \subsection{Cells}
 577 \label{subsec:cells}
 578
 579 Onion routers communicate with one another, and with users' OPs, via
 580 TLS connections with ephemeral keys.  Using TLS conceals the data on
 581 the connection with perfect forward secrecy, and prevents an attacker
 582 from modifying data on the wire or impersonating an OR.
 583
 584 Traffic passes along these connections in fixed-size cells.  Each cell
 585 is 512 bytes, %(but see Section~\ref{sec:conclusion} for a discussion of
 586 %allowing large cells and small cells on the same network),
 587 and consists of a header and a payload. The header includes a circuit
 588 identifier (circID) that specifies which circuit the cell refers to
 589 (many circuits can be multiplexed over the single TLS connection), and
 590 a command to describe what to do with the cell's payload.  (Circuit
 591 identifiers are connection-specific: each circuit has a different
 592 circID on each OP/OR or OR/OR connection it traverses.)
 593 Based on their command, cells are either \emph{control} cells, which are
 594 always interpreted by the node that receives them, or \emph{relay} cells,
 595 which carry end-to-end stream data.   The control cell commands are:
 596 \emph{padding} (currently used for keepalive, but also usable for link
 597 padding); \emph{create} or \emph{created} (used to set up a new circuit);
 598 and \emph{destroy} (to tear down a circuit).
 599
 600 Relay cells have an additional header (the relay header) at the front
 601 of the payload, containing a streamID (stream identifier: many streams can
 602 be multiplexed over a circuit); an end-to-end checksum for integrity
 603 checking; the length of the relay payload; and a relay command.
 604 The entire contents of the relay header and the relay cell payload
 605 are encrypted or decrypted together as the relay cell moves along the
 606 circuit, using the 128-bit AES cipher in counter mode to generate a
 607 cipher stream.  The relay commands are: \emph{relay
 608 data} (for data flowing down the stream), \emph{relay begin} (to open a
 609 stream), \emph{relay end} (to close a stream cleanly), \emph{relay
 610 teardown} (to close a broken stream), \emph{relay connected}
 611 (to notify the OP that a relay begin has succeeded), \emph{relay
 612 extend} and \emph{relay extended} (to extend the circuit by a hop,
 613 and to acknowledge), \emph{relay truncate} and \emph{relay truncated}
 614 (to tear down only part of the circuit, and to acknowledge), \emph{relay
 615 sendme} (used for congestion control), and \emph{relay drop} (used to
 616 implement long-range dummies).
 617 We give a visual overview of cell structure plus the details of relay
 618 cell structure, and then describe each of these cell types and commands
 619 in more detail below.
 620
 621 %\begin{figure}[h]
 622 %\unitlength=1cm
 623 %\centering
 624 %\begin{picture}(8.0,1.5)
 625 %\put(4,.5){\makebox(0,0)[c]{\epsfig{file=cell-struct,width=7cm}}}
 626 %\end{picture}
 627 %\end{figure}
 628
 629 \begin{figure}[h]
 630 \centering
 631 \mbox{\epsfig{figure=cell-struct,width=7cm}}
 632 \end{figure}
 633
 634 \subsection{Circuits and streams}
 635 \label{subsec:circuits}
 636
 637 Onion Routing originally built one circuit for each
 638 TCP stream.  Because building a circuit can take several tenths of a
 639 second (due to public-key cryptography and network latency),
 640 this design imposed high costs on applications like web browsing that
 641 open many TCP streams.
 642
 643 In Tor, each circuit can be shared by many TCP streams.  To avoid
 644 delays, users construct circuits preemptively.  To limit linkability
 645 among their streams, users' OPs build a new circuit
 646 periodically if the previous ones have been used,
 647 and expire old used circuits that no longer have any open streams.
 648 OPs consider rotating to a new circuit once a minute: thus
 649 even heavy users spend negligible time
 650 building circuits, but a limited number of requests can be linked
 651 to each other through a given exit node. Also, because circuits are built
 652 in the background, OPs can recover from failed circuit creation
 653 without harming user experience.\\
 654
 655 \begin{figure}[h]
 656 \centering
 657 \mbox{\epsfig{figure=interaction,width=8.75cm}}
 658 \caption{Alice builds a two-hop circuit and begins fetching a web page.}
 659 \label{fig:interaction}
 660 \end{figure}
 661
 662 \noindent{\large\bf Constructing a circuit}\label{subsubsec:constructing-a-circuit}\\
 663 %\subsubsection{Constructing a circuit}
 664 A user's OP constructs circuits incrementally, negotiating a
 665 symmetric key with each OR on the circuit, one hop at a time. To begin
 666 creating a new circuit, the OP (call her Alice) sends a
 667 \emph{create} cell to the first node in her chosen path (call him Bob).
 668 (She chooses a new
 669 circID $C_{AB}$ not currently used on the connection from her to Bob.)
 670 The \emph{create} cell's
 671 payload contains the first half of the Diffie-Hellman handshake
 672 ($g^x$), encrypted to the onion key of Bob. Bob
 673 responds with a \emph{created} cell containing $g^y$
 674 along with a hash of the negotiated key $K=g^{xy}$.
 675
 676 Once the circuit has been established, Alice and Bob can send one
 677 another relay cells encrypted with the negotiated
 678 key.\footnote{Actually, the negotiated key is used to derive two
 679   symmetric keys: one for each direction.}  More detail is given in
 680 the next section.
 681
 682 To extend the circuit further, Alice sends a \emph{relay extend} cell
 683 to Bob, specifying the address of the next OR (call her Carol), and
 684 an encrypted $g^{x_2}$ for her.  Bob copies the half-handshake into a
 685 \emph{create} cell, and passes it to Carol to extend the circuit.
 686 (Bob chooses a new circID $C_{BC}$ not currently used on the connection
 687 between him and Carol.  Alice never needs to know this circID; only Bob
 688 associates $C_{AB}$ on his connection with Alice to $C_{BC}$ on
 689 his connection with Carol.)
 690 When Carol responds with a \emph{created} cell, Bob wraps the payload
 691 into a \emph{relay extended} cell and passes it back to Alice.  Now
 692 the circuit is extended to Carol, and Alice and Carol share a common key
 693 $K_2 = g^{x_2 y_2}$.
 694
 695 To extend the circuit to a third node or beyond, Alice
 696 proceeds as above, always telling the last node in the circuit to
 697 extend one hop further.
 698
 699 This circuit-level handshake protocol achieves unilateral entity
 700 authentication (Alice knows she's handshaking with the OR, but
 701 the OR doesn't care who is opening the circuit---Alice uses no public key
 702 and remains anonymous) and unilateral key authentication
 703 (Alice and the OR agree on a key, and Alice knows only the OR learns
 704 it). It also achieves forward
 705 secrecy and key freshness. More formally, the protocol is as follows
 706 (where $E_{PK_{Bob}}(\cdot)$ is encryption with Bob's public key,
 707 $H$ is a secure hash function, and $|$ is concatenation):
 708 \begin{equation*}
 709 \begin{aligned}
 710 \mathrm{Alice} \rightarrow \mathrm{Bob}&: E_{PK_{Bob}}(g^x) \\
 711 \mathrm{Bob} \rightarrow \mathrm{Alice}&: g^y, H(K | \mathrm{``handshake"}) \\
 712 \end{aligned}
 713 \end{equation*}
 714
 715 \noindent In the second step, Bob proves that it was he who received $g^x$,
 716 and who chose $y$. We use PK encryption in the first step
 717 (rather than, say, using the first two steps of STS, which has a
 718 signature in the second step) because a single cell is too small to
 719 hold both a public key and a signature. Preliminary analysis with the
 720 NRL protocol analyzer~\cite{meadows96} shows this protocol to be
 721 secure (including perfect forward secrecy) under the
 722 traditional Dolev-Yao model.\\
 723
 724 \noindent{\large\bf Relay cells}\\
 725 %\subsubsection{Relay cells}
 726 %
 727 Once Alice has established the circuit (so she shares keys with each
 728 OR on the circuit), she can send relay cells.
 729 %Recall that every relay cell has a streamID that indicates to which
 730 %stream the cell belongs.  %This streamID allows a relay cell to be
 731 %addressed to any OR on the circuit.
 732 Upon receiving a relay
 733 cell, an OR looks up the corresponding circuit, and decrypts the relay
 734 header and payload with the session key for that circuit.
 735 If the cell is headed away from Alice the OR then checks whether the
 736 decrypted cell has a valid digest (as an optimization, the first
 737 two bytes of the integrity check are zero, so in most cases we can avoid
 738 computing the hash).
 739 %is recognized---either because it
 740 %corresponds to an open stream at this OR for the given circuit, or because
 741 %it is the control streamID (zero).
 742 If valid, it accepts the relay cell and processes it as described
 743 below.  Otherwise,
 744 the OR looks up the circID and OR for the
 745 next step in the circuit, replaces the circID as appropriate, and
 746 sends the decrypted relay cell to the next OR.  (If the OR at the end
 747 of the circuit receives an unrecognized relay cell, an error has
 748 occurred, and the circuit is torn down.)
 749
 750 OPs treat incoming relay cells similarly: they iteratively unwrap the
 751 relay header and payload with the session keys shared with each
 752 OR on the circuit, from the closest to farthest.
 753 If at any stage the digest is valid, the cell must have
 754 originated at the OR whose encryption has just been removed.
 755
 756 To construct a relay cell addressed to a given OR, Alice assigns the
 757 digest, and then iteratively
 758 encrypts the cell payload (that is, the relay header and payload) with
 759 the symmetric key of each hop up to that OR.  Because the digest is
 760 encrypted to a different value at each step, only at the targeted OR
 761 will it have a meaningful value.\footnote{
 762   % Should we just say that 2^56 is itself negligible?
 763   % Assuming 4-hop circuits with 10 streams per hop, there are 33
 764   % possible bad streamIDs before the last circuit.  This still
 765   % gives an error only once every 2 million terabytes (approx).
 766 With 48 bits of digest per cell, the probability of an accidental
 767 collision is far lower than the chance of hardware failure.}
 768 This \emph{leaky pipe} circuit topology
 769 allows Alice's streams to exit at different ORs on a single circuit.
 770 Alice may choose different exit points because of their exit policies,
 771 or to keep the ORs from knowing that two streams
 772 originate from the same person.
 773
 774 When an OR later replies to Alice with a relay cell, it
 775 encrypts the cell's relay header and payload with the single key it
 776 shares with Alice, and sends the cell back toward Alice along the
 777 circuit.  Subsequent ORs add further layers of encryption as they
 778 relay the cell back to Alice.
 779
 780 To tear down a circuit, Alice sends a \emph{destroy} control
 781 cell. Each OR in the circuit receives the \emph{destroy} cell, closes
 782 all streams on that circuit, and passes a new \emph{destroy} cell
 783 forward. But just as circuits are built incrementally, they can also
 784 be torn down incrementally: Alice can send a \emph{relay
 785 truncate} cell to a single OR on a circuit. That OR then sends a
 786 \emph{destroy} cell forward, and acknowledges with a
 787 \emph{relay truncated} cell. Alice can then extend the circuit to
 788 different nodes, without signaling to the intermediate nodes (or
 789 a limited observer) that she has changed her circuit.
 790 Similarly, if a node on the circuit goes down, the adjacent
 791 node can send a \emph{relay truncated} cell back to Alice.  Thus the
 792 ``break a node and see which circuits go down''
 793 attack~\cite{freedom21-security} is weakened.
 794
 795 \subsection{Opening and closing streams}
 796 \label{subsec:tcp}
 797
 798 When Alice's application wants a TCP connection to a given
 799 address and port, it asks the OP (via SOCKS) to make the
 800 connection. The OP chooses the newest open circuit (or creates one if
 801 needed), and chooses a suitable OR on that circuit to be the
 802 exit node (usually the last node, but maybe others due to exit policy
 803 conflicts; see Section~\ref{subsec:exitpolicies}.) The OP then opens
 804 the stream by sending a \emph{relay begin} cell to the exit node,
 805 using a new random streamID. Once the
 806 exit node connects to the remote host, it responds
 807 with a \emph{relay connected} cell.  Upon receipt, the OP sends a
 808 SOCKS reply to notify the application of its success. The OP
 809 now accepts data from the application's TCP stream, packaging it into
 810 \emph{relay data} cells and sending those cells along the circuit to
 811 the chosen OR.
 812
 813 There's a catch to using SOCKS, however---some applications pass the
 814 alphanumeric hostname to the Tor client, while others resolve it into
 815 an IP address first and then pass the IP address to the Tor client. If
 816 the application does DNS resolution first, Alice thereby reveals her
 817 destination to the remote DNS server, rather than sending the hostname
 818 through the Tor network to be resolved at the far end. Common applications
 819 like Mozilla and SSH have this flaw.
 820
 821 With Mozilla, the flaw is easy to address: the filtering HTTP
 822 proxy called Privoxy gives a hostname to the Tor client, so Alice's
 823 computer never does DNS resolution.
 824 But a portable general solution, such as is needed for
 825 SSH, is
 826 an open problem. Modifying or replacing the local nameserver
 827 can be invasive, brittle, and unportable. Forcing the resolver
 828 library to prefer TCP rather than UDP is hard, and also has
 829 portability problems. Dynamically intercepting system calls to the
 830 resolver library seems a promising direction. We could also provide
 831 a tool similar to \emph{dig} to perform a private lookup through the
 832 Tor network. Currently, we encourage the use of privacy-aware proxies
 833 like Privoxy wherever possible.
 834
 835 Closing a Tor stream is analogous to closing a TCP stream: it uses a
 836 two-step handshake for normal operation, or a one-step handshake for
 837 errors. If the stream closes abnormally, the adjacent node simply sends a
 838 \emph{relay teardown} cell. If the stream closes normally, the node sends
 839 a \emph{relay end} cell down the circuit, and the other side responds with
 840 its own \emph{relay end} cell. Because
 841 all relay cells use layered encryption, only the destination OR knows
 842 that a given relay cell is a request to close a stream.  This two-step
 843 handshake allows Tor to support TCP-based applications that use half-closed
 844 connections.
 845 % such as broken HTTP clients that close their side of the
 846 %stream after writing but are still willing to read.
 847
 848 \subsection{Integrity checking on streams}
 849 \label{subsec:integrity-checking}
 850
 851 Because the old Onion Routing design used a stream cipher without integrity
 852 checking, traffic was
 853 vulnerable to a malleability attack: though the attacker could not
 854 decrypt cells, any changes to encrypted data
 855 would create corresponding changes to the data leaving the network.
 856 This weakness allowed an adversary who could guess the encrypted content
 857 to change a padding cell to a destroy
 858 cell; change the destination address in a \emph{relay begin} cell to the
 859 adversary's webserver; or change an FTP command from
 860 {\tt dir} to {\tt rm~*}. (Even an external
 861 adversary could do this, because the link encryption similarly used a
 862 stream cipher.)
 863
 864 Because Tor uses TLS on its links, external adversaries cannot modify
 865 data. Addressing the insider malleability attack, however, is
 866 more complex.
 867
 868 We could do integrity checking of the relay cells at each hop, either
 869 by including hashes or by using an authenticating cipher mode like
 870 EAX~\cite{eax}, but there are some problems. First, these approaches
 871 impose a message-expansion overhead at each hop, and so we would have to
 872 either leak the path length or waste bytes by padding to a maximum
 873 path length. Second, these solutions can only verify traffic coming
 874 from Alice: ORs would not be able to produce suitable hashes for
 875 the intermediate hops, since the ORs on a circuit do not know the
 876 other ORs' session keys. Third, we have already accepted that our design
 877 is vulnerable to end-to-end timing attacks; so tagging attacks performed
 878 within the circuit provide no additional information to the attacker.
 879
 880 Thus, we check integrity only at the edges of each stream. (Remember that
 881 in our leaky-pipe circuit topology, a stream's edge could be any hop
 882 in the circuit.) When Alice
 883 negotiates a key with a new hop, they each initialize a SHA-1
 884 digest with a derivative of that key,
 885 thus beginning with randomness that only the two of them know.
 886 Then they each incrementally add to the SHA-1 digest the contents of
 887 all relay cells they create, and include with each relay cell the
 888 first four bytes of the current digest.  Each also keeps a SHA-1
 889 digest of data received, to verify that the received hashes are correct.
 890
 891 To be sure of removing or modifying a cell, the attacker must be able
 892 to deduce the current digest state (which depends on all
 893 traffic between Alice and Bob, starting with their negotiated key).
 894 Attacks on SHA-1 where the adversary can incrementally add to a hash
 895 to produce a new valid hash don't work, because all hashes are
 896 end-to-end encrypted across the circuit.  The computational overhead
 897 of computing the digests is minimal compared to doing the AES
 898 encryption performed at each hop of the circuit. We use only four
 899 bytes per cell to minimize overhead; the chance that an adversary will
 900 correctly guess a valid hash
 901 %, plus the payload the current cell,
 902 is
 903 acceptably low, given that the OP or OR tear down the circuit if they
 904 receive a bad hash.
 905
 906 \subsection{Rate limiting and fairness}
 907 \label{subsec:rate-limit}
 908
 909 Volunteers are more willing to run services that can limit
 910 their bandwidth usage. To accommodate them, Tor servers use a
 911 token bucket approach~\cite{tannenbaum96} to
 912 enforce a long-term average rate of incoming bytes, while still
 913 permitting short-term bursts above the allowed bandwidth.
 914 % Current bucket sizes are set to ten seconds' worth of traffic.
 915
 916 %Further, we want to avoid starving any Tor streams. Entire circuits
 917 %could starve if we read greedily from connections and one connection
 918 %uses all the remaining bandwidth. We solve this by dividing the number
 919 %of tokens in the bucket by the number of connections that want to read,
 920 %and reading at most that number of bytes from each connection. We iterate
 921 %this procedure until the number of tokens in the bucket is under some
 922 %threshold (currently 10KB), at which point we greedily read from connections.
 923
 924 Because the Tor protocol outputs about the same number of bytes as it
 925 takes in, it is sufficient in practice to limit only incoming bytes.
 926 With TCP streams, however, the correspondence is not one-to-one:
 927 relaying a single incoming byte can require an entire 512-byte cell.
 928 (We can't just wait for more bytes, because the local application may
 929 be awaiting a reply.) Therefore, we treat this case as if the entire
 930 cell size had been read, regardless of the cell's fullness.
 931
 932 Further, inspired by Rennhard et al's design in~\cite{anonnet}, a
 933 circuit's edges can heuristically distinguish interactive streams from bulk
 934 streams by comparing the frequency with which they supply cells.  We can
 935 provide good latency for interactive streams by giving them preferential
 936 service, while still giving good overall throughput to the bulk
 937 streams. Such preferential treatment presents a possible end-to-end
 938 attack, but an adversary observing both
 939 ends of the stream can already learn this information through timing
 940 attacks.
 941
 942 \subsection{Congestion control}
 943 \label{subsec:congestion}
 944
 945 Even with bandwidth rate limiting, we still need to worry about
 946 congestion, either accidental or intentional. If enough users choose the
 947 same OR-to-OR connection for their circuits, that connection can become
 948 saturated. For example, an attacker could send a large file
 949 through the Tor network to a webserver he runs, and then
 950 refuse to read any of the bytes at the webserver end of the
 951 circuit. Without some congestion control mechanism, these bottlenecks
 952 can propagate back through the entire network. We don't need to
 953 reimplement full TCP windows (with sequence numbers,
 954 the ability to drop cells when we're full and retransmit later, and so
 955 on),
 956 because TCP already guarantees in-order delivery of each
 957 cell.
 958 %But we need to investigate further the effects of the current
 959 %parameters on throughput and latency, while also keeping privacy in mind;
 960 %see Section~\ref{sec:maintaining-anonymity} for more discussion.
 961 We describe our response below.
 962
 963 \textbf{Circuit-level throttling:}
 964 To control a circuit's bandwidth usage, each OR keeps track of two
 965 windows. The \emph{packaging window} tracks how many relay data cells the OR is
 966 allowed to package (from incoming TCP streams) for transmission back to the OP,
 967 and the \emph{delivery window} tracks how many relay data cells it is willing
 968 to deliver to TCP streams outside the network. Each window is initialized
 969 (say, to 1000 data cells). When a data cell is packaged or delivered,
 970 the appropriate window is decremented. When an OR has received enough
 971 data cells (currently 100), it sends a \emph{relay sendme} cell towards the OP,
 972 with streamID zero. When an OR receives a \emph{relay sendme} cell with
 973 streamID zero, it increments its packaging window. Either of these cells
 974 increments the corresponding window by 100. If the packaging window
 975 reaches 0, the OR stops reading from TCP connections for all streams
 976 on the corresponding circuit, and sends no more relay data cells until
 977 receiving a \emph{relay sendme} cell.
 978
 979 The OP behaves identically, except that it must track a packaging window
 980 and a delivery window for every OR in the circuit. If a packaging window
 981 reaches 0, it stops reading from streams destined for that OR.
 982
 983 \textbf{Stream-level throttling}:
 984 The stream-level congestion control mechanism is similar to the
 985 circuit-level mechanism. ORs and OPs use \emph{relay sendme} cells
 986 to implement end-to-end flow control for individual streams across
 987 circuits. Each stream begins with a packaging window (currently 500 cells),
 988 and increments the window by a fixed value (50) upon receiving a \emph{relay
 989 sendme} cell. Rather than always returning a \emph{relay sendme} cell as soon
 990 as enough cells have arrived, the stream-level congestion control also
 991 has to check whether data has been successfully flushed onto the TCP
 992 stream; it sends the \emph{relay sendme} cell only when the number of bytes pending
 993 to be flushed is under some threshold (currently 10 cells' worth).
 994
 995 %% Maybe omit this next paragraph. -NM
 996 %Currently, non-data relay cells do not affect the windows. Thus we
 997 %avoid potential deadlock issues, for example, arising because a stream
 998 %can't send a \emph{relay sendme} cell when its packaging window is empty.
 999
1000 These arbitrarily chosen parameters seem to give tolerable throughput
1001 and delay; see Section~\ref{sec:in-the-wild}.
1002
1003 \section{Rendezvous Points and hidden services}
1004 \label{sec:rendezvous}
1005
1006 Rendezvous points are a building block for \emph{location-hidden
1007 services} (also known as \emph{responder anonymity}) in the Tor
1008 network.  Location-hidden services allow Bob to offer a TCP
1009 service, such as a webserver, without revealing his IP address.
1010 This type of anonymity protects against distributed DoS attacks:
1011 attackers are forced to attack the onion routing network
1012 because they do not know Bob's IP address.
1013
1014 Our design for location-hidden servers has the following goals.
1015 \textbf{Access-control:} Bob needs a way to filter incoming requests,
1016 so an attacker cannot flood Bob simply by making many connections to him.
1017 \textbf{Robustness:} Bob should be able to maintain a long-term pseudonymous
1018 identity even in the presence of router failure. Bob's service must
1019 not be tied to a single OR, and Bob must be able to migrate his service
1020 across ORs. \textbf{Smear-resistance:}
1021 A social attacker
1022 should not be able to ``frame'' a rendezvous router by
1023 offering an illegal or disreputable location-hidden service and
1024 making observers believe the router created that service.
1025 \textbf{Application-transparency:} Although we require users
1026 to run special software to access location-hidden servers, we must not
1027 require them to modify their applications.
1028
1029 We provide location-hiding for Bob by allowing him to advertise
1030 several onion routers (his \emph{introduction points}) as contact
1031 points. He may do this on any robust efficient
1032 key-value lookup system with authenticated updates, such as a
1033 distributed hash table (DHT) like CFS~\cite{cfs:sosp01}.\footnote{
1034 Rather than rely on an external infrastructure, the Onion Routing network
1035 can run the lookup service itself.  Our current implementation provides a
1036 simple lookup system on the
1037 directory servers.} Alice, the client, chooses an OR as her
1038 \emph{rendezvous point}. She connects to one of Bob's introduction
1039 points, informs him of her rendezvous point, and then waits for him
1040 to connect to the rendezvous point. This extra level of indirection
1041 helps Bob's introduction points avoid problems associated with serving
1042 unpopular files directly (for example, if Bob serves
1043 material that the introduction point's community finds objectionable,
1044 or if Bob's service tends to get attacked by network vandals).
1045 The extra level of indirection also allows Bob to respond to some requests
1046 and ignore others.
1047
1048 \subsection{Rendezvous points in Tor}
1049
1050 The following steps are
1051 %We give an overview of the steps of a rendezvous. These are
1052 performed on behalf of Alice and Bob by their local OPs;
1053 application integration is described more fully below.
1054
1055 \begin{tightlist}
1056 \item Bob generates a long-term public key pair to identify his service.
1057 \item Bob chooses some introduction points, and advertises them on
1058       the lookup service, signing the advertisement with his public key.  He
1059       can add more later.
1060 \item Bob builds a circuit to each of his introduction points, and tells
1061       them to wait for requests.
1062 \item Alice learns about Bob's service out of band (perhaps Bob told her,
1063       or she found it on a website).  She retrieves the details of Bob's
1064       service from the lookup service.  If Alice wants to access Bob's
1065       service anonymously, she must connect to the lookup service via Tor.
1066 \item Alice chooses an OR as the rendezvous point (RP) for her connection to
1067       Bob's service. She builds a circuit to the RP, and gives it a
1068       randomly chosen ``rendezvous cookie'' to recognize Bob.
1069 \item Alice opens an anonymous stream to one of Bob's introduction
1070       points, and gives it a message (encrypted with Bob's public key)
1071       telling it about herself,
1072       her RP and rendezvous cookie, and the
1073       start of a DH
1074       handshake. The introduction point sends the message to Bob.
1075 \item If Bob wants to talk to Alice, he builds a circuit to Alice's
1076       RP and sends the rendezvous cookie, the second half of the DH
1077       handshake, and a hash of the session
1078       key they now share. By the same argument as in
1079       Section~\ref{subsubsec:constructing-a-circuit}, Alice knows she
1080       shares the key only with Bob.
1081 \item The RP connects Alice's circuit to Bob's. Note that RP can't
1082       recognize Alice, Bob, or the data they transmit.
1083 \item Alice sends a \emph{relay begin} cell along the circuit. It
1084       arrives at Bob's OP, which connects to Bob's
1085       webserver.
1086 \item An anonymous stream has been established, and Alice and Bob
1087       communicate as normal.
1088 \end{tightlist}
1089
1090 When establishing an introduction point, Bob provides the onion router
1091 with the public key identifying his service.  Bob signs his
1092 messages, so others cannot usurp his introduction point
1093 in the future. He uses the same public key to establish the other
1094 introduction points for his service, and periodically refreshes his
1095 entry in the lookup service.
1096
1097 The message that Alice gives
1098 the introduction point includes a hash of Bob's public key % to identify
1099 %the service, along with
1100 and an optional initial authorization token (the
1101 introduction point can do prescreening, for example to block replays). Her
1102 message to Bob may include an end-to-end authorization token so Bob
1103 can choose whether to respond.
1104 The authorization tokens can be used to provide selective access:
1105 important users can get uninterrupted access.
1106 %important users get tokens to ensure uninterrupted access. %to the
1107 %service.
1108 During normal situations, Bob's service might simply be offered
1109 directly from mirrors, while Bob gives out tokens to high-priority users. If
1110 the mirrors are knocked down,
1111 %by distributed DoS attacks or even
1112 %physical attack,
1113 those users can switch to accessing Bob's service via
1114 the Tor rendezvous system.
1115
1116 Bob's introduction points are themselves subject to DoS---he must
1117 open many introduction points or risk such an attack.
1118 He can provide selected users with a current list or future schedule of
1119 unadvertised introduction points;
1120 this is most practical
1121 if there is a stable and large group of introduction points
1122 available. Bob could also give secret public keys
1123 for consulting the lookup service. All of these approaches
1124 limit exposure even when
1125 some selected users collude in the DoS\@.
1126
1127 \subsection{Integration with user applications}
1128
1129 Bob configures his onion proxy to know the local IP address and port of his
1130 service, a strategy for authorizing clients, and his public key. The onion
1131 proxy anonymously publishes a signed statement of Bob's
1132 public key, an expiration time, and
1133 the current introduction points for his service onto the lookup service,
1134 indexed
1135 by the hash of his public key.  Bob's webserver is unmodified,
1136 and doesn't even know that it's hidden behind the Tor network.
1137
1138 Alice's applications also work unchanged---her client interface
1139 remains a SOCKS proxy.  We encode all of the necessary information
1140 into the fully qualified domain name (FQDN) Alice uses when establishing her
1141 connection. Location-hidden services use a virtual top level domain
1142 called {\tt .onion}: thus hostnames take the form {\tt x.y.onion} where
1143 {\tt x} is the authorization cookie and {\tt y} encodes the hash of
1144 the public key. Alice's onion proxy
1145 examines addresses; if they're destined for a hidden server, it decodes
1146 the key and starts the rendezvous as described above.
1147
1148 \subsection{Previous rendezvous work}
1149 %XXXX Should this get integrated into the earlier related work section? -NM
1150
1151 Rendezvous points in low-latency anonymity systems were first
1152 described for use in ISDN telephony~\cite{jerichow-jsac98,isdn-mixes}.
1153 Later low-latency designs used rendezvous points for hiding location
1154 of mobile phones and low-power location
1155 trackers~\cite{federrath-ih96,reed-protocols97}.  Rendezvous for
1156 anonymizing low-latency
1157 Internet connections was suggested in early Onion Routing
1158 work~\cite{or-ih96}, but the first published design was by Ian
1159 Goldberg~\cite{ian-thesis}. His design differs from
1160 ours in three ways. First, Goldberg suggests that Alice should manually
1161 hunt down a current location of the service via Gnutella; our approach
1162 makes lookup transparent to the user, as well as faster and more robust.
1163 Second, in Tor the client and server negotiate session keys
1164 with Diffie-Hellman, so plaintext is not exposed even at the rendezvous
1165 point. Third,
1166 our design minimizes the exposure from running the
1167 service, to encourage volunteers to offer introduction and rendezvous
1168 services. Tor's introduction points do not output any bytes to the
1169 clients; the rendezvous points don't know the client or the server,
1170 and can't read the data being transmitted. The indirection scheme is
1171 also designed to include authentication/authorization---if Alice doesn't
1172 include the right cookie with her request for service, Bob need not even
1173 acknowledge his existence.
1174
1175 \section{Other design decisions}
1176 \label{sec:other-design}
1177
1178 \subsection{Denial of service}
1179 \label{subsec:dos}
1180
1181 Providing Tor as a public service creates many opportunities for
1182 denial-of-service attacks against the network.  While
1183 flow control and rate limiting (discussed in
1184 Section~\ref{subsec:congestion}) prevent users from consuming more
1185 bandwidth than routers are willing to provide, opportunities remain for
1186 users to
1187 consume more network resources than their fair share, or to render the
1188 network unusable for others.
1189
1190 First of all, there are several CPU-consuming denial-of-service
1191 attacks wherein an attacker can force an OR to perform expensive
1192 cryptographic operations.  For example, an attacker can
1193 %\emph{create} cell full of junk bytes can force an OR to perform an RSA
1194 %decrypt.
1195 %Similarly, an attacker can
1196 fake the start of a TLS handshake, forcing the OR to carry out its
1197 (comparatively expensive) half of the handshake at no real computational
1198 cost to the attacker.
1199
1200 We have not yet implemented any defenses for these attacks, but several
1201 approaches are possible. First, ORs can
1202 require clients to solve a puzzle~\cite{puzzles-tls} while beginning new
1203 TLS handshakes or accepting \emph{create} cells.  So long as these
1204 tokens are easy to verify and computationally expensive to produce, this
1205 approach limits the attack multiplier.  Additionally, ORs can limit
1206 the rate at which they accept \emph{create} cells and TLS connections,
1207 so that
1208 the computational work of processing them does not drown out the
1209 symmetric cryptography operations that keep cells
1210 flowing.  This rate limiting could, however, allow an attacker
1211 to slow down other users when they build new circuits.
1212
1213 % What about link-to-link rate limiting?
1214
1215 Adversaries can also attack the Tor network's hosts and network
1216 links. Disrupting a single circuit or link breaks all streams passing
1217 along that part of the circuit. Users similarly lose service
1218 when a router crashes or its operator restarts it. The current
1219 Tor design treats such attacks as intermittent network failures, and
1220 depends on users and applications to respond or recover as appropriate. A
1221 future design could use an end-to-end TCP-like acknowledgment protocol,
1222 so no streams are lost unless the entry or exit point is
1223 disrupted. This solution would require more buffering at the network
1224 edges, however, and the performance and anonymity implications from this
1225 extra complexity still require investigation.
1226
1227 \subsection{Exit policies and abuse}
1228 \label{subsec:exitpolicies}
1229
1230 % originally, we planned to put the "users only know the hostname,
1231 % not the IP, but exit policies are by IP" problem here too. Not
1232 % worth putting in the submission, but worth thinking about putting
1233 % in sometime somehow. -RD
1234
1235 Exit abuse is a serious barrier to wide-scale Tor deployment. Anonymity
1236 presents would-be vandals and abusers with an opportunity to hide
1237 the origins of their activities. Attackers can harm the Tor network by
1238 implicating exit servers for their abuse. Also, applications that commonly
1239 use IP-based authentication (such as institutional mail or webservers)
1240 can be fooled by the fact that anonymous connections appear to originate
1241 at the exit OR.
1242
1243 We stress that Tor does not enable any new class of abuse. Spammers
1244 and other attackers already have access to thousands of misconfigured
1245 systems worldwide, and the Tor network is far from the easiest way
1246 to launch attacks.
1247 %Indeed, because of its limited
1248 %anonymity, Tor is probably not a good way to commit crimes.
1249 But because the
1250 onion routers can be mistaken for the originators of the abuse,
1251 and the volunteers who run them may not want to deal with the hassle of
1252 explaining anonymity networks to irate administrators, we must block or limit
1253 abuse through the Tor network.
1254
1255 To mitigate abuse issues, each onion router's \emph{exit policy}
1256 describes to which external addresses and ports the router will
1257 connect. On one end of the spectrum are \emph{open exit}
1258 nodes that will connect anywhere. On the other end are \emph{middleman}
1259 nodes that only relay traffic to other Tor nodes, and \emph{private exit}
1260 nodes that only connect to a local host or network.  A private
1261 exit can allow a client to connect to a given host or
1262 network more securely---an external adversary cannot eavesdrop traffic
1263 between the private exit and the final destination, and so is less sure of
1264 Alice's destination and activities. Most onion routers in the current
1265 network function as
1266 \emph{restricted exits} that permit connections to the world at large,
1267 but prevent access to certain abuse-prone addresses and services such
1268 as SMTP.
1269 The OR might also be able to authenticate clients to
1270 prevent exit abuse without harming anonymity~\cite{or-discex00}.
1271
1272 %The abuse issues on closed (e.g. military) networks are different
1273 %from the abuse on open networks like the Internet. While these IP-based
1274 %access controls are still commonplace on the Internet, on closed networks,
1275 %nearly all participants will be honest, and end-to-end authentication
1276 %can be assumed for important traffic.
1277
1278 Many administrators use port restrictions to support only a
1279 limited set of services, such as HTTP, SSH, or AIM.
1280 This is not a complete solution, of course, since abuse opportunities for these
1281 protocols are still well known.
1282
1283 We have not yet encountered any abuse in the deployed network, but if
1284 we do we should consider using proxies to clean traffic for certain
1285 protocols as it leaves the network.  For example, much abusive HTTP
1286 behavior (such as exploiting buffer overflows or well-known script
1287 vulnerabilities) can be detected in a straightforward manner.
1288 Similarly, one could run automatic spam filtering software (such as
1289 SpamAssassin) on email exiting the OR network.
1290
1291 ORs may also rewrite exiting traffic to append
1292 headers or other information indicating that the traffic has passed
1293 through an anonymity service.  This approach is commonly used
1294 by email-only anonymity systems.  ORs can also
1295 run on servers with hostnames like {\tt anonymous} to further
1296 alert abuse targets to the nature of the anonymous traffic.
1297
1298 A mixture of open and restricted exit nodes allows the most
1299 flexibility for volunteers running servers. But while having many
1300 middleman nodes provides a large and robust network,
1301 having only a few exit nodes reduces the number of points
1302 an adversary needs to monitor for traffic analysis, and places a
1303 greater burden on the exit nodes.  This tension can be seen in the
1304 Java Anon Proxy
1305 cascade model, wherein only one node in each cascade needs to handle
1306 abuse complaints---but an adversary only needs to observe the entry
1307 and exit of a cascade to perform traffic analysis on all that
1308 cascade's users. The hydra model (many entries, few exits) presents a
1309 different compromise: only a few exit nodes are needed, but an
1310 adversary needs to work harder to watch all the clients; see
1311 Section~\ref{sec:conclusion}.
1312
1313 Finally, we note that exit abuse must not be dismissed as a peripheral
1314 issue: when a system's public image suffers, it can reduce the number
1315 and diversity of that system's users, and thereby reduce the anonymity
1316 of the system itself.  Like usability, public perception is a
1317 security parameter.  Sadly, preventing abuse of open exit nodes is an
1318 unsolved problem, and will probably remain an arms race for the
1319 foreseeable future.  The abuse problems faced by Princeton's CoDeeN
1320 project~\cite{darkside} give us a glimpse of likely issues.
1321
1322 \subsection{Directory Servers}
1323 \label{subsec:dirservers}
1324
1325 First-generation Onion Routing designs~\cite{freedom2-arch,or-jsac98} used
1326 in-band network status updates: each router flooded a signed statement
1327 to its neighbors, which propagated it onward. But anonymizing networks
1328 have different security goals than typical link-state routing protocols.
1329 For example, delays (accidental or intentional)
1330 that can cause different parts of the network to have different views
1331 of link-state and topology are not only inconvenient: they give
1332 attackers an opportunity to exploit differences in client knowledge.
1333 We also worry about attacks to deceive a
1334 client about the router membership list, topology, or current network
1335 state. Such \emph{partitioning attacks} on client knowledge help an
1336 adversary to efficiently deploy resources
1337 against a target~\cite{minion-design}.
1338
1339 Tor uses a small group of redundant, well-known onion routers to
1340 track changes in network topology and node state, including keys and
1341 exit policies.  Each such \emph{directory server} acts as an HTTP
1342 server, so clients can fetch current network state
1343 and router lists, and so other ORs can upload
1344 state information.  Onion routers periodically publish signed
1345 statements of their state to each directory server. The directory servers
1346 combine this information with their own views of network liveness,
1347 and generate a signed description (a \emph{directory}) of the entire
1348 network state. Client software is
1349 pre-loaded with a list of the directory servers and their keys,
1350 to bootstrap each client's view of the network.
1351 % XXX this means that clients will be forced to upgrade as the
1352 % XXX dirservers change or get compromised. argue that this is ok.
1353
1354 When a directory server receives a signed statement for an OR, it
1355 checks whether the OR's identity key is recognized. Directory
1356 servers do not advertise unrecognized ORs---if they did,
1357 an adversary could take over the network by creating many
1358 servers~\cite{sybil}. Instead, new nodes must be approved by the
1359 directory
1360 server administrator before they are included. Mechanisms for automated
1361 node approval are an area of active research, and are discussed more
1362 in Section~\ref{sec:maintaining-anonymity}.
1363
1364 Of course, a variety of attacks remain. An adversary who controls
1365 a directory server can track clients by providing them different
1366 information---perhaps by listing only nodes under its control, or by
1367 informing only certain clients about a given node. Even an external
1368 adversary can exploit differences in client knowledge: clients who use
1369 a node listed on one directory server but not the others are vulnerable.
1370
1371 Thus these directory servers must be synchronized and redundant, so
1372 that they can agree on a common directory.  Clients should only trust
1373 this directory if it is signed by a threshold of the directory
1374 servers.
1375
1376 The directory servers in Tor are modeled after those in
1377 Mixminion~\cite{minion-design}, but our situation is easier. First,
1378 we make the
1379 simplifying assumption that all participants agree on the set of
1380 directory servers. Second, while Mixminion needs to predict node
1381 behavior, Tor only needs a threshold consensus of the current
1382 state of the network. Third, we assume that we can fall back to the
1383 human administrators to discover and resolve problems when a consensus
1384 directory cannot be reached. Since there are relatively few directory
1385 servers (currently 3, but we expect as many as 9 as the network scales),
1386 we can afford operations like broadcast to simplify the consensus-building
1387 protocol.
1388
1389 To avoid attacks where a router connects to all the directory servers
1390 but refuses to relay traffic from other routers, the directory servers
1391 must also build circuits and use them to anonymously test router
1392 reliability~\cite{mix-acc}. Unfortunately, this defense is not yet
1393 designed or
1394 implemented.
1395
1396 Using directory servers is simpler and more flexible than flooding.
1397 Flooding is expensive, and complicates the analysis when we
1398 start experimenting with non-clique network topologies. Signed
1399 directories can be cached by other
1400 onion routers,
1401 so directory servers are not a performance
1402 bottleneck when we have many users, and do not aid traffic analysis by
1403 forcing clients to announce their existence to any
1404 central point.
1405
1406 \section{Attacks and Defenses}
1407 \label{sec:attacks}
1408
1409 Below we summarize a variety of attacks, and discuss how well our
1410 design withstands them.\\
1411
1412 \noindent{\large\bf Passive attacks}\\
1413 \emph{Observing user traffic patterns.} Observing a user's connection
1414 will not reveal her destination or data, but it will
1415 reveal traffic patterns (both sent and received). Profiling via user
1416 connection patterns requires further processing, because multiple
1417 application streams may be operating simultaneously or in series over
1418 a single circuit.
1419
1420 \emph{Observing user content.} While content at the user end is encrypted,
1421 connections to responders may not be (indeed, the responding website
1422 itself may be hostile). While filtering content is not a primary goal
1423 of Onion Routing, Tor can directly use Privoxy and related
1424 filtering services to anonymize application data streams.
1425
1426 \emph{Option distinguishability.} We allow clients to choose
1427 configuration options. For example, clients concerned about request
1428 linkability should rotate circuits more often than those concerned
1429 about traceability. Allowing choice may attract users with different
1430 %There is economic incentive to attract users by
1431 %allowing this choice;
1432 needs; but clients who are
1433 in the minority may lose more anonymity by appearing distinct than they
1434 gain by optimizing their behavior~\cite{econymics}.
1435
1436 \emph{End-to-end timing correlation.}  Tor only minimally hides
1437 such correlations. An attacker watching patterns of
1438 traffic at the initiator and the responder will be
1439 able to confirm the correspondence with high probability. The
1440 greatest protection currently available against such confirmation is to hide
1441 the connection between the onion proxy and the first Tor node,
1442 by running the OP on the Tor node or behind a firewall. This approach
1443 requires an observer to separate traffic originating at the onion
1444 router from traffic passing through it: a global observer can do this,
1445 but it might be beyond a limited observer's capabilities.
1446
1447 \emph{End-to-end size correlation.} Simple packet counting
1448 will also be effective in confirming
1449 endpoints of a stream. However, even without padding, we may have some
1450 limited protection: the leaky pipe topology means different numbers
1451 of packets may enter one end of a circuit than exit at the other.
1452
1453 \emph{Website fingerprinting.} All the effective passive
1454 attacks above are traffic confirmation attacks,
1455 which puts them outside our design goals. There is also
1456 a passive traffic analysis attack that is potentially effective.
1457 Rather than searching exit connections for timing and volume
1458 correlations, the adversary may build up a database of
1459 ``fingerprints'' containing file sizes and access patterns for
1460 targeted websites. He can later confirm a user's connection to a given
1461 site simply by consulting the database. This attack has
1462 been shown to be effective against SafeWeb~\cite{hintz-pet02}.
1463 It may be less effective against Tor, since
1464 streams are multiplexed within the same circuit, and
1465 fingerprinting will be limited to
1466 the granularity of cells (currently 512 bytes). Additional
1467 defenses could include
1468 larger cell sizes, padding schemes to group websites
1469 into large sets, and link
1470 padding or long-range dummies.\footnote{Note that this fingerprinting
1471 attack should not be confused with the much more complicated latency
1472 attacks of~\cite{back01}, which require a fingerprint of the latencies
1473 of all circuits through the network, combined with those from the
1474 network edges to the target user and the responder website.}\\
1475
1476 \noindent{\large\bf Active attacks}\\
1477 \emph{Compromise keys.} An attacker who learns the TLS session key can
1478 see control cells and encrypted relay cells on every circuit on that
1479 connection; learning a circuit
1480 session key lets him unwrap one layer of the encryption. An attacker
1481 who learns an OR's TLS private key can impersonate that OR for the TLS
1482 key's lifetime, but he must
1483 also learn the onion key to decrypt \emph{create} cells (and because of
1484 perfect forward secrecy, he cannot hijack already established circuits
1485 without also compromising their session keys). Periodic key rotation
1486 limits the window of opportunity for these attacks. On the other hand,
1487 an attacker who learns a node's identity key can replace that node
1488 indefinitely by sending new forged descriptors to the directory servers.
1489
1490 \emph{Iterated compromise.} A roving adversary who can
1491 compromise ORs (by system intrusion, legal coercion, or extralegal
1492 coercion) could march down the circuit compromising the
1493 nodes until he reaches the end.  Unless the adversary can complete
1494 this attack within the lifetime of the circuit, however, the ORs
1495 will have discarded the necessary information before the attack can
1496 be completed.  (Thanks to the perfect forward secrecy of session
1497 keys, the attacker cannot force nodes to decrypt recorded
1498 traffic once the circuits have been closed.)  Additionally, building
1499 circuits that cross jurisdictions can make legal coercion
1500 harder---this phenomenon is commonly called ``jurisdictional
1501 arbitrage.'' The Java Anon Proxy project recently experienced the
1502 need for this approach, when
1503 a German court forced them to add a backdoor to
1504 their nodes~\cite{jap-backdoor}.
1505
1506 \emph{Run a recipient.} An adversary running a webserver
1507 trivially learns the timing patterns of users connecting to it, and
1508 can introduce arbitrary patterns in its responses.
1509 End-to-end attacks become easier: if the adversary can induce
1510 users to connect to his webserver (perhaps by advertising
1511 content targeted to those users), he now holds one end of their
1512 connection.  There is also a danger that application
1513 protocols and associated programs can be induced to reveal information
1514 about the initiator. Tor depends on Privoxy and similar protocol cleaners
1515 to solve this latter problem.
1516
1517 \emph{Run an onion proxy.} It is expected that end users will
1518 nearly always run their own local onion proxy. However, in some
1519 settings, it may be necessary for the proxy to run
1520 remotely---typically, in institutions that want
1521 to monitor the activity of those connecting to the proxy.
1522 Compromising an onion proxy compromises all future connections
1523 through it.
1524
1525 \emph{DoS non-observed nodes.} An observer who can only watch some
1526 of the Tor network can increase the value of this traffic
1527 by attacking non-observed nodes to shut them down, reduce
1528 their reliability, or persuade users that they are not trustworthy.
1529 The best defense here is robustness.
1530
1531 \emph{Run a hostile OR.}  In addition to being a local observer,
1532 an isolated hostile node can create circuits through itself, or alter
1533 traffic patterns to affect traffic at other nodes. Nonetheless, a hostile
1534 node must be immediately adjacent to both endpoints to compromise the
1535 anonymity of a circuit. If an adversary can
1536 run multiple ORs, and can persuade the directory servers
1537 that those ORs are trustworthy and independent, then occasionally
1538 some user will choose one of those ORs for the start and another
1539 as the end of a circuit. If an adversary
1540 controls $m>1$ of $N$ nodes, he can correlate at most
1541 $\left(\frac{m}{N}\right)^2$ of the traffic---although an
1542 adversary
1543 could still attract a disproportionately large amount of traffic
1544 by running an OR with a permissive exit policy, or by
1545 degrading the reliability of other routers.
1546
1547 \emph{Introduce timing into messages.} This is simply a stronger
1548 version of passive timing attacks already discussed earlier.
1549
1550 \emph{Tagging attacks.} A hostile node could ``tag'' a
1551 cell by altering it. If the
1552 stream were, for example, an unencrypted request to a Web site,
1553 the garbled content coming out at the appropriate time would confirm
1554 the association. However, integrity checks on cells prevent
1555 this attack.
1556
1557 \emph{Replace contents of unauthenticated protocols.}  When
1558 relaying an unauthenticated protocol like HTTP, a hostile exit node
1559 can impersonate the target server. Clients
1560 should prefer protocols with end-to-end authentication.
1561
1562 \emph{Replay attacks.} Some anonymity protocols are vulnerable
1563 to replay attacks.  Tor is not; replaying one side of a handshake
1564 will result in a different negotiated session key, and so the rest
1565 of the recorded session can't be used.
1566
1567 \emph{Smear attacks.} An attacker could use the Tor network for
1568 socially disapproved acts, to bring the
1569 network into disrepute and get its operators to shut it down.
1570 Exit policies reduce the possibilities for abuse, but
1571 ultimately the network requires volunteers who can tolerate
1572 some political heat.
1573
1574 \emph{Distribute hostile code.} An attacker could trick users
1575 into running subverted Tor software that did not, in fact, anonymize
1576 their connections---or worse, could trick ORs into running weakened
1577 software that provided users with less anonymity.  We address this
1578 problem (but do not solve it completely) by signing all Tor releases
1579 with an official public key, and including an entry in the directory
1580 that lists which versions are currently believed to be secure.  To
1581 prevent an attacker from subverting the official release itself
1582 (through threats, bribery, or insider attacks), we provide all
1583 releases in source code form, encourage source audits, and
1584 frequently warn our users never to trust any software (even from
1585 us) that comes without source.\\
1586
1587 \noindent{\large\bf Directory attacks}\\
1588 \emph{Destroy directory servers.}  If a few directory
1589 servers disappear, the others still decide on a valid
1590 directory.  So long as any directory servers remain in operation,
1591 they will still broadcast their views of the network and generate a
1592 consensus directory.  (If more than half are destroyed, this
1593 directory will not, however, have enough signatures for clients to
1594 use it automatically; human intervention will be necessary for
1595 clients to decide whether to trust the resulting directory.)
1596
1597 \emph{Subvert a directory server.}  By taking over a directory server,
1598 an attacker can partially influence the final directory.  Since ORs
1599 are included or excluded by majority vote, the corrupt directory can
1600 at worst cast a tie-breaking vote to decide whether to include
1601 marginal ORs.  It remains to be seen how often such marginal cases
1602 occur in practice.
1603
1604 \emph{Subvert a majority of directory servers.} An adversary who controls
1605 more than half the directory servers can include as many compromised
1606 ORs in the final directory as he wishes. We must ensure that directory
1607 server operators are independent and attack-resistant.
1608
1609 \emph{Encourage directory server dissent.}  The directory
1610 agreement protocol assumes that directory server operators agree on
1611 the set of directory servers.  An adversary who can persuade some
1612 of the directory server operators to distrust one another could
1613 split the quorum into mutually hostile camps, thus partitioning
1614 users based on which directory they use.  Tor does not address
1615 this attack.
1616
1617 \emph{Trick the directory servers into listing a hostile OR.}
1618 Our threat model explicitly assumes directory server operators will
1619 be able to filter out most hostile ORs.
1620 % If this is not true, an
1621 % attacker can flood the directory with compromised servers.
1622
1623 \emph{Convince the directories that a malfunctioning OR is
1624 working.}  In the current Tor implementation, directory servers
1625 assume that an OR is running correctly if they can start a TLS
1626 connection to it.  A hostile OR could easily subvert this test by
1627 accepting TLS connections from ORs but ignoring all cells. Directory
1628 servers must actively test ORs by building circuits and streams as
1629 appropriate.  The tradeoffs of a similar approach are discussed
1630 in~\cite{mix-acc}.\\
1631
1632 \noindent{\large\bf Attacks against rendezvous points}\\
1633 \emph{Make many introduction requests.}  An attacker could
1634 try to deny Bob service by flooding his introduction points with
1635 requests.  Because the introduction points can block requests that
1636 lack authorization tokens, however, Bob can restrict the volume of
1637 requests he receives, or require a certain amount of computation for
1638 every request he receives.
1639
1640 \emph{Attack an introduction point.} An attacker could
1641 disrupt a location-hidden service by disabling its introduction
1642 points.  But because a service's identity is attached to its public
1643 key, the service can simply re-advertise
1644 itself at a different introduction point. Advertisements can also be
1645 done secretly so that only high-priority clients know the address of
1646 Bob's introduction points or so that different clients know of different
1647 introduction points. This forces the attacker to disable all possible
1648 introduction points.
1649
1650 \emph{Compromise an introduction point.} An attacker who controls
1651 Bob's introduction point can flood Bob with
1652 introduction requests, or prevent valid introduction requests from
1653 reaching him. Bob can notice a flood, and close the circuit.  To notice
1654 blocking of valid requests, however, he should periodically test the
1655 introduction point by sending rendezvous requests and making
1656 sure he receives them.
1657
1658 \emph{Compromise a rendezvous point.}  A rendezvous
1659 point is no more sensitive than any other OR on
1660 a circuit, since all data passing through the rendezvous is encrypted
1661 with a session key shared by Alice and Bob.
1662
1663 \section{Early experiences: Tor in the Wild}
1664 \label{sec:in-the-wild}
1665
1666 As of mid-May 2004, the Tor network consists of 32 nodes
1667 (24 in the US, 8 in Europe), and more are joining each week as the code
1668 matures. (For comparison, the current remailer network
1669 has about 40 nodes.) % We haven't asked PlanetLab to provide
1670 %Tor nodes, since their AUP wouldn't allow exit nodes (see
1671 %also~\cite{darkside}) and because we aim to build a long-term community of
1672 %node operators and developers.}
1673 Each node has at least a 768Kb/768Kb connection, and
1674 many have 10Mb. The number of users varies (and of course, it's hard to
1675 tell for sure), but we sometimes have several hundred users---administrators at
1676 several companies have begun sending their entire departments' web
1677 traffic through Tor, to block other divisions of
1678 their company from reading their traffic. Tor users have reported using
1679 the network for web browsing, FTP, IRC, AIM, Kazaa, SSH, and
1680 recipient-anonymous email via rendezvous points. One user has anonymously
1681 set up a Wiki as a hidden service, where other users anonymously publish
1682 the addresses of their hidden services.
1683
1684 Each Tor node currently processes roughly 800,000 relay
1685 cells (a bit under half a gigabyte) per week. On average, about 80\%
1686 of each 498-byte payload is full for cells going back to the client,
1687 whereas about 40\% is full for cells coming from the client. (The difference
1688 arises because most of the network's traffic is web browsing.) Interactive
1689 traffic like SSH brings down the average a lot---once we have more
1690 experience, and assuming we can resolve the anonymity issues, we may
1691 partition traffic into two relay cell sizes: one to handle
1692 bulk traffic and one for interactive traffic.
1693
1694 Based in part on our restrictive default exit policy (we
1695 reject SMTP requests) and our low profile, we have had no abuse
1696 issues since the network was deployed in October
1697 2003. Our slow growth rate gives us time to add features,
1698 resolve bugs, and get a feel for what users actually want from an
1699 anonymity system.  Even though having more users would bolster our
1700 anonymity sets, we are not eager to attract the Kazaa or warez
1701 communities---we feel that we must build a reputation for privacy, human
1702 rights, research, and other socially laudable activities.
1703
1704 As for performance, profiling shows that Tor spends almost
1705 all its CPU time in AES, which is fast.  Current latency is attributable
1706 to two factors. First, network latency is critical: we are
1707 intentionally bouncing traffic around the world several times. Second,
1708 our end-to-end congestion control algorithm focuses on protecting
1709 volunteer servers from accidental DoS rather than on optimizing
1710 performance. % Right now the first $500 \times 500\mbox{B}=250\mbox{KB}$
1711 %of the stream arrives
1712 %quickly, and after that throughput depends on the rate that \emph{relay
1713 %sendme} acknowledgments arrive.
1714 To quantify these effects, we did some informal tests using a network of 4
1715 nodes on the same machine (a heavily loaded 1GHz Athlon). We downloaded a 60
1716 megabyte file from {\tt debian.org} every 30 minutes for 54 hours (108 sample
1717 points). It arrived in about 300 seconds on average, compared to 210s for a
1718 direct download. We ran a similar test on the production Tor network,
1719 fetching the front page of {\tt cnn.com} (55 kilobytes):
1720 % every 20 seconds for 8952 data points
1721 while a direct
1722 download consistently took about 0.3s, the performance through Tor varied.
1723 Some downloads were as fast as 0.4s, with a median at 2.8s, and
1724 90\% finishing within 5.3s.  It seems that as the network expands, the chance
1725 of building a slow circuit (one that includes a slow or heavily loaded node
1726 or link) is increasing.  On the other hand, as our users remain satisfied
1727 with this increased latency, we can address our performance incrementally as we
1728 proceed with development. %\footnote{For example, we have just begun pushing
1729 %a pipelining patch to the production network that seems to decrease
1730 %latency for medium-to-large files; we will present revised benchmarks
1731 %as they become available.}
1732
1733 %With the current network's topology and load, users can typically get 1-2
1734 %megabits sustained transfer rate, which is good enough for now.
1735 %Indeed, the Tor
1736 %design aims foremost to provide a security research platform; performance
1737 %only needs to be sufficient to retain users~\cite{econymics,back01}.
1738 %We can tweak the congestion control
1739 %parameters to provide faster throughput at the cost of
1740 %larger buffers at each node; adding the heuristics mentioned in
1741 %Section~\ref{subsec:rate-limit} to favor low-volume
1742 %streams may also help. More research remains to find the
1743 %right balance.
1744 % We should say _HOW MUCH_ latency there is in these cases. -NM
1745
1746 %performs badly on lossy networks. may need airhook or something else as
1747 %transport alternative?
1748
1749 Although Tor's clique topology and full-visibility directories present
1750 scaling problems, we still expect the network to support a few hundred
1751 nodes and maybe 10,000 users before we're forced to become
1752 more distributed. With luck, the experience we gain running the current
1753 topology will help us choose among alternatives when the time comes.
1754
1755 \section{Open Questions in Low-latency Anonymity}
1756 \label{sec:maintaining-anonymity}
1757
1758 In addition to the non-goals in
1759 Section~\ref{subsec:non-goals}, many questions must be solved
1760 before we can be confident of Tor's security.
1761
1762 Many of these open issues are questions of balance. For example,
1763 how often should users rotate to fresh circuits? Frequent rotation
1764 is inefficient, expensive, and may lead to intersection attacks and
1765 predecessor attacks~\cite{wright03}, but infrequent rotation makes the
1766 user's traffic linkable. Besides opening fresh circuits, clients can
1767 also exit from the middle of the circuit,
1768 or truncate and re-extend the circuit. More analysis is
1769 needed to determine the proper tradeoff.
1770
1771 %% Duplicated by 'Better directory distribution' in section 9.
1772 %
1773 %A similar question surrounds timing of directory operations: how often
1774 %should directories be updated?  Clients that update infrequently receive
1775 %an inaccurate picture of the network, but frequent updates can overload
1776 %the directory servers. More generally, we must find more
1777 %decentralized yet practical ways to distribute up-to-date snapshots of
1778 %network status without introducing new attacks.
1779
1780 How should we choose path lengths? If Alice always uses two hops,
1781 then both ORs can be certain that by colluding they will learn about
1782 Alice and Bob. In our current approach, Alice always chooses at least
1783 three nodes unrelated to herself and her destination.
1784 %% This point is subtle, but not IMO necessary.  Anybody who thinks
1785 %% about it will see that it's implied by the above sentence; anybody
1786 %% who doesn't think about it is safe in his ignorance.
1787 %
1788 %Thus normally she chooses
1789 %three nodes, but if she is running an OR and her destination is on an OR,
1790 %she uses five.
1791 Should Alice choose a random path length (e.g.~from a geometric
1792 distribution) to foil an attacker who
1793 uses timing to learn that he is the fifth hop and thus concludes that
1794 both Alice and the responder are running ORs?
1795
1796 Throughout this paper, we have assumed that end-to-end traffic
1797 confirmation will immediately and automatically defeat a low-latency
1798 anonymity system. Even high-latency anonymity systems can be
1799 vulnerable to end-to-end traffic confirmation, if the traffic volumes
1800 are high enough, and if users' habits are sufficiently
1801 distinct~\cite{statistical-disclosure,limits-open}. Can anything be
1802 done to
1803 make low-latency systems resist these attacks as well as high-latency
1804 systems? Tor already makes some effort to conceal the starts and ends of
1805 streams by wrapping long-range control commands in identical-looking
1806 relay cells. Link padding could frustrate passive observers who count
1807 packets; long-range padding could work against observers who own the
1808 first hop in a circuit. But more research remains to find an efficient
1809 and practical approach. Volunteers prefer not to run constant-bandwidth
1810 padding; but no convincing traffic shaping approach has been
1811 specified. Recent work on long-range padding~\cite{defensive-dropping}
1812 shows promise. One could also try to reduce correlation in packet timing
1813 by batching and re-ordering packets, but it is unclear whether this could
1814 improve anonymity without introducing so much latency as to render the
1815 network unusable.
1816
1817 A cascade topology may better defend against traffic confirmation by
1818 aggregating users, and making padding and
1819 mixing more affordable.  Does the hydra topology (many input nodes,
1820 few output nodes) work better against some adversaries? Are we going
1821 to get a hydra anyway because most nodes will be middleman nodes?
1822
1823 Common wisdom suggests that Alice should run her own OR for best
1824 anonymity, because traffic coming from her node could plausibly have
1825 come from elsewhere. How much mixing does this approach need?  Is it
1826 immediately beneficial because of real-world adversaries that can't
1827 observe Alice's router, but can run routers of their own?
1828
1829 To scale to many users, and to prevent an attacker from observing the
1830 whole network, it may be necessary
1831 to support far more servers than Tor currently anticipates.
1832 This introduces several issues.  First, if approval by a central set
1833 of directory servers is no longer feasible, what mechanism should be used
1834 to prevent adversaries from signing up many colluding servers? Second,
1835 if clients can no longer have a complete picture of the network,
1836 how can they perform discovery while preventing attackers from
1837 manipulating or exploiting gaps in their knowledge?  Third, if there
1838 are too many servers for every server to constantly communicate with
1839 every other, which non-clique topology should the network use?
1840 (Restricted-route topologies promise comparable anonymity with better
1841 scalability~\cite{danezis:pet2003}, but whatever topology we choose, we
1842 need some way to keep attackers from manipulating their position within
1843 it~\cite{casc-rep}.) Fourth, if no central authority is tracking
1844 server reliability, how do we stop unreliable servers from making
1845 the network unusable?  Fifth, do clients receive so much anonymity
1846 from running their own ORs that we should expect them all to do
1847 so~\cite{econymics}, or do we need another incentive structure to
1848 motivate them?  Tarzan and MorphMix present possible solutions.
1849
1850 % advogato, captcha
1851
1852 When a Tor node goes down, all its circuits (and thus streams) must break.
1853 Will users abandon the system because of this brittleness? How well
1854 does the method in Section~\ref{subsec:dos} allow streams to survive
1855 node failure? If affected users rebuild circuits immediately, how much
1856 anonymity is lost? It seems the problem is even worse in a peer-to-peer
1857 environment---such systems don't yet provide an incentive for peers to
1858 stay connected when they're done retrieving content, so we would expect
1859 a higher churn rate.
1860
1861 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1862
1863 \section{Future Directions}
1864 \label{sec:conclusion}
1865
1866 Tor brings together many innovations into a unified deployable system. The
1867 next immediate steps include:
1868
1869 \emph{Scalability:} Tor's emphasis on deployability and design simplicity
1870 has led us to adopt a clique topology, semi-centralized
1871 directories, and a full-network-visibility model for client
1872 knowledge. These properties will not scale past a few hundred servers.
1873 Section~\ref{sec:maintaining-anonymity} describes some promising
1874 approaches, but more deployment experience will be helpful in learning
1875 the relative importance of these bottlenecks.
1876
1877 \emph{Bandwidth classes:} This paper assumes that all ORs have
1878 good bandwidth and latency. We should instead adopt the MorphMix model,
1879 where nodes advertise their bandwidth level (DSL, T1, T3), and
1880 Alice avoids bottlenecks by choosing nodes that match or
1881 exceed her bandwidth. In this way DSL users can usefully join the Tor
1882 network.
1883
1884 \emph{Incentives:} Volunteers who run nodes are rewarded with publicity
1885 and possibly better anonymity~\cite{econymics}. More nodes means increased
1886 scalability, and more users can mean more anonymity. We need to continue
1887 examining the incentive structures for participating in Tor. Further,
1888 we need to explore more approaches to limiting abuse, and understand
1889 why most people don't bother using privacy systems.
1890
1891 \emph{Cover traffic:} Currently Tor omits cover traffic---its costs
1892 in performance and bandwidth are clear but its security benefits are
1893 not well understood. We must pursue more research on link-level cover
1894 traffic and long-range cover traffic to determine whether some simple padding
1895 method offers provable protection against our chosen adversary.
1896
1897 %%\emph{Offer two relay cell sizes:} Traffic on the Internet tends to be
1898 %%large for bulk transfers and small for interactive traffic. One cell
1899 %%size cannot be optimal for both types of traffic.
1900 % This should go in the spec and todo, but not the paper yet. -RD
1901
1902 \emph{Caching at exit nodes:} Perhaps each exit node should run a
1903 caching web proxy~\cite{shsm03}, to improve anonymity for cached pages
1904 (Alice's request never
1905 leaves the Tor network), to improve speed, and to reduce bandwidth cost.
1906 On the other hand, forward security is weakened because caches
1907 constitute a record of retrieved files.  We must find the right
1908 balance between usability and security.
1909
1910 \emph{Better directory distribution:}
1911 Clients currently download a description of
1912 the entire network every 15 minutes. As the state grows larger
1913 and clients more numerous, we may need a solution in which
1914 clients receive incremental updates to directory state.
1915 More generally, we must find more
1916 scalable yet practical ways to distribute up-to-date snapshots of
1917 network status without introducing new attacks.
1918
1919 \emph{Further specification review:} Our public
1920 byte-level specification~\cite{tor-spec} needs
1921 external review.  We hope that as Tor
1922 is deployed, more people will examine its
1923 specification.
1924
1925 \emph{Multisystem interoperability:} We are currently working with the
1926 designer of MorphMix to unify the specification and implementation of
1927 the common elements of our two systems. So far, this seems
1928 to be relatively straightforward.  Interoperability will allow testing
1929 and direct comparison of the two designs for trust and scalability.
1930
1931 \emph{Wider-scale deployment:} The original goal of Tor was to
1932 gain experience in deploying an anonymizing overlay network, and
1933 learn from having actual users.  We are now at a point in design
1934 and development where we can start deploying a wider network.  Once
1935 we have many actual users, we will doubtlessly be better
1936 able to evaluate some of our design decisions, including our
1937 robustness/latency tradeoffs, our performance tradeoffs (including
1938 cell size), our abuse-prevention mechanisms, and
1939 our overall usability.
1940
1941 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1942
1943 %% commented out for anonymous submission
1944 \section*{Acknowledgments}
1945  We thank Peter Palfrader, Geoff Goodell, Adam Shostack, Joseph Sokol-Margolis,
1946    John Bashinski, and Zack Brown
1947    for editing and comments;
1948  Matej Pfajfar, Andrei Serjantov, Marc Rennhard for design discussions;
1949  Bram Cohen for congestion control discussions;
1950  Adam Back for suggesting telescoping circuits; and
1951  Cathy Meadows for formal analysis of the \emph{extend} protocol.
1952  This work has been supported by ONR and DARPA.
1953
1954 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1955
1956 \bibliographystyle{latex8}
1957 \bibliography{tor-design}
1958
1959 \end{document}
1960
1961 % Style guide:
1962 %     U.S. spelling
1963 %     avoid contractions (it's, can't, etc.)
1964 %     prefer ``for example'' or ``such as'' to e.g.
1965 %     prefer ``that is'' to i.e.
1966 %     'mix', 'mixes' (as noun)
1967 %     'mix-net'
1968 %     'mix', 'mixing' (as verb)
1969 %     'middleman'  [Not with a hyphen; the hyphen has been optional
1970 %         since Middle English.]
1971 %     'nymserver'
1972 %     'Cypherpunk', 'Cypherpunks', 'Cypherpunk remailer'
1973 %     'Onion Routing design', 'onion router' [note capitalization]
1974 %     'SOCKS'
1975 %     Try not to use \cite as a noun.
1976 %     'Authorizating' sounds great, but it isn't a word.
1977 %     'First, second, third', not 'Firstly, secondly, thirdly'.
1978 %     'circuit', not 'channel'
1979 %     Typography: no space on either side of an em dash---ever.
1980 %     Hyphens are for multi-part words; en dashs imply movement or
1981 %        opposition (The Alice--Bob connection); and em dashes are
1982 %        for punctuation---like that.
1983 %     A relay cell; a control cell; a \emph{create} cell; a
1984 %     \emph{relay truncated} cell.  Never ``a \emph{relay truncated}.''
1985 %
1986 %     'Substitute ``Damn'' every time you're inclined to write ``very;'' your
1987 %     editor will delete it and the writing will be just as it should be.'
1988 %     -- Mark Twain