doc/tor-design.tex

   1 \documentclass[times,10pt,twocolumn]{article}
   2 \usepackage{latex8}
   3 \usepackage{times}
   4 \usepackage{url}
   5 \usepackage{graphics}
   6 \usepackage{amsmath}
   7
   8 %\pagestyle{empty}
   9
  10 \renewcommand\url{\begingroup \def\UrlLeft{<}\def\UrlRight{>}\urlstyle{tt}\Url}
  11 \newcommand\emailaddr{\begingroup \def\UrlLeft{<}\def\UrlRight{>}\urlstyle{tt}\Url}
  12
  13 \newcommand{\workingnote}[1]{}        % The version that hides the note.
  14 %\newcommand{\workingnote}[1]{(**#1)}   % The version that makes the note visible.
  15
  16
  17 % If an URL ends up with '%'s in it, that's because the line *in the .bib/.tex
  18 % file* is too long, so break it there (it doesn't matter if the next line is
  19 % indented with spaces). -DH
  20
  21 %\newif\ifpdf
  22 %\ifx\pdfoutput\undefined
  23 %   \pdffalse
  24 %\else
  25 %   \pdfoutput=1
  26 %   \pdftrue
  27 %\fi
  28
  29 \newenvironment{tightlist}{\begin{list}{$\bullet$}{
  30   \setlength{\itemsep}{0mm}
  31     \setlength{\parsep}{0mm}
  32     %  \setlength{\labelsep}{0mm}
  33     %  \setlength{\labelwidth}{0mm}
  34     %  \setlength{\topsep}{0mm}
  35     }}{\end{list}}
  36
  37 \begin{document}
  38
  39 %% Use dvipdfm instead. --DH
  40 %\ifpdf
  41 %  \pdfcompresslevel=9
  42 %  \pdfpagewidth=\the\paperwidth
  43 %  \pdfpageheight=\the\paperheight
  44 %\fi
  45
  46 \title{Tor: The Second-Generation Onion Router}
  47 % Putting the 'Private' back in 'Virtual Private Network'
  48
  49 \author{Roger Dingledine \\ The Free Haven Project \\ arma@freehaven.net \and
  50 Nick Mathewson \\ The Free Haven Project \\ nickm@freehaven.net \and
  51 Paul Syverson \\ Naval Research Lab \\ syverson@itd.nrl.navy.mil}
  52
  53 \maketitle
  54 \thispagestyle{empty}
  55
  56 \begin{abstract}
  57 We present Tor, a circuit-based low-latency anonymous communication
  58 service. This second-generation Onion Routing system addresses limitations
  59 in the original design. Tor adds perfect forward secrecy, congestion
  60 control, directory servers, integrity checking, variable exit policies,
  61 and a practical design for rendezvous points. Tor works on the real-world
  62 Internet, requires no special privileges or kernel modifications, requires
  63 little synchronization or coordination between nodes, and provides a
  64 reasonable tradeoff between anonymity, usability, and efficiency. We
  65 close with a list of open problems in anonymous communication.
  66 \end{abstract}
  67
  68 %\begin{center}
  69 %\textbf{Keywords:} anonymity, peer-to-peer, remailer, nymserver, reply block
  70 %\end{center}
  71
  72 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  73
  74 \Section{Overview}
  75 \label{sec:intro}
  76
  77 Onion Routing is a distributed overlay network designed to anonymize
  78 low-latency TCP-based applications such as web browsing, secure shell,
  79 and instant messaging. Clients choose a path through the network and
  80 build a \emph{circuit}, in which each node (or ``onion router'' or ``OR'')
  81 in the path knows its predecessor and successor, but no other nodes in
  82 the circuit.  Traffic flowing down the circuit is sent in fixed-size
  83 \emph{cells}, which are unwrapped by a symmetric key at each node
  84 (like the layers of an onion) and relayed downstream. The
  85 Onion Routing project published several design and analysis papers
  86 \cite{or-ih96,or-jsac98,or-discex00,or-pet00}. While a wide area Onion
  87 Routing network was deployed briefly, the only long-running and
  88 publicly accessible implementation was a fragile
  89 proof-of-concept that ran on a single machine. Even this simple deployment
  90 processed connections from over sixty thousand distinct IP addresses from
  91 all over the world at a rate of about fifty thousand per day.
  92 But many critical design and deployment issues were never
  93 resolved, and the design has not been updated in several years. Here
  94 we describe Tor, a protocol for asynchronous, loosely federated onion
  95 routers that provides the following improvements over the old Onion
  96 Routing design:
  97
  98 \textbf{Perfect forward secrecy:} Onion Routing
  99 was originally vulnerable to a single hostile node recording traffic and
 100 later compromising successive nodes in the circuit and forcing them
 101 to decrypt it. Rather than using a single multiply encrypted data
 102 structure (an \emph{onion}) to lay each circuit,
 103 Tor now uses an incremental or \emph{telescoping} path-building design,
 104 where the initiator negotiates session keys with each successive hop in
 105 the circuit.  Once these keys are deleted, subsequently compromised nodes
 106 cannot decrypt old traffic.  As a side benefit, onion replay detection
 107 is no longer necessary, and the process of building circuits is more
 108 reliable, since the initiator knows when a hop fails and can then try
 109 extending to a new node.
 110
 111 \textbf{Separation of ``protocol cleaning'' from anonymity:}
 112 Onion Routing originally required a separate ``application
 113 proxy'' for each supported application protocol---most of which were
 114 never written, so many applications were never supported.  Tor uses the
 115 standard and near-ubiquitous SOCKS \cite{socks4} proxy interface, allowing
 116 us to support most TCP-based programs without modification.  Tor now
 117 relies on the filtering features of privacy-enhancing
 118 application-level proxies such as Privoxy \cite{privoxy}, without trying
 119 to duplicate those features itself.
 120
 121 \textbf{No mixing, padding, or traffic shaping yet:} Onion
 122 Routing originally called for batching and reordering cells as they arrived,
 123 assumed padding between ORs, and in
 124 later designs added padding between onion proxies (users) and ORs
 125 \cite{or-ih96,or-jsac98}.  Tradeoffs between padding protection
 126 and cost were discussed, and \emph{traffic shaping} algorithms were
 127 theorized \cite{or-pet00} to provide good security without expensive
 128 padding, but no concrete padding scheme was suggested.
 129 Recent research \cite{econymics}
 130 and deployment experience \cite{freedom21-security} suggest that this
 131 level of resource use is not practical or economical; and even full
 132 link padding is still vulnerable \cite{defensive-dropping}. Thus,
 133 until we have a proven and convenient design for traffic shaping or
 134 low-latency mixing that improves anonymity against a realistic
 135 adversary, we leave these strategies out.
 136
 137 \textbf{Many TCP streams can share one circuit:} Onion Routing originally
 138 built a separate circuit for each
 139 application-level request, but this required
 140 multiple public key operations for every request, and also presented
 141 a threat to anonymity from building so many circuits; see
 142 Section~\ref{sec:maintaining-anonymity}.  Tor multiplexes multiple TCP
 143 streams along each circuit to improve efficiency and anonymity.
 144
 145 \textbf{Leaky-pipe circuit topology:} Through in-band signaling
 146 within the circuit, Tor initiators can direct traffic to nodes partway
 147 down the circuit. This novel approach
 148 allows traffic to exit the circuit from the middle---possibly
 149 frustrating traffic shape and volume attacks based on observing the end
 150 of the circuit. (It also allows for long-range padding if
 151 future research shows this to be worthwhile.)
 152
 153 \textbf{Congestion control:} Earlier anonymity designs do not
 154 address traffic bottlenecks. Unfortunately, typical approaches to
 155 load balancing and flow control in overlay networks involve inter-node
 156 control communication and global views of traffic. Tor's decentralized
 157 congestion control uses end-to-end acks to maintain anonymity
 158 while allowing nodes at the edges of the network to detect congestion
 159 or flooding and send less data until the congestion subsides.
 160
 161 \textbf{Directory servers:} The earlier Onion Routing design
 162 planned to flood link-state information through the network---an approach
 163 that can be unreliable and open to partitioning attacks.
 164 Tor takes a simplified view toward distributing such
 165 information. Certain more trusted nodes act as \emph{directory
 166 servers}: they provide signed directories that describe known
 167 routers and their availability. Users periodically download the
 168 directories via HTTP.
 169
 170 \textbf{Variable exit policies:} Tor provides a consistent mechanism
 171 for each node to advertise a policy describing the hosts
 172 and ports to which it will connect. These exit policies are critical
 173 in a volunteer-based distributed infrastructure, because each operator
 174 is comfortable with allowing different types of traffic to exit the Tor
 175 network from his node.
 176
 177 \textbf{End-to-end integrity checking:} The original Onion Routing
 178 design did no integrity checking on data. Any node on the
 179 circuit could change the contents of data cells as they passed by---for
 180 example, to alter a connection request so it would connect
 181 to a different webserver, or to `tag' encrypted traffic and look for
 182 corresponding corrupted traffic at the network edges \cite{minion-design}.
 183 Tor hampers these attacks by checking data integrity before it leaves
 184 the network.
 185
 186 \textbf{Improved robustness to failed nodes:} A failed node
 187 in the old design meant that circuit building failed, but thanks to
 188 Tor's step-by-step circuit building, users notice failed nodes
 189 while building circuits and route around them. Additionally, liveness
 190 information from directories allows users to avoid unreliable nodes in
 191 the first place.
 192
 193 \textbf{Rendezvous points and hidden services:}
 194 Tor provides an integrated mechanism for responder anonymity via
 195 location-protected servers.  Previous Onion Routing designs included
 196 long-lived ``reply onions'' that could be used to build circuits
 197 to a hidden server, but these reply onions did not provide forward
 198 security, and became useless if any node in the path went down
 199 or rotated its keys.  In Tor, clients negotiate {\it rendezvous points}
 200 to connect with hidden servers; reply onions are no longer required.
 201
 202
 203 Unlike Freedom \cite{freedom2-arch}, Tor only tries to anonymize
 204 TCP streams. Not requiring patches (or built-in support) in an
 205 operating system's network stack has been valuable to Tor's
 206 portability and deployability.
 207
 208 We have implemented most of the above features. Our source code is
 209 available under a free license, and, as far as we know, is unencumbered by
 210 patents. We have recently begun deploying a wide-area alpha network
 211 to test the design in practice, to get more experience with usability
 212 and users, and to provide a research platform for experimentation.
 213
 214 We review previous work in Section~\ref{sec:related-work}, describe
 215 our goals and assumptions in Section~\ref{sec:assumptions},
 216 and then address the above list of improvements in
 217 Sections~\ref{sec:design}-\ref{sec:rendezvous}. We summarize
 218 in Section~\ref{sec:attacks} how our design stands up to
 219 known attacks, and conclude with a list of open problems in
 220 Section~\ref{sec:maintaining-anonymity} and future work for the Onion
 221 Routing project in Section~\ref{sec:conclusion}.
 222
 223 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 224
 225 \Section{Related work}
 226 \label{sec:related-work}
 227
 228 Modern anonymity systems date to Chaum's {\bf Mix-Net} design
 229 \cite{chaum-mix}. Chaum
 230 proposed hiding the correspondence between sender and recipient by
 231 wrapping messages in layers of public-key cryptography, and relaying them
 232 through a path composed of ``mixes.''  Each mix in turn
 233 decrypts, delays, and re-orders messages, before relaying them toward
 234 their destinations.
 235
 236 Subsequent relay-based anonymity designs have diverged in two
 237 main directions.  Some have tried to maximize anonymity at
 238 the cost of introducing comparatively large and variable latencies,
 239 including {\bf Babel} \cite{babel}, {\bf Mixmaster}
 240 \cite{mixmaster-spec}, and
 241 {\bf Mixminion} \cite{minion-design}.  Because of this
 242 decision, these \emph{high-latency} networks resist strong global
 243 adversaries,
 244 but introduce too much lag for interactive tasks like web browsing,
 245 internet chat, or SSH connections.
 246
 247 Tor belongs to the second category: \emph{low-latency} designs that
 248 try to anonymize interactive network traffic. These systems handle
 249 a variety of bidirectional protocols.
 250 They also provide more convenient
 251 mail delivery than the high-latency anonymous email
 252 networks, because the remote mail server provides explicit and timely
 253 delivery confirmation.
 254 But because these designs typically
 255 involve many packets that must be delivered quickly, it is
 256 difficult for them to prevent an attacker who can eavesdrop both ends of the
 257 communication from correlating the timing and volume
 258 of traffic entering the anonymity network with traffic leaving it.  These
 259 protocols are also vulnerable against active attacks in which an
 260 adversary introduces timing patterns into traffic entering the network and
 261 looks
 262 for correlated patterns among exiting traffic.
 263 Although some work has been done to frustrate
 264 these attacks, %\footnote{
 265 %  The most common approach is to pad and limit communication to a constant
 266 %  rate, or to limit
 267 %  the variation in traffic shape.  Doing so can have prohibitive bandwidth
 268 %  costs and/or performance limitations.
 269 %}
 270 % Point in the footnote is covered above, yes? -PS
 271  most designs protect primarily against traffic analysis rather than traffic
 272 confirmation (cf.\ Section~\ref{subsec:threat-model}).
 273
 274 The simplest low-latency designs are single-hop proxies such as the
 275 {\bf Anonymizer} \cite{anonymizer}, wherein a single trusted server strips the
 276 data's origin before relaying it.  These designs are easy to
 277 analyze, but users must trust the anonymizing proxy.
 278 Concentrating the traffic to a single point increases the anonymity set
 279 (the people a given user is hiding among), but it is vulnerable if the
 280 adversary can observe all traffic going into and out of the proxy.
 281
 282 More complex are distributed-trust, circuit-based anonymizing systems.
 283 In these designs, a user establishes one or more medium-term bidirectional
 284 end-to-end circuits, and tunnels data in fixed-size cells.
 285 Establishing circuits is computationally expensive and typically
 286 requires public-key
 287 cryptography, whereas relaying cells is comparatively inexpensive and
 288 typically requires only symmetric encryption.
 289 Because a circuit crosses several servers, and each server only knows
 290 the adjacent servers in the circuit, no single server can link a
 291 user to her communication partners.
 292
 293 The {\bf Java Anon Proxy} (also known as JAP or Web MIXes) uses fixed shared
 294 routes known as \emph{cascades}.  As with a single-hop proxy, this
 295 approach aggregates users into larger anonymity sets, but again an
 296 attacker only needs to observe both ends of the cascade to bridge all
 297 the system's traffic.  The Java Anon Proxy's design
 298 calls for padding between end users and the head of the cascade
 299 \cite{web-mix}. However, it is not demonstrated whether the current
 300 implementation's padding policy improves anonymity.
 301
 302 {\bf PipeNet} \cite{back01, pipenet}, another low-latency design proposed at
 303 about the same time as Onion Routing, provided
 304 stronger anonymity at the cost of allowing a single user to shut
 305 down the network simply by not sending. Systems like {\bf ISDN mixes}
 306 \cite{isdn-mixes} were designed for other environments with
 307 different assumptions.
 308
 309 In P2P designs like {\bf Tarzan} \cite{tarzan:ccs02} and {\bf MorphMix}
 310 \cite{morphmix:fc04}, all participants both generate traffic and relay
 311 traffic for others. These systems aim to conceal
 312 whether a given peer originated a request
 313 or just relayed it from another peer. While Tarzan and MorphMix use
 314 layered encryption as above, {\bf Crowds} \cite{crowds-tissec} simply assumes
 315 an adversary who cannot observe the initiator: it uses no public-key
 316 encryption, so any node on a circuit can read that circuit's traffic.
 317
 318 {\bf Hordes} \cite{hordes-jcs} is based on Crowds but also uses multicast
 319 responses to hide the initiator. {\bf Herbivore} \cite{herbivore} and
 320 $\mbox{\bf P}^{\mathbf 5}$ \cite{p5} go even further, requiring broadcast.
 321 These systems are designed primarily for communication between peers,
 322 although Herbivore users can make external connections by
 323 requesting a peer to serve as a proxy.
 324
 325 Systems like {\bf Freedom} and the original Onion Routing build the circuit
 326 all at once, using a layered ``onion'' of public-key encrypted messages,
 327 each layer of which provides a set of session keys and the address of the
 328 next server in the circuit. Tor as described herein, Tarzan, MorphMix,
 329 {\bf Cebolla} \cite{cebolla}, and Rennhard's {\bf Anonymity Network} \cite{anonnet}
 330 build the circuit
 331 in stages, extending it one hop at a time.
 332 Section~\ref{subsubsec:constructing-a-circuit} describes how this
 333 approach makes perfect forward secrecy feasible.
 334
 335 Circuit-based anonymity designs must choose which protocol layer
 336 to anonymize. They may choose to intercept IP packets directly, and
 337 relay them whole (stripping the source address) along the circuit
 338 \cite{freedom2-arch,tarzan:ccs02}.  Alternatively, like
 339 Tor, they may accept TCP streams and relay the data in those streams
 340 along the circuit, ignoring the breakdown of that data into TCP segments
 341 \cite{morphmix:fc04,anonnet}. Finally, they may accept application-level
 342 protocols (such as HTTP) and relay the application requests themselves
 343 along the circuit.
 344 Making this protocol-layer decision requires a compromise between flexibility
 345 and anonymity.  For example, a system that understands HTTP, such as Crowds,
 346 can strip
 347 identifying information from those requests, can take advantage of caching
 348 to limit the number of requests that leave the network, and can batch
 349 or encode those requests in order to minimize the number of connections.
 350 On the other hand, an IP-level anonymizer can handle nearly any protocol,
 351 even ones unforeseen by its designers (though these systems require
 352 kernel-level modifications to some operating systems, and so are more
 353 complex and less portable). TCP-level anonymity networks like Tor present
 354 a middle approach: they are fairly application neutral (so long as the
 355 application supports, or can be tunneled across, TCP), but by treating
 356 application connections as data streams rather than raw TCP packets,
 357 they avoid the well-known inefficiencies of tunneling TCP over TCP
 358 \cite{tcp-over-tcp-is-bad}.
 359
 360 Distributed-trust anonymizing systems need to prevent attackers from
 361 adding too many servers and thus compromising user paths.
 362 Tor relies on a small set of well-known directory servers, run by
 363 independent parties, to decide which nodes can
 364 join. Tarzan and MorphMix allow unknown users to run servers, and use
 365 a limited resource (like IP addresses) to prevent an attacker from
 366 controlling too much of the network.  Crowds suggests requiring
 367 written, notarized requests from potential crowd members.
 368
 369 Anonymous communication is essential for censorship-resistant
 370 systems like Eternity \cite{eternity}, Free~Haven \cite{freehaven-berk},
 371 Publius \cite{publius}, and Tangler \cite{tangler}. Tor's rendezvous
 372 points enable connections between mutually anonymous entities; they
 373 are a building block for location-hidden servers, which are needed by
 374 Eternity and Free~Haven.
 375
 376 % didn't include rewebbers. No clear place to put them, so I'll leave
 377 % them out for now. -RD
 378
 379 \Section{Design goals and assumptions}
 380 \label{sec:assumptions}
 381
 382 \noindent{\large\bf Goals}\\
 383 Like other low-latency anonymity designs, Tor seeks to frustrate
 384 attackers from linking communication partners, or from linking
 385 multiple communications to or from a single user.  Within this
 386 main goal, however, several considerations have directed
 387 Tor's evolution.
 388
 389 \textbf{Deployability:} The design must be deployed and used in the
 390 real world.  Thus it
 391 must not be expensive to run (for example, by requiring more bandwidth
 392 than volunteers are willing to provide); must not place a heavy
 393 liability burden on operators (for example, by allowing attackers to
 394 implicate onion routers in illegal activities); and must not be
 395 difficult or expensive to implement (for example, by requiring kernel
 396 patches, or separate proxies for every protocol).  We also cannot
 397 require non-anonymous parties (such as websites)
 398 to run our software.  (Our rendezvous point design does not meet
 399 this goal for non-anonymous users talking to hidden servers,
 400 however; see Section~\ref{sec:rendezvous}.)
 401
 402 \textbf{Usability:} A hard-to-use system has fewer users---and because
 403 anonymity systems hide users among users, a system with fewer users
 404 provides less anonymity.  Usability is thus not only a convenience:
 405 it is a security requirement \cite{econymics,back01}. Tor should
 406 therefore not
 407 require modifying applications; should not introduce prohibitive delays;
 408 and should require users to make as few configuration decisions
 409 as possible.  Finally, Tor should be easily implemented on all common
 410 platforms; we cannot require users to change their operating system in order
 411 to be anonymous.  (The current Tor implementation runs on Windows and
 412 assorted Unix clones including Linux, FreeBSD, and MacOS X.)
 413
 414 \textbf{Flexibility:} The protocol must be flexible and well-specified,
 415 so Tor can serve as a test-bed for future research.
 416 Many of the open problems in low-latency anonymity
 417 networks, such as generating dummy traffic or preventing Sybil attacks
 418 \cite{sybil}, may be solvable independently from the issues solved by
 419 Tor. Hopefully future systems will not need to reinvent Tor's design.
 420 (But note that while a flexible design benefits researchers,
 421 there is a danger that differing choices of extensions will make users
 422 distinguishable. Experiments should be run on a separate network.)
 423
 424 \textbf{Simple design:} The protocol's design and security
 425 parameters must be well-understood. Additional features impose implementation
 426 and complexity costs; adding unproven techniques to the design threatens
 427 deployability, readability, and ease of security analysis. Tor aims to
 428 deploy a simple and stable system that integrates the best accepted
 429 approaches to protecting anonymity.\\
 430
 431 \noindent{\large\bf Non-goals}\label{subsec:non-goals}\\
 432 In favoring simple, deployable designs, we have explicitly deferred
 433 several possible goals, either because they are solved elsewhere, or because
 434 they are not yet solved.
 435
 436 \textbf{Not peer-to-peer:} Tarzan and MorphMix aim to scale to completely
 437 decentralized peer-to-peer environments with thousands of short-lived
 438 servers, many of which may be controlled by an adversary.  This approach
 439 is appealing, but still has many open problems
 440 \cite{tarzan:ccs02,morphmix:fc04}.
 441
 442 \textbf{Not secure against end-to-end attacks:} Tor does not claim
 443 to provide a definitive solution to end-to-end timing or intersection
 444 attacks. Some approaches, such as running an onion router, may help;
 445 see Section~\ref{sec:maintaining-anonymity} for more discussion.
 446
 447 \textbf{No protocol normalization:} Tor does not provide \emph{protocol
 448 normalization} like Privoxy or the Anonymizer. If anonymization from
 449 the responder is desired for complex and variable
 450 protocols like HTTP, Tor must be layered with a filtering proxy such
 451 as Privoxy to hide differences between clients, and expunge protocol
 452 features that leak identity.
 453 Note that by this separation Tor can also provide services that
 454 are anonymous to the network yet authenticated to the responder, like
 455 SSH. Similarly, Tor does not currently integrate
 456 tunneling for non-stream-based protocols like UDP; this too must be
 457 provided by an external service.
 458
 459 \textbf{Not steganographic:} Tor does not try to conceal who is connected
 460 to the network.
 461
 462 \SubSection{Threat Model}
 463 \label{subsec:threat-model}
 464
 465 A global passive adversary is the most commonly assumed threat when
 466 analyzing theoretical anonymity designs. But like all practical
 467 low-latency systems, Tor does not protect against such a strong
 468 adversary. Instead, we assume an adversary who can observe some fraction
 469 of network traffic; who can generate, modify, delete, or delay
 470 traffic;  who can operate onion routers of its own; and who can
 471 compromise some fraction of the onion routers.
 472
 473 In low-latency anonymity systems that use layered encryption, the
 474 adversary's typical goal is to observe both the initiator and the
 475 responder. By observing both ends, passive attackers can confirm a
 476 suspicion that Alice is
 477 talking to Bob if the timing and volume patterns of the traffic on the
 478 connection are distinct enough; active attackers can induce timing
 479 signatures on the traffic to force distinct patterns. Rather
 480 than focusing on these \emph{traffic confirmation} attacks,
 481 we aim to prevent \emph{traffic
 482 analysis} attacks, where the adversary uses traffic patterns to learn
 483 which points in the network he should attack.
 484
 485 Our adversary might try to link an initiator Alice with her
 486 communication partners, or try to build a profile of Alice's
 487 behavior. He might mount passive attacks by observing the network edges
 488 and correlating traffic entering and leaving the network---by
 489 relationships in packet timing, volume, or externally visible
 490 user-selected
 491 options. The adversary can also mount active attacks by compromising
 492 routers or keys; by replaying traffic; by selectively denying service
 493 to trustworthy routers to move users to
 494 compromised routers, or denying service to users to see if traffic
 495 elsewhere in the
 496 network stops; or by introducing patterns into traffic that can later be
 497 detected. The adversary might subvert the directory servers to give users
 498 differing views of network state. Additionally, he can try to decrease
 499 the network's reliability by attacking nodes or by performing antisocial
 500 activities from reliable servers and trying to get them taken down;
 501 making the network unreliable flushes users to other less anonymous
 502 systems, where they may be easier to attack.
 503 We summarize
 504 in Section~\ref{sec:attacks} how well the Tor design defends against
 505 each of these attacks.
 506
 507 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 508
 509 \Section{The Tor Design}
 510 \label{sec:design}
 511
 512 The Tor network is an overlay network; each onion router (OR)
 513 runs as a normal
 514 user-level process without any special privileges.
 515 Each onion router maintains a long-term TLS \cite{TLS}
 516 connection to every other onion router.
 517 %(We discuss alternatives to this clique-topology assumption in
 518 %Section~\ref{sec:maintaining-anonymity}.)
 519 % A subset of the ORs also act as
 520 %directory servers, tracking which routers are in the network;
 521 %see Section~\ref{subsec:dirservers} for directory server details.
 522 Each user
 523 runs local software called an onion proxy (OP) to fetch directories,
 524 establish circuits across the network,
 525 and handle connections from user applications.  These onion proxies accept
 526 TCP streams and multiplex them across the circuits. The onion
 527 router on the other side
 528 of the circuit connects to the destinations of
 529 the TCP streams and relays data.
 530
 531 Each onion router uses three public keys: a long-term identity key, a
 532 short-term onion key, and a short-term link key.  The identity
 533 key is used to sign TLS certificates, to sign the OR's \emph{router
 534 descriptor} (a summary of its keys, address, bandwidth, exit policy,
 535 and so on), and (by directory servers) to sign directories. Changing
 536 the identity key of a router is considered equivalent to creating a
 537 new router. The onion key is used to decrypt requests
 538 from users to set up a circuit and negotiate ephemeral keys. Finally,
 539 link keys are used by the TLS protocol when communicating between
 540 onion routers. Each short-term key is rotated periodically and
 541 independently, to limit the impact of key compromise.
 542
 543 Section~\ref{subsec:cells} presents the fixed-size
 544 \emph{cells} that are the unit of communication in Tor. We describe
 545 in Section~\ref{subsec:circuits} how circuits are
 546 built, extended, truncated, and destroyed. Section~\ref{subsec:tcp}
 547 describes how TCP streams are routed through the network.  We address
 548 integrity checking in Section~\ref{subsec:integrity-checking},
 549 and resource limiting in Section~\ref{subsec:rate-limit}.
 550 Finally,
 551 Section~\ref{subsec:congestion} talks about congestion control and
 552 fairness issues.
 553
 554 \SubSection{Cells}
 555 \label{subsec:cells}
 556
 557 Onion routers communicate with one another, and with users' OPs, via
 558 TLS connections with ephemeral keys.  Using TLS conceals the data on
 559 the connection with perfect forward secrecy, and prevents an attacker
 560 from modifying data on the wire or impersonating an OR.
 561
 562 Traffic passes along these connections in fixed-size cells.  Each cell
 563 is 256 bytes (but see Section~\ref{sec:conclusion} for a discussion of
 564 allowing large cells and small cells on the same network), and
 565 consists of a header and a payload. The header includes a circuit
 566 identifier (circID) that specifies which circuit the cell refers to
 567 (many circuits can be multiplexed over the single TLS connection), and
 568 a command to describe what to do with the cell's payload.  (Circuit
 569 identifiers are connection-specific: each single circuit has a different
 570 circID on each OP/OR or OR/OR connection it traverses.)
 571 Based on their command, cells are either \emph{control} cells, which are
 572 always interpreted by the node that receives them, or \emph{relay} cells,
 573 which carry end-to-end stream data.   The control cell commands are:
 574 \emph{padding} (currently used for keepalive, but also usable for link
 575 padding); \emph{create} or \emph{created} (used to set up a new circuit);
 576 and \emph{destroy} (to tear down a circuit).
 577
 578 Relay cells have an additional header (the relay header) after the
 579 cell header, containing a stream identifier (many streams can
 580 be multiplexed over a circuit); an end-to-end checksum for integrity
 581 checking; the length of the relay payload; and a relay command.
 582 The entire contents of the relay header and the relay cell payload
 583 are encrypted or decrypted together as the relay cell moves along the
 584 circuit, using the 128-bit AES cipher in counter mode to generate a
 585 cipher stream.
 586 The
 587 relay commands are: \emph{relay
 588 data} (for data flowing down the stream), \emph{relay begin} (to open a
 589 stream), \emph{relay end} (to close a stream cleanly), \emph{relay
 590 teardown} (to close a broken stream), \emph{relay connected}
 591 (to notify the OP that a relay begin has succeeded), \emph{relay
 592 extend} and \emph{relay extended} (to extend the circuit by a hop,
 593 and to acknowledge), \emph{relay truncate} and \emph{relay truncated}
 594 (to tear down only part of the circuit, and to acknowledge), \emph{relay
 595 sendme} (used for congestion control), and \emph{relay drop} (used to
 596 implement long-range dummies).
 597 We describe each of these cell types and commands in more detail below.
 598
 599 \SubSection{Circuits and streams}
 600 \label{subsec:circuits}
 601
 602 Onion Routing originally built one circuit for each
 603 TCP stream.  Because building a circuit can take several tenths of a
 604 second (due to public-key cryptography and network latency),
 605 this design imposed high costs on applications like web browsing that
 606 open many TCP streams.
 607
 608 In Tor, each circuit can be shared by many TCP streams.  To avoid
 609 delays, users construct circuits preemptively.  To limit linkability
 610 among their streams, users' OPs build a new circuit
 611 periodically if the previous one has been used,
 612 and expire old used circuits that no longer have any open streams.
 613 OPs consider making a new circuit once a minute: thus
 614 even heavy users spend negligible time
 615 building circuits, but a limited number of requests can be linked
 616 to each other through a given exit node. Also, because circuits are built
 617 in the background, OPs can recover from failed circuit creation
 618 without delaying streams and thereby harming user experience.\\
 619
 620 \noindent{\large\bf Constructing a circuit}\label{subsubsec:constructing-a-circuit}\\
 621 %\subsubsection{Constructing a circuit}
 622 A user's OP constructs circuits incrementally, negotiating a
 623 symmetric key with each OR on the circuit, one hop at a time. To begin
 624 creating a new circuit, the OP (call her Alice) sends a
 625 \emph{create} cell to the first node in her chosen path (call him Bob).
 626 (She chooses a new
 627 circID $C_{AB}$ not currently used on the connection from her to Bob.)
 628 The \emph{create} cell's
 629 payload contains the first half of the Diffie-Hellman handshake
 630 ($g^x$), encrypted to the onion key of the OR (call him Bob). Bob
 631 responds with a \emph{created} cell containing the second half of the
 632 DH handshake, along with a hash of the negotiated key $K=g^{xy}$.
 633
 634 Once the circuit has been established, Alice and Bob can send one
 635 another relay cells encrypted with the negotiated
 636 key.\footnote{Actually, the negotiated key is used to derive two
 637   symmetric keys: one for each direction.}  More detail is given in
 638 the next section.
 639
 640 To extend the circuit further, Alice sends a \emph{relay extend} cell
 641 to Bob, specifying the address of the next OR (call her Carol), and
 642 an encrypted $g^{x_2}$ for her.  Bob copies the half-handshake into a
 643 \emph{create} cell, and passes it to Carol to extend the circuit.
 644 (Bob chooses a new circID $C_{BC}$ not currently used on the connection
 645 between him and Carol.  Alice never needs to know this circID; only Bob
 646 associates $C_{AB}$ on his connection with Alice to $C_{BC}$ on
 647 his connection with Carol.)
 648 When Carol responds with a \emph{created} cell, Bob wraps the payload
 649 into a \emph{relay extended} cell and passes it back to Alice.  Now
 650 the circuit is extended to Carol, and Alice and Carol share a common key
 651 $K_2 = g^{x_2 y_2}$.
 652
 653 To extend the circuit to a third node or beyond, Alice
 654 proceeds as above, always telling the last node in the circuit to
 655 extend one hop further.
 656
 657 This circuit-level handshake protocol achieves unilateral entity
 658 authentication (Alice knows she's handshaking with the OR, but
 659 the OR doesn't care who is opening the circuit---Alice uses no public key
 660 and is trying to remain anonymous) and unilateral key authentication
 661 (Alice and the OR agree on a key, and Alice knows only the OR learns
 662 it). It also achieves forward
 663 secrecy and key freshness. More formally, the protocol is as follows
 664 (where $E_{PK_{Bob}}(\cdot)$ is encryption with Bob's public key,
 665 $H$ is a secure hash function, and $|$ is concatenation):
 666 \begin{equation*}
 667 \begin{aligned}
 668 \mathrm{Alice} \rightarrow \mathrm{Bob}&: E_{PK_{Bob}}(g^x) \\
 669 \mathrm{Bob} \rightarrow \mathrm{Alice}&: g^y, H(K | \mathrm{``handshake"}) \\
 670 \end{aligned}
 671 \end{equation*}
 672
 673 \noindent In the second step, Bob proves that it was he who received $g^x$,
 674 and who chose $y$. We use PK encryption in the first step
 675 (rather than, say, using the first two steps of STS, which has a
 676 signature in the second step) because a single cell is too small to
 677 hold both a public key and a signature. Preliminary analysis with the
 678 NRL protocol analyzer \cite{meadows96} shows this protocol to be
 679 secure (including perfect forward secrecy) under the
 680 traditional Dolev-Yao model.\\
 681
 682 \noindent{\large\bf Relay cells}\\
 683 %\subsubsection{Relay cells}
 684 %
 685 Once Alice has established the circuit (so she shares keys with each
 686 OR on the circuit), she can send relay cells.  Recall that every relay
 687 cell has a streamID that indicates to which
 688 stream the cell belongs.  This streamID allows a relay cell to be
 689 addressed to any OR on the circuit.  Upon receiving a relay
 690 cell, an OR looks up the corresponding circuit, and decrypts the relay
 691 header and payload with the session key for that circuit.
 692 If the cell is headed downstream (away from Alice) the OR then checks
 693 whether the decrypted streamID is recognized---either because it
 694 corresponds to an open stream at this OR for the given circuit, or because
 695 it is the control streamID (zero).  If the OR recognizes the
 696 streamID, it accepts the relay cell and processes it as described
 697 below.  Otherwise,
 698 the OR looks up the circID and OR for the
 699 next step in the circuit, replaces the circID as appropriate, and
 700 sends the decrypted relay cell to the next OR.  (If the OR at the end
 701 of the circuit receives an unrecognized relay cell, an error has
 702 occurred, and the cell is discarded.)
 703
 704 OPs treat incoming relay cells similarly: they iteratively unwrap the
 705 relay header and payload with the session keys shared with each
 706 OR on the circuit, from the closest to farthest.  (Because we use a
 707 stream cipher, encryption operations may be inverted in any order.)
 708 If at any stage the OP recognizes the streamID, the cell must have
 709 originated at the OR whose encryption has just been removed.
 710
 711 To construct a relay cell addressed to a given OR, Alice iteratively
 712 encrypts the cell payload (that is, the relay header and payload) with
 713 the symmetric key of each hop up to that OR.  Because the streamID is
 714 encrypted to a different value at each step, only at the targeted OR
 715 will it have a meaningful value.\footnote{
 716   % Should we just say that 2^56 is itself negligible?
 717   % Assuming 4-hop circuits with 10 streams per hop, there are 33
 718   % possible bad streamIDs before the last circuit.  This still
 719   % gives an error only once every 2 million terabytes (approx).
 720 With 56 bits of streamID per cell, the probability of an accidental
 721 collision is far lower than the chance of hardware failure.}
 722 This \emph{leaky pipe} circuit topology
 723 allows Alice's streams to exit at different ORs on a single circuit.
 724 Alice may choose different exit points because of their exit policies,
 725 or to keep the ORs from knowing that two streams
 726 originate from the same person.
 727
 728 When an OR later replies to Alice with a relay cell, it
 729 encrypts the cell's relay header and payload with the single key it
 730 shares with Alice, and sends the cell back toward Alice along the
 731 circuit.  Subsequent ORs add further layers of encryption as they
 732 relay the cell back to Alice.
 733
 734 To tear down a whole circuit, Alice sends a \emph{destroy} control
 735 cell. Each OR in the circuit receives the \emph{destroy} cell, closes
 736 all open streams on that circuit, and passes a new \emph{destroy} cell
 737 forward. But just as circuits are built incrementally, they can also
 738 be torn down incrementally: Alice can send a \emph{relay
 739 truncate} cell to a single OR on the circuit. That OR then sends a
 740 \emph{destroy} cell forward, and acknowledges with a
 741 \emph{relay truncated} cell. Alice can then extend the circuit to
 742 different nodes, all without signaling to the intermediate nodes (or
 743 a limited observer) that she has changed her circuit.
 744 Similarly, if a node on the circuit goes down, the adjacent
 745 node can send a \emph{relay truncated} cell back to Alice.  Thus the
 746 ``break a node and see which circuits go down'' attack
 747 \cite{freedom21-security} is weakened.
 748
 749 \SubSection{Opening and closing streams}
 750 \label{subsec:tcp}
 751
 752 When Alice's application wants a TCP connection to a given
 753 address and port, it asks the OP (via SOCKS) to make the
 754 connection. The OP chooses the newest open circuit (or creates one if
 755 none is available), and chooses a suitable OR on that circuit to be the
 756 exit node (usually the last node, but maybe others due to exit policy
 757 conflicts; see Section~\ref{subsec:exitpolicies}.) The OP then opens
 758 the stream by sending a \emph{relay begin} cell to the exit node,
 759 using a streamID of zero (so the OR will recognize it), containing as
 760 its relay payload a new randomly generated streamID, the destination
 761 address, and the destination port.  Once the
 762 exit node completes the connection to the remote host, it responds
 763 with a \emph{relay connected} cell.  Upon receipt, the OP sends a
 764 SOCKS reply to notify the application of its success. The OP
 765 now accepts data from the application's TCP stream, packaging it into
 766 \emph{relay data} cells and sending those cells along the circuit to
 767 the chosen OR.
 768
 769 There's a catch to using SOCKS, however---some applications pass the
 770 alphanumeric hostname to the proxy, while others resolve it into an IP
 771 address first and then pass the IP address to the proxy.  If the
 772 application does DNS resolution first, Alice will thereby
 773 reveal her destination to the DNS server.  Common applications
 774 like Mozilla and SSH have this flaw.
 775
 776 In the case of Mozilla, the flaw is easy to address: the filtering HTTP
 777 proxy called Privoxy does the SOCKS call safely, and Mozilla talks to
 778 Privoxy safely. But a portable general solution, such as is needed for
 779 SSH, is
 780 an open problem. Modifying or replacing the local nameserver
 781 can be invasive, brittle, and not portable. Forcing the resolver
 782 library to do resolution via TCP rather than UDP is
 783 hard, and also has portability problems. We could also provide a
 784 tool similar to \emph{dig} to perform a private lookup through the
 785 Tor network. Our current answer is to encourage the use of
 786 privacy-aware proxies like Privoxy wherever possible.
 787
 788 Closing a Tor stream is analogous to closing a TCP stream: it uses a
 789 two-step handshake for normal operation, or a one-step handshake for
 790 errors. If the stream closes abnormally, the adjacent node simply sends a
 791 \emph{relay teardown} cell. If the stream closes normally, the node sends
 792 a \emph{relay end} cell down the circuit. When the other side has sent
 793 back its own \emph{relay end} cell, the stream can be torn down.  Because
 794 all relay cells use layered encryption, only the destination OR knows
 795 that a given relay cell is a request to close a stream.  This two-step
 796 handshake allows Tor to support TCP-based applications that use half-closed
 797 connections.
 798 % such as broken HTTP clients that close their side of the
 799 %stream after writing but are still willing to read.
 800
 801 \SubSection{Integrity checking on streams}
 802 \label{subsec:integrity-checking}
 803
 804 Because the old Onion Routing design used a stream cipher, traffic was
 805 vulnerable to a malleability attack: though the attacker could not
 806 decrypt cells, any changes to encrypted data
 807 would create corresponding changes to the data leaving the network.
 808 (Even an external adversary could do this, despite link encryption, by
 809 inverting bits on the wire.)
 810
 811 This weakness allowed an adversary to change a padding cell to a destroy
 812 cell; change the destination address in a \emph{relay begin} cell to the
 813 adversary's webserver; or change an FTP command from
 814 {\tt dir} to {\tt rm~*}. Any OR or external adversary
 815 along the circuit could introduce such corruption in a stream, if it
 816 knew or could guess the encrypted content.
 817
 818 Tor prevents external adversaries from mounting this attack by
 819 using TLS on its links, which provides integrity checking.
 820 Addressing the insider malleability attack, however, is
 821 more complex.
 822
 823 We could do integrity checking of the relay cells at each hop, either
 824 by including hashes or by using an authenticating cipher mode like
 825 EAX \cite{eax}, but there are some problems. First, these approaches
 826 impose a message-expansion overhead at each hop, and so we would have to
 827 either leak the path length or waste bytes by padding to a maximum
 828 path length. Second, these solutions can only verify traffic coming
 829 from Alice: ORs would not be able to include suitable hashes for
 830 the intermediate hops, since the ORs on a circuit do not know the
 831 other ORs' session keys. Third, we have already accepted that our design
 832 is vulnerable to end-to-end timing attacks; tagging attacks performed
 833 within the circuit provide no additional information to the attacker.
 834
 835 Thus, we check integrity only at the edges of each stream. When Alice
 836 negotiates a key with a new hop, they each initialize a SHA-1
 837 digest with a derivative of that key,
 838 thus beginning with randomness that only the two of them know. From
 839 then on they each incrementally add to the SHA-1 digest the contents of
 840 all relay cells they create, and include with each relay cell the
 841 first four bytes of the current digest.  Each also keeps a SHA-1
 842 digest of data received, to verify that the received hashes are correct.
 843
 844 To be sure of removing or modifying a cell, the attacker must be able
 845 to either deduce the current digest state (which depends on all
 846 traffic between Alice and Bob, starting with their negotiated key).
 847 Attacks on SHA-1 where the adversary can incrementally add to a hash
 848 to produce a new valid hash don't work, because all hashes are
 849 end-to-end encrypted across the circuit.  The computational overhead
 850 of computing the digests is minimal compared to doing the AES
 851 encryption performed at each hop of the circuit. We use only four
 852 bytes per cell to minimize overhead; the chance that an adversary will
 853 correctly guess a valid hash
 854 %, plus the payload the current cell,
 855 is
 856 acceptably low, given that Alice or Bob tear down the circuit if they
 857 receive a bad hash.
 858
 859 \SubSection{Rate limiting and fairness}
 860 \label{subsec:rate-limit}
 861
 862 Volunteers are generally more willing to run services that can limit
 863 their own bandwidth usage. To accommodate them, Tor servers use a
 864 token bucket approach \cite{tannenbaum96} to
 865 enforce a long-term average rate of incoming bytes, while still
 866 permitting short-term bursts above the allowed bandwidth. Current bucket
 867 sizes are set to ten seconds' worth of traffic.
 868
 869 %Further, we want to avoid starving any Tor streams. Entire circuits
 870 %could starve if we read greedily from connections and one connection
 871 %uses all the remaining bandwidth. We solve this by dividing the number
 872 %of tokens in the bucket by the number of connections that want to read,
 873 %and reading at most that number of bytes from each connection. We iterate
 874 %this procedure until the number of tokens in the bucket is under some
 875 %threshold (currently 10KB), at which point we greedily read from connections.
 876
 877 Because the Tor protocol generates roughly the same number of outgoing
 878 bytes as incoming bytes, it is sufficient in practice to limit only
 879 incoming bytes.
 880 With TCP streams, however, the correspondence is not one-to-one:
 881 relaying a single incoming byte can require an entire 256-byte cell.
 882 (We can't just wait for more bytes, because the local application may
 883 be waiting for a reply.) Therefore, we treat this case as if the entire
 884 cell size had been read, regardless of the fullness of the cell.
 885
 886 Further, inspired by Rennhard et al's design in \cite{anonnet}, a
 887 circuit's edges heuristically distinguish interactive streams from bulk
 888 streams by comparing the frequency with which they supply cells.  We can
 889 provide good latency for interactive streams by giving them preferential
 890 service, while still giving good overall throughput to the bulk
 891 streams. Such preferential treatment presents a possible end-to-end
 892 attack, but an adversary observing both
 893 ends of the stream can already learn this information through timing
 894 attacks.
 895
 896 \SubSection{Congestion control}
 897 \label{subsec:congestion}
 898
 899 Even with bandwidth rate limiting, we still need to worry about
 900 congestion, either accidental or intentional. If enough users choose the
 901 same OR-to-OR connection for their circuits, that connection can become
 902 saturated. For example, an attacker could send a large file
 903 through the Tor network to a webserver he runs, and then
 904 refuse to read any of the bytes at the webserver end of the
 905 circuit. Without some congestion control mechanism, these bottlenecks
 906 can propagate back through the entire network. We don't need to
 907 reimplement full TCP windows (with sequence numbers,
 908 the ability to drop cells when we're full and retransmit later, and so
 909 on),
 910 because TCP already guarantees in-order delivery of each
 911 cell.
 912 %But we need to investigate further the effects of the current
 913 %parameters on throughput and latency, while also keeping privacy in mind;
 914 %see Section~\ref{sec:maintaining-anonymity} for more discussion.
 915 We describe our response below.
 916
 917 \textbf{Circuit-level throttling:}
 918 To control a circuit's bandwidth usage, each OR keeps track of two
 919 windows. The \emph{packaging window} tracks how many relay data cells the OR is
 920 allowed to package (from incoming TCP streams) for transmission back to the OP,
 921 and the \emph{delivery window} tracks how many relay data cells it is willing
 922 to deliver to TCP streams outside the network. Each window is initialized
 923 (say, to 1000 data cells). When a data cell is packaged or delivered,
 924 the appropriate window is decremented. When an OR has received enough
 925 data cells (currently 100), it sends a \emph{relay sendme} cell towards the OP,
 926 with streamID zero. When an OR receives a \emph{relay sendme} cell with
 927 streamID zero, it increments its packaging window. Either of these cells
 928 increments the corresponding window by 100. If the packaging window
 929 reaches 0, the OR stops reading from TCP connections for all streams
 930 on the corresponding circuit, and sends no more relay data cells until
 931 receiving a \emph{relay sendme} cell.
 932
 933 The OP behaves identically, except that it must track a packaging window
 934 and a delivery window for every OR in the circuit. If a packaging window
 935 reaches 0, it stops reading from streams destined for that OR.
 936
 937 \textbf{Stream-level throttling}:
 938 The stream-level congestion control mechanism is similar to the
 939 circuit-level mechanism. ORs and OPs use \emph{relay sendme} cells
 940 to implement end-to-end flow control for individual streams across
 941 circuits. Each stream begins with a packaging window (currently 500 cells),
 942 and increments the window by a fixed value (50) upon receiving a \emph{relay
 943 sendme} cell. Rather than always returning a \emph{relay sendme} cell as soon
 944 as enough cells have arrived, the stream-level congestion control also
 945 has to check whether data has been successfully flushed onto the TCP
 946 stream; it sends the \emph{relay sendme} cell only when the number of bytes pending
 947 to be flushed is under some threshold (currently 10 cells' worth).
 948
 949 Currently, non-data relay cells do not affect the windows. Thus we
 950 avoid potential deadlock issues, for example, arising because a stream
 951 can't send a \emph{relay sendme} cell when its packaging window is empty.
 952
 953 These arbitrarily chosen parameters
 954 %are probably not optimal; more
 955 %research remains to find which parameters
 956 seem to give tolerable throughput and delay; more research remains.
 957
 958 \Section{Other design decisions}
 959
 960 \SubSection{Resource management and denial-of-service}
 961 \label{subsec:dos}
 962
 963 Providing Tor as a public service creates many opportunities for
 964 denial-of-service attacks against the network.  While
 965 flow control and rate limiting (discussed in
 966 Section~\ref{subsec:congestion}) prevent users from consuming more
 967 bandwidth than routers are willing to provide, opportunities remain for
 968 users to
 969 consume more network resources than their fair share, or to render the
 970 network unusable for others.
 971
 972 First of all, there are several CPU-consuming denial-of-service
 973 attacks wherein an attacker can force an OR to perform expensive
 974 cryptographic operations.  For example, an attacker who sends a
 975 \emph{create} cell full of junk bytes can force an OR to perform an RSA
 976 decrypt.  Similarly, an attacker can
 977 fake the start of a TLS handshake, forcing the OR to carry out its
 978 (comparatively expensive) half of the handshake at no real computational
 979 cost to the attacker.
 980
 981 Several approaches exist to address these attacks. First, ORs may
 982 require clients to solve a puzzle \cite{puzzles-tls} while beginning new
 983 TLS handshakes or accepting \emph{create} cells.  So long as these
 984 tokens are easy to verify and computationally expensive to produce, this
 985 approach limits the attack multiplier.  Additionally, ORs may limit
 986 the rate at which they accept create cells and TLS connections, so that
 987 the computational work of processing them does not drown out the (comparatively
 988 inexpensive) work of symmetric cryptography needed to keep cells
 989 flowing.  This rate limiting could, however, allow an attacker
 990 to slow down other users when they build new circuits.
 991
 992 % What about link-to-link rate limiting?
 993
 994 Adversaries can also attack the Tor network's hosts and network
 995 links. Disrupting a single circuit or link breaks all streams passing
 996 along that part of the circuit. Indeed, this same loss of service
 997 occurs when a router crashes or its operator restarts it. The current
 998 Tor design treats such attacks as intermittent network failures, and
 999 depends on users and applications to respond or recover as appropriate. A
1000 future design could use an end-to-end TCP-like acknowledgment protocol,
1001 so that no streams are lost unless the entry or exit point itself is
1002 disrupted. This solution would require more buffering at the network
1003 edges, however, and the performance and anonymity implications from this
1004 extra complexity still require investigation.
1005
1006 \SubSection{Exit policies and abuse}
1007 \label{subsec:exitpolicies}
1008
1009 % originally, we planned to put the "users only know the hostname,
1010 % not the IP, but exit policies are by IP" problem here too. Not
1011 % worth putting in the submission, but worth thinking about putting
1012 % in sometime somehow. -RD
1013
1014 Exit abuse is a serious barrier to wide-scale Tor deployment. Anonymity
1015 presents would-be vandals and abusers with an opportunity to hide
1016 the origins of their activities. Attackers can harm the Tor network by
1017 implicating exit servers for their abuse. Also, applications that commonly
1018 use IP-based authentication (such as institutional mail or webservers)
1019 can be fooled by the fact that anonymous connections appear to originate
1020 at the exit OR.
1021
1022 We stress that Tor does not enable any new class of abuse. Spammers
1023 and other attackers already have access to thousands of misconfigured
1024 systems worldwide, and the Tor network is far from the easiest way
1025 to launch antisocial or illegal attacks.
1026 %Indeed, because of its limited
1027 %anonymity, Tor is probably not a good way to commit crimes.
1028 But because the
1029 onion routers can easily be mistaken for the originators of the abuse,
1030 and the volunteers who run them may not want to deal with the hassle of
1031 repeatedly explaining anonymity networks, we must block or limit
1032 the abuse that travels through the Tor network.
1033
1034 To mitigate abuse issues, in Tor, each onion router's \emph{exit policy}
1035 describes to which external addresses and ports the router will
1036 connect. On one end of the spectrum are \emph{open exit}
1037 nodes that will connect anywhere. On the other end are \emph{middleman}
1038 nodes that only relay traffic to other Tor nodes, and \emph{private exit}
1039 nodes that only connect to a local host or network.  Using a private
1040 exit (if one exists) is a more secure way for a client to connect to a
1041 given host or network---an external adversary cannot eavesdrop traffic
1042 between the private exit and the final destination, and so is less sure of
1043 Alice's destination and activities. Most onion routers will function as
1044 \emph{restricted exits} that permit connections to the world at large,
1045 but prevent access to certain abuse-prone addresses and services.
1046 Additionally, in some cases the OR can authenticate clients to
1047 prevent exit abuse without harming anonymity \cite{or-discex00}.
1048
1049 %The abuse issues on closed (e.g. military) networks are different
1050 %from the abuse on open networks like the Internet. While these IP-based
1051 %access controls are still commonplace on the Internet, on closed networks,
1052 %nearly all participants will be honest, and end-to-end authentication
1053 %can be assumed for important traffic.
1054
1055 Many administrators will use port restrictions to support only a
1056 limited set of services, such as HTTP, SSH, or AIM.
1057 This is not a complete solution, of course, since abuse opportunities for these
1058 protocols are still well known.
1059
1060 A further solution may be to use proxies to clean traffic for certain
1061 protocols as it leaves the network.  For example, much abusive HTTP
1062 behavior (such as exploiting buffer overflows or well-known script
1063 vulnerabilities) can be detected in a straightforward manner.
1064 Similarly, one could run automatic spam filtering software (such as
1065 SpamAssassin) on email exiting the OR network.
1066
1067 ORs may also rewrite exiting traffic to append
1068 headers or other information indicating that the traffic has passed
1069 through an anonymity service.  This approach is commonly used
1070 by email-only anonymity systems.  ORs can also
1071 run on servers with hostnames like {\tt anonymous} to further
1072 alert abuse targets to the nature of the anonymous traffic.
1073
1074 A mixture of open and restricted exit nodes allows the most
1075 flexibility for volunteers running servers. But while having many
1076 middleman nodes provides a large and robust network,
1077 having only a few exit nodes reduces the number of points
1078 an adversary needs to monitor for traffic analysis, and places a
1079 greater burden on the exit nodes.  This tension can be seen in the
1080 Java Anon Proxy
1081 cascade model, wherein only one node in each cascade needs to handle
1082 abuse complaints---but an adversary only needs to observe the entry
1083 and exit of a cascade to perform traffic analysis on all that
1084 cascade's users. The hydra model (many entries, few exits) presents a
1085 different compromise: only a few exit nodes are needed, but an
1086 adversary needs to work harder to watch all the clients; see
1087 Section~\ref{sec:conclusion}.
1088
1089 Finally, we note that exit abuse must not be dismissed as a peripheral
1090 issue: when a system's public image suffers, it can reduce the number
1091 and diversity of that system's users, and thereby reduce the anonymity
1092 of the system itself.  Like usability, public perception is a
1093 security parameter.  Sadly, preventing abuse of open exit nodes is an
1094 unsolved problem, and will probably remain an arms race for the
1095 forseeable future.  The abuse problems faced by Princeton's CoDeeN
1096 project \cite{darkside} give us a glimpse of likely issues.
1097
1098 \SubSection{Directory Servers}
1099 \label{subsec:dirservers}
1100
1101 First-generation Onion Routing designs \cite{freedom2-arch,or-jsac98} used
1102 in-band network status updates: each router flooded a signed statement
1103 to its neighbors, which propagated it onward. But anonymizing networks
1104 have different security goals than typical link-state routing protocols.
1105 For example, delays (accidental or intentional)
1106 that can cause different parts of the network to have different views
1107 of link-state and topology are not only inconvenient: they give
1108 attackers an opportunity to exploit differences in client knowledge.
1109 We also worry about attacks to deceive a
1110 client about the router membership list, topology, or current network
1111 state. Such \emph{partitioning attacks} on client knowledge help an
1112 adversary to efficiently deploy resources
1113 against a target \cite{minion-design}.
1114
1115
1116 Tor uses a small group of redundant, well-known onion routers to
1117 track changes in network topology and node state, including keys and
1118 exit policies.  Each such \emph{directory server} acts as an HTTP
1119 server, so participants can fetch current network state and router
1120 lists, and so other ORs can upload
1121 state information.  Onion routers periodically publish signed
1122 statements of their state to each directory server. The directory servers
1123 combine this state information with their own views of network liveness,
1124 and generate a signed description (a \emph{directory}) of the entire
1125 network state. Client software is
1126 pre-loaded with a list of the directory servers and their keys,
1127 to bootstrap each client's view of the network.
1128
1129 When a directory server receives a signed statement for an OR, it
1130 checks whether the OR's identity key is recognized. Directory
1131 servers do not automatically advertise unrecognized ORs. (If they did,
1132 an adversary could take over the network by creating many servers
1133 \cite{sybil}.) Instead, new nodes must be approved by the directory
1134 server administrator before they are included. Mechanisms for automated
1135 node approval are an area of active research, and are discussed more
1136 in Section~\ref{sec:maintaining-anonymity}.
1137
1138 Of course, a variety of attacks remain. An adversary who controls
1139 a directory server can track clients by providing them different
1140 information---perhaps by listing only nodes under its control, or by
1141 informing only certain clients about a given node. Even an external
1142 adversary can exploit differences in client knowledge: clients who use
1143 a node listed on one directory server but not the others are vulnerable.
1144
1145 Thus these directory servers must be synchronized and redundant, so
1146 that they can agree on a common directory.  Clients should only trust
1147 this directory if it is signed by a threshold of the directory
1148 servers.
1149
1150 The directory servers in Tor are modeled after those in Mixminion
1151 \cite{minion-design}, but our situation is easier. First, we make the
1152 simplifying assumption that all participants agree on the set of
1153 directory servers. Second, while Mixminion needs to predict node
1154 behavior, Tor only needs a threshold consensus of the current
1155 state of the network.
1156
1157 Tor directory servers build a consensus directory through a simple
1158 four-round broadcast protocol.  In round one, each server dates and
1159 signs its current opinion, and broadcasts it to the other directory
1160 servers; then in round two, each server rebroadcasts all the signed
1161 opinions it has received.  At this point all directory servers check
1162 to see whether any server has signed multiple opinions in the same
1163 period. Such a server is either broken or cheating, so the protocol
1164 stops and notifies the administrators, who either remove the cheater
1165 or wait for the broken server to be fixed.  If there are no
1166 discrepancies, each directory server then locally computes an algorithm
1167 (described below)
1168 on the set of opinions, resulting in a uniform shared directory. In
1169 round three servers sign this directory and broadcast it; and finally
1170 in round four the servers rebroadcast the directory and all the
1171 signatures.  If any directory server drops out of the network, its
1172 signature is not included on the final directory.
1173
1174 The rebroadcast steps ensure that a directory server is heard by
1175 either all of the other servers or none of them, even when some links
1176 are down (assuming that any two directory servers can talk directly or
1177 via a third). Broadcasts are feasible because there are relatively few
1178 directory servers (currently 3, but we expect as many as 9 as the network
1179 scales). Computing the shared directory locally is a straightforward
1180 threshold voting process: we include an OR if a majority of directory
1181 servers believe it to be good.
1182
1183 To avoid attacks where a router connects to all the directory servers
1184 but refuses to relay traffic from other routers, the directory servers
1185 must build circuits and use them to anonymously test router reliability
1186 \cite{mix-acc}.
1187
1188 Using directory servers is simpler and more flexible than flooding.
1189 Flooding is expensive, and complicates the analysis when we
1190 start experimenting with non-clique network topologies. Signed
1191 directories are less expensive, because they can be cached by other
1192 onion routers.
1193 Thus directory servers are not a performance
1194 bottleneck when we have many users, and do not aid traffic analysis by
1195 forcing clients to periodically announce their existence to any
1196 central point.
1197
1198 \Section{Rendezvous points and hidden services}
1199 \label{sec:rendezvous}
1200
1201 Rendezvous points are a building block for \emph{location-hidden
1202 services} (also known as \emph{responder anonymity}) in the Tor
1203 network.  Location-hidden services allow Bob to offer a TCP
1204 service, such as a webserver, without revealing its IP address.
1205 This type of anonymity protects against distributed DoS attacks:
1206 attackers are forced to attack the onion routing network as a whole
1207 rather than just Bob's IP address.
1208
1209 Our design for location-hidden servers has the following goals.
1210 \textbf{Access-controlled:} Bob needs a way to filter incoming requests,
1211 so an attacker cannot flood Bob simply by making many connections to him.
1212 \textbf{Robust:} Bob should be able to maintain a long-term pseudonymous
1213 identity even in the presence of router failure. Bob's service must
1214 not be tied to a single OR, and Bob must be able to tie his service
1215 to new ORs. \textbf{Smear-resistant:}
1216 A social attacker who offers an illegal or disreputable location-hidden
1217 service should not be able to ``frame'' a rendezvous router by
1218 making observers believe the router created that service.
1219 %slander-resistant? defamation-resistant?
1220 \textbf{Application-transparent:} Although we require users
1221 to run special software to access location-hidden servers, we must not
1222 require them to modify their applications.
1223
1224 We provide location-hiding for Bob by allowing him to advertise
1225 several onion routers (his \emph{introduction points}) as contact
1226 points. He may do this on any robust efficient
1227 key-value lookup system with authenticated updates, such as a
1228 distributed hash table (DHT) like CFS \cite{cfs:sosp01}\footnote{
1229 Rather than rely on an external infrastructure, the Onion Routing network
1230 can run the DHT itself.  At first, we can run a simple lookup
1231 system on the
1232 directory servers.} Alice, the client, chooses an OR as her
1233 \emph{rendezvous point}. She connects to one of Bob's introduction
1234 points, informs him of her rendezvous point, and then waits for him
1235 to connect to the rendezvous point. This extra level of indirection
1236 helps Bob's introduction points avoid problems associated with serving
1237 unpopular files directly (for example, if Bob serves
1238 material that the introduction point's community finds objectionable,
1239 or if Bob's service tends to get attacked by network vandals).
1240 The extra level of indirection also allows Bob to respond to some requests
1241 and ignore others.
1242
1243 We give an overview of the steps of a rendezvous. These are
1244 performed on behalf of Alice and Bob by their local OPs;
1245 application integration is described more fully below.
1246
1247 \begin{tightlist}
1248 \item Bob chooses some introduction points, and advertises them on
1249       the DHT.  He can add more later.
1250 \item Bob builds a circuit to each of his introduction points,
1251       and waits for requests.
1252 \item Alice learns about Bob's service out of band (perhaps Bob told her,
1253       or she found it on a website). She retrieves the details of Bob's
1254       service from the DHT.
1255 \item Alice chooses an OR to be the rendezvous point (RP) for this
1256       transaction. She builds a circuit to RP, and gives it a
1257       rendezvous cookie that it will use to recognize Bob.
1258 \item Alice opens an anonymous stream to one of Bob's introduction
1259       points, and gives it a message (encrypted to Bob's public key)
1260       which tells him
1261       about herself, her chosen RP and the rendezvous cookie, and the
1262       first half of a DH
1263       handshake. The introduction point sends the message to Bob.
1264 \item If Bob wants to talk to Alice, he builds a circuit to Alice's
1265       RP and sends the rendezvous cookie, the second half of the DH
1266       handshake, and a hash of the session
1267       key they now share. By the same argument as in
1268       Section~\ref{subsubsec:constructing-a-circuit}, Alice knows she
1269       shares the key only with Bob.
1270 \item The RP connects Alice's circuit to Bob's. Note that RP can't
1271       recognize Alice, Bob, or the data they transmit.
1272 \item Alice now sends a \emph{relay begin} cell along the circuit. It
1273       arrives at Bob's onion proxy. Bob's onion proxy connects to Bob's
1274       webserver.
1275 \item An anonymous stream has been established, and Alice and Bob
1276       communicate as normal.
1277 \end{tightlist}
1278
1279 When establishing an introduction point, Bob provides the onion router
1280 with a public ``introduction'' key. The hash of this public key
1281 identifies a unique service, and (since Bob is required to sign his
1282 messages) prevents anybody else from usurping Bob's introduction point
1283 in the future. Bob uses the same public key when establishing the other
1284 introduction points for that service. Bob periodically refreshes his
1285 entry in the DHT.
1286
1287 The message that Alice gives
1288 the introduction point includes a hash of Bob's public key to identify
1289 the service, along with an optional initial authorization token (the
1290 introduction point can do prescreening, for example to block replays). Her
1291 message to Bob may include an end-to-end authorization token so Bob
1292 can choose whether to respond.
1293 The authorization tokens can be used to provide selective access:
1294 important users get tokens to ensure uninterrupted access to the
1295 service. During normal situations, Bob's service might simply be offered
1296 directly from mirrors, while Bob gives out tokens to high-priority users. If
1297 the mirrors are knocked down,
1298 %by distributed DoS attacks or even
1299 %physical attack,
1300 those users can switch to accessing Bob's service via
1301 the Tor rendezvous system.
1302
1303 Since Bob's introduction points might themselves be subject to DoS he
1304 could have to choose between keeping many
1305 introduction connections open or risking such an attack. In this case,
1306 he can provide selected users
1307 with a current list and/or future schedule of introduction points that
1308 are not advertised in the DHT\@. This is most likely to be practical
1309 if there is a relatively stable and large group of introduction points
1310 available. Alternatively, Bob could give secret public keys
1311 to selected users for consulting the DHT\@. All of these approaches
1312 have the advantage of limiting exposure even when
1313 some of the selected high-priority users collude in the DoS\@.
1314
1315 \SubSection{Integration with user applications}
1316
1317 Bob configures his onion proxy to know the local IP address and port of his
1318 service, a strategy for authorizing clients, and a public key. Bob
1319 publishes the public key, an expiration time (``not valid after''), and
1320 the current introduction points for his service into the DHT, indexed
1321 by the hash of the public key.  Bob's webserver is unmodified,
1322 and doesn't even know that it's hidden behind the Tor network.
1323
1324 Alice's applications also work unchanged---her client interface
1325 remains a SOCKS proxy. We encode all of the necessary information
1326 into the fully qualified domain name Alice uses when establishing her
1327 connection. Location-hidden services use a virtual top level domain
1328 called {\tt .onion}: thus hostnames take the form {\tt x.y.onion} where
1329 {\tt x} is the authorization cookie, and {\tt y} encodes the hash of
1330 the public key. Alice's onion proxy
1331 examines addresses; if they're destined for a hidden server, it decodes
1332 the key and starts the rendezvous as described above.
1333
1334 \subsection{Previous rendezvous work}
1335
1336 Rendezvous points in low-latency anonymity systems were first
1337 described for use in ISDN telephony \cite{isdn-mixes,jerichow-jsac98}.
1338 Later low-latency designs used rendezvous points for hiding location
1339 of mobile phones and low-power location trackers
1340 \cite{federrath-ih96,reed-protocols97}.  Rendezvous for low-latency
1341 Internet connections was suggested in early Onion Routing work
1342 \cite{or-ih96}; however, the first published design of rendezvous
1343 points for low-latency Internet connections was by Ian Goldberg
1344 \cite{ian-thesis}. His design differs from
1345 ours in three ways. First, Goldberg suggests that Alice should manually
1346 hunt down a current location of the service via Gnutella; our approach
1347 makes lookup transparent to the user, as well as faster and more robust.
1348 Second, in Tor the client and server negotiate session keys
1349 via Diffie-Hellman, so plaintext is not exposed even at the rendezvous point. Third,
1350 our design tries to minimize the exposure associated with running the
1351 service, to encourage volunteers to offer introduction and rendezvous
1352 point services. Tor's introduction points do not output any bytes to the
1353 clients, and the rendezvous points don't know the client or the server,
1354 and can't read the data being transmitted. The indirection scheme is
1355 also designed to include authentication/authorization---if Alice doesn't
1356 include the right cookie with her request for service, Bob need not even
1357 acknowledge his existence.
1358
1359 \Section{Attacks and Defenses}
1360 \label{sec:attacks}
1361
1362 Below we summarize a variety of attacks, and discuss how well our
1363 design withstands them.\\
1364
1365 \noindent{\large\bf Passive attacks}\\
1366 \emph{Observing user traffic patterns.} Observing a user's connection
1367 will not reveal her destination or data, but it will
1368 reveal traffic patterns (both sent and received). Profiling via user
1369 connection patterns requires further processing, because multiple
1370 application streams may be operating simultaneously or in series over
1371 a single circuit.
1372
1373 \emph{Observing user content.} While content at the user end is encrypted,
1374 connections to responders may not be (indeed, the responding website
1375 itself may be hostile). While filtering content is not a primary goal
1376 of Onion Routing, Tor can directly use Privoxy and related
1377 filtering services to anonymize application data streams.
1378
1379 \emph{Option distinguishability.} We allow clients to choose local
1380 configuration options. For example, clients concerned about request
1381 linkability should rotate circuits more often than those concerned
1382 about traceability. There is economic incentive to attract users by
1383 allowing this choice; but at the same time, a set of clients who are
1384 in the minority may lose more anonymity by appearing distinct than they
1385 gain by optimizing their behavior \cite{econymics}.
1386
1387 \emph{End-to-end timing correlation.}  Tor only minimally hides
1388 such correlations. An attacker watching patterns of
1389 traffic at the initiator and the responder will be
1390 able to confirm the correspondence with high probability. The
1391 greatest protection currently available against such confirmation is to hide
1392 the connection between the onion proxy and the first Tor node,
1393 by running the OP on the Tor node or behind a firewall. This approach
1394 requires an observer to separate traffic originating at the onion
1395 router from traffic passing through it: a global observer can do this,
1396 but it might be beyond a limited observer's capabilities.
1397
1398 \emph{End-to-end size correlation.} Simple packet counting
1399 will also be effective in confirming
1400 endpoints of a stream. However, even without padding, we have some
1401 limited protection: the leaky pipe topology means different numbers
1402 of packets may enter one end of a circuit than exit at the other.
1403
1404 \emph{Website fingerprinting.} All the effective passive
1405 attacks above are traffic confirmation attacks,
1406 which puts them outside our design goals. There is also
1407 a passive traffic analysis attack that is potentially effective.
1408 Rather than searching exit connections for timing and volume
1409 correlations, the adversary may build up a database of
1410 ``fingerprints'' containing file sizes and access patterns for
1411 targeted websites. He can later confirm a user's connection to a given
1412 site simply by consulting the database. This attack has
1413 been shown to be effective against SafeWeb \cite{hintz-pet02}.
1414 It may be less effective against Tor, since
1415 streams are multiplexed within the same circuit, and
1416 fingerprinting will be limited to
1417 the granularity of cells (currently 256 bytes). Additional
1418 defenses could include
1419 larger cell sizes, padding schemes to group websites
1420 into large sets, and link
1421 padding or long-range dummies.\footnote{Note that this fingerprinting
1422 attack should not be confused with the much more complicated latency
1423 attacks of \cite{back01}, which require a fingerprint of the latencies
1424 of all circuits through the network, combined with those from the
1425 network edges to the target user and the responder website.}\\
1426
1427 \noindent{\large\bf Active attacks}\\
1428 \emph{Compromise keys.} An attacker who learns the TLS session key can
1429 see control cells and encrypted relay cells on every circuit on that
1430 connection; learning a circuit
1431 session key lets him unwrap one layer of the encryption. An attacker
1432 who learns an OR's TLS private key can impersonate that OR for the TLS
1433 key's lifetime, but he must
1434 also learn the onion key to decrypt \emph{create} cells (and because of
1435 perfect forward secrecy, he cannot hijack already established circuits
1436 without also compromising their session keys). Periodic key rotation
1437 limits the window of opportunity for these attacks. On the other hand,
1438 an attacker who learns a node's identity key can replace that node
1439 indefinitely by sending new forged descriptors to the directory servers.
1440
1441 \emph{Iterated compromise.} A roving adversary who can
1442 compromise ORs (by system intrusion, legal coercion, or extralegal
1443 coercion) could march down the circuit compromising the
1444 nodes until he reaches the end.  Unless the adversary can complete
1445 this attack within the lifetime of the circuit, however, the ORs
1446 will have discarded the necessary information before the attack can
1447 be completed.  (Thanks to the perfect forward secrecy of session
1448 keys, the attacker cannot force nodes to decrypt recorded
1449 traffic once the circuits have been closed.)  Additionally, building
1450 circuits that cross jurisdictions can make legal coercion
1451 harder---this phenomenon is commonly called ``jurisdictional
1452 arbitrage.'' The Java Anon Proxy project recently experienced the
1453 need for this approach, when
1454 a German court forced them to add a backdoor to
1455 all of their nodes \cite{jap-backdoor}.
1456
1457 \emph{Run a recipient.} An adversary running a webserver
1458 trivially learns the timing patterns of users connecting to it, and
1459 can introduce arbitrary patterns in its responses.
1460 End-to-end attacks become easier: if the adversary can induce
1461 users to connect to his webserver (perhaps by advertising
1462 content targeted to those users), she now holds one end of their
1463 connection.  There is also a danger that application
1464 protocols and associated programs can be induced to reveal information
1465 about the initiator. Tor depends on Privoxy and similar protocol cleaners
1466 to solve this latter problem.
1467
1468 \emph{Run an onion proxy.} It is expected that end users will
1469 nearly always run their own local onion proxy. However, in some
1470 settings, it may be necessary for the proxy to run
1471 remotely---typically, in institutions that want
1472 to monitor the activity of those connecting to the proxy.
1473 Compromising an onion proxy compromises all future connections
1474 through it.
1475
1476 \emph{DoS non-observed nodes.} An observer who can only watch some
1477 of the Tor network can increase the value of this traffic
1478 by attacking non-observed nodes to shut them down, reduce
1479 their reliability, or persuade users that they are not trustworthy.
1480 The best defense here is robustness.
1481
1482 \emph{Run a hostile OR.}  In addition to being a local observer,
1483 an isolated hostile node can create circuits through itself, or alter
1484 traffic patterns to affect traffic at other nodes. Nonetheless, a hostile
1485 node must be immediately adjacent to both endpoints to compromise the
1486 anonymity of a circuit. If an adversary can
1487 run multiple ORs, and can persuade the directory servers
1488 that those ORs are trustworthy and independent, then occasionally
1489 some user will choose one of those ORs for the start and another
1490 as the end of a circuit. If an adversary
1491 controls $m>1$ out of $N$ nodes, he should be able to correlate at most
1492 $\left(\frac{m}{N}\right)^2$ of the traffic in this way---although an
1493 adversary
1494 could possibly attract a disproportionately large amount of traffic
1495 by running an OR with an unusually permissive exit policy, or by
1496 degrading the reliability of other routers.
1497
1498 \emph{Introduce timing into messages.} This is simply a stronger
1499 version of passive timing attacks already discussed earlier.
1500
1501 \emph{Tagging attacks.} A hostile node could ``tag'' a
1502 cell by altering it. If the
1503 stream were, for example, an unencrypted request to a Web site,
1504 the garbled content coming out at the appropriate time would confirm
1505 the association. However, integrity checks on cells prevent
1506 this attack.
1507
1508 \emph{Replace contents of unauthenticated protocols.}  When
1509 relaying an unauthenticated protocol like HTTP, a hostile exit node
1510 can impersonate the target server. Clients
1511 should prefer protocols with end-to-end authentication.
1512
1513 \emph{Replay attacks.} Some anonymity protocols are vulnerable
1514 to replay attacks.  Tor is not; replaying one side of a handshake
1515 will result in a different negotiated session key, and so the rest
1516 of the recorded session can't be used.
1517
1518 \emph{Smear attacks.} An attacker could use the Tor network for
1519 socially disapproved acts, to bring the
1520 network into disrepute and get its operators to shut it down.
1521 Exit policies reduce the possibilities for abuse, but
1522 ultimately the network will require volunteers who can tolerate
1523 some political heat.
1524
1525 \emph{Distribute hostile code.} An attacker could trick users
1526 into running subverted Tor software that did not, in fact, anonymize
1527 their connections---or worse, could trick ORs into running weakened
1528 software that provided users with less anonymity.  We address this
1529 problem (but do not solve it completely) by signing all Tor releases
1530 with an official public key, and including an entry in the directory
1531 that lists which versions are currently believed to be secure.  To
1532 prevent an attacker from subverting the official release itself
1533 (through threats, bribery, or insider attacks), we provide all
1534 releases in source code form, encourage source audits, and
1535 frequently warn our users never to trust any software (even from
1536 us) that comes without source.\\
1537
1538 \noindent{\large\bf Directory attacks}\\
1539 \emph{Destroy directory servers.}  If a few directory
1540 servers disappear, the others still decide on a valid
1541 directory.  So long as any directory servers remain in operation,
1542 they will still broadcast their views of the network and generate a
1543 consensus directory.  (If more than half are destroyed, this
1544 directory will not, however, have enough signatures for clients to
1545 use it automatically; human intervention will be necessary for
1546 clients to decide whether to trust the resulting directory.)
1547
1548 \emph{Subvert a directory server.}  By taking over a directory server,
1549 an attacker can partially influence the final directory.  Since ORs
1550 are included or excluded by majority vote, the corrupt directory can
1551 at worst cast a tie-breaking vote to decide whether to include
1552 marginal ORs.  It remains to be seen how often such marginal cases
1553 occur in practice.
1554
1555 \emph{Subvert a majority of directory servers.} An adversary who controls
1556 more than half the directory servers can include as many compromised
1557 ORs in the final directory as he wishes. We must ensure that directory
1558 server operators are independent and attack-resistant.
1559
1560 \emph{Encourage directory server dissent.}  The directory
1561 agreement protocol assumes that directory server operators agree on
1562 the set of directory servers.  An adversary who can persuade some
1563 of the directory server operators to distrust one another could
1564 split the quorum into mutually hostile camps, thus partitioning
1565 users based on which directory they use.  Tor does not address
1566 this attack.
1567
1568 \emph{Trick the directory servers into listing a hostile OR.}
1569 Our threat model explicitly assumes directory server operators will
1570 be able to filter out most hostile ORs.
1571 % If this is not true, an
1572 % attacker can flood the directory with compromised servers.
1573
1574 \emph{Convince the directories that a malfunctioning OR is
1575 working.}  In the current Tor implementation, directory servers
1576 assume that an OR is running correctly if they can start a TLS
1577 connection to it.  A hostile OR could easily subvert this test by
1578 accepting TLS connections from ORs but ignoring all cells. Directory
1579 servers must actively test ORs by building circuits and streams as
1580 appropriate.  The tradeoffs of a similar approach are discussed in
1581 \cite{mix-acc}.\\
1582
1583 \noindent{\large\bf Attacks against rendezvous points}\\
1584 \emph{Make many introduction requests.}  An attacker could
1585 try to deny Bob service by flooding his introduction points with
1586 requests.  Because the introduction points can block requests that
1587 lack authorization tokens, however, Bob can restrict the volume of
1588 requests he receives, or require a certain amount of computation for
1589 every request he receives.
1590
1591 \emph{Attack an introduction point.} An attacker could
1592 disrupt a location-hidden service by disabling its introduction
1593 points.  But because a service's identity is attached to its public
1594 key, not its introduction point, the service can simply re-advertise
1595 itself at a different introduction point. Advertisements can also be
1596 done secretly so that only high-priority clients know the address of
1597 Bob's introduction points, forcing the attacker to disable all possible
1598 introduction points.
1599
1600 \emph{Compromise an introduction point.} An attacker who controls
1601 Bob's introduction point can flood Bob with
1602 introduction requests, or prevent valid introduction requests from
1603 reaching him. Bob can notice a flood, and close the circuit.  To notice
1604 blocking of valid requests, however, he should periodically test the
1605 introduction point by sending rendezvous requests and making
1606 sure he receives them.
1607
1608 \emph{Compromise a rendezvous point.}  A rendezvous
1609 point is no more sensitive than any other OR on
1610 a circuit, since all data passing through the rendezvous is encrypted
1611 with a session key shared by Alice and Bob.
1612
1613 \Section{Open Questions in Low-latency Anonymity}
1614 \label{sec:maintaining-anonymity}
1615
1616 In addition to the non-goals in
1617 Section~\ref{subsec:non-goals}, many other questions must be solved
1618 before we can be confident of Tor's security.
1619
1620 Many of these open issues are questions of balance. For example,
1621 how often should users rotate to fresh circuits? Frequent rotation
1622 is inefficient, expensive, and may lead to intersection attacks and
1623 predecessor attacks \cite{wright03}, but infrequent rotation makes the
1624 user's traffic linkable. Besides opening fresh circuits, clients can
1625 also exit from the middle of the circuit,
1626 or truncate and re-extend the circuit. More analysis is
1627 needed to determine the proper tradeoff.
1628
1629 %% Duplicated by 'Better directory distribution' in section 9.
1630 %
1631 %A similar question surrounds timing of directory operations: how often
1632 %should directories be updated?  Clients that update infrequently receive
1633 %an inaccurate picture of the network, but frequent updates can overload
1634 %the directory servers. More generally, we must find more
1635 %decentralized yet practical ways to distribute up-to-date snapshots of
1636 %network status without introducing new attacks.
1637
1638 How should we choose path lengths? If Alice only ever uses two hops,
1639 then both ORs can be certain that by colluding they will learn about
1640 Alice and Bob. In our current approach, Alice always chooses at least
1641 three nodes unrelated to herself and her destination.
1642 %% This point is subtle, but not IMO necessary.  Anybody who thinks
1643 %% about it will see that it's implied by the above sentence; anybody
1644 %% who doesn't think about it is safe in his ignorance.
1645 %
1646 %Thus normally she chooses
1647 %three nodes, but if she is running an OR and her destination is on an OR,
1648 %she uses five.
1649 Should Alice choose a nondeterministic path length (say,
1650 increasing it from a geometric distribution) to foil an attacker who
1651 uses timing to learn that he is the fifth hop and thus concludes that
1652 both Alice and the responder are on ORs?
1653
1654 Throughout this paper, we have assumed that end-to-end traffic
1655 confirmation will immediately and automatically defeat a low-latency
1656 anonymity system. Even high-latency anonymity systems can be
1657 vulnerable to end-to-end traffic confirmation, if the traffic volumes
1658 are high enough, and if users' habits are sufficiently distinct
1659 \cite{limits-open,statistical-disclosure}. Can anything be done to
1660 make low-latency systems resist these attacks as well as high-latency
1661 systems? Tor already makes some effort to conceal the starts and ends of
1662 streams by wrapping long-range control commands in identical-looking
1663 relay cells. Link padding could frustrate passive observers who count
1664 packets; long-range padding could work against observers who own the
1665 first hop in a circuit. But more research remains to find an efficient
1666 and practical approach. Volunteers prefer not to run constant-bandwidth
1667 padding; but no convincing traffic shaping approach has been
1668 specified. Recent work on long-range padding \cite{defensive-dropping}
1669 shows promise. One could also try to reduce correlation in packet timing
1670 by batching and re-ordering packets, but it is unclear whether this could
1671 improve anonymity without introducing so much latency as to render the
1672 network unusable.
1673
1674 A cascade topology may better defend against traffic confirmation by
1675 aggregating users, and making padding and
1676 mixing more affordable.  Does the hydra topology (many input nodes,
1677 few output nodes) work better against some adversaries? Are we going
1678 to get a hydra anyway because most nodes will be middleman nodes?
1679
1680 Common wisdom suggests that Alice should run her own OR for best
1681 anonymity, because traffic coming from her node could plausibly have
1682 come from elsewhere. How much mixing does this approach need?  Is it
1683 immediately beneficial because of real-world adversaries that can't
1684 observe Alice's router, but can run routers of their own?
1685
1686 To scale to many users, and to prevent an attacker from observing the
1687 whole network at once, it may be necessary
1688 to support far more servers than Tor currently anticipates.
1689 This introduces several issues.  First, if approval by a centralized set
1690 of directory servers is no longer feasible, what mechanism should be used
1691 to prevent adversaries from signing up many colluding servers? Second,
1692 if clients can no longer have a complete picture of the network at all
1693 times, how can they perform discovery while preventing attackers from
1694 manipulating or exploiting gaps in their knowledge?  Third, if there
1695 are too many servers for every server to constantly communicate with
1696 every other, what kind of non-clique topology should the network use?
1697 (Restricted-route topologies promise comparable anonymity with better
1698 scalability \cite{danezis-pets03}, but whatever topology we choose, we
1699 need some way to keep attackers from manipulating their position within
1700 it \cite{casc-rep}.) Fourth, since no centralized authority is tracking
1701 server reliability, how do we prevent unreliable servers from rendering
1702 the network unusable?  Fifth, do clients receive so much anonymity benefit
1703 from running their own servers that we should expect them all to do so
1704 \cite{econymics}, or do we need to find another incentive structure to
1705 motivate them?  Tarzan and MorphMix present possible solutions.
1706
1707 % advogato, captcha
1708
1709 When a Tor node goes down, all its circuits (and thus streams) must break.
1710 Will users abandon the system because of this brittleness? How well
1711 does the method in Section~\ref{subsec:dos} allow streams to survive
1712 node failure? If affected users rebuild circuits immediately, how much
1713 anonymity is lost? It seems the problem is even worse in a peer-to-peer
1714 environment---such systems don't yet provide an incentive for peers to
1715 stay connected when they're done retrieving content, so we would expect
1716 a higher churn rate.
1717
1718 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1719
1720 \Section{Future Directions}
1721 \label{sec:conclusion}
1722
1723 Tor brings together many innovations into a unified deployable system. The
1724 next immediate steps include:
1725
1726 \emph{Scalability:} Tor's emphasis on deployability and design simplicity
1727 has led us to adopt a clique topology, semi-centralized
1728 directories, and a full-network-visibility model for client
1729 knowledge. These properties will not scale past a few hundred servers.
1730 Section~\ref{sec:maintaining-anonymity} describes some promising
1731 approaches, but more deployment experience will be helpful in learning
1732 the relative importance of these bottlenecks.
1733
1734 \emph{Bandwidth classes:} This paper assumes that all ORs have
1735 good bandwidth and latency. We should instead adopt the Morphmix model,
1736 where nodes advertise their bandwidth level (DSL, T1, T3), and
1737 Alice avoids bottlenecks by choosing nodes that match or
1738 exceed her bandwidth. In this way DSL users can usefully join the Tor
1739 network.
1740
1741 \emph{Incentives:} Volunteers who run nodes are rewarded with publicity
1742 and possibly better anonymity \cite{econymics}. More nodes means increased
1743 scalability, and more users can mean more anonymity. We need to continue
1744 examining the incentive structures for participating in Tor.
1745
1746 \emph{Cover traffic:} Currently Tor omits cover traffic---its costs
1747 in performance and bandwidth are clear but its security benefits are
1748 not well understood. We must pursue more research on link-level cover
1749 traffic and long-range cover traffic to determine whether some simple padding
1750 method offers provable protection against our chosen adversary.
1751
1752 %%\emph{Offer two relay cell sizes:} Traffic on the Internet tends to be
1753 %%large for bulk transfers and small for interactive traffic. One cell
1754 %%size cannot be optimal for both types of traffic.
1755 % This should go in the spec and todo, but not the paper yet. -RD
1756
1757 \emph{Caching at exit nodes:} Perhaps each exit node should run a
1758 caching web proxy, to improve anonymity for cached pages (Alice's request never
1759 leaves the Tor network), to improve speed, and to reduce bandwidth cost.
1760 On the other hand, forward security is weakened because caches
1761 constitute a record of retrieved files.  We must find the right
1762 balance between usability and security.
1763
1764 \emph{Better directory distribution:}
1765 Clients currently download a description of
1766 the entire network every 15 minutes. As the state grows larger
1767 and clients more numerous, we may need a solution in which
1768 clients receive incremental updates to directory state.
1769 More generally, we must find more
1770 scalable yet practical ways to distribute up-to-date snapshots of
1771 network status without introducing new attacks.
1772
1773 \emph{Implement location-hidden services:} The design in
1774 Section~\ref{sec:rendezvous} has not yet been implemented.  While doing
1775 so we are likely to encounter additional issues that must be resolved,
1776 both in terms of usability and anonymity.
1777
1778 \emph{Further specification review:} Although have a public
1779 byte-level specification for the Tor protocols, it needs
1780 extensive external review.  We hope that as Tor
1781 is more widely deployed, more people will examine its
1782 specification.
1783
1784 \emph{Multisystem interoperability:} We are currently working with the
1785 designer of MorphMix to unify the specification and implementation of
1786 the common elements of our two systems. So far, this seems
1787 to be relatively straightforward.  Interoperability will allow testing
1788 and direct comparison of the two designs for trust and scalability.
1789
1790 \emph{Wider-scale deployment:} The original goal of Tor was to
1791 gain experience in deploying an anonymizing overlay network, and
1792 learn from having actual users.  We are now at a point in design
1793 and development where we can start deploying a wider network.  Once
1794 we have many actual users, we will doubtlessly be better
1795 able to evaluate some of our design decisions, including our
1796 robustness/latency tradeoffs, our performance tradeoffs (including
1797 cell size), our abuse-prevention mechanisms, and
1798 our overall usability.
1799
1800 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1801
1802 %% commented out for anonymous submission
1803 \section*{Acknowledgments}
1804  Peter Palfrader, Geoff Goodell, Adam Shostack, Joseph Sokol-Margolis,
1805    John Bashinski, Zack Brown:
1806    for editing and comments.
1807  Matej Pfajfar, Andrei Serjantov, Marc Rennhard: for design discussions.
1808  Bram Cohen for congestion control discussions.
1809  Adam Back for suggesting telescoping circuits.
1810  Cathy Meadows for formal analysis of the extend protocol.
1811  This work supported by ONR and DARPA.
1812
1813 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1814
1815 \bibliographystyle{latex8}
1816 \bibliography{tor-design}
1817
1818 \end{document}
1819
1820 % Style guide:
1821 %     U.S. spelling
1822 %     avoid contractions (it's, can't, etc.)
1823 %     prefer ``for example'' or ``such as'' to e.g.
1824 %     prefer ``that is'' to i.e.
1825 %     'mix', 'mixes' (as noun)
1826 %     'mix-net'
1827 %     'mix', 'mixing' (as verb)
1828 %     'middleman'  [Not with a hyphen; the hyphen has been optional
1829 %         since Middle English.]
1830 %     'nymserver'
1831 %     'Cypherpunk', 'Cypherpunks', 'Cypherpunk remailer'
1832 %     'Onion Routing design', 'onion router' [note capitalization]
1833 %     'SOCKS'
1834 %     Try not to use \cite as a noun.
1835 %     'Authorizating' sounds great, but it isn't a word.
1836 %     'First, second, third', not 'Firstly, secondly, thirdly'.
1837 %     'circuit', not 'channel'
1838 %     Typography: no space on either side of an em dash---ever.
1839 %     Hyphens are for multi-part words; en dashs imply movement or
1840 %        opposition (The Alice--Bob connection); and em dashes are
1841 %        for punctuation---like that.
1842 %     A relay cell; a control cell; a \emph{create} cell; a
1843 %     \emph{relay truncated} cell.  Never ``a \emph{relay truncated}.''
1844 %
1845 %     'Substitute ``Damn'' every time you're inclined to write ``very;'' your
1846 %     editor will delete it and the writing will be just as it should be.'
1847 %     -- Mark Twain