doc/spec/proposals/115-two-hop-paths.txt

   1 Filename: 115-two-hop-paths.txt
   2 Title: Two Hop Paths
   3 Version: $Revision$
   4 Last-Modified: $Date$
   5 Author: Mike Perry
   6 Created:
   7 Status: Open
   8 Supersedes: 112
   9
  10
  11 Overview:
  12
  13   The idea is that users should be able to choose if they would like
  14   to have either two or three hop paths through the tor network.
  15
  16   Let us be clear: the users who would choose this option should be
  17   those that are concerned with IP obfuscation only: ie they would not be
  18   targets of a resource-intensive multi-node attack. It is sometimes said
  19   that these users should find some other network to use other than Tor.
  20   This is a foolish suggestion: more users improves security of everyone,
  21   and the current small userbase size is a critical hindrance to
  22   anonymity, as is discussed below and in [1].
  23
  24   This value should be modifiable from the controller, and should be
  25   available from Vidalia.
  26
  27
  28 Motivation:
  29
  30   The Tor network is slow and overloaded. Increasingly often I hear
  31   stories about friends and friends of friends who are behind firewalls,
  32   annoying censorware, or under surveillance that interferes with their
  33   productivity and Internet usage, or chills their speech. These people
  34   know about Tor, but they choose to put up with the censorship because
  35   Tor is too slow to be usable for them. In fact, to download a fresh,
  36   complete copy of levine-timing.pdf for the Theoretical Argument
  37   section of this proposal over Tor took me 3 tries.
  38
  39   Furthermore, the biggest current problem with Tor's anonymity for
  40   those who really need it is not someone attacking the network to
  41   discover who they are. It's instead the extreme danger that so few
  42   people use Tor because it's so slow, that those who do use it have
  43   essentially no confusion set.
  44
  45   The recent case where the professor and the rogue Tor user were the
  46   only Tor users on campus, and thus suspected in an incident involving
  47   Tor and that University underscores this point: "That was why the police
  48   had come to see me. They told me that only two people on our campus were
  49   using Tor: me and someone they suspected of engaging in an online scam.
  50   The detectives wanted to know whether the other user was a former
  51   student of mine, and why I was using Tor"[1].
  52
  53   Not only does Tor provide no anonymity if you use it to be anonymous
  54   but are obviously from a certain institution, location or circumstance,
  55   it is also dangerous to use Tor for risk of being accused of having
  56   something significant enough to hide to be willing to put up with
  57   the horrible performance as opposed to using some weaker alternative.
  58
  59   There are many ways to improve the speed problem, and of course we
  60   should and will implement as many as we can. Johannes's GSoC project
  61   and my reputation system are longer term, higher-effort things that
  62   will still provide benefit independent of this proposal.
  63
  64   However, reducing the path length to 2 for those who do not need the
  65   extra anonymity 3 hops provide not only improves their Tor experience
  66   but also reduces their load on the Tor network by 33%, and should
  67   increase adoption of Tor by a good deal. That's not just Win-Win, it's
  68   Win-Win-Win.
  69
  70
  71 Who will enable this option?
  72
  73   This is the crux of the proposal. Admittedly, there is some anonymity
  74   loss and some degree of decreased investment required on the part of
  75   the adversary to attack 2 hop users versus 3 hop users, even if it is
  76   minimal and limited mostly to up-front costs and false positives.
  77
  78   The key questions are:
  79
  80   1. Are these users in a class such that their risk is significantly
  81      less than the amount of this anonymity loss?
  82
  83   2. Are these users able to identify themselves?
  84
  85   Many many users of Tor are not at risk for an adversary capturing c/n
  86   nodes of the network just to see what they do. These users use Tor to
  87   circumvent aggressive content filters, or simply to keep their IP out of
  88   marketing and search engine databases. Most content filters have no
  89   interest in running Tor nodes to catch violators, and marketers
  90   certainly would never consider such a thing, both on a cost basis and a
  91   legal one.
  92
  93   In a sense, this represents an alternate threat model against these
  94   users who are not at risk for Tor's normal threat model.
  95
  96   It should be evident to these users that they fall into this class. All
  97   that should be needed is a radio button
  98
  99    * "I use Tor for local content filter circumvention and/or IP obfuscation,
 100       not anonymity. Speed is more important to me than high anonymity.
 101       No one will make considerable efforts to determine my real IP."
 102    * "I use Tor for anonymity and/or national-level, legally enforced
 103       censorship. It is possible effort will be taken to identify
 104       me, including but not limited to network surveillance. I need more
 105       protection."
 106
 107   and then some explanation in the help for exactly what this means, and
 108   the risks involved with eliminating the adversary's need for timing
 109   attacks with respect to false positives. Ultimately, the decision is a
 110   simple one that can be made without this information, however. The user
 111   does not need Paul Syverson to instruct them on the deep magic of Onion
 112   Routing to make this decision. They just need to know why they use Tor.
 113   If they use it just to stay out of marketing databases and/or bypass a
 114   local content filter, two hops is plenty. This is likely the vast
 115   majority of Tor users, and many non-users we would like to bring on
 116   board.
 117
 118   So, having established this class of users, let us now go on to
 119   examine theoretical and practical risks we place them at, and determine
 120   if these risks violate the users needs, or introduce additional risk
 121   to node operators who may be subject to requests from law enforcement
 122   to track users who need 3 hops, but use 2 because they enjoy the
 123   thrill of russian roulette.
 124
 125
 126 Theoretical Argument:
 127
 128   It has long been established that timing attacks against mixed
 129   and onion networks are extremely effective, and that regardless
 130   of path length, if the adversary has compromised your first and
 131   last hop of your path, you can assume they have compromised your
 132   identity for that connection.
 133
 134   In fact, it was demonstrated that for all but the slowest, lossiest
 135   networks, error rates for false positives and false negatives were
 136   very near zero[2]. Only for constant streams of traffic over slow and
 137   (more importantly) extremely lossy network links did the error rate
 138   hit 20%. For loss rates typical to the Internet, even the error rate
 139   for slow nodes with constant traffic streams was 13%.
 140
 141   When you take into account that most Tor streams are not constant,
 142   but probably much more like their "HomeIP" dataset, which consists
 143   mostly of web traffic that exists over finite intervals at specific
 144   times, error rates drop to fractions of 1%, even for the "worst"
 145   network nodes.
 146
 147   Therefore, the user has little benefit from the extra hop, assuming
 148   the adversary does timing correlation on their nodes. Since timing
 149   correlation is simply an implementation issue and is most likely
 150   a single up-front cost (and one that is like quite a bit cheaper
 151   than the cost of the machines purchased to host the nodes to mount
 152   an attack), the real protection is the low probability of getting
 153   both the first and last hop of a client's stream.
 154
 155
 156 Practical Issues:
 157
 158   Theoretical issues aside, there are several practical issues with the
 159   implementation of Tor that need to be addressed to ensure that
 160   identity information is not leaked by the implementation.
 161
 162   Exit policy issues:
 163
 164   If a client chooses an exit with a very restrictive exit policy
 165   (such as an IP or IP range), the first hop then knows a good deal
 166   about the destination. For this reason, clients should not select
 167   exits that match their destination IP with anything other than "*".
 168
 169   Partitioning:
 170
 171   Partitioning attacks form another concern. Since Tor uses telescoping
 172   to build circuits, it is possible to tell a user is constructing only
 173   two hop paths at the entry node and on the local network. An external
 174   adversary can potentially differentiate 2 and 3 hop users, and decide
 175   that all IP addresses connecting to Tor and using 3 hops have something
 176   to hide, and should be scrutinized more closely or outright apprehended.
 177
 178   One solution to this is to use the "leaky-circuit" method of attaching
 179   streams: The user always creates 3-hop circuits, but if the option
 180   is enabled, they always exit from their 2nd hop. The ideal solution
 181   would be to create a RELAY_SHISHKABOB cell which contains onion
 182   skins for every host along the path, but this requires protocol
 183   changes at the nodes to support.
 184
 185   Guard nodes:
 186
 187   Since guard nodes can rotate due to client relocation, network
 188   failure, node upgrades and other issues, if you amortize the risk a
 189   mobile, dialup, or otherwise intermittently connected user is exposed to
 190   over any reasonable duration of Tor usage (on the order of a year), it
 191   is the same with or without guard nodes. Assuming an adversary has
 192   c%/n% of network bandwidth, and guards rotate on average with period R,
 193   statistically speaking, it's merely a question of if the user wishes
 194   their risk to be concentrated with probability c/n over an expected
 195   period of R*c, and probability 0 over an expected period of R*(n-c),
 196   versus a continuous risk of (c/n)^2. So statistically speaking, guards
 197   only create a time-tradeoff of risk over the long run for normal Tor
 198   usage. Rotating guards do not reduce risk for normal client usage long
 199   term.[3]
 200
 201   On other other hand, assuming a more stable method of guard selection
 202   and preservation is devised, or a more stable client side network than
 203   my own is typical (which rotates guards frequently due to network issues
 204   and moving about), guard nodes provide a tradeoff in the form of c/n% of
 205   the users being "sacrificial users" who are exposed to high risk O(c/n)
 206   of identification, while the rest of the network is exposed to zero
 207   risk.
 208
 209   The nature of Tor makes it likely an adversary will take a "shock and
 210   awe" approach to suppressing Tor by rounding up a few users whose
 211   browsing activity has been observed to be made into examples, in an
 212   attempt to prove that Tor is not perfect.
 213
 214   Since this "shock and awe" attack can be applied with or without guard
 215   nodes, stable guard nodes do offer a measure of accountability of sorts.
 216   If a user was using a small set of guard nodes and knows them well, and
 217   then is suddenly apprehended as a result of Tor usage, having a fixed
 218   set of entry points to suspect is a lot better than suspecting the whole
 219   network. Conversely, it can also give non-apprehended users comfort
 220   that they are likely to remain safe indefinitely with their set of (now
 221   presumably trusted) guards. This is probably the most beneficial
 222   property of reliable guards: they deter the adversary from mounting
 223   "shock and awe" attacks because the surviving users will not
 224   intimidated, but instead made more confident. Of course, guards need to
 225   be made much more stable and users need to be encouraged to know their
 226   guards for this property to really take effect.
 227
 228   This beneficial property of client vigilance also carries over to an
 229   active adversary, except in this case instead of relying on the user
 230   to remember their guard nodes and somehow communicate them after
 231   apprehension, the code can alert them to the presence of an active
 232   adversary before they are apprehended. But only if they use guard nodes.
 233
 234   So lets consider the active adversary: Two hop paths allow malicious
 235   guards to get considerably more benefit from failing circuits if they do
 236   not extend to their colluding peers for the exit hop. Since guards can
 237   detect the number of hops in a path via either timing or by statistical
 238   analysis of the exit policy of the 2nd hop, they can perform this attack
 239   predominantly against 2 hop users.
 240
 241   This can be addressed by completely abandoning an entry guard after a
 242   certain ratio of extend or general circuit failures with respect to
 243   non-failed circuits. The proper value for this ratio can be determined
 244   experimentally with TorFlow. There is the possibility that the local
 245   network can abuse this feature to cause certain guards to be dropped,
 246   but they can do that anyways with the current Tor by just making guards
 247   they don't like unreachable. With this mechanism, Tor will complain
 248   loudly if any guard failure rate exceeds the expected in any failure
 249   case, local or remote.
 250
 251   Eliminating guards entirely would actually not address this issue due
 252   to the time-tradeoff nature of risk. In fact, it would just make it
 253   worse. Without guard nodes, it becomes much more difficult for clients
 254   to become alerted to Tor entry points that are failing circuits to make
 255   sure that they only devote bandwidth to carry traffic for streams which
 256   they observe both ends. Yet the rogue entry points are still able to
 257   significantly increase their success rates by failing circuits.
 258
 259   For this reason, guard nodes should remain enabled for 2 hop users,
 260   at least until an IP-independent, undetectable guard scanner can
 261   be created. TorFlow can scan for failing guards, but after a while,
 262   its unique behavior gives away the fact that its IP is a scanner and
 263   it can be given selective service.
 264
 265   Consideration of risks for node operators:
 266
 267   There is a serious risk for two hop users in the form of guard
 268   profiling. If an adversary running an exit node notices that a
 269   particular site is always visited from a fixed previous hop, it is
 270   likely that this is a two hop user using a certain guard, which could be
 271   monitored to determine their identity. Thus, for the protection of both
 272   2 hop users and node operators, 2 hop users should limit their guard
 273   duration to a sufficient number of days to verify reliability of a node,
 274   but not much more. This duration can be determined experimentally by
 275   TorFlow.
 276
 277   Considering a Tor client builds on average 144 circuits/day (10
 278   minutes per circuit), if the adversary owns c/n% of exits on the
 279   network, they can expect to see 144*c/n circuits from this user, or
 280   about 14 minutes of usage per day per percentage of network penetration.
 281   Since it will take several occurrences of user-linkable exit content
 282   from the same predecessor hop for the adversary to have any confidence
 283   this is a 2 hop user, it is very unlikely that any sort of demands made
 284   upon the predecessor node would guaranteed to be effective (ie it
 285   actually was a guard), let alone be executed in time to apprehend the
 286   user before they rotated guards.
 287
 288   The reverse risk also warrants consideration. If a malicious guard has
 289   orders to surveil Mike Perry, it can determine Mike Perry is using two
 290   hops by observing his tendency to choose a 2nd hop with a viable exit
 291   policy. This can be done relatively quickly, unfortunately, and
 292   indicates Mike Perry should spend some of his time building real 3 hop
 293   circuits through the same guards, to require them to at least wait for
 294   him to actually use Tor to determine his style of operation, rather than
 295   collect this information from his passive building patterns.
 296
 297   However, to actively determine where Mike Perry is going, the guard
 298   will need to require logging ahead of time at multiple exit nodes that
 299   he may use over the course of the few days while he is at that guard,
 300   and correlate the usage times of the exit node with Mike Perry's
 301   activity at that guard for the few days he uses it. At this point, the
 302   adversary is mounting a scale and method of attack (widespread logging,
 303   timing attacks) that works pretty much just as effectively against 3
 304   hops, so exit node operators are exposed to no additional danger than
 305   they otherwise normally are.
 306
 307
 308 Why not fix Pathlen=2?:
 309
 310   The main reason I am not advocating that we always use 2 hops is that
 311   in some situations, timing correlation evidence by itself may not be
 312   considered as solid and convincing as an actual, uninterrupted, fully
 313   traced path. Are these timing attacks as effective on a real network as
 314   they are in simulation? Maybe the circuit multiplexing of Tor can serve
 315   to frustrate them to a degree? Would an extralegal adversary or
 316   authoritarian government even care? In the face of these situation
 317   dependent unknowns, it should be up to the user to decide if this is
 318   a concern for them or not.
 319
 320   It should probably also be noted that even a false positive
 321   rate of 1% for a 200k concurrent-user network could mean that for a
 322   given node, a given stream could be confused with something like 10
 323   users, assuming ~200 nodes carry most of the traffic (ie 1000 users
 324   each). Though of course to really know for sure, someone needs to do
 325   an attack on a real network, unfortunately.
 326
 327   Additionally, at some point cover traffic schemes may be implemented to
 328   frustrate timing attacks on the first hop. It is possible some expert
 329   users may do this ad-hoc already, and may wish to continue using 3 hops
 330   for this reason.
 331
 332
 333 Implementation:
 334
 335   new_route_len() can be modified directly with a check of the
 336   Pathlen option. However, circuit construction logic should be
 337   altered so that both 2 hop and 3 hop users build the same types of
 338   circuits, and the option should ultimately govern circuit selection,
 339   not construction. This improves coverage against guard nodes being
 340   able to passively profile users who aren't even using Tor.
 341   PathlenCoinWeight, anyone? :)
 342
 343   The exit policy hack is a bit more tricky. compare_addr_to_addr_policy
 344   needs to return an alternate ADDR_POLICY_ACCEPTED_WILDCARD or
 345   ADDR_POLICY_ACCEPTED_SPECIFIC return value for use in
 346   circuit_is_acceptable.
 347
 348   The leaky exit is trickier still.. handle_control_attachstream
 349   does allow paths to exit at a given hop. Presumably something similar
 350   can be done in connection_ap_handshake_process_socks, and elsewhere?
 351   Circuit construction would also have to be performed such that the
 352   2nd hop's exit policy is what is considered, not the 3rd's.
 353
 354   The entry_guard_t structure could have num_circ_failed and
 355   num_circ_succeeded members such that if it exceeds F% circuit
 356   extend failure rate to a second hop, it is removed from the entry list.
 357
 358   F should be sufficiently high to avoid churn from normal Tor circuit
 359   failure as determined by TorFlow scans.
 360
 361   The Vidalia option should be presented as a radio button.
 362
 363
 364 Migration:
 365
 366   Phase 1: Adjust exit policy checks if Pathlen is set, implement leaky
 367   circuit ability, and 2-3 hop circuit selection logic governed by
 368   Pathlen.
 369
 370   Phase 2: Experiment to determine the proper ratio of circuit
 371   failures used to expire garbage or malicious guards via TorFlow
 372   (pending Bug #440 backport+adoption).
 373
 374   Phase 3: Implement guard expiration code to kick off failure-prone
 375   guards and warn the user. Cap 2 hop guard duration to a proper number
 376   of days determined sufficient to establish guard reliability (to be
 377   determined by TorFlow).
 378
 379   Phase 4: Make radiobutton in Vidalia, along with help entry
 380   that explains in layman's terms the risks involved.
 381
 382   Phase 5: Allow user to specify path length by HTTP URL suffix.
 383
 384
 385 [1] http://p2pnet.net/story/11279
 386 [2] http://www.cs.umass.edu/~mwright/papers/levine-timing.pdf
 387 [3] Proof available upon request ;)