1 \documentstyle[12pt,twoside
]{article
}
2 \def\TITLE{Tunnels over IP
}
5 \Large\bf Tunnels over IP in Linux-
2.2
10 { \large Alexey~N.~Kuznetsov
} \\
11 \em Institute for Nuclear Research, Moscow \\
12 \verb|kuznet@ms2.inr.ac.ru| \\
21 \section{Instead of introduction: micro-FAQ.
}
26 Q: In linux-
2.0.36 I used:
28 ifconfig tunl1
10.0.0.1 pointopoint
193.233.7.65
30 to create tunnel. It does not work in
2.2.0!
32 A: You are right, it does not work. The command written above is split to two commands.
34 ip tunnel add MY-TUNNEL mode ipip remote
193.233.7.65
36 will create tunnel device with name
\verb|MY-TUNNEL|. Now you may configure
39 ifconfig MY-TUNNEL
10.0.0.1
41 Certainly, if you prefer name
\verb|tunl1| to
\verb|MY-TUNNEL|,
45 Q: In linux-
2.0.36 I used:
47 ifconfig tunl0
10.0.0.1
48 route add -net
10.0.0.0 gw
193.233.7.65 dev tunl0
50 to tunnel net
10.0.0.0 via router
193.233.7.65. It does not
51 work in
2.2.0! Moreover,
\verb|route| prints a funny error sort of
52 ``network unreachable'' and after this I found a strange direct route
53 to
10.0.0.0 via
\verb|tunl0| in routing table.
55 A: Yes, in
2.2 the rule that
{\em normal
} gateway must reside on directly
56 connected network has not any exceptions. You may tell kernel, that
57 this particular route is
{\em abnormal
}:
59 ifconfig tunl0
10.0.0.1 netmask
255.255.255.255
60 ip route add
10.0.0.0/
8 via
193.233.7.65 dev tunl0 onlink
62 Note keyword
\verb|onlink|, it is the magic key that orders kernel
63 not to check for consistency of gateway address.
64 Probably, after this explanation you have already guessed another method
67 ifconfig tunl0
10.0.0.1 netmask
255.255.255.255
68 route add -host
193.233.7.65 dev tunl0
69 route add -net
10.0.0.0 netmask
255.0.0.0 gw
193.233.7.65
70 route del -host
193.233.7.65 dev tunl0
72 Well, if you like such tricks, nobody may prohibit you to use them.
74 that between
\verb|route add| and
\verb|route del| host
193.233.7.65 is
78 Q: In
2.0.36 I used to load
\verb|tunnel| device module and
\verb|ipip| module.
79 I cannot find any
\verb|tunnel| in
2.2!
81 A: Linux-
2.2 has single module
\verb|ipip| for both directions of tunneling
82 and for all IPIP tunnel devices.
85 Q:
\verb|traceroute| does not work over tunnel! Well, stop... It works,
86 only skips some number of hops.
88 A: Yes. By default tunnel driver copies
\verb|ttl| value from
89 inner packet to outer one. It means that path traversed by tunneled
90 packets to another endpoint is not hidden. If you dislike this, or if you
91 are going to use some routing protocol expecting that packets
92 with ttl
1 will reach peering host (f.e.\ RIP, OSPF or EBGP)
93 and you are not afraid of
94 tunnel loops, you may append option
\verb|ttl
64|, when creating tunnel
95 with
\verb|ip tunnel add|.
98 Q: ... Well, list of things, which
2.0 was able to do finishes.
102 \paragraph{Summary of differences between
2.2 and
2.0.
}
106 \item {\bf In
2.0} you could compile tunnel device into kernel
107 and got set of
4 devices
\verb|tunl0| ...
\verb|tunl3| or,
108 alternatively, compile it as module and load new module
109 for each new tunnel. Also, module
\verb|ipip| was necessary
110 to receive tunneled packets.
112 {\bf 2.2} has
{\em one\/
} module
\verb|ipip|. Loading it you get base
113 tunnel device
\verb|tunl0| and another tunnels may be created with command
114 \verb|ip tunnel add|. These new devices may have arbitrary names.
117 \item {\bf In
2.0} you set remote tunnel endpoint address with
118 the command
\verb|ifconfig| ...
\verb|pointopoint A|.
120 {\bf In
2.2} this command has the same semantics on all
121 the interfaces, namely it sets not tunnel endpoint,
122 but address of peering host, which is directly reachable
124 rather than via Internet. Actual tunnel endpoint address
\verb|A|
125 should be set with
\verb|ip tunnel add ... remote A|.
127 \item {\bf In
2.0} you create tunnel routes with the command:
129 route add -net
10.0.0.0 gw A dev tunl0
132 {\bf 2.2} interprets this command equally for all device
133 kinds and gateway is required to be directly reachable via this tunnel,
134 rather than via Internet. You still may use
\verb|ip route add ... onlink|
135 to override this behaviour.
140 \section{Tunnel setup: basics
}
142 Standard Linux-
2.2 kernel supports three flavor of tunnels,
143 listed in the following table:
147 \vrule depth
0.8ex width
0pt
\relax
148 Mode & Description & Base device \\
149 ipip & IP over IP & tunl0 \\
150 sit & IPv6 over IP & sit0 \\
151 gre & ANY over GRE over IP & gre0
156 \noindent All the kinds of tunnels are created with one command:
158 ip tunnel add <NAME> mode <MODE>
[ local <S>
] [ remote <D>
]
161 This command creates new tunnel device with name
\verb|<NAME>|.
162 The
\verb|<NAME>| is an arbitrary string. Particularly,
163 it may be even
\verb|eth0|. The rest of parameters set
164 different tunnel characteristics.
169 \verb|mode <MODE>| sets tunnel mode. Three modes are available now
170 \verb|ipip|,
\verb|sit| and
\verb|gre|.
173 \verb|remote <D>| sets remote endpoint of the tunnel to IP
176 \verb|local <S>| sets fixed local address for tunneled
177 packets. It must be an address on another interface of this host.
181 \let\thefootnote\oldthefootnote
183 Both
\verb|remote| and
\verb|local| may be omitted. In this case we
184 say that they are zero or wildcard. Two tunnels of one mode cannot
185 have the same
\verb|remote| and
\verb|local|. Particularly it means
186 that base device or fallback tunnel cannot be replicated.
\footnote{
187 This restriction is relaxed for keyed GRE tunnels.
}
189 Tunnels are divided to two classes:
{\bf pointopoint
} tunnels, which
190 have some not wildcard
\verb|remote| address and deliver all the packets
191 to this destination, and
{\bf NBMA
} (i.e. Non-Broadcast Multi-Access) tunnels,
192 which have no
\verb|remote|. Particularly, base devices (f.e.\
\verb|tunl0|)
193 are NBMA, because they have neither
\verb|remote| nor
194 \verb|local| addresses.
197 After tunnel device is created you should configure it as you did
198 it with another devices. Certainly, the configuration of tunnels has
199 some features related to the fact that they work over existing Internet
200 routing infrastructure and simultaneously create new virtual links,
201 which changes this infrastructure. The danger that not enough careful
202 tunnel setup will result in formation of tunnel loops,
203 collapse of routing or flooding network with exponentially
204 growing number of tunneled fragments is very real.
207 Protocol setup on pointopoint tunnels does not differ of configuration
208 of another devices. You should set a protocol address with
\verb|ifconfig|
209 and add routes with
\verb|route| utility.
211 NBMA tunnels are different. To route something via NBMA tunnel
212 you have to explain to driver, where it should deliver packets to.
213 The only way to make it is to create special routes with gateway
214 address pointing to desired endpoint. F.e.\
216 ip route add
10.0.0.0/
24 via <A> dev tunl0 onlink
218 It is important to use option
\verb|onlink|, otherwise
219 kernel will refuse request to create route via gateway not directly
220 reachable over device
\verb|tunl0|. With IPv6 the situation is much simpler:
221 when you start device
\verb|sit0|, it automatically configures itself
222 with all IPv4 addresses mapped to IPv6 space, so that all IPv4
223 Internet is
{\em really reachable
} via
\verb|sit0|! Excellent, the command
225 ip route add
3FFE::/
16 via ::
193.233.7.65 dev sit0
227 will route
\verb|
3FFE::/
16| via
\verb|sit0|, sending all the packets
228 destined to this prefix to
193.233.7.65.
230 \section{Tunnel setup: options
}
232 Command
\verb|ip tunnel add| has several additional options.
235 \item \verb|ttl N| --- set fixed TTL
\verb|N| on tunneled packets.
236 \verb|N| is number in the range
1--
255.
0 is special value,
237 meaning that packets inherit TTL value.
238 Default value is:
\verb|inherit|.
240 \item \verb|tos T| --- set fixed tos
\verb|T| on tunneled packets.
241 Default value is:
\verb|inherit|.
243 \item \verb|dev DEV| --- bind tunnel to device
\verb|DEV|, so that
244 tunneled packets will be routed only via this device and will
245 not be able to escape to another device, when route to endpoint changes.
247 \item \verb|nopmtudisc| --- disable Path MTU Discovery on this tunnel.
248 It is enabled by default. Note that fixed ttl is incompatible
249 with this option: tunnels with fixed ttl always make pmtu discovery.
253 \verb|ipip| and
\verb|sit| tunnels have no more options.
\verb|gre|
254 tunnels are more complicated:
258 \item \verb|key K| --- use keyed GRE with key
\verb|K|.
\verb|K| is
259 either number or IP address-like dotted quad.
261 \item \verb|csum| --- checksum tunneled packets.
263 \item \verb|seq| --- serialize packets.
265 I think this option does not
266 work. At least, I did not test it, did not debug it and
267 even do not understand, how it is supposed to work and for what
268 purpose Cisco planned to use it.
274 Actually, these GRE options can be set separately for input and
275 output directions by prefixing corresponding keywords with letter
276 \verb|i| or
\verb|o|. F.e.\
\verb|icsum| orders to accept only
277 packets with correct checksum and
\verb|ocsum| means, that
278 our host will calculate and send checksum.
280 Command
\verb|ip tunnel add| is not the only operation,
281 which can be made with tunnels. Certainly, you may get short help page
287 Besides that, you may view list of installed tunnels with the help of command:
291 Also you may look at statistics:
293 ip -s tunnel ls Cisco
295 where
\verb|Cisco| is name of tunnel device. Command
299 destroys tunnel
\verb|Cisco|. And, finally,
301 ip tunnel change Cisco mode sit local ME remote HE ttl
32
303 changes its parameters.
305 \section{Differences
2.2 and
2.0 tunnels revisited.
}
307 Now we can discuss more subtle differences between tunneling in
2.0
312 \item In
2.0 all tunneled packets were received promiscuously
313 as soon as you loaded module
\verb|ipip|.
2.2 tries to select the best
314 tunnel device and packet looks as received on this. F.e.\ if host
315 received
\verb|ipip| packet from host
\verb|D| destined to our
316 local address
\verb|S|, kernel searches for matching tunnels
320 1 &
\verb|remote| is
\verb|D| and
\verb|local| is
\verb|S| \\
321 2 &
\verb|remote| is
\verb|D| and
\verb|local| is wildcard \\
322 3 &
\verb|remote| is wildcard and
\verb|local| is
\verb|S| \\
326 If tunnel exists, but it is not in
\verb|UP| state, the tunnel is ignored.
327 Note, that if
\verb|tunl0| is
\verb|UP| it receives all the IPIP packets,
328 not acknowledged by more specific tunnels.
329 Be careful, it means that without carefully installed firewall rules
330 anyone on the Internet may inject to your network any packets with
331 source addresses indistinguishable from local ones. It is not so bad idea
332 to design tunnels in the way enforcing maximal route symmetry
333 and to enable reversed path filter (
\verb|rp_filter| sysctl option) on
336 \item In
2.2 you can monitor and debug tunnels with
\verb|tcpdump|.
337 F.e.\
\verb|tcpdump|
\verb|-i Cisco|
\verb|-nvv| will dump packets,
338 which kernel output, via tunnel
\verb|Cisco| and the packets received on it
339 from kernel viewpoint.
344 \section{Linux and Cisco IOS tunnels.
}
346 Among another tunnels Cisco IOS supports IPIP and GRE.
347 Essentially, Cisco setup is subset of options, available for Linux.
348 Let us consider the simplest example:
353 tunnel source
10.10.14.1
354 tunnel destination
10.10.13.2
358 This command set translates to:
361 ip tunnel add Tunnel0 \
367 Any questions? No questions.
369 \section{Interaction IPIP tunnels and DVMRP.
}
371 DVMRP exploits IPIP tunnels to route multicasts via Internet.
372 \verb|mrouted| creates
373 IPIP tunnels listed in its configuration file automatically.
374 From kernel and user viewpoints there are no differences between
375 tunnels, created in this way, and tunnels created by
\verb|ip tunnel|.
376 I.e.\ if
\verb|mrouted| created some tunnel, it may be used to
377 route unicast packets, provided appropriate routes are added.
378 And vice versa, if administrator has already created a tunnel,
379 it will be reused by
\verb|mrouted|, if it requests DVMRP
380 tunnel with the same local and remote addresses.
382 Do not wonder, if your manually configured tunnel is
383 destroyed, when mrouted exits.
386 \section{Broadcast GRE ``tunnels''.
}
388 It is possible to set
\verb|remote| for GRE tunnel to a multicast
389 address. Such tunnel becomes
{\bf broadcast
} tunnel (though word
390 tunnel is not quite appropriate in this case, it is rather virtual network).
392 ip tunnel add Universe local
193.233.7.65 \
393 remote
224.66.66.66 ttl
16
394 ip addr add
10.0.0.1/
16 dev Universe
395 ip link set Universe up
397 This tunnel is true broadcast network and broadcast packets are
398 sent to multicast group
224.66.66.66. By default such tunnel starts
399 to resolve both IP and IPv6 addresses via ARP/NDISC, so that
400 if multicast routing is supported in surrounding network, all GRE nodes
401 will find one another automatically and will form virtual Ethernet-like
402 broadcast network. If multicast routing does not work, it is unpleasant
403 but not fatal flaw. The tunnel becomes NBMA rather than broadcast network.
404 You may disable dynamic ARPing by:
406 echo
0 > /proc/sys/net/ipv4/neigh/Universe/mcast_solicit
408 and to add required information to ARP tables manually:
410 ip neigh add
10.0.0.2 lladdr
128.6.190.2 dev Universe nud permanent
412 In this case packets sent to
10.0.0.2 will be encapsulated in GRE
413 and sent to
128.6.190.2. It is possible to facilitate address resolution
414 using methods typical for another NBMA networks f.e.\ to start user
415 level
\verb|arpd| daemon, which will maintain database of hosts attached
416 to GRE virtual network or ask for information
417 dedicated ARP or NHRP server.
420 Actually, such setup is the most natural for tunneling,
421 it is really flexible, scalable and easily managable, so that
422 it is strongly recommended to be used with GRE tunnels instead of ugly
423 hack with NBMA mode and
\verb|onlink| modifier. Unfortunately,
424 by historical reasons broadcast mode is not supported by IPIP tunnels,
425 but this probably will change in future.
429 \section{Traffic control issues.
}
431 Tunnels are devices, hence all the power of Linux traffic control
432 applies to them. The simplest (and the most useful in practice)
433 example is limiting tunnel bandwidth. The following command:
435 tc qdisc add dev tunl0 root tbf \
436 rate
128Kbit burst
4K limit
10K
438 will limit tunneled traffic to
128Kbit with maximal burst size of
4K
439 and queuing not more than
10K.
441 However, you should remember, that tunnels are
{\em virtual
} devices
442 implemented in software and true queue management is impossible for them
443 just because they have no queues. Instead, it is better to create classes
444 on real physical interfaces and to map tunneled packets to them.
445 In general case of dynamic routing you should create such classes
446 on all outgoing interfaces, or, alternatively,
447 to use option
\verb|dev DEV| to bind tunnel to a fixed physical device.
448 In the last case packets will be routed only via specified device
449 and you need to setup corresponding classes only on it.
450 Though you have to pay for this convenience,
451 if routing will change, your tunnel will fail.
453 Suppose that CBQ class
\verb|
1:ABC| has been created on device
\verb|eth0|
454 specially for tunnel
\verb|Cisco| with endpoints
\verb|S| and
\verb|D|.
455 Now you can select IPIP packets with addresses
\verb|S| and
\verb|D|
456 with some classifier and map them to class
\verb|
1:ABC|. F.e.\
457 it is easy to make with
\verb|rsvp| classifier:
459 tc filter add dev eth0 pref
100 proto ip rsvp \
460 session D ipproto ipip filter S \
464 If you want to make more detailed classification of sub-flows
465 transmitted via tunnel, you can build CBQ subtree,
466 rooted at
\verb|
1:ABC| and attach to subroot set of rules parsing
467 IPIP packets more deeply.