1 .\" This man page is Copyright (C) 1999 Andi Kleen <ak@muc.de>.
3 .\" %%%LICENSE_START(VERBATIM_ONE_PARA)
4 .\" Permission is granted to distribute possibly modified copies
5 .\" of this page provided the header is included verbatim,
6 .\" and in case of nontrivial modification author and date
7 .\" of the modification is added to the header.
10 .\" $Id: packet.7,v 1.13 2000/08/14 08:03:45 ak Exp $
12 .TH packet 7 (date) "Linux man-pages (unreleased)"
14 packet \- packet interface on device level
17 .B #include <sys/socket.h>
18 .B #include <linux/if_packet.h>
19 .B #include <net/ethernet.h> /* the L2 protocols */
21 .BI "packet_socket = socket(AF_PACKET, int " socket_type ", int "protocol );
24 Packet sockets are used to receive or send raw packets at the device driver
26 They allow the user to implement protocol modules in user space
27 on top of the physical layer.
33 for raw packets including the link-level header or
35 for cooked packets with the link-level header removed.
36 The link-level header information is available in a common format in a
40 is the IEEE 802.3 protocol number in network byte order.
43 include file for a list of allowed protocols.
46 .BR htons(ETH_P_ALL) ,
47 then all protocols are received.
48 All incoming packets of that protocol type will be passed to the packet
49 socket before they are passed to the protocols implemented in the kernel.
53 no packets are received.
55 can optionally be called with a nonzero
57 to start receiving packets for the protocols specified.
59 In order to create a packet socket, a process must have the
61 capability in the user namespace that governs its network namespace.
64 packets are passed to and from the device driver without any changes in
66 When receiving a packet, the address is still parsed and
70 When transmitting a packet, the user-supplied buffer
71 should contain the physical-layer header.
73 queued unmodified to the network driver of the interface defined by the
75 Some device drivers always add other headers.
77 is similar to but not compatible with the obsolete
78 .B AF_INET/SOCK_PACKET
82 operates on a slightly higher level.
83 The physical header is removed before the packet is passed to the user.
84 Packets sent through a
86 packet socket get a suitable physical-layer header based on the
89 destination address before they are queued.
91 By default, all packets of the specified protocol type
92 are passed to a packet socket.
93 To get packets only from a specific interface use
95 specifying an address in a
97 to bind the packet socket to an interface.
98 Fields used for binding are
108 operation is not supported on packet sockets.
117 the real length of the packet on the wire is always returned,
118 even when it is longer than the buffer.
122 structure is a device-independent physical-layer address.
127 unsigned short sll_family; /* Always AF_PACKET */
128 unsigned short sll_protocol; /* Physical\-layer protocol */
129 int sll_ifindex; /* Interface number */
130 unsigned short sll_hatype; /* ARP hardware type */
131 unsigned char sll_pkttype; /* Packet type */
132 unsigned char sll_halen; /* Length of address */
133 unsigned char sll_addr[8]; /* Physical\-layer address */
138 The fields of this structure are as follows:
141 is the standard ethernet protocol type in network byte order as defined
143 .I <linux/if_ether.h>
145 It defaults to the socket's protocol.
148 is the interface index of the interface
151 0 matches any interface (only permitted for binding).
153 is an ARP type as defined in the
158 contains the packet type.
161 for a packet addressed to the local host,
163 for a physical-layer broadcast packet,
165 for a packet sent to a physical-layer multicast address,
167 for a packet to some other host that has been caught by a device driver
168 in promiscuous mode, and
170 for a packet originating from the local host that is looped back to a packet
172 These types make sense only for receiving.
177 contain the physical-layer (e.g., IEEE 802.3) address and its length.
178 The exact interpretation depends on the device.
180 When you send packets, it is enough to specify
187 The other fields should be 0.
191 are set on received packets for your information.
193 Packet socket options are configured by calling
198 .B PACKET_ADD_MEMBERSHIP
201 .B PACKET_DROP_MEMBERSHIP
203 Packet sockets can be used to configure physical-layer multicasting
204 and promiscuous mode.
205 .B PACKET_ADD_MEMBERSHIP
207 .B PACKET_DROP_MEMBERSHIP
211 structure as argument:
216 int mr_ifindex; /* interface index */
217 unsigned short mr_type; /* action */
218 unsigned short mr_alen; /* address length */
219 unsigned char mr_address[8]; /* physical\-layer address */
225 contains the interface index for the interface whose status
229 field specifies which action to perform.
231 enables receiving all packets on a shared medium (often known as
233 .B PACKET_MR_MULTICAST
234 binds the socket to the physical-layer multicast group specified in
239 .B PACKET_MR_ALLMULTI
240 sets the socket up to receive all multicast packets arriving at
243 In addition, the traditional ioctls
247 can be used for the same purpose.
249 .BR PACKET_AUXDATA " (since Linux 2.6.21)"
250 .\" commit 8dc4194474159660d7f37c495e3fc3f10d0db8cc
251 If this binary option is enabled, the packet socket passes a metadata
252 structure along with each packet in the
255 The structure can be read with
261 struct tpacket_auxdata {
263 __u32 tp_len; /* packet length */
264 __u32 tp_snaplen; /* captured length */
268 __u16 tp_vlan_tpid; /* Since Linux 3.14; earlier, these
269 were unused padding bytes */
270 .\" commit a0cdfcf39362410d5ea983f4daf67b38de129408 added tp_vlan_tpid
275 .BR PACKET_FANOUT " (since Linux 3.1)"
276 .\" commit dc99f600698dcac69b8f56dda9a8a00d645c5ffc
277 To scale processing across threads, packet sockets can form a fanout
279 In this mode, each matching packet is enqueued onto only one
281 A socket joins a fanout group by calling
287 Each network namespace can have up to 65536 independent groups.
288 A socket selects a group by encoding the ID in the first 16 bits of
289 the integer option value.
290 The first packet socket to join a group implicitly creates it.
291 To successfully join an existing group, subsequent packet sockets
292 must have the same protocol, device settings, fanout mode, and
294 Packet sockets can leave a fanout group only by closing the socket.
295 The group is deleted when the last socket is closed.
297 Fanout supports multiple algorithms to spread traffic between sockets,
302 .BR PACKET_FANOUT_HASH ,
303 sends packets from the same flow to the same socket to maintain
305 For each packet, it chooses a socket by taking the packet flow hash
306 modulo the number of sockets in the group, where a flow hash is a hash
307 over network-layer address and optional transport-layer port fields.
309 The load-balance mode
311 implements a round-robin algorithm.
314 selects the socket based on the CPU that the packet arrived on.
316 .B PACKET_FANOUT_ROLLOVER
317 processes all data on a single socket, moving to the next when one
321 selects the socket using a pseudo-random number generator.
324 .\" commit 2d36097d26b5991d71a2cf4a20c1a158f0f1bfcd
325 (available since Linux 3.14)
326 selects the socket using the recorded queue_mapping of the received skb.
329 Fanout modes can take additional options.
330 IP fragmentation causes packets from the same flow to have different
333 .BR PACKET_FANOUT_FLAG_DEFRAG ,
334 if set, causes packets to be defragmented before fanout is applied, to
335 preserve order even in this case.
336 Fanout mode and options are communicated in the second 16 bits of the
337 integer option value.
339 .B PACKET_FANOUT_FLAG_ROLLOVER
340 enables the roll over mechanism as a backup strategy: if the
341 original fanout algorithm selects a backlogged socket, the packet
342 rolls over to the next available one.
344 .BR PACKET_LOSS " (with " PACKET_TX_RING )
345 When a malformed packet is encountered on a transmit ring,
346 the default is to reset its
349 .B TP_STATUS_WRONG_FORMAT
350 and abort the transmission immediately.
351 The malformed packet blocks itself and subsequently enqueued packets from
353 The format error must be fixed, the associated
356 .BR TP_STATUS_SEND_REQUEST ,
357 and the transmission process restarted via
361 is set, any malformed packet will be skipped, its
364 .BR TP_STATUS_AVAILABLE ,
365 and the transmission process continued.
367 .BR PACKET_RESERVE " (with " PACKET_RX_RING )
368 By default, a packet receive ring writes packets immediately following the
369 metadata structure and alignment padding.
370 This integer option reserves additional headroom.
373 Create a memory-mapped ring buffer for asynchronous packet reception.
374 The packet socket reserves a contiguous region of application address
375 space, lays it out into an array of packet slots and copies packets
378 into subsequent slots.
379 Each packet is preceded by a metadata structure similar to
380 .IR tpacket_auxdata .
381 The protocol fields encode the offset to the data
382 from the start of the metadata header.
384 stores the offset to the network layer.
385 If the packet socket is of type
392 then that field stores the offset to the link-layer frame.
393 Packet socket and application communicate the head and tail of the ring
397 The packet socket owns all slots with
400 .BR TP_STATUS_KERNEL .
401 After filling a slot, it changes the status of the slot to transfer
402 ownership to the application.
403 During normal operation, the new
405 value has at least the
407 bit set to signal that a received packet has been stored.
408 When the application has finished processing a packet, it transfers
409 ownership of the slot back to the socket by setting
412 .BR TP_STATUS_KERNEL .
414 Packet sockets implement multiple variants of the packet ring.
415 The implementation details are described in
416 .I Documentation/networking/packet_mmap.rst
417 in the Linux kernel source tree.
420 Retrieve packet socket statistics in the form of a structure
424 struct tpacket_stats {
425 unsigned int tp_packets; /* Total packet count */
426 unsigned int tp_drops; /* Dropped packet count */
431 Receiving statistics resets the internal counters.
432 The statistics structure differs when using a ring of variant
435 .BR PACKET_TIMESTAMP " (with " PACKET_RX_RING "; since Linux 2.6.36)"
436 .\" commit 614f60fa9d73a9e8fdff3df83381907fea7c5649
437 The packet receive ring always stores a timestamp in the metadata header.
438 By default, this is a software generated timestamp generated when the
439 packet is copied into the ring.
440 This integer option selects the type of timestamp.
441 Besides the default, it support the two hardware formats described in
442 .I Documentation/networking/timestamping.rst
443 in the Linux kernel source tree.
445 .BR PACKET_TX_RING " (since Linux 2.6.31)"
446 .\" commit 69e3c75f4d541a6eb151b3ef91f34033cb3ad6e1
447 Create a memory-mapped ring buffer for packet transmission.
448 This option is similar to
450 and takes the same arguments.
451 The application writes packets into slots with
454 .B TP_STATUS_AVAILABLE
455 and schedules them for transmission by changing
458 .BR TP_STATUS_SEND_REQUEST .
459 When packets are ready to be transmitted, the application calls
461 or a variant thereof.
466 fields of this call are ignored.
467 If an address is passed using
471 then that overrides the socket default.
472 On successful transmission, the socket resets
475 .BR TP_STATUS_AVAILABLE .
476 It immediately aborts the transmission on error unless
480 .BR PACKET_VERSION " (with " PACKET_RX_RING "; since Linux 2.6.27)"
481 .\" commit bbd6ef87c544d88c30e4b762b1b61ef267a7d279
484 creates a packet receive ring of variant
486 To create another variant, configure the desired variant by setting this
487 integer option before creating the ring.
489 .BR PACKET_QDISC_BYPASS " (since Linux 3.14)"
490 .\" commit d346a3fae3ff1d99f5d0c819bf86edf9094a26a1
491 By default, packets sent through packet sockets pass through the kernel's
492 qdisc (traffic control) layer, which is fine for the vast majority of use
494 For traffic generator appliances using packet sockets
495 that intend to brute-force flood the network\[em]for example,
496 to test devices under load in a similar
497 fashion to pktgen\[em]this layer can be bypassed by setting
498 this integer option to 1.
499 A side effect is that packet buffering in the qdisc layer is avoided,
500 which will lead to increased drops when network
501 device transmit queues are busy;
502 therefore, use at your own risk.
505 can be used to receive the timestamp of the last received packet.
509 .\" FIXME Document SIOCGSTAMPNS
511 In addition, all standard ioctls defined in
515 are valid on packet sockets.
517 Packet sockets do no error handling other than errors occurred
518 while passing the packet to the device driver.
519 They don't have the concept of a pending error.
523 Unknown multicast group address passed.
526 User passed invalid memory address.
532 Packet is bigger than interface MTU.
538 Not enough memory to allocate the packet.
541 Unknown device name or interface index specified in interface address.
547 No interface address passed.
550 Interface address contained an invalid interface index.
553 User has insufficient privileges to carry out this operation.
555 In addition, other errors may be generated by the low-level driver.
558 is a new feature in Linux 2.2.
559 Earlier Linux versions supported only
562 For portable programs it is suggested to use
566 although this covers only a subset of the
572 packet sockets make no attempt to create or parse the IEEE 802.2 LLC
573 header for a IEEE 802.3 frame.
576 is specified as protocol for sending the kernel creates the
577 802.3 frame and fills out the length field; the user has to supply the LLC
578 header to get a fully conforming packet.
579 Incoming 802.3 packets are not multiplexed on the DSAP/SSAP protocol
580 fields; instead they are supplied to the user as protocol
582 with the LLC header prefixed.
583 It is thus not possible to bind to
587 instead and do the protocol multiplex yourself.
588 The default for sending is the standard Ethernet DIX
589 encapsulation with the protocol filled in.
591 Packet sockets are not subject to the input or output firewall chains.
593 In Linux 2.0, the only way to get a packet socket was with the call:
597 socket(AF_INET, SOCK_PACKET, protocol)
601 This is still supported, but deprecated and strongly discouraged.
602 The main difference between the two methods is that
605 .I struct sockaddr_pkt
606 to specify an interface, which doesn't provide physical-layer
611 struct sockaddr_pkt {
612 unsigned short spkt_family;
613 unsigned char spkt_device[14];
614 unsigned short spkt_protocol;
623 is the IEEE 802.3 protocol type as defined in
627 is the device name as a null-terminated string, for example, eth0.
629 This structure is obsolete and should not be used in new code.
631 .SS LLC header handling
632 The IEEE 802.2/803.3 LLC handling could be considered as a bug.
637 extension is an ugly hack and should be replaced by a control message.
638 There is currently no way to get the original destination address of
641 .SS spkt_device device name truncation
646 has a size of 14 bytes,
647 which is less than the constant
651 which is 16 bytes and describes the system limit for a network interface name.
652 This means the names of network devices longer than 14 bytes
653 will be truncated to fit into
655 All these lengths include the terminating null byte (\[aq]\e0\[aq])).
657 Issues from this with old code typically show up with
658 very long interface names used by the
659 .B Predictable Network Interface Names
660 feature enabled by default in many modern Linux distributions.
662 The preferred solution is to rewrite code to avoid
664 Possible user solutions are to disable
665 .B Predictable Network Interface Names
666 or to rename the interface to a name of at most 13 bytes,
667 for example using the
670 .SS Documentation issues
671 Socket filters are not documented.
673 .\" This man page was written by Andi Kleen with help from Matthew Wilcox.
674 .\" AF_PACKET in Linux 2.2 was implemented
675 .\" by Alexey Kuznetsov, based on code by Alan Cox and others.
679 .BR capabilities (7),
685 RFC\ 894 for the standard IP Ethernet encapsulation.
686 RFC\ 1700 for the IEEE 802.3 IP encapsulation.
689 .I <linux/if_ether.h>
690 include file for physical-layer protocols.
692 The Linux kernel source tree.
693 .I Documentation/networking/filter.rst
694 describes how to apply Berkeley Packet Filters to packet sockets.
695 .I tools/testing/selftests/net/psock_tpacket.c
696 contains example source code for all available versions of