1 Rocker Network Switch Register Programming Guide
2 Copyright (c) Scott Feldman <sfeldma@gmail.com>
3 Copyright (c) Neil Horman <nhorman@tuxdriver.com>
4 Version 0.11, 12/29/2014
9 This program is free software; you can redistribute it and/or modify
10 it under the terms of the GNU General Public License as published by
11 the Free Software Foundation; either version 2 of the License, or
12 (at your option) any later version.
14 This program is distributed in the hope that it will be useful,
15 but WITHOUT ANY WARRANTY; without even the implied warranty of
16 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
17 GNU General Public License for more details.
19 SECTION 1: Introduction
20 =======================
25 This document describes the hardware/software interface for the Rocker switch
26 device. The intended audience is authors of OS drivers and device emulation
29 Notations and Conventions
30 -------------------------
32 o In register descriptions, [n:m] indicates a range from bit n to bit m,
34 o Use of leading 0x indicates a hexadecimal number.
35 o Use of leading 0b indicates a binary number.
36 o The use of RSVD or Reserved indicates that a bit or field is reserved for
38 o Field width is in bytes, unless otherwise noted.
39 o Register are (R) read-only, (R/W) read/write, (W) write-only, or (COR) clear
41 o TLV values in network-byte-order are designated with (N).
44 SECTION 2: PCI Configuration Registers
45 ======================================
47 PCI Configuration Space
48 -----------------------
50 Each switch instance registers as a PCI device with PCI configuration space:
52 offset width description value
53 ---------------------------------------------
54 0x0 2 Vendor ID 0x1b36
55 0x2 2 Device ID 0x0006
57 0x8 1 Revision ID 0x01
58 0x9 3 Class code 0x2800
62 0xF 1 Built-in self test
63 0x10 4 Base address low
64 0x14 4 Base address high
66 0x2C 2 Subsystem vendor ID *
70 0x3D 1 Interrupt pin 0x00
72 0x3D 1 Max latency 0x00
78 * Assigned by sub-system implementation
80 SECTION 3: Memory-Mapped Register Space
81 =======================================
83 There are two memory-mapped BARs. BAR0 maps device register space and is
84 0x2000 in size. BAR1 maps MSI-X vector and PBA tables and is also 0x2000 in
85 size, allowing for 256 MSI-X vectors.
87 All registers are 4 or 8 bytes long. It is assumed host software will access 4
88 byte registers with one 4-byte access, and 8 byte registers with either two
89 4-byte accesses or a single 8-byte access. In the case of two 4-byte accesses,
90 access must be lower and then upper 4-bytes, in that order.
92 BAR0 device register space is organized as follows:
95 ------------------------------------------------------
96 0x0000-0x000f Bogus registers to catch misbehaving
97 drivers. Writes do nothing. Reads
99 0x0010-0x00ff Test registers
100 0x0300-0x03ff General purpose registers
101 0x1000-0x1fff Descriptor control
103 Holes in register space are reserved. Writes to reserved registers do nothing.
104 Reads to reserved registers read back as 0.
106 No fancy stuff like write-combining is enabled on any of the registers.
108 BAR1 MSI-X register space is organized as follows:
111 ------------------------------------------------------
112 0x0000-0x0fff MSI-X vector table (256 vectors total)
113 0x1000-0x1fff MSI-X PBA table
116 SECTION 4: Interrupts, DMA, and Endianness
117 ==========================================
122 The device supports only MSI-X interrupts. BAR1 memory-mapped region contains
123 the MSI-X vector and PBA tables, with support for up to 256 MSI-X vectors.
125 The vector assignment is:
128 -----------------------------------------------------
129 0 Command descriptor ring completion
130 1 Event descriptor ring completion
131 2 Test operation completion
133 4-255 Tx and Rx descriptor ring completion
137 A MSI-X vector table entry is 16 bytes:
139 field offset width description
140 -------------------------------------------------------------
141 lower_addr 0x0 4 [31:2] message address[31:2]
142 [1:0] Rsvd (4 byte alignment
144 upper_addr 0x4 4 [31:19] Rsvd
145 [14:0] message address[46:32]
146 data 0x8 4 message data[31:0]
147 control 0xc 4 [31:1] Rsvd
148 [0] mask (0 = enable,
151 Software should install the Interrupt Service Routine (ISR) before any ports
152 are enabled or any commands are issued on the command ring.
157 DMA operations are used for packet DMA to/from the CPU, command and event
158 processing. Command processing includes statistical counters and table dumps,
159 table insertion/deletion, and more. Event processing provides an async
160 notification method for device-originating events. Each DMA operation has a
161 set of control registers to manage a descriptor ring. The descriptor rings are
162 allocated from contiguous host DMA-able memory and registers specify the rings
163 base address, size and current head and tail indices. Software always writes
164 the head, and hardware always writes the tail.
166 The higher-order bit of DMA_DESC_COMP_ERR is used to mark hardware completion
167 of a descriptor. Software will clear this bit when posting a descriptor to the
168 ring, and hardware will set this bit when the descriptor is complete.
170 Descriptor ring sizes must be a power of 2 and range from 2 to 64K entries.
171 Descriptor rings' base address must be 8-byte aligned. Descriptors must be
172 packed within ring. Each descriptor in each ring must also be aligned on an 8
173 byte boundary. Each descriptor ring will have these registers:
175 DMA_DESC_xxx_BASE_ADDR, offset 0x1000 + (x * 32), 64-bit, (R/W)
176 DMA_DESC_xxx_SIZE, offset 0x1008 + (x * 32), 32-bit, (R/W)
177 DMA_DESC_xxx_HEAD, offset 0x100c + (x * 32), 32-bit, (R/W)
178 DMA_DESC_xxx_TAIL, offset 0x1010 + (x * 32), 32-bit, (R)
179 DMA_DESC_xxx_CTRL, offset 0x1014 + (x * 32), 32-bit, (W)
180 DMA_DESC_xxx_CREDITS, offset 0x1018 + (x * 32), 32-bit, (R/W)
181 DMA_DESC_xxx_RSVD1, offset 0x101c + (x * 32), 32-bit, (R/W)
183 Where x is descriptor ring index:
201 Writing BASE_ADDR or SIZE will reset HEAD and TAIL to zero. HEAD cannot be
202 written past TAIL. To do so would wrap the ring. An empty ring is when HEAD
203 == TAIL. A full ring is when HEAD is one position behind TAIL. Both HEAD and
204 TAIL increment and modulo wrap at the ring size.
209 ------------------------------------------------------------------------
210 [0] CTRL_RESET Reset the descriptor ring
213 All descriptor types share some common fields:
215 field width description
216 -------------------------------------------------------------------
217 DMA_DESC_BUF_ADDR 8 Phys addr of desc payload, 8-byte
219 DMA_DESC_COOKIE 8 Desc cookie for completion matching,
220 upper-most bit is reserved
221 DMA_DESC_BUF_SIZE 2 Desc payload size in bytes
222 DMA_DESC_TLV_SIZE 2 Desc payload total size in bytes
223 used for TLVs. Must be <=
225 DMA_DESC_COMP_ERR 2 Completion status of associated
226 desc payload. High order bit is
227 clear on new descs, toggled by
228 hw for completed items.
230 To support forward- and backward-compatibility, descriptor and completion
231 payloads are specified in TLV format. Fields are packed with Type=field name,
232 Length=field length, and Value=field value. Software will ignore unknown fields
233 filled in by the switch. Likewise, the switch will ignore unknown fields
234 filled in by software.
236 Descriptor payload buffer is 8-byte aligned and TLVs are 8-byte aligned. The
237 value within a TLV is also 8-byte aligned. The (packed, 8 byte) TLV header is:
239 field width description
240 -----------------------------
242 len 2 TLV value length
245 The alignment requirements for descriptors and TLVs are to avoid unaligned
246 access exceptions in software. Note that the payload for each TLV is also
249 Figure 1 shows an example descriptor buffer with two TLVs.
251 <------- 8 bytes ------->
253 8-byte +––––+ +–––––––––––+–––––+–––––+ +–+
254 align | type | len | pad | TLV#1 hdr |
255 +–––––––––––+–––––+–––––+ (len=22) |
257 | value | TVL#1 value |
258 | | (padded to 8-byte |
259 | +–––––+ alignment) |
261 8-byte +––––+ +–––––––––––+–––––––––––+ |
262 align | type | len | pad | TLV#2 hdr DESC_BUF_SIZE
263 +–––––+–––––+–––––+–––––+ (len=2) |
264 |value|/////////////////| TLV#2 value |
265 +–––––+/////////////////| |
266 |///////////////////////| |
267 |///////////////////////| |
268 |///////////////////////| |
269 |////////unused/////////| |
270 |////////space//////////| |
271 |///////////////////////| |
272 |///////////////////////| |
273 |///////////////////////| |
274 +–––––––––––––––––––––––+ +–+
278 TLVs can be nested within the NEST TLV type.
283 MSI-X vectors used for descriptor ring completions use a credit mechanism for
284 efficient device, PCIe bus, OS and driver operations. Each descriptor ring has
285 a credit count which represents the number of outstanding descriptors to be
286 processed by the driver. As the device marks descriptors complete, the credit
287 count is incremented. As the driver processes those outstanding descriptors,
288 it returns credits back to the device. This way, the device knows the driver's
289 progress and can make decisions about when to fire the next interrupt or not.
290 When the credit count is zero, and the first descriptors are posted for the
291 driver, a single interrupt is fired. Once the interrupt is fired, the
292 interrupt is disabled (auto-masked*). In response to the interrupt, the driver
293 will process descriptors and PIO write a returned credit value for that
294 descriptor ring. If the driver returns all credits (the driver caught up with
295 the device and there is no outstanding work), then the interrupt is unmasked,
296 but not fired. If only partial credits are returned, the interrupt remains
297 masked but the device generates an interrupt, signaling the driver that more
298 outstanding work is available.
300 (* this masking is unrelated to to the MSI-X interrupt mask register)
305 Device registers are hard-coded to little-endian (LE). The driver should
306 convert to/from host endianess to LE for device register accesses.
308 Descriptors are LE. Descriptor buffer TLVs will have LE type and length
309 fields, but the value field can either be LE or network-byte-order, depending
310 on context. TLV values containing network packet data will be in network-byte
311 order. A TLV value containing a field or mask used to compare against network
312 packet data is network-byte order. For example, flow match fields (and masks)
313 are network-byte-order since they're matched directly, byte-by-byte, against
314 network packet data. All non-network-packet TLV multi-byte values will be LE.
316 TLV values in network-byte-order are designated with (N).
319 SECTION 5: Test Registers
320 =========================
322 Rocker has several test registers to support troubleshooting register access,
323 interrupt generation, and DMA operations:
325 TEST_REG, offset 0x0010, 32-bit (R/W)
326 TEST_REG64, offset 0x0018, 64-bit (R/W)
327 TEST_IRQ, offset 0x0020, 32-bit (R/W)
328 TEST_DMA_ADDR, offset 0x0028, 64-bit (R/W)
329 TEST_DMA_SIZE, offset 0x0030, 32-bit (R/W)
330 TEST_DMA_CTRL, offset 0x0034, 32-bit (R/W)
332 Reads to TEST_REG and TEST_REG64 will read a value equal to twice the last
333 value written to the register. The 32-bit and 64-bit versions are for testing
334 32-bit and 64-bit host accesses.
336 A vector can be written to TEST_IRQ and the device will generate an interrupt
339 To test basic DMA operations, allocate a DMA-able host buffer and put the
340 buffer address into TEST_DMA_ADDR and size into TEST_DMA_SIZE. Then, write to
341 TEST_DMA_CTRL to manipulate the buffer contents. TEST_DMA_CTRL operations are:
343 operation value description
344 -----------------------------------------------------------
345 TEST_DMA_CTRL_CLEAR 1 clear buffer
346 TEST_DMA_CTRL_FILL 2 fill buffer bytes with 0x96
347 TEST_DMA_CTRL_INVERT 4 invert bytes in buffer
349 Various buffer address and sizes should be tested to verify no address boundary
350 issue exists. In particular, buffers that start on odd-8-byte boundary and/or
351 span multiple PAGE sizes should be tested.
357 Physical and Logical Ports
358 ------------------------------------
360 The switch supports up to 62 physical (front-panel) ports. Register
361 PORT_PHYS_COUNT returns the actual number of physical ports available:
363 PORT_PHYS_COUNT, offset 0x0304, 32-bit, (R)
365 In addition to front-panel ports, the switch supports logical ports for
368 Front-panel ports and logical tunnel ports are mapped into a single 32-bit port
369 space. A special CPU port is assigned port 0. The front-panel ports are
370 mapped to ports 1-62. A special loopback port is assigned port 63. Logical
371 tunnel ports are assigned ports 0x0001000-0x0001ffff.
372 To summarize the port assignments:
375 -------------------------------------------------------
376 0 CPU port (for packets to/from host CPU)
377 1-62 front-panel physical ports
380 0x00010000-0x0001ffff logical tunnel ports
381 0x00020000-0xffffffff RSVD
386 Switch front-panel ports operate in a mode. Currently, the only mode is
387 OF-DPA. OF-DPA[1] mode is based on OpenFlow Data Plane Abstraction (OF-DPA)
388 Abstract Switch Specification, Version 1.0, from Broadcom Corporation. To
389 set/get the mode for front-panel ports, see port settings, below.
394 Link status for all front-panel ports is available via PORT_PHYS_LINK_STATUS:
396 PORT_PHYS_LINK_STATUS, offset 0x0310, 64-bit, (R)
398 Value is port bitmap. Bits 0 and 63 always read 0. Bits 1-62
399 read 1 for link UP and 0 for link DOWN for respective front-panel ports.
401 Other properties for front-panel ports are available via DMA CMD descriptors:
403 Get PORT_SETTINGS descriptor:
405 field width description
406 ----------------------------------------------
407 PORT_SETTINGS 2 CMD_GET
408 PPORT 4 Physical port #
410 Get PORT_SETTINGS completion:
412 field width description
413 ----------------------------------------------
414 PPORT 4 Physical port #
415 SPEED 4 Current port interface speed, in Mbps
416 DUPLEX 1 1 = Full, 0 = Half
417 AUTONEG 1 1 = enabled, 0 = disabled
418 MACADDR 6 Port MAC address
420 LEARNING 1 MAC address learning on port
423 PHYS_NAME <var> Physical port name (string)
425 Set PORT_SETTINGS descriptor:
427 field width description
428 ----------------------------------------------
429 PORT_SETTINGS 2 CMD_SET
430 PPORT 4 Physical port #
431 SPEED 4 Port interface speed, in Mbps
432 DUPLEX 1 1 = Full, 0 = Half
433 AUTONEG 1 1 = enabled, 0 = disabled
434 MACADDR 6 Port MAC address
440 Front-panel ports are initially disabled, which means port ingress and egress
441 packets will be dropped. To enable or disable a port, use PORT_PHYS_ENABLE:
443 PORT_PHYS_ENABLE: offset 0x0318, 64-bit, (R/W)
445 Value is bitmap of first 64 ports. Bits 0 and 63 are ignored
446 and always read as 0. Write 1 to enable port; write 0 to disable it.
450 SECTION 7: Switch Control
451 =========================
453 This section covers switch-wide register settings.
458 This register is used for low level control of the switch.
460 CONTROL: offset 0x0300, 32-bit, (W)
463 ------------------------------------------------------------------------
464 [0] CONTROL_RESET If set, device will perform reset
470 The switch has a SWITCH_ID to be used by software to uniquely identify the
473 SWITCH_ID: offset 0x0320, 64-bit, (R)
475 Value is opaque to switch software and no special encoding is implied.
481 Non-I/O asynchronous events from the device are notified to the host using the
482 event ring. The TLV structure for events is:
484 field width description
485 ---------------------------------------------------
486 TYPE 4 Event type, one of:
489 INFO <nest> Event info (details below)
494 When link status changes on a physical port, this event is generated.
496 field width description
497 ---------------------------------------------------
499 PPORT 4 Physical port
500 LINKUP 1 Link status:
507 When a packet ingresses on a port and the source MAC/VLAN isn't known to the
508 device, the device will generate this event. In response to the event, the
509 driver should install to the device the MAC/VLAN on the port into the bridge
510 table. Once installed, the MAC/VLAN is known on the port and this event will
511 no longer be generated.
513 field width description
514 ---------------------------------------------------
516 PPORT 4 Physical port
521 SECTION 9: CPU Packet Processing
522 ================================
524 Ingress packets directed to the host CPU for further processing are delivered
525 in the DMA RX ring. Likewise, host CPU originating packets destined to egress
526 on switch ports are scheduled by software using the DMA TX ring.
531 Software schedules packets for egress on switch ports using the DMA TX ring. A
532 TX descriptor buffer describes the packet location and size in host DMA-able
533 memory, the destination port, and any hardware-offload functions (such as L3
534 payload checksum offload). Software then bumps the descriptor head to signal
535 hardware of new Tx work. In response, hardware will DMA read Tx descriptors up
536 to head, DMA read descriptor buffer and packet data, perform offloading
537 functions, and finally frame packet on wire (network). Once packet processing
538 is complete, hardware will writeback status to descriptor(s) to signal to
539 software that Tx is complete and software resources (e.g. skb) backing packet
542 Figure 2 shows an example 3-fragment packet queued with one Tx descriptor. A
543 TLV is used for each packet fragment.
546 +–––––––+ +–+
549 +––––––––+ | | | |
550 Tx ring +–––+ +–––––+ | | |
551 +–––––––––+ | | TLVs | +–––––––+ |
552 | +–––+ +––––––––+ pkt frag 2 |
553 | desc 0 | | +–––––+ +–––––––+ |
554 +–––––––––+ | TLVs | +–––+ | |
555 head+–+ | +––––––––+ | | |
556 | desc 1 | | +–––––+ +–––––––+ |pkt
557 +–––––––––+ | TLVs | | |
558 | | +––––––––+ | pkt frag 3 |
559 | | | +–––––––+ |
560 +–––––––––+ +–––+ | |
563 +–––––––––+ | | |
566 +–––––––––+ | | |
567 | | +–––––––+ +–+
569 +–––––––––+
573 The TLVs for Tx descriptor buffer are:
575 field width description
576 ---------------------------------------------------------------------
577 PPORT 4 Destination physical port #
578 TX_OFFLOAD 1 Hardware offload modes:
580 1: insert IP csum (ipv4 only)
581 2: insert TCP/UDP csum
582 3: L3 csum calc and insert
583 into csum offset (TX_L3_CSUM_OFF)
584 16-bit 1's complement csum value.
585 IPv4 pseudo-header and IP
586 already calculated by OS
588 4: TSO (TCP Segmentation Offload)
589 TX_L3_CSUM_OFF 2 For L3 csum offload mode, the offset,
590 from the beginning of the packet,
591 of the csum field in the L3 header
592 TX_TSO_MSS 2 For TSO offload mode, the
593 Maximum Segment Size in bytes
594 TX_TSO_HDR_LEN 2 For TSO offload mode, the
595 length of ethernet, IP, and
596 TCP/UDP headers, including IP
598 TX_FRAGS <array> Packet fragments
599 TX_FRAG <nest> Packet fragment
600 TX_FRAG_ADDR 8 DMA address of packet fragment
601 TX_FRAG_LEN 2 Packet fragment length
603 Possible status return codes in descriptor on completion are:
606 --------------------------------------------------------------------
608 -ROCKER_ENXIO address or data read err on desc buf or packet
610 -ROCKER_EINVAL bad pport or TSO or csum offloading error
611 -ROCKER_ENOMEM no memory for internal staging tx fragment
616 For packets ingressing on switch ports that are not forwarded by the switch but
617 rather directed to the host CPU for further processing are delivered in the DMA
618 RX ring. Rx descriptor buffers are allocated by software and placed on the
619 ring. Hardware will fill Rx descriptor buffers with packet data, write the
620 completion, and signal to software that a new packet is ready. Since Rx packet
621 size is not known a-priori, the Rx descriptor buffer must be allocated for
622 worst-case packet size. A single Rx descriptor will contain the entire Rx
623 packet data in one RX_FRAG. Other Rx TLVs describe and hardware offloads
624 performed on the packet, such as checksum validation.
626 The TLVs for Rx descriptor buffer are:
628 field width description
629 ---------------------------------------------------
630 PPORT 4 Source physical port #
631 RX_FLAGS 2 Packet parsing flags:
632 (1 << 0): IPv4 packet
633 (1 << 1): IPv6 packet
634 (1 << 2): csum calculated
635 (1 << 3): IPv4 csum good
636 (1 << 4): IP fragment
639 (1 << 7): TCP/UDP csum good
640 RX_CSUM 2 IP calculated checksum:
641 IPv4: IP payload csum
642 IPv6: header and payload csum
643 (Only valid is RX_FLAGS:csum calc is set)
644 RX_FRAG_ADDR 8 DMA address of packet fragment
645 RX_FRAG_MAX_LEN 2 Packet maximum fragment length
646 RX_FRAG_LEN 2 Actual packet fragment length after receive
648 Possible status return codes in descriptor on completion are:
651 --------------------------------------------------------------------
653 -ROCKER_ENXIO address or data read err on desc buf
654 -ROCKER_ENOMEM no memory for internal staging desc buf
655 -ROCKER_EMSGSIZE Rx descriptor buffer wasn't big enough to contain
656 packet data TLV and other TLVs.
659 SECTION 10: OF-DPA Mode
660 ======================
662 OF-DPA mode allows the switch to offload flow packet processing functions to
663 hardware. An OpenFlow controller would communicate with an OpenFlow agent
664 installed on the switch. The OpenFlow agent would (directly or indirectly)
665 communicate with the Rocker switch driver, which in turn would program switch
666 hardware with flow functionality, as defined in OF-DPA. The block diagram is:
668 +–––––––––––––––----–––+
670 | Remote Controller |
671 +––––––––+––----–––––––+
674 +––––––––+–––––––––+
677 +––––––––––––––––––+
680 +––––––––––––––––––+
682 +––––––––––––––––––+
685 +––––––––––––––––––+
687 To participate in flow functions, ports must be configure for OF-DPA mode
688 during switch initialization.
690 OF-DPA Flow Table Interface
691 ---------------------------
693 There are commands to add, modify, delete, and get stats of flow table entries.
694 The commands are issued using the DMA CMD descriptor ring. The following
695 commands are defined:
697 CMD_ADD: add an entry to flow table
698 CMD_MOD: modify an entry in flow table
699 CMD_DEL: delete an entry from flow table
700 CMD_GET_STATS: get stats for flow entry
702 TLVs for add and modify commands are:
704 field width description
705 ----------------------------------------------------
706 OF_DPA_CMD 2 CMD_[ADD|MOD]
707 OF_DPA_TBL 2 Flow table ID
712 40: multicast routing
715 OF_DPA_PRIORITY 4 Flow priority
716 OF_DPA_HARDTIME 4 Hard timeout for flow
717 OF_DPA_IDLETIME 4 Idle timeout for flow
718 OF_DPA_COOKIE 8 Cookie
720 Additional TLVs based on flow table ID:
722 Table ID 0: ingress port
724 field width description
725 ----------------------------------------------------
726 OF_DPA_IN_PPORT 4 ingress physical port number
727 OF_DPA_GOTO_TBL 2 goto table ID; zero to drop
731 field width description
732 ----------------------------------------------------
733 OF_DPA_IN_PPORT 4 ingress physical port number
734 OF_DPA_VLAN_ID 2 (N) vlan ID
735 OF_DPA_VLAN_ID_MASK 2 (N) vlan ID mask
736 OF_DPA_GOTO_TBL 2 goto table ID; zero to drop
737 OF_DPA_NEW_VLAN_ID 2 (N) new vlan ID
739 Table ID 20: termination mac
741 field width description
742 ----------------------------------------------------
743 OF_DPA_IN_PPORT 4 ingress physical port number
744 OF_DPA_IN_PPORT_MASK 4 ingress physical port number mask
745 OF_DPA_ETHERTYPE 2 (N) must be either 0x0800 or 0x86dd
746 OF_DPA_DST_MAC 6 (N) destination MAC
747 OF_DPA_DST_MAC_MASK 6 (N) destination MAC mask
748 OF_DPA_VLAN_ID 2 (N) vlan ID
749 OF_DPA_VLAN_ID_MASK 2 (N) vlan ID mask
750 OF_DPA_GOTO_TBL 2 only acceptable values are
751 unicast or multicast routing
753 OF_DPA_OUT_PPORT 2 if specified, must be
754 controller, set zero otherwise
756 Table ID 30: unicast routing
758 field width description
759 ----------------------------------------------------
760 OF_DPA_ETHERTYPE 2 (N) must be either 0x0800 or 0x86dd
761 OF_DPA_DST_IP 4 (N) destination IPv4 address.
762 Must be unicast address
763 OF_DPA_DST_IP_MASK 4 (N) IP mask. Must be prefix mask
764 OF_DPA_DST_IPV6 16 (N) destination IPv6 address.
765 Must be unicast address
766 OF_DPA_DST_IPV6_MASK 16 (N) IPv6 mask. Must be prefix mask
767 OF_DPA_GOTO_TBL 2 goto table ID; zero to drop
768 OF_DPA_GROUP_ID 4 data for GROUP action must
769 be an L3 Unicast group entry
771 Table ID 40: multicast routing
773 field width description
774 ----------------------------------------------------
775 OF_DPA_ETHERTYPE 2 (N) must be either 0x0800 or 0x86dd
776 OF_DPA_VLAN_ID 2 (N) vlan ID
777 OF_DPA_SRC_IP 4 (N) source IPv4. Optional,
778 can contain IPv4 address,
779 must be completely masked
781 OF_DPA_SRC_IP_MASK 4 (N) IP Mask
782 OF_DPA_DST_IP 4 (N) destination IPv4 address.
783 Must be multicast address
784 OF_DPA_SRC_IPV6 16 (N) source IPv6 Address. Optional.
785 Can contain IPv6 address,
786 must be completely masked
788 OF_DPA_SRC_IPV6_MASK 16 (N) IPv6 mask.
789 OF_DPA_DST_IPV6 16 (N) destination IPv6 Address. Must
791 Must be multicast address
792 OF_DPA_GOTO_TBL 2 goto table ID; zero to drop
793 OF_DPA_GROUP_ID 4 data for GROUP action must
794 be an L3 multicast group entry
796 Table ID 50: bridging
798 field width description
799 ----------------------------------------------------
800 OF_DPA_VLAN_ID 2 (N) vlan ID
801 OF_DPA_TUNNEL_ID 4 tunnel ID
802 OF_DPA_DST_MAC 6 (N) destination MAC
803 OF_DPA_DST_MAC_MASK 6 (N) destination MAC mask
804 OF_DPA_GOTO_TBL 2 goto table ID; zero to drop
805 OF_DPA_GROUP_ID 4 data for GROUP action must
806 be a L2 Interface, L2
808 or L2 Overlay group entry
810 OF_DPA_TUNNEL_LPORT 4 unicast Tenant Bridging
811 flows specify a tunnel
813 OF_DPA_OUT_PPORT 2 data for OUTPUT action,
814 restricted to CONTROLLER,
817 Table ID 60: acl policy
819 field width description
820 ----------------------------------------------------
821 OF_DPA_IN_PPORT 4 ingress physical port number
822 OF_DPA_IN_PPORT_MASK 4 ingress physical port number mask
823 OF_DPA_ETHERTYPE 2 (N) ethertype
824 OF_DPA_VLAN_ID 2 (N) vlan ID
825 OF_DPA_VLAN_ID_MASK 2 (N) vlan ID mask
826 OF_DPA_VLAN_PCP 2 (N) vlan Priority Code Point
827 OF_DPA_VLAN_PCP_MASK 2 (N) vlan Priority Code Point mask
828 OF_DPA_SRC_MAC 6 (N) source MAC
829 OF_DPA_SRC_MAC_MASK 6 (N) source MAC mask
830 OF_DPA_DST_MAC 6 (N) destination MAC
831 OF_DPA_DST_MAC_MASK 6 (N) destination MAC mask
832 OF_DPA_TUNNEL_ID 4 tunnel ID
833 OF_DPA_SRC_IP 4 (N) source IPv4. Optional,
834 can contain IPv4 address,
835 must be completely masked
837 OF_DPA_SRC_IP_MASK 4 (N) IP Mask
838 OF_DPA_DST_IP 4 (N) destination IPv4 address.
839 Must be multicast address
840 OF_DPA_DST_IP_MASK 4 (N) IP Mask
841 OF_DPA_SRC_IPV6 16 (N) source IPv6 Address. Optional.
842 Can contain IPv6 address,
843 must be completely masked
845 OF_DPA_SRC_IPV6_MASK 16 (N) IPv6 mask
846 OF_DPA_DST_IPV6 16 (N) destination IPv6 Address. Must
847 be multicast address.
848 OF_DPA_DST_IPV6_MASK 16 (N) IPv6 mask
849 OF_DPA_SRC_ARP_IP 4 (N) source IPv4 address in the ARP
850 payload. Only used if ethertype
852 OF_DPA_SRC_ARP_IP_MASK 4 (N) IP Mask
853 OF_DPA_IP_PROTO 1 IP protocol
854 OF_DPA_IP_PROTO_MASK 1 IP protocol mask
855 OF_DPA_IP_DSCP 1 DSCP
856 OF_DPA_IP_DSCP_MASK 1 DSCP mask
858 OF_DPA_IP_ECN_MASK 1 ECN mask
859 OF_DPA_L4_SRC_PORT 2 (N) L4 source port, only for
861 OF_DPA_L4_SRC_PORT_MASK 2 (N) L4 source port mask
862 OF_DPA_L4_DST_PORT 2 (N) L4 source port, only for
864 OF_DPA_L4_DST_PORT_MASK 2 (N) L4 source port mask
865 OF_DPA_ICMP_TYPE 1 ICMP type, only if IP
867 OF_DPA_ICMP_TYPE_MASK 1 ICMP type mask
868 OF_DPA_ICMP_CODE 1 ICMP code
869 OF_DPA_ICMP_CODE_MASK 1 ICMP code mask
870 OF_DPA_IPV6_LABEL 4 (N) IPv6 flow label
871 OF_DPA_IPV6_LABEL_MASK 4 (N) IPv6 flow label mask
872 OF_DPA_GROUP_ID 4 data for GROUP action
873 OF_DPA_QUEUE_ID_ACTION 1 write the queue ID
874 OF_DPA_NEW_QUEUE_ID 1 queue ID
875 OF_DPA_VLAN_PCP_ACTION 1 write the VLAN priority
876 OF_DPA_NEW_VLAN_PCP 1 VLAN priority
877 OF_DPA_IP_DSCP_ACTION 1 write the DSCP
878 OF_DPA_NEW_IP_DSCP 1 new DSCP
879 OF_DPA_TUNNEL_LPORT 4 restrct to valid tunnel
880 logical port, set to 0
882 OF_DPA_OUT_PPORT 2 data for OUTPUT action,
883 restricted to CONTROLLER,
885 OF_DPA_CLEAR_ACTIONS 4 if 1 packets matching flow are
886 dropped (all other instructions
889 TLVs for flow delete and get stats command are:
891 field width description
892 ---------------------------------------------------
893 OF_DPA_CMD 2 CMD_[DEL|GET_STATS]
894 OF_DPA_COOKIE 8 Cookie
896 On completion of get stats command, the descriptor buffer is written back with
899 field width description
900 ---------------------------------------------------
901 OF_DPA_STAT_DURATION 4 Flow duration
902 OF_DPA_STAT_RX_PKTS 8 Received packets
903 OF_DPA_STAT_TX_PKTS 8 Transmit packets
905 Possible status return codes in descriptor on completion are:
907 DESC_COMP_ERR command reason
908 --------------------------------------------------------------------
910 -ROCKER_EFAULT all head or tail index outside
912 -ROCKER_ENXIO all address or data read err on
914 -ROCKER_EMSGSIZE GET_STATS cmd descriptor buffer wasn't
915 big enough to contain write-back
917 -ROCKER_EINVAL all invalid parameters passed in
918 -ROCKER_EEXIST ADD entry already exists
919 -ROCKER_ENOSPC ADD no space left in flow table
920 -ROCKER_ENOENT MOD|DEL|GET_STATS cookie invalid
922 Group Table Interface
923 ---------------------
925 There are commands to add, modify, delete, and get stats of group table
926 entries. The commands are issued using the DMA CMD descriptor ring. The
927 following commands are defined:
929 CMD_ADD: add an entry to group table
930 CMD_MOD: modify an entry in group table
931 CMD_DEL: delete an entry from group table
932 CMD_GET_STATS: get stats for group entry
934 TLVs for add and modify commands are:
936 field width description
937 -----------------------------------------------------------
938 FLOW_GROUP_CMD 2 CMD_[ADD|MOD]
939 FLOW_GROUP_ID 2 Flow group ID
940 FLOW_GROUP_TYPE 1 Group type:
950 FLOW_VLAN_ID 2 Vlan ID (types 0, 3, 4, 6)
951 FLOW_L2_PORT 2 Port (types 0)
952 FLOW_INDEX 4 Index (all types but 0)
953 FLOW_OVERLAY_TYPE 1 Overlay sub-type (type 8):
954 0: Flood unicast tunnel
955 1: Flood multicast tunnel
956 2: Multicast unicast tunnel
957 3: Multicast multicast tunnel
958 FLOW_GROUP_ACTION nest
959 FLOW_GROUP_ID 2 next group ID in chain (all
961 FLOW_OUT_PORT 4 egress port (types 0, 8)
962 FLOW_POP_VLAN_TAG 1 strip outer VLAN tag (type 1
964 FLOW_VLAN_ID 2 (types 1, 5)
965 FLOW_SRC_MAC 6 (types 1, 2, 5)
966 FLOW_DST_MAC 6 (types 1, 2)
968 TLVs for flow delete and get stats command are:
970 field width description
971 -----------------------------------------------------------
972 FLOW_GROUP_CMD 2 CMD_[DEL|GET_STATS]
973 FLOW_GROUP_ID 2 Flow group ID
975 On completion of get stats command, the descriptor buffer is written back with
978 field width description
979 ---------------------------------------------------
980 FLOW_GROUP_ID 2 Flow group ID
981 FLOW_STAT_DURATION 4 Flow duration
982 FLOW_STAT_REF_COUNT 4 Flow reference count
983 FLOW_STAT_BUCKET_COUNT 4 Flow bucket count
985 Possible status return codes in descriptor on completion are:
987 DESC_COMP_ERR command reason
988 --------------------------------------------------------------------
990 -ROCKER_EFAULT all head or tail index outside
992 -ROCKER_ENXIO all address or data read err on
994 -ROCKER_ENOSPC GET_STATS cmd descriptor buffer wasn't
995 big enough to contain write-back
997 -ROCKER_EINVAL ADD|MOD invalid parameters passed in
998 -ROCKER_EEXIST ADD entry already exists
999 -ROCKER_ENOSPC ADD no space left in flow table
1000 -ROCKER_ENOENT MOD|DEL|GET_STATS group ID invalid
1001 -ROCKER_EBUSY DEL group reference count non-zero
1002 -ROCKER_ENODEV ADD next group ID doesn't exist
1009 [1] OpenFlow Data Plane Abstraction (OF-DPA) Abstract Switch Specification,
1010 Version 1.0, from Broadcom Corporation, February 21, 2014.