winpcap/dox/npf.htm

   1 <html>
   2
   3 <head>
   4 <meta http-equiv="Content-Type" content="text/html; charset=windows-1252">
   5 <meta name="GENERATOR" content="Microsoft FrontPage 6.0">
   6 <meta name="ProgId" content="FrontPage.Editor.Document">
   7 <title></title>
   8 </head>
   9
  10 <body>
  11
  12 <p>This section documents the internals of the Netgroup Packet Filter (NPF), the kernel
  13 portion of WinPcap. Normal users are probably interested in how to use WinPcap
  14 and not in its internal structure. Therefore
  15 the information present in this module is destined mainly to WinPcap developers and maintainers, or to
  16 the people interested in how the driver works. In particular, a good knowledge
  17 of OSes, networking and Win32 kernel programming and device drivers development
  18 is required to profitably read this section.&nbsp;</p>
  19 <p>NPF is the WinPcap component that does the hard work, processing the packets
  20 that transit on the network and exporting capture, injection and analysis
  21 capabilities to user-level.</p>
  22 <p>The following paragraphs will describe the interaction of NPF with the
  23 OS and its basic structure.</p>
  24 <h2>NPF and NDIS</h2>
  25 <p>NDIS (Network Driver Interface Specification) is a standard that defines the
  26 communication between a network adapter (or, better, the driver that manages it)
  27 and the protocol drivers (that implement for example TCP/IP). Main NDIS purpose
  28 is to act as a wrapper that allows protocol drivers to send and receive packets
  29 onto a network (LAN or WAN) without caring either the particular adapter or the
  30 particular Win32 operating system.</p>
  31 <p>NDIS supports three types of network drivers:</p>
  32 <ol>
  33   <li><strong>Network interface card or NIC drivers</strong>. NIC drivers
  34     directly manage network interface cards, referred to as NICs. The NIC
  35     drivers interface directly to the hardware at their lower edge and at their
  36     upper edge present an interface to allow upper layers to send packets on the
  37     network, to handle interrupts, to reset the NIC, to halt the NIC and to
  38     query and set the operational characteristics of the driver. NIC drivers can
  39     be either miniports or legacy full NIC drivers.
  40     <ul>
  41       <li>Miniport drivers implement only the hardware-specific operations
  42         necessary to manage a NIC, including sending and receiving data on the
  43         NIC. Operations common to all lowest level NIC drivers, such as
  44         synchronization, is provided by NDIS. Miniports do not call operating
  45         system routines directly; their interface to the operating system is
  46         NDIS.<br>
  47         A miniport does not keep track of bindings. It merely passes packets up
  48         to NDIS and NDIS makes sure that these packets are passed to the correct
  49         protocols.
  50       <li>Full NIC drivers have been written to perform both hardware-specific
  51         operations and all the synchronization and queuing operations usually
  52         done by NDIS. Full NIC drivers, for instance, maintain their own binding
  53         information for indicating received data.&nbsp;</li>
  54     </ul>
  55   <li><strong>Intermediate drivers</strong>. Intermediate drivers interface
  56     between an upper-level driver such as a protocol driver and a miniport. To
  57     the upper-level driver, an intermediate driver looks like a miniport. To a
  58     miniport, the intermediate driver looks like a protocol driver. An
  59     intermediate protocol driver can layer on top of another intermediate driver
  60     although such layering could have a negative effect on system performance. A
  61     typical reason for developing an intermediate driver is to perform media
  62     translation between an existing legacy protocol driver and a miniport that
  63     manages a NIC for a new media type unknown to the protocol driver. For
  64     instance, an intermediate driver could translate from LAN protocol to ATM
  65     protocol. An intermediate driver cannot communicate with user-mode
  66     applications, but only with other NDIS drivers.
  67   <li><b>Transport drivers or protocol drivers</b>. A protocol driver implements
  68     a network protocol stack such as IPX/SPX or TCP/IP, offering its services
  69     over one or more network interface cards. A protocol driver services
  70     application-layer clients at its upper edge and connects to one or more NIC
  71     driver(s) or intermediate NDIS driver(s) at its lower edge.</li>
  72 </ol>
  73 <p>NPF is implemented as a protocol driver. This is not the best possible choice
  74 from the performance point of view, but allows reasonable independence from the
  75 MAC layer and as well as complete access to the raw traffic.</p>
  76 <p>Notice that the various Win32 operating systems have different versions of
  77 NDIS: NPF is NDIS 5 compliant under Windows 2000 and its derivations (like
  78 Windows XP), NDIS 3
  79 compliant on the other Win32 platforms.&nbsp;</p>
  80 <p>Next figure shows the position of NPF inside the NDIS stack:</p>
  81 <p align="center"><img border="0" src="npf-ndis.gif"></p>
  82 <p align="center"><b>Figure 1: NPF inside NDIS.</b></p>
  83 <p>The interaction with the OS is normally asynchronous. This means that the
  84 driver provides a set of callback functions that are invoked by the system when
  85 some operation is required to NPF. NPF exports callback functions for all the I/O operations of the
  86 applications: open, close, read, write, ioctl, etc.</p>
  87 <p>The interaction with NDIS is asynchronous as well: events
  88 like the arrival of a new packet are notified to NPF through a callback
  89 function (Packet_tap() in this case). Furthermore, the interaction with NDIS and
  90 the NIC
  91 driver takes always place by means of non blocking functions: when NPF invokes a
  92 NDIS function, the call returns immediately; when the processing ends, NDIS invokes
  93 a specific NPF
  94 callback to inform that the function has finished. The
  95 driver exports a callback for any low-level operation, like sending packets,
  96 setting or requesting parameters on the NIC, etc.</p>
  97
  98 <h2>NPF structure basics</h2>
  99
 100 <p>Next figure shows the structure of WinPcap, with particular reference to the
 101 NPF driver.</p>
 102
 103 <p align="center"><img border="0" src="npf-npf.gif" width="500" height="412"></p>
 104
 105 <p align="center"><b>Figure 2: NPF device driver.</b>
 106
 107 <p>NPF is able to
 108 perform a number of different operations: capture, monitoring, dump to disk,
 109 packet injection. The following paragraphs will describe shortly each of these
 110 operations.</p>
 111 <h4>Packet Capture</h4>
 112 <p>The most important operation of NPF is packet capture.
 113 During a capture, the driver sniffs the packets using a network interface and delivers them intact to the
 114 user-level applications.&nbsp;
 115 </p>
 116 <p>The capture process relies on two main components:</p>
 117 <ul>
 118   <li>
 119     <p>A packet filter that decides if an
 120     incoming packet has to be accepted and copied to the listening application.
 121     Most applications using NPF reject far more packets than those accepted,
 122     therefore a versatile and efficient packet filter is critical for good
 123     over-all performance. A packet filter is a function with boolean output
 124     that is applied to a packet. If the value of the function is true the
 125     capture driver copies
 126     the packet to the application; if it is false the packet is discarded. NPF
 127     packet filter is a bit more complex, because it determines not only if the
 128     packet should be kept, but also the amount of bytes to keep. The filtering
 129     system adopted by NPF derives from the <b>BSD Packet Filter</b> (BPF), a
 130         virtual processor able to execute filtering programs expressed in a
 131         pseudo-assembler and created at user level. The application takes a user-defined filter (e.g. “pick up all UDP packets”)
 132     and, using wpcap.dll, compiles them into a BPF program (e.g. “if the
 133     packet is IP and the <i>protocol type</i>  field is equal to 17, then return
 134     true”). Then, the application uses the <i>BIOCSETF</i>
 135     IOCTL to inject the filter in the kernel. At this point, the program
 136     is executed for every incoming packet, and only the conformant packets are
 137     accepted. Unlike traditional solutions, NPF does not <i>interpret</i>
 138     the filters, but it <i>executes</i> them. For performance reasons, before using the
 139     filter NPF feeds it to a JIT compiler that translates it into a native 80x86
 140     function. When a packet is captured, NPF calls this native function instead
 141     of invoking the filter interpreter, and this makes the process very fast.
 142     The concept behind this optimization is very similar to the one of Java
 143     jitters.</li>
 144   <li>
 145     <p>A circular buffer to store the
 146     packets and avoid loss. A packet is stored in the buffer with a header that
 147     maintains information like the timestamp and the size of the packet.
 148     Moreover, an alignment padding is inserted between the packets in order to
 149     speed-up the access to their data by the applications. Groups of&nbsp; packets can be copied
 150     with a single operation from the NPF buffer to the applications. This
 151     improves performances because it minimizes the number of reads. If the
 152     buffer is full when a new packet arrives, the packet is discarded and
 153     hence it's lost. Both kernel and user buffer can be
 154 changed at runtime for maximum versatility: packet.dll and wpcap.dll provide functions for this purpose.</li>
 155 </ul>
 156 <p>The size of the user buffer is very
 157 important because it determines the <i>maximum</i> amount of data that can be
 158 copied from kernel space to user space within a single system call. On the other
 159 hand, it can be noticed that also the <i>minimum</i> amount of data that can be copied
 160 in a single call is extremely important. In presence of a large value for this
 161 variable, the kernel waits for the arrival of several packets before copying the
 162 data to the user. This guarantees a low number of system calls, i.e. low
 163 processor usage, which is a good setting for applications like sniffers. On the
 164 other side, a small value means that the kernel will copy the packets as soon as
 165 the application is ready to receive them. This is excellent for real time
 166 applications (like, for example, ARP redirectors or bridges) that need the better
 167 responsiveness from the kernel.
 168 From this point of view, NPF has a configurable behavior, that allows users to choose between
 169 best efficiency or best responsiveness (or any intermediate situation).&nbsp;</p>
 170 <p>The wpcap library includes a couple of system calls that can be used both to set the timeout after
 171 which a read expires and the minimum amount of data that can be transferred to
 172 the application. By default, the read timeout is 1 second, and the minimum
 173 amount of data copied between the kernel and the application is 16K.</p>
 174 <h4> Packet injection</h4>
 175 <p> NPF allows to write raw packets to the network. To send data, a
 176 user-level application performs a WriteFile() system call on the NPF device file. The data is sent to the network as is, without encapsulating it in
 177 any protocol, therefore the application will have to build the various headers
 178 for each packet. The application usually does not need to generate the FCS
 179 because it is calculated by the network adapter hardware and it is attached
 180 automatically at the end of a packet before sending it to the network.</p>
 181 <p>In normal situations, the sending rate of the packets to the network is not
 182 very high because of the need of a system call for each packet. For this reason,
 183 the possibility to send a single packet more than once with a single write
 184 system call has been added. The user-level application can set, with an IOCTL
 185 call (code pBIOCSWRITEREP), the number of times a single packet will be
 186 repeated: for example, if this value is set to 1000, every raw packet written by
 187 the application on the driver's device file will be sent 1000 times. This
 188 feature can be used to generate high speed traffic for testing purposes: the
 189 overload of context switches is no longer present, so performance is remarkably
 190 better.&nbsp;</p>
 191
 192 <h4> Network monitoring</h4>
 193 <p>WinPcap offers a kernel-level programmable monitoring
 194 module, able to calculate simple statistics on the network traffic. The
 195 idea behind this module is shown in Figure
 196 2: the statistics can be gathered without the need to copy the packets to
 197 the application, that simply receives and displays the results obtained from the
 198 monitoring engine. This allows to avoid great part of the capture overhead in
 199 terms of memory and CPU clocks.</p>
 200 <p>The monitoring engine is
 201 made of a <i>classifier</i> followed by a <i>counter</i>. The packets are
 202 classified using the filtering engine of NPF, that provides a configurable way
 203 to select a subset of the traffic. The data that pass the filter go to the
 204 counter, that keeps some variables like the number of packets and
 205 the amount of bytes accepted by the filter and updates them with the data of the
 206 incoming packets. These variables are passed to the user-level application at
 207 regular intervals whose period can be configured by the user. No buffers are
 208 allocated at kernel and user level.</p>
 209 <h4>Dump to disk</h4>
 210 <p>The dump to disk
 211 capability can be used to save the network data to disk directly from kernel
 212 mode.
 213 </p>
 214 <p align="center"><img border="0" src="npf-dump.gif" width="400" height="187">
 215 </p>
 216 <p align="center"><b>Figure 3: packet capture versus kernel-level dump.</b>
 217 </p>
 218 <p>In
 219 traditional systems, the path covered by the packets that are saved to disk is
 220 the one followed by the black arrows in Figure
 221 3: every packet is copied several times, and normally 4 buffers are
 222 allocated: the one of the capture driver, the one in the application that keeps
 223 the captured data, the one of the stdio functions (or similar) that are used by
 224 the application to write on file, and finally the one of the file system.
 225
 226 </p>
 227 <p>When the
 228 kernel-level traffic logging feature of NPF is enabled, the capture driver
 229 addresses the file system directly, hence the path covered by the packets is the
 230 one of the red dotted arrow: only two buffers and a single copy are necessary,
 231 the number of system call is drastically reduced, therefore the performance is
 232 considerably better.
 233
 234 </p>
 235 <p>Current
 236 implementation dumps the to disk in the widely used libpcap format. It gives
 237 also the possibility to filter the traffic before the dump process in order to
 238 select the packet that will go to the disk.
 239 </p>
 240 <h2>Further reading</h2>
 241 <p>The structure of NPF and its filtering engine derive directly from the one of
 242 the BSD Packet Filter (BPF), so if you are interested the subject you can read
 243 the following papers:</p>
 244 <p>- S. McCanne and V. Jacobson, <a href="ftp://ftp.ee.lbl.gov/papers/bpf-usenix93.ps.Z">The
 245 BSD Packet Filter: A New Architecture for User-level Packet Capture</a>.
 246 Proceedings of the 1993 Winter USENIX Technical Conference (San Diego, CA, Jan.
 247 1993), USENIX.&nbsp;</p>
 248 <p>- A. Begel, S. McCanne, S.L.Graham, BPF+: <a href="http://www.acm.org/pubs/articles/proceedings/comm/316188/p123-begel/p123-begel.pdf">Exploiting
 249 Global Data-flow Optimization in a Generalized Packet Filter Architecture</a>,
 250 Proceedings of ACM SIGCOMM '99, pages 123-134, Conference on Applications,
 251 technologies, architectures, and protocols for computer communications, August
 252 30 - September 3, 1999, Cambridge, USA</p>
 253 <h2>Note</h2>
 254 <p>The code documented in this manual is the one of the Windows NTx version of
 255 NPF.&nbsp;The Windows 9x code is very similar, but it is less efficient and
 256 lacks advanced features like kernel-mode dump.</p>
 257 <p>
 258
 259
 260 </body>
 261
 262 </html>