1 .\" Copyright (c) 2002 Luigi Rizzo
2 .\" All rights reserved.
4 .\" Redistribution and use in source and binary forms, with or without
5 .\" modification, are permitted provided that the following conditions
7 .\" 1. Redistributions of source code must retain the above copyright
8 .\" notice, this list of conditions and the following disclaimer.
9 .\" 2. Redistributions in binary form must reproduce the above copyright
10 .\" notice, this list of conditions and the following disclaimer in the
11 .\" documentation and/or other materials provided with the distribution.
13 .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
14 .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
15 .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
16 .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
17 .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
18 .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
19 .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
20 .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
21 .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
22 .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
25 .\" $FreeBSD: src/share/man/man4/polling.4,v 1.27 2007/04/06 14:25:14 brueffer Exp $
32 .Nd network device driver polling support
34 .Cd "options IFPOLL_ENABLE"
36 Network device polling
38 for brevity) refers to a technique that
39 lets the operating system periodically poll network devices, instead of
40 relying on the network devices to generate interrupts when they need attention.
41 This might seem inefficient and counterintuitive, but when done
44 gives more control to the operating system on
45 when and how to handle network devices, with a number of advantages in terms
46 of system responsiveness and performance.
50 reduces the overhead for context
51 switches which is incurred when servicing interrupts, and
52 gives more control on the scheduling of a CPU between various
53 tasks (user processes, software interrupts, device handling)
54 which ultimately reduces the chances of livelock in the system.
55 .Ss Principles of Operation
56 In the normal, interrupt-based mode, network devices generate an interrupt
57 whenever they need attention.
59 context switch and the execution of an interrupt handler
60 which performs whatever processing is needed by the network device.
61 The duration of the interrupt handler is potentially unbounded
62 unless the network device driver has been programmed with real-time
63 concerns in mind (which is generally not the case for
66 Furthermore, under heavy traffic load, the system might be
67 persistently processing interrupts without being able to
68 complete other work, either in the kernel or in userland.
70 Network device polling disables interrupts by polling network devices on
72 This way, the context switch overhead is removed.
74 the operating system can control accurately how much work to spend
75 in handling network device events, and thus prevent livelock by reserving
76 some amount of CPU to other tasks.
80 also changes the way software network interrupts
81 are scheduled, so there is never the risk of livelock because
82 packets are not processed to completion.
84 It is turned on and off with help of
87 An interface does not have to be
89 in order to turn on its
93 The following tunables can be set from
97 .Bl -tag -width indent -compact
98 .It Va net.ifpoll.burst_max
100 .Va net.ifpoll.X.rx.burst_max
103 .It Va net.ifpoll.each_burst
105 .Va net.ifpoll.X.rx.each_burst
108 .It Va net.ifpoll.user_frac
110 .Va net.ifpoll.X.rx.user_frac
113 .It Va net.ifpoll.pollhz
115 .Va net.ifpoll.X.pollhz
118 .It Va net.ifpoll.status_frac
120 .Va net.ifpoll.0.status_frac
123 .It Va net.ifpoll.tx_frac
125 .Va net.ifpoll.X.tx_frac
131 is controlled by the following per CPU
137 .Bl -tag -width indent -compact
138 .It Va net.ifpoll.X.pollhz
139 The polling frequency, whose range is 1 to 30000.
142 .It Va net.ifpoll.X.rx.user_frac
145 is enabled, and provided that there is some work to do,
146 up to this percent of the CPU cycles is reserved to userland tasks,
147 the remaining fraction being available for
152 .It Va net.ifpoll.X.rx.burst
153 Maximum number of packets grabbed from each network interface in
155 This number is dynamically adjusted by the kernel,
156 according to the programmed
157 .Va user_frac , burst_max ,
158 CPU speed, and system load.
160 .It Va net.ifpoll.X.rx.each_burst
161 The burst above is split into smaller chunks of this number of
162 packets, going round-robin among all interfaces registered for
164 This prevents the case that a large burst from a single interface
165 can saturate the IP interrupt queue.
168 .It Va net.ifpoll.X.rx.burst_max
170 .Va net.ifpoll.X.rx.burst .
173 is enabled, each interface can receive at most
174 .Pq Va pollhz No * Va burst_max
175 packets per second unless there are spare CPU cycles available for
178 This number should be tuned to match the expected load.
179 Default is 250 which is adequate for 1000Mbit network and pollhz=6000.
181 .It Va net.ifpoll.X.rx.handlers
182 How many active network devices have registered for packet reception
185 .It Va net.ifpoll.X.tx_frac
186 Controls how often (every
187 .Va tx_frac No / Va pollhz
188 seconds) the tranmission queue is checked for packet transmission
190 Increasing this value reduces the time spent on checking packets
191 transmission done events thus reduces bus load,
192 but it also increases chance
193 that the transmission queue getting saturated.
196 .It Va net.ifpoll.X.tx.handlers
197 How many active network devices have registered for packet transmission
200 .It Va net.ifpoll.0.status_frac
201 Controls how often (every
202 .Va status_frac No / Va pollhz
203 seconds) the status registers of the network device are checked for error
204 conditions and the like.
205 Increasing this value reduces the load on the bus,
206 but also delays the error detection.
209 .It Va net.ifpoll.0.status.handlers
210 How many active network devices have registered for status
213 .It Va net.ifpoll.X.rx.short_ticks
214 .It Va net.ifpoll.X.rx.lost_polls
215 .It Va net.ifpoll.X.rx.pending_polls
216 .It Va net.ifpoll.X.rx.residual_burst
217 .It Va net.ifpoll.X.rx.phase
218 .It Va net.ifpoll.X.rx.suspect
219 .It Va net.ifpoll.X.rx.stalled
220 .It Va net.ifpoll.X.tx.short_ticks
221 .It Va net.ifpoll.X.tx.lost_polls
222 .It Va net.ifpoll.X.tx.pending_polls
223 .It Va net.ifpoll.X.tx.residual_burst
224 .It Va net.ifpoll.X.tx.phase
225 .It Va net.ifpoll.X.tx.suspect
226 .It Va net.ifpoll.X.tx.stalled
229 .Sh SUPPORTED DEVICES
230 Network device polling requires explicit modifications to
231 the network device drivers.
232 As of this writing, the
255 devices are supported,
256 with others in the works.
266 support multiple reception queues based
276 support multiple transmission queues based
278 The modifications are rather straightforward, consisting in
279 the extraction of the inner part of the interrupt service routine
280 and writing a callback function,
283 to probe the network device for events and process them.
285 conditionally compiled sections of the network devices mentioned above
288 In order to reduce the latency in processing packets,
289 it is advisable to set the
292 .Va net.ifpoll.X.pollhz
295 Network device polling first appeared in
301 The network device polling code was rewritten by
303 based on the original code by
304 .An Luigi Rizzo Aq Mt luigi@iet.unipi.it .
306 made the polling frequency settable at runtime,
307 added per CPU polling
308 and added multiple reception and tranmission queue polling support.