2 -----------------------------------------------------------------------
4 T H E /proc F I L E S Y S T E M
6 -----------------------------------------------------------------------
7 /proc/sys Terrehon Bowden <terrehon@wpi.com> January 27 1999
8 Bodo Bauer <bb@ricochet.net>
9 -----------------------------------------------------------------------
10 Version 1.1 Kernel version 2.2
11 -----------------------------------------------------------------------
14 1 Introduction/Credits
18 2 The /proc file system
20 2.1 Process specific subdirectories
22 2.3 IDE devices in /proc/ide
23 2.4 Networking info in /proc/net
25 2.6 Parallel port info in /proc/parport
26 2.7 TTY info in /proc/tty
28 3 Reading and modifying kernel parameters
30 3.1 /proc/sys/debug and /proc/sys/proc
31 3.2 /proc/fs - File system data
32 3.3 /proc/fs/binfmt_misc - Miscellaneous binary formats
33 3.4 /proc/sys/kernel - General kernel parameters
34 3.5 /proc/sys/vm - The virtual memory subsystem
35 3.6 /proc/sys/dev - Device specific parameters
36 3.7 /proc/sys/sunrpc - Remote procedure calls
37 3.8 /proc/sys/net - Networking stuff
38 3.9 /proc/sys/net/ipv4 - IPV4 settings=20
42 -----------------------------------------------------------------------
44 1 Introduction/Credits
46 This documentation is part of a soon to be released book published by
47 IDG Books on the SuSE Linux distribution. As there is no complete
48 documentation for the /proc file system and we've used many freely
49 available sources to write this chapter, it seems only fair to give
50 the work back to the Linux community. This work is based on the
51 2.1.132 and 2.2.0-pre-kernel versions. I'm afraid it's still far from
52 complete, but we hope it will be useful. As far as we know, it is the
53 first 'all-in-one' document about the /proc file system. It is
54 focused on the Intel x86 hardware, so if you are looking for PPC, ARM,
55 SPARC, APX, etc., features, you probably won't find what you are
56 looking for. It also only covers IPv4 networking, not IPv6 nor other
59 We'd like to thank Alan Cox, Rik van Riel, and Alexey Kuznetsov. We'd
60 also like to extend a special thank you to Andi Kleen for
61 documentation, which we relied on heavily to create this document, as
62 well as the additional information he provided. Thanks to everybody
63 else who contributed source or docs to the Linux kernel and helped
64 create a great piece of software... :)
66 If you have any comments, corrections or additions, please don't
67 hesitate to contact Bodo Bauer at bb@ricochet.net. We'll be happy to
68 add them to this document.
70 The latest version of this document is available online at
71 http://www.suse.com/~bb/Docs/proc.html in HTML, ASCII, and as
76 We don't guarantee the correctness of this document, and if you come
77 to us complaining about how you screwed up your system because of
78 incorrect documentation, we won't feel responsible...
80 -----------------------------------------------------------------------
82 2 The /proc file system
84 The proc file system acts as an interface to internal data structures
85 in the kernel. It can be used to obtain information about the system
86 and to change certain kernel parameters at runtime. It contains
87 (among other things) one subdirectory for each process running on the
88 system which is named after the process id (PID) of the process. The
89 link self points to the process reading the file system.
91 2.1 Process specific subdirectories
93 Each process subdirectory has the in table 1.1 listed entries.
95 _________________________________________________
96 cmdline Command line arguments
97 environ Values of environment variables
98 fd Directory, which contains all file descriptors
99 mem Memory held by this process
101 status Process status in human readable form
102 cwd Link to the current working directory
103 exe Link to the executable of this process
105 root Link to the root directory of this process
106 statm Process memory status information
107 _________________________________________________
108 Table 1.1: Process specific entries in /proc
110 For example, to get the status information of a process, all you have
111 to do is read the file /proc/PID/status:
113 > cat /proc/self/status
128 SigPnd: 0000000000000000
129 SigBlk: 0000000000000000
130 SigIgn: 0000000000000000
131 SigCgt: 0000000000000000
132 CapInh: 00000000fffffeff
133 CapPrm: 0000000000000000
134 CapEff: 0000000000000000
136 This shows you almost the same information as you would get if you
137 viewed it with the ps command. In fact, ps uses the proc file system
138 to obtain its information.
140 The statm file contains more detailed information about the process
141 memory usage. It contains seven values with the following meanings:
143 size total program size
144 resident size of in memory portions
145 shared number of the pages that are shared
146 trs number of pages that are 'code'
147 drs number of pages of data/stack
148 lrs number of pages of library
149 dt number of dirty pages
151 The ratio text/data/library is approximate only by heuristics.
155 Similar to the process entries, these are files which give information
156 about the running kernel. The files used to obtain this information
157 are contained in /proc and are listed in table 1.2. Not all of these
158 will be present in your system. It depends on the kernel configuration
159 and the loaded modules, which files are there, and which are missing.
161 ________________________________________________
162 apm Advanced power management info
163 cmdline Kernel command line
164 cpuinfo Info about the CPU
165 devices Available devices (block and character)
166 dma Used DMS channels
167 filesystems Supported filesystems
168 interrupts Interrupt usage
169 ioports I/O port usage
170 kcore Kernel core image
172 ksyms Kernel symbol table
177 modules List of loaded modules
178 mounts Mounted filesystems
179 partitions Table of partitions known to the system
181 slabinfo Slab pool info
182 stat Overall statistics
183 swaps Swap space utilization
185 version Kernel version
186 ________________________________________________
187 Table 1.2: Kernel info in /proc
189 You can, for example, check which interrupts are currently in use and
190 what they are used for by looking in the file /proc/interrupts:
192 > cat /proc/interrupts
194 0: 8728810 XT-PIC timer
195 1: 895 XT-PIC keyboard
197 3: 531695 XT-PIC aha152x
198 4: 2014133 XT-PIC serial
199 5: 44401 XT-PIC pcnet_cs
202 12: 182918 XT-PIC PS/2 Mouse
204 14: 1232265 XT-PIC ide0
208 There three more important subdirectories in /proc: net, scsi and
209 sys. The general rule is that the contents, or even the existence of
210 these directories, depends on your kernel configuration. If SCSI is
211 not enabled, the directory scsi may not exist. The same is true with
212 the net, which is only there when networking support is present in the
215 The slabinfo file gives information about memory usage on the slab
216 level. Linux uses slab pools for memory management above page level
217 in version 2.2. Commonly used objects have their own slab pool (like
218 network buffers, directory cache, etc.).
220 2.3 IDE devices in /proc/ide
222 This subdirectory contains information about all IDE devices that the
223 kernel is aware of. There is one subdirectory for each device
224 (i.e. hard disk) containing the following files:
227 capacity Capacity of the medium
228 driver Driver and version
229 geometry Physical and logical geometry
230 identify Device identify block
232 model Device identifier
233 settings Device setup
234 smart_thresholds IDE disk management thresholds
235 smart_values IDE disk management values
237 2.4 Networking info in /proc/net
239 This directory follows the usual pattern. Table 1.3 lists the files
242 ____________________________________________________
244 dev network devices with statistics
245 dev_mcast Lists the Layer2 multicast groups a
246 device is listening to (interface index,
247 label, number of references, number of
249 dev_stat network device status
250 ip_fwchains Firewall chain linkage
251 ip_fwnames Firewall chains
252 ip_masq Directory containing the masquerading
254 ip_masquerade Major masquerading table
255 netstat Network statistics
256 raw Raw device statistics
257 route Kernel routing table
258 rpc Directory containing rpc info
259 rt_cache Routing cache
261 sockstat Socket statistics
263 tr_rif Token ring RIF routing table
265 unix UNIX domain sockets
266 wireless Wireless interface data (Wavelan etc)
267 igmp IP multicast addresses, which this host joined
268 psched Global packet scheduler parameters.
269 netlink List of PF_NETLINK sockets.
270 ip_mr_vifs List of multicast virtual interfaces.
271 ip_mr_cache List of multicast routing cache.
272 udp6 UDP sockets (IPv6)
273 tcp6 TCP sockets (IPv6)
274 raw6 Raw device statistics (IPv6)
275 igmp6 IP multicast addresses, which this host joineed (IPv6)
276 if_inet6 List of IPv6 interface addresses.
277 ipv6_route Kernel routing table for IPv6
278 rt6_stats global IPv6 routing tables statistics.
279 sockstat6 Socket statistics (IPv6)
280 snmp6 Snmp data (IPv6)
281 ____________________________________________________
282 Table 1.3: Network info in /proc/net
284 You can use this information to see which network devices are
285 available in your system and how much traffic was routed over those
290 face |bytes packets errs drop fifo frame compressed multicast|[...
291 lo: 908188 5596 0 0 0 0 0 0 [...
292 ppp0:15475140 20721 410 0 0 410 0 0 [...
293 eth0: 614530 7085 0 0 0 0 0 1 [...
296 ...] bytes packets errs drop fifo colls carrier compressed
297 ...] 908188 5596 0 0 0 0 0 0
298 ...] 1375103 17405 0 0 0 0 0 0
299 ...] 1703981 5535 0 0 0 3 0 0
303 If you have a SCSI host adapter in your system, you'll find a
304 subdirectory named after the driver for this adapter in /proc/scsi.
305 You'll also see a list of all recognized SCSI devices in /proc/scsi:
309 Host: scsi0 Channel: 00 Id: 00 Lun: 00
310 Vendor: QUANTUM Model: XP34550W Rev: LXY4
311 Type: Direct-Access ANSI SCSI revision: 02
312 Host: scsi0 Channel: 00 Id: 01 Lun: 00
313 Vendor: SEAGATE Model: ST34501W Rev: 0018
314 Type: Direct-Access ANSI SCSI revision: 02
315 Host: scsi0 Channel: 00 Id: 02 Lun: 00
316 Vendor: SEAGATE Model: ST34501W Rev: 0017
317 Type: Direct-Access ANSI SCSI revision: 02
318 Host: scsi0 Channel: 00 Id: 04 Lun: 00
319 Vendor: ARCHIVE Model: Python 04106-XXX Rev: 703b
320 Type: Sequential-Access ANSI SCSI revision: 02
322 The directory named after the driver has one file for each adapter
323 found in the system. These files contain information about
324 the controller, including the used IRQ and the IO address range:
326 >cat /proc/scsi/ncr53c8xx/0
328 Chip NCR53C875, device id 0xf, revision id 0x4
329 IO port address 0xec00, IRQ number 11
330 Synchronous period factor 12, max commands per lun 4
332 2.6 Parallel port info in /proc/parport
334 The directory /proc/parport contains information about the parallel
335 ports of your system. It has one subdirectory for each port, named
336 after the port number (0,1,2,...).
338 This directory contains four files:
340 autoprobe Autoprobe results of this port
341 devices Connected device modules
342 hardware Hardware info (port type, io-port, DMA, IRQ, etc.)
343 irq Used interrupt, if any
345 2.7 TTY info in /proc/tty
347 Information about the available and the actually used tty's can be
348 found in /proc/tty. You'll find entries for drivers and line
349 disciplines in this directory, as shown in the table below:
351 drivers List of drivers and their usage
352 ldiscs Registered line disciplines
353 driver/serial Usage statistic and status of single tty lines
355 To see which tty's are currently in use, you can simply look into the
356 file /proc/tty/drivers:
358 >cat /proc/tty/drivers
359 pty_slave /dev/pts 136 0-255 pty:slave
360 pty_master /dev/ptm 128 0-255 pty:master
361 pty_slave /dev/ttyp 3 0-255 pty:slave
362 pty_master /dev/pty 2 0-255 pty:master
363 serial /dev/cua 5 64-67 serial:callout
364 serial /dev/ttyS 4 64-67 serial
365 /dev/tty0 /dev/tty0 4 0 system:vtmaster
366 /dev/ptmx /dev/ptmx 5 2 system
367 /dev/console /dev/console 5 1 system:console
368 /dev/tty /dev/tty 5 0 system:/dev/tty
369 unknown /dev/tty 4 1-63 console
371 -----------------------------------------------------------------------
373 3 Reading and modifying kernel parameters
375 A very interesting part of /proc is the directory /proc/sys. This not
376 only provides information, it also allows you to change parameters
377 within the kernel. Be very careful when trying this. You can optimize
378 your system, but you also can crash it. Never play around with kernel
379 parameters on a production system. Set up a development machine and
380 test to make sure that everything works the way you want it to. You
381 may have no alternative but to reboot the machine once an error has
384 To change a value, simply echo the new value into the file. An example
385 is given below in the section on the file system data. You need to be
386 root to do this. You can create your own boot script to get this done
387 every time your system boots.
389 The files in /proc/sys can be used to tune and monitor miscellaneous
390 and general things in the operation of the Linux kernel. Since some
391 of the files can inadvertently disrupt your system, it is advisable to
392 read both documentation and source before actually making
393 adjustments. In any case, be very careful when writing to any of these
394 files. The entries in /proc may change slightly between the 2.1.* and
395 the 2.2 kernel, so review the kernel documentation if there is any
396 doubt. You'll find the documentation in the directory
397 /usr/src/linux/Documentation/sys. This chapter is heavily based on the
398 documentation included in the pre 2.2 kernels. Thanks to Rick van Riel
399 for providing this information.
401 3.1 /proc/sys/debug and /proc/sys/proc
403 These two subdirectories are empty.
405 3.2 /proc/fs - File system data
407 This subdirectory contains specific file system, file handle, inode,
408 dentry and quota information.
410 Currently, these files are in /proc/sys/fs:
413 Status of the directory cache. Since directory entries are
414 dynamically allocated and deallocated, this file gives information
415 about the current status. It holds six values, in which the last
416 two are not used and are always zero. The other four mean:
418 nr_dentry Seems to be zero all the time
419 nr_unused Number of unused cache entries
420 age_limit Age in seconds after the entry may be
421 reclaimed, when memory is short
424 dquot-nr and dquot-max
425 The file dquot-max shows the maximum number of cached disk quota
428 The file dquot-nr shows the number of allocated disk quota
429 entries and the number of free disk quota entries.
431 If the number of free cached disk quotas is very low and you have
432 a large number of simultaneous system users, you might want
436 The kernel allocates file handles dynamically, but as yet
437 doesn't free them again.
439 The value in file-max denotes the maximum number of file handles
440 that the Linux kernel will allocate. When you get a lot of error
441 messages about running out of file handles, you might want to raise
442 this limit. The default value is 4096. To change it, just write the
443 new number into the file:
445 # cat /proc/sys/fs/file-max
447 # echo 8192 > /proc/sys/fs/file-max
448 # cat /proc/sys/fs/file-max
451 This method of revision is useful for all customizable parameters
452 of the kernel - simply echo the new value to the corresponding
455 The three values in file-nr denote the number of allocated file
456 handles, the number of used file handles, and the maximum number of
457 file handles. When the allocated file handles come close to the
458 maximum, but the number of actually used ones is far behind, you've
459 encountered a peak in your usage of file handles and you don't need
460 to increase the maximum.
462 However, there is still a per process limit of open files, which
463 unfortunatly can't be changed that easily. It is set to 1024 by
464 default. To change this you have to edit the files limits.h and
465 fs.h in the directory /usr/src/linux/include/linux. Change the
466 definition of NR_OPEN and recompile the kernel.
468 inode-state, inode-nr and inode-max
469 As with file handles, the kernel allocates the inode structures
470 dynamically, but can't free them yet.
472 The value in inode-max denotes the maximum number of inode
473 handlers. This value should be 3 to 4 times larger than the value
474 in file-max, since stdin, stdout, and network sockets also need an
475 inode struct to handle them. If you regularly run out of inodes,
476 you should increase this value.
478 The file inode-nr contains the first two items from inode-state, so
479 we'll skip to that file...
481 inode-state contains three actual numbers and four dummy values. The
482 actual numbers are (in order of appearance) nr_inodes, nr_free_inodes,
486 Denotes the number of inodes the system has allocated. This can
487 be slightly more than inode-max because Linux allocates them one
491 Represents the number of free inodes and pre shrink is nonzero
492 when the nr_inodes > inode-max and the system needs to prune the
493 inode list instead of allocating more.
495 super-nr and super-max
496 Again, super block structures are allocated by the kernel,
497 but not freed. The file super-max contains the maximum number of
498 super block handlers, where super-nr shows the number of
499 currently allocated ones.
501 Every mounted file system needs a super block, so if you plan to
502 mount lots of file systems, you may want to increase these
505 3.3 /proc/fs/binfmt_misc - Miscellaneous binary formats
507 Besides these files, there is the subdirectory
508 /proc/sys/fs/binfmt_misc. This handles the kernel support for
509 miscellaneous binary formats.
511 Binfmt_misc provides the ability to register additional binary formats
512 to the Kernel without compiling an additional module/kernel. Therefore
513 binfmt_misc needs to know magic numbers at the beginning or the
514 filename extension of the binary.
516 It works by maintaining a linked list of structs, that contain a
517 description of a binary format, including a magic with size (or the
518 filename extension), offset and mask, and the interpreter name. On
519 request it invokes the given interpreter with the original program as
520 argument, as binfmt_java and binfmt_em86 and binfmt_mz do.
521 Since binfmt_misc does not define any default binary-formats, you have to
522 register an additional binary-format.
524 There are two general files in binfmt_misc and one file per registered
525 format. The two general files are register and status.
527 Registering a new binary format
529 echo :name:type:offset:magic:mask:interpreter: > /proc/sys/fs/binfmt_misc/register
531 with appropriate name (the name for the /proc-dir entry), offset
532 (defaults to 0, if omitted), magic and mask (which can be omitted,
533 defaults to all 0xff) and last but not least, the interpreter that is
534 to be invoked (for example and testing '/bin/echo'). Type can be M for
535 usual magic matching or E for filename extension matching (give
536 extension in place of magic).
538 To check or reset the status of the binary format handler:
540 If you do a cat on the file /proc/sys/fs/binfmt_misc/status, you will
541 get the current status (enabled/disabled) of binfmt_misc. Change the
542 status by echoing 0 (disables) or 1 (enables) or -1 (caution: this
543 clears all previously registered binary formats) to status. For
544 example echo 0 > status to disable binfmt_misc (temporarily).
546 Status of a single handler
548 Each registered handler has an entry in /proc/sys/fs/binfmt_misc.
549 These files perform the same function as status, but their scope is
550 limited to the actual binary format. By cating this file, you also
551 receive all related information about the interpreter/magic of the
554 Example usage of binfmt_misc (emulate binfmt_java)
556 cd /proc/sys/fs/binfmt_misc
557 echo ':Java:M::\xca\xfe\xba\xbe::/usr/local/java/bin/javawrapper:' > register
558 echo ':HTML:E::html::/usr/local/java/bin/appletviewer:' > register
559 echo ':Applet:M::<!--applet::/usr/local/java/bin/appletviewer:' > register
560 echo ':DEXE:M::\x0eDEX::/usr/bin/dosexec:' > register
562 These three lines add support for Java executables and Java applets
563 (like binfmt_java, additionally recognizing the .html extension with
564 no need to put <!--applet> to every applet file). You have to install
565 the JDK and the shell-script /usr/local/java/bin/javawrapper too. It
566 works around the brokenness of the Java filename handling. To add a
567 Java binary, just create a link to the class-file somewhere in the
570 3.4 /proc/sys/kernel - general kernel parameters
572 This directory reflects general kernel behaviors. As I've said before,
573 the contents are depend on your configuration. I'll list the most
574 important files, along with descriptions of what they mean and how to
578 The file contains three values; highwater, lowwater, and
581 It exists only when BSD-style process accounting is enabled. These
582 values control its behavior. If the free space on the file system
583 where the log lives goes below lowwater%, accounting suspends. If
584 it goes above highwater%, accounting resumes. Frequency determines
585 how often you check the amount of free space (value is in
586 seconds). Default settings are: 4, 2, and 30. That is, suspend
587 accounting if there left <= 2% free; resume it if we have a value
588 >=3%; consider information about the amount of free space valid
592 When the value in this file is 0, ctrl-alt-del is trapped and sent
593 to the init(1) program to handle a graceful restart. However, when
594 the value is > 0, Linux's reaction to this key combination will be
595 an immediate reboot, without syncing its dirty buffers.
597 Note: when a program (like dosemu) has the keyboard in raw mode,
598 the ctrl-alt-del is intercepted by the program before it ever
599 reaches the kernel tty layer, and it is up to the program to decide
602 domainname and hostname
603 These files can be controlled to set the NIS domainname and
604 hostname of your box. For the classic darkstar.frop.org a simple:
606 # echo "darkstar" > /proc/sys/kernel/hostname
607 # echo "frop.org" > /proc/sys/kernel/domainname
609 would suffice to set your hostname and NIS domainname.
611 osrelease, ostype and version
613 The names make it pretty obvious what these fields contain:
615 >cat /proc/sys/kernel/osrelease
617 >cat /proc/sys/kernel/ostype
619 >cat /proc/sys/kernel/version
620 #8 Mon Jan 25 19:45:02 PST 1999
622 The files osrelease and ostype should be clear enough. Version
623 needs a little more clarification however. The #8 means that this
624 is the 8th kernel built from this source base and the date behind
625 it indicates the time the kernel was built. The only way to tune
626 these values is to rebuild the kernel.
629 The value in this file represents the number of seconds the kernel
630 waits before rebooting on a panic. When you use the software
631 watchdog, the recommended setting is 60. If set to 0, the auto
632 reboot after a kernel panic is disabled, this is the default
636 The four values in printk denote console_loglevel,
637 default_message_loglevel, minimum_console_level, and
638 default_console_loglevel respectively.
640 These values influence printk() behavior when printing or logging
641 error messages, which come from inside the kernel. See syslog(2)
642 for more information on the different log levels.
645 Messages with a higher priority than this will be printed to
648 default_message_level
649 Messages without an explicit priority will be printed with
652 minimum_console_loglevel
653 Minimum (highest) value to which the console_loglevel can be set.
655 default_console_loglevel
656 Default value for console_loglevel.
659 This file shows the size of the generic SCSI (sg) buffer. At this
660 point, you can't tune it yet, but you can change it at compile time
661 by editing include/scsi/sg.h and changing the value of
664 If you use a scanner with SANE (Scanner Access now easy) you
665 might want to set this to a higher value. Look into the SANE
666 documentation on this issue.
669 The location where the modprobe binary is located. The kernel
670 uses this program to load modules on demand.
672 3.5 /proc/sys/vm - The virtual memory subsystem
674 The files in this directory can be used to tune the operation of the
675 virtual memory (VM) subsystem of the Linux kernel. In addition, one of
676 the files (bdflush) has a little influence on disk usage.
679 This file controls the operation of the bdflush kernel daemon. It
680 currently contains 9 integer values, 6 of which are actually used
683 nfract Percentage of buffer cache dirty to
685 ndirty Maximum number of dirty blocks to
686 write out per-wake-cycle
687 nrefill Number of clean buffers to try to obtain
688 each time we call refill
689 nref_dirt Dirty buffer threshold for activating bdflush
690 when trying to refill buffers.
692 age_buffer Time for normal buffer to age before you flush it
693 age_super Time for superblock to age before you flush it
698 This parameter governs the maximum number of dirty buffers
699 in the buffer cache. Dirty means that the contents of the
700 buffer still have to be written to disk (as opposed to a
701 clean buffer, which can just be forgotten about). Setting
702 this to a high value means that Linux can delay disk writes
703 for a long time, but it also means that it will have to do a
704 lot of I/O at once when memory becomes short. A low value
705 will spread out disk I/O more evenly.
708 Ndirty gives the maximum number of dirty buffers that
709 bdflush can write to the disk at one time. A high value will
710 mean delayed, bursty I/O, while a small value can lead to
711 memory shortage when bdflush isn't woken up often enough.
714 This the number of buffers that bdflush will add to the list
715 of free buffers when refill_freelist() is called. It is
716 necessary to allocate free buffers beforehand, since the
717 buffers are often different sizes than the memory pages
718 and some bookkeeping needs to be done beforehand. The
719 higher the number, the more memory will be wasted and the
720 less often refill_freelist() will need to run.
723 When refill_freelist() comes across more than nref_dirt
724 dirty buffers, it will wake up bdflush.
726 age_buffer and age_super
727 Finally, the age_buffer and age_super parameters govern the
728 maximum time Linux waits before writing out a dirty buffer
729 to disk. The value is expressed in jiffies (clockticks), the
730 number of jiffies per second is 100. Age_buffer is the
731 maximum age for data blocks, while age_super is for
732 filesystems meta data.
735 The three values in this file control how much memory should be
736 used for buffer memory. The percentage is calculated as a
737 percentage of total system memory.
742 This is the minimum percentage of memory that should be
743 spent on buffer memory.
746 When Linux is short on memory, and the buffer cache uses more
747 than it has been allotted, the memory mangement (MM) subsystem
748 will prune the buffer cache more heavily than other memory to
752 This is the maximum amount of memory that can be used for
756 This file contains three values: min, low and high:
759 When the number of free pages in the system reaches this number,
760 only the kernel can allocate more memory.
763 If the number of free pages gets below this point, the kernel
764 starts swapping aggressively.
767 The kernel tries to keep up to this amount of memory free; if
768 memory comes below this point, the kernel gently starts swapping
769 in the hopes that it never has to do really aggressive swapping.
772 Kswapd is the kernel swap out daemon. That is, kswapd is that piece
773 of the kernel that frees memory when it gets fragmented or
774 full. Since every system is different, you'll probably want some
775 control over this piece of the system.
777 The file contains three numbers:
780 The maximum number of pages kswapd tries to free in one round is
781 calculated from this number. Usually this number will be divided
782 by 4 or 8 (see mm/vmscan.c), so it isn't as big as it looks.
784 When you need to increase the bandwidth to/from swap, you'll want
785 to increase this number.
788 This is the minimum number of times kswapd tries to free a page
789 each time it is called. Basically it's just there to make sure
790 that kswapd frees some pages even when it's being called with
795 This is probably the greatest influence on system
796 performance. swap_cluster is the number of pages kswapd writes in
797 one turn. You'll want this value to be large so that kswapd does
798 its I/O in large chunks and the disk doesn't have to seek as
799 often., but you don't want it to be too large since that would
800 flood the request queue.
803 This file contains one value. The following algorithm is used to
804 decide if there's enough memory: if the value of overcommit_memory
805 is positive, then there's always enough memory. This is a useful
806 feature, since programs often malloc() huge amounts of memory 'just
807 in case', while they only use a small part of it. Leaving this
808 value at 0 will lead to the failure of such a huge malloc(), when
809 in fact the system has enough memory for the program to run.
811 On the other hand, enabling this feature can cause you to run out
812 of memory and thrash the system to death, so large and/or important
813 servers will want to set this value to 0.
816 This file does exactly the same as buffermem, only this file
817 controls the amount of memory allowed for memory mapping and
818 generic caching of files.
820 You don't want the minimum level to be too low, otherwise your
821 system might thrash when memory is tight or fragmentation is
825 The kernel keeps a number of page tables in a per-processor cache
826 (this helps a lot on SMP systems). The cache size for each
827 processor will be between the low and the high value.
829 On a low-memory, single CPU system, you can safely set these values
830 to 0 so you don't waste memory. It is used on SMP systems so that
831 the system can perform fast pagetable allocations without having to
832 aquire the kernel memory lock.
834 For large systems, the settings are probably fine. For normal
835 systems they won't hurt a bit. For small systems (<16MB ram) it
836 might be advantageous to set both values to 0.
839 This file contains no less than 8 variables. All of these values
842 The first four variables sc_max_page_age, sc_page_advance,
843 sc_page_decline and sc_page_initial_age are used to keep track of
844 Linux's page aging. Page aging is a bookkeeping method to track
845 which pages of memory are often used, and which pages can be
846 swapped out without consequences.
848 When a page is swapped in, it starts at sc_page_initial_age
849 (default 3) and when the page is scanned by kswapd, its age is
850 adjusted according to the following scheme:
852 o If the page was used since the last time we scanned, its age
853 is increased by sc_page_advance (default 3) up to a
854 maximum of sc_max_page_age (default 20).
856 o Else (meaning it wasn't used) its age is decreased by
857 sc_page_decline (default 1).
859 When a page reaches age 0, it's ready to be swapped out.
861 The next four variables sc_age_cluster_fract, sc_age_cluster_min,
862 sc_pageout_weight and sc_bufferout_weight, can be used to control
863 kswapd's aggressiveness in swapping out pages.
865 Sc_age_cluster_fract is used to calculate how many pages from a
866 process are to be scanned by kswapd. The formula used is
869 -------------------- * resident set size
872 So if you want kswapd to scan the whole process,
873 sc_age_cluster_fract needs to have a value of 1024. The minimum
874 number of pages kswapd will scan is represented by
875 sc_age_cluster_min, this is done so kswapd will also scan small
878 The values of sc_pageout_weight and sc_bufferout_weight are used
879 to control how many tries kswapd will make in order to swap out
880 one page/buffer. These values can be used to fine-tune the ratio
881 between user pages and buffer/cache memory. When you find that
882 your Linux system is swapping out too many process pages in order
883 to satisfy buffer memory demands, you might want to either
884 increase sc_bufferout_weight, or decrease the value of
887 3.6 /proc/sys/dev - Device specific parameters
889 Currently there is only support for CDROM drives, and for those, there
890 is only one read only file containing information about the CD-ROM
891 drives attached to the system:
893 >cat /proc/sys/dev/cdrom/info
898 drive # of slots: 1 0
902 Can change speed: 1 1
904 Can read multisession: 1 1
906 Reports media changed: 1 1
909 You see two drives, sr0 and hdc, and their lists of features.
911 3.7 /proc/sys/sunrpc - Remote procedure calls
913 This directory contains four files, which enable or disable debugging
914 for the RPC functions NFS, NFS-daemon, RPC and NLM. The default values
915 are 0. They can be set to one, to turn debugging on. (The default
918 3.8 /proc/sys/net - Networking stuff
920 The interface to the networking parts of the kernel is located in
921 /proc/sys/net. The table below shows all possible subdirectories. You
922 may see only some of them, depending on the configuration of your
925 +-------------------------------------------------------------+
926 | core General parameter |appletalk Appletalk protocol |
927 | unix Unix domain sockets |netrom NET/ROM |
928 | 802 E802 protocol |ax25 AX25 |
929 | ethernet Ethernet protocol |rose X.25 PLP layer |
930 | ipv4 IP version 4 |x25 X.25 protocol |
931 | ipx IPX |token-ring IBM token ring |
932 | bridge Bridging |decnet DEC net |
933 | ipv6 IP version 6 | |
934 +-------------------------------------------------------------+
936 We will concentrate on IP networking here. As AX15, X.25, and DEC Net
937 are only minor players in the Linux world, we'll skip them in this
938 chapter. You'll find some short info to Appletalk and IPX further down
939 in section 3.10 and 3.11. Please look in the online documentation and
940 the kernel source to get a detailed view of the parameters for those
941 protocols. In this section we'll discuss the subdirectories printed in
942 bold letters in the table above. As default values are suitable for
943 most needs, there is no need to change these values.
945 /proc/sys/net/core - Network core options
948 The default setting of the socket receive buffer in bytes.
951 The maximum receive socket buffer size in bytes.
954 The default setting (in bytes) of the socket send buffer.
957 The maximum send socket buffer size in bytes.
959 message_burst and message_cost
960 These parameters are used to limit the warning messages written to
961 the kernel log from the networking code. They enforce a rate limit
962 to make a denial-of-service attack impossible. The higher the
963 message_cost factor is, the less messages will be
964 written. Message_burst controls when messages will be dropped. The
965 default settings limit warning messages to one every five seconds.
968 Maximal number of packets, queued on INPUT side, when the interface
969 receives packets faster than kernel can process them.
972 Maximum ancillary buffer size allowed per socket. Ancillary data is
973 a sequence of struct cmsghdr structures with appended data.
975 /proc/sys/net/unix - Parameters for UNIX domain sockets
977 There are only two files in this subdirectory. They control the delays
978 for deleting and destroying socket descriptors.
980 3.9 /proc/sys/net/ipv4 - IPV4 settings
982 IP version 4 is still the most used protocol in Unix networking. It
983 will be replaced by IP version 6 in the next couple of years, but for
984 the moment it's the de facto standard for the internet and is used in
985 most networking environments around the world. Because of the
986 importance of this protocol, we'll have a deeper look into the subtree
987 controlling the behavior of the IPv4 subsystem of the Linux kernel.
989 Let's start with the entries in /proc/sys/net/ipv4 itself.
993 icmp_echo_ignore_all and icmp_echo_ignore_broadcasts
994 Turn on (1) or off (0), if the kernel should ignore all ICMP ECHO
995 requests, or just those to broadcast and multicast addresses.
997 Please note that if you accept ICMP echo requests with a
998 broadcast/multicast destination address your network may be used
999 as an exploder for denial of service packet flooding attacks to
1002 icmp_destunreach_rate, icmp_echoreply_rate,
1003 icmp_paramprob_rate and icmp_timeexeed_rate
1004 Sets limits for sending ICMP packets to specific targets. A value of
1005 zero disables all limiting. Any positive value sets the maximum
1006 package rate in hundredths of a second (on Intel systems).
1011 This file contains one, if the host got its IP configuration by
1012 RARP, BOOTP, DHCP or a similar mechanism. Otherwise it is zero.
1015 TTL (Time To Live) for IPv4 interfaces. This is simply the
1016 maximum number of hops a packet may travel.
1019 Enable dynamic socket address rewriting on interface address change. This
1020 is useful for dialup interface with changing IP addresses.
1023 Enable or disable forwarding of IP packages between interfaces. A
1024 change of this value resets all other parameters to their default
1025 values. They differ if the kernel is configured as host or router.
1028 Range of ports used by TCP and UDP to choose the local
1029 port. Contains two numbers, the first number is the lowest port,
1030 the second number the highest local port. Default is 1024-4999.
1031 Should be changed to 32768-61000 for high-usage systems.
1034 Global switch to turn path MTU discovery off. It can also be set
1035 on a per socket basis by the applications or on a per route
1039 Enable/disable debugging of IP masquerading.
1042 IP fragmentation settings
1044 ipfrag_high_trash and ipfrag_low_trash
1045 Maximum memory used to reassemble IP fragments. When
1046 ipfrag_high_thresh bytes of memory is allocated for this purpose,
1047 the fragment handler will toss packets until ipfrag_low_thresh is
1052 Time in seconds to keep an IP fragment in memory.
1056 tcp_retrans_collapse
1057 Bug-to-bug compatibility with some broken printers. On retransmit
1058 try to send bigger packets to work around bugs in certain TCP
1059 stacks. Can be turned off by setting it to zero.
1061 tcp_keepalive_probes
1062 Number of keep alive probes TCP sends out, until it decides that the
1063 connection is broken.
1066 How often TCP sends out keep alive messages, when keep alive is
1067 enabled. The default is 2 hours.
1070 Number of times initial SYNs for a TCP connection attempt will be
1071 retransmitted. Should not be higher than 255. This is only the
1072 timeout for outgoing connections, for incoming connections the
1073 number of retransmits is defined by tcp_retries1.
1076 Enable select acknowledgments after RFC2018.
1079 Enable timestamps as defined in RFC1323.
1082 Enable the strict RFC793 interpretation of the TCP urgent pointer
1083 field. The default is to use the BSD compatible interpretation
1084 of the urgent pointer pointing to the first byte after the urgent
1085 data. The RFC793 interpretation is to have it point to the last
1086 byte of urgent data. Enabling this option may lead to
1087 interoperatibility problems. Disabled by default.
1090 Only valid when the kernel was compiled with
1091 CONFIG_SYNCOOKIES. Send out syncookies when the syn backlog queue
1092 of a socket overflows. This is to prevent against the common 'syn
1093 flood attack'. Disabled by default.
1095 Note that the concept of a socket backlog is abandoned, this
1096 means the peer may not receive reliable error messages from an
1097 over loaded server with syncookies enabled.
1100 Enable window scaling as defined in RFC1323.
1103 How many seconds to wait for a final FIN before the socket is
1104 always closed. This is strictly a violation of the TCP
1105 specification, but required to prevent denial-of-service attacks.
1108 How many keepalive probes are sent per slow timer run. Shouldn't be
1109 set too high to prevent bursts.
1112 Length of the per socket backlog queue. Since Linux 2.2 the backlog
1113 specified in listen(2) only specifies the length of the backlog
1114 queue of already established sockets. When more connection requests
1115 arrive Linux starts to drop packets. When syncookies are enabled
1116 the packets are still answered and the maximum queue is effectively
1120 Defines how often an answer to a TCP connection request is
1121 retransmitted before giving up.
1124 Defines how often a TCP packet is retransmitted before giving up.
1126 Interface specific settings
1128 In the directory /proc/sys/net/ipv4/conf you'll find one subdirectory
1129 for each interface the system knows about and one directory calls
1130 all. Changes in the all subdirectory affect all interfaces, where
1131 changes in the other subdirectories affect only one interface.
1133 All directories have the same entries:
1136 This switch decides if the kernel accepts ICMP redirect messages
1137 or not. The default is 'yes', if the kernel is configured for a
1138 regular host; and 'no' for a router configuration.
1141 Should source routed packages be accepted or declined. The
1142 default is dependent on the kernel configuration. It's 'yes' for
1143 routers and 'no' for hosts.
1146 Accept packets with source address 0.b.c.d destined not to this
1147 host as local ones. It is supposed that BOOTP relay daemon will
1148 catch and forward such packets.
1150 The default is 'no', as this feature is not implemented yet
1151 (kernel version 2.2.0-pre?).
1154 Enable or disable IP forwarding on this interface.
1157 Log packets with source addresses with no known route to kernel log.
1160 Do multicast routing. The kernel needs to be compiled with
1161 CONFIG_MROUTE and a multicast routing daemon is required.
1164 Do (1) or don't (0) do proxy ARP.
1167 Integer value deciding if source validation should be made.
1168 1 means yes, 0 means no. Disabled by default, but
1169 local/broadcast address spoofing is always on.
1171 If you set this to 1 on a router that is the only connection
1172 for a network to the net , it evidently prevents spoofing attacks
1173 against your internal networks (external addresses can still be
1174 spoofed), without the need for additional firewall rules.
1177 Accept ICMP redirect messages only for gateways, listed in
1178 default gateway list. Enabled by default.
1181 If it is not set the kernel does not assume that different subnets
1182 on this device can communicate directly. Default setting is 'yes'.
1185 Determines if or if not to send ICMP redirects to other hosts.
1190 The directory /proc/sys/net/ipv4/route contains several file to
1191 control routing issues.
1193 error_burst and error_cost
1194 These parameters are used to limit the warning messages written to
1195 the kernel log from the routing code. The higher the error_cost
1196 factor is, the fewer messages will be written. Error_burst controls
1197 when messages will be dropped. The default settings limit warning
1198 messages to one every five seconds.
1201 Writing to this file results in a flush of the routing cache.
1203 gc_elastic, gc_interval, gc_min_interval, gc_tresh, gc_timeout
1204 Values to control the frequency and behavior of the garbage
1205 collection algorithm for the routing cache.
1208 Maximum size of the routing cache. Old entries will be purged
1209 once the cache has this size.
1211 max_delay, min_delay
1212 Delays for flushing the routing cache.
1214 redirect_load, redirect_number
1215 Factors which determine if more ICPM redirects should be sent to
1216 a specific host. No redirects will be sent once the load limit or
1217 the maximum number of redirects has been reached.
1221 Timeout for redirects. After this period redirects will be sent
1222 again, even if this has been stopped, because the load or number
1223 limit has been reached.
1225 Network Neighbor handling
1227 Settings about how to handle connections with direct neighbors (nodes
1228 attached to the same link) can be found in the directory
1229 /proc/sys/net/ipv4/neigh.
1231 As we saw it in the conf directory, there is a default subdirectory
1232 which holds the default values, and one directory for each
1233 interface. The contents of the directories are identical, with the
1234 single exception that the default settings contain additional options
1235 to set garbage collection parameters.
1237 In the interface directories you'll find the following entries:
1240 A base value used for computing the random reachable time value
1241 as specified in RFC2461.
1244 The time, expressed in jiffies (1/100 sec), between retransmitted
1245 Neighbor Solicitation messages. Used for address resolution and to
1246 determine if a neighbor is unreachable.
1249 Maximum queue length for a pending arp request - how many packets
1250 are accepted from other layers while the arp address is still
1254 Maximum for random delay of answers to neighbor solicitation
1255 messages in jiffies (1/100 sec). Not yet implemented (Linux does
1256 not have anycast support yet).
1259 Maximum number of retries for unicast solicitation.
1262 Maximum number of retries for multicast solicitation.
1264 delay_first_probe_time
1265 Delay for the first time probe if the neighbor is reachable. (see
1269 An ARP/neighbor entry is only replaced with a new one if the old
1270 is at least locktime old. This prevents ARP cache thrashing.
1273 Maximum time (real time is random [0..proxytime]) before
1274 answering to an arp request for which we have an proxy arp entry.
1275 In some cases, this is used to prevent network flooding.
1278 Maximum queue length of the delayed proxy arp timer (see
1282 Determines the number of requests to send to the user level arp
1283 daemon. 0 to turn off.
1286 Determines how often to check for stale ARP entries. After an ARP
1287 entry is stale it will be resolved again (useful when an IP address
1288 migrates to another machine). When ucast_solicit is > 0 it first
1289 tries to send an ARP packet directly to the known host, when that
1290 fails and mcast_solicit is > 0, an ARP request is broadcasted.
1294 The /proc/sys/net/appletalk directory holds the Appletalk
1295 configuration data when Appletalk is loaded. The configurable
1299 The amount of time we keep an AARP entry before expiring
1300 it. Used to age out old hosts.
1303 The amount of time we will spend trying to resolve an Appletalk
1306 aarp-retransmit-limit
1307 The number of times we will retransmit a query before giving up.
1310 Controls the rate at which expiries are checked.
1313 The directory /proc/net/appletalk holds the list of active appletalk
1314 sockets on a machine.
1316 The fields indicate the DDP type, the local address (in network:node
1317 format) the remote address, the size of the transmit pending queue,
1318 the size of the received queue (bytes waiting for applications to
1319 read) the state and the uid owning the socket.
1321 /proc/net/atalk_iface lists all the interfaces configured for
1322 appletalk.It shows the name of the interface, its appletalk address,
1323 the network range on that ad- dress (or network number for phase 1
1324 networks), and the status of the interface.
1326 /proc/net/atalk_route lists each known network route. It lists the
1327 target (network) that the route leads to, the router (may be directly
1328 connected), the route flags, and the device the route is via.
1332 The IPX protocol has no tunable values in /proc/sys/net.
1334 The IPX protocol does, however, provide /proc/net/ipx. This lists each
1335 IPX socket giving the local and remote addresses in Novell format
1336 (that is network:node:port). In accordance with the strange Novell
1337 tradition, everything but the port is in hex. Not_Connected is
1338 displayed for sockets that are not tied to a specific remote
1339 address. The Tx and Rx queue sizes indicate the number of bytes
1340 pending for transmit and receive. The state indicates the state the
1341 socket is in and the uid is the owning uid of the socket.
1343 The /proc/net/ipx_interface file lists all IPX interfaces. For each
1344 interface it gives the network number, the node number, and indicates
1345 if the network is the primary network. It also indicates which device it is bound to (or
1346 Internal for internal networks) and the Frame Type if
1347 appropriate. Linux supports 802.3, 802.2, 802.2 SNAP and DIX (Blue
1348 Book) ethernet framing for IPX.
1350 The /proc/net/ipx_route table holds a list of IPX routes. For each
1351 route it gives the destination network, the router node (or Directly)
1352 and the network address of the router (or Connected) for internal