1 .\" This manpage is copyright (C) 2001 Paul Sheer.
3 .\" %%%LICENSE_START(VERBATIM)
4 .\" Permission is granted to make and distribute verbatim copies of this
5 .\" manual provided the copyright notice and this permission notice are
6 .\" preserved on all copies.
8 .\" Permission is granted to copy and distribute modified versions of this
9 .\" manual under the conditions for verbatim copying, provided that the
10 .\" entire resulting derived work is distributed under the terms of a
11 .\" permission notice identical to this one.
13 .\" Since the Linux kernel and libraries are constantly changing, this
14 .\" manual page may be incorrect or out-of-date. The author(s) assume no
15 .\" responsibility for errors or omissions, or for damages resulting from
16 .\" the use of the information contained herein. The author(s) may not
17 .\" have taken the same level of care in the production of this manual,
18 .\" which is licensed free of charge, as they might when working
21 .\" Formatted or processed versions of this manual, if unaccompanied by
22 .\" the source, must acknowledge the copyright and authors of this work.
25 .\" very minor changes, aeb
27 .\" Modified 5 June 2002, Michael Kerrisk <mtk.manpages@gmail.com>
28 .\" 2006-05-13, mtk, removed much material that is redundant with select.2
29 .\" various other changes
30 .\" 2008-01-26, mtk, substantial changes and rewrites
32 .TH SELECT_TUT 2 2021-03-22 "Linux" "Linux Programmer's Manual"
34 select, pselect \- synchronous I/O multiplexing
43 system calls are used to efficiently monitor multiple file descriptors,
44 to see if any of them is, or becomes, "ready";
45 that is, to see whether I/O becomes possible,
46 or an "exceptional condition" has occurred on any of the file descriptors.
48 This page provides background and tutorial information
49 on the use of these system calls.
50 For details of the arguments and semantics of
57 .SS Combining signal and data events
59 is useful if you are waiting for a signal as well as
60 for file descriptor(s) to become ready for I/O.
61 Programs that receive signals
62 normally use the signal handler only to raise a global flag.
63 The global flag will indicate that the event must be processed
64 in the main loop of the program.
65 A signal will cause the
69 call to return with \fIerrno\fP set to \fBEINTR\fP.
70 This behavior is essential so that signals can be processed
71 in the main loop of the program, otherwise
73 would block indefinitely.
76 in the main loop will be a conditional to check the global flag.
78 what if a signal arrives after the conditional, but before the
83 would block indefinitely, even though an event is actually pending.
84 This race condition is solved by the
87 This call can be used to set the signal mask to a set of signals
88 that are to be received only within the
91 For instance, let us say that the event in question
92 was the exit of a child process.
93 Before the start of the main loop, we
94 would block \fBSIGCHLD\fP using
100 by using an empty signal mask.
101 Our program would look like:
104 static volatile sig_atomic_t got_SIGCHLD = 0;
107 child_sig_handler(int sig)
113 main(int argc, char *argv[])
115 sigset_t sigmask, empty_mask;
117 fd_set readfds, writefds, exceptfds;
120 sigemptyset(&sigmask);
121 sigaddset(&sigmask, SIGCHLD);
122 if (sigprocmask(SIG_BLOCK, &sigmask, NULL) == \-1) {
123 perror("sigprocmask");
128 sa.sa_handler = child_sig_handler;
129 sigemptyset(&sa.sa_mask);
130 if (sigaction(SIGCHLD, &sa, NULL) == \-1) {
135 sigemptyset(&empty_mask);
137 for (;;) { /* main loop */
138 /* Initialize readfds, writefds, and exceptfds
139 before the pselect() call. (Code omitted.) */
141 r = pselect(nfds, &readfds, &writefds, &exceptfds,
143 if (r == \-1 && errno != EINTR) {
150 /* Handle signalled event here; e.g., wait() for all
151 terminated children. (Code omitted.) */
154 /* main body of program */
159 So what is the point of
161 Can't I just read and write to my file descriptors whenever I want?
165 multiple descriptors at the same time and properly puts the process to
166 sleep if there is no activity.
167 UNIX programmers often find
168 themselves in a position where they have to handle I/O from more than one
169 file descriptor where the data flow may be intermittent.
170 If you were to merely create a sequence of
175 find that one of your calls may block waiting for data from/to a file
176 descriptor, while another file descriptor is unused though ready for I/O.
178 efficiently copes with this situation.
180 Many people who try to use
182 come across behavior that is
183 difficult to understand and produces nonportable or borderline results.
184 For instance, the above program is carefully written not to
185 block at any point, even though it does not set its file descriptors to
187 It is easy to introduce
188 subtle errors that will remove the advantage of using
190 so here is a list of essentials to watch for when using
194 You should always try to use
198 should have nothing to do if there is no data available.
200 depends on timeouts is not usually portable and is difficult to debug.
203 The value \fInfds\fP must be properly calculated for efficiency as
207 No file descriptor must be added to any set if you do not intend
208 to check its result after the
210 call, and respond appropriately.
216 returns, all file descriptors in all sets
217 should be checked to see if they are ready.
226 do \fInot\fP necessarily read/write the full amount of data
227 that you have requested.
228 If they do read/write the full amount, it's
229 because you have a low traffic load and a fast stream.
230 This is not always going to be the case.
231 You should cope with the case of your
232 functions managing to send or receive only a single byte.
235 Never read/write only in single bytes at a time unless you are really
236 sure that you have a small amount of data to process.
238 inefficient not to read/write as much data as you can buffer each time.
239 The buffers in the example below are 1024 bytes although they could
240 easily be made larger.
250 can fail with the error
260 set to \fBEAGAIN\fP (\fBEWOULDBLOCK\fP).
261 These results must be properly managed (not done properly above).
262 If your program is not going to receive any signals, then
263 it is unlikely you will get \fBEINTR\fP.
264 If your program does not set nonblocking I/O,
265 you will not get \fBEAGAIN\fP.
266 .\" Nonetheless, you should still cope with these errors for completeness.
275 with a buffer length of zero.
284 fail with errors other than those listed in \fB7.\fP,
285 or one of the input functions returns 0, indicating end of file,
286 then you should \fInot\fP pass that file descriptor to
289 In the example below,
290 I close the file descriptor immediately, and then set it to \-1
291 to prevent it being included in a set.
294 The timeout value must be initialized with each new call to
296 since some operating systems modify the structure.
298 however does not modify its timeout structure.
303 modifies its file descriptor sets,
304 if the call is being used in a loop,
305 then the sets must be reinitialized before each call.
306 .\" "I have heard" does not fill me with confidence, and doesn't
307 .\" belong in a man page, so I've commented this point out.
310 .\" I have heard that the Windows socket layer does not cope with OOB data
312 .\" It also does not cope with
314 .\" calls when no file descriptors are set at all.
315 .\" Having no file descriptors set is a useful
316 .\" way to sleep the process with subsecond precision by using the timeout.
317 .\" (See further on.)
323 all operating systems that support sockets also support
327 many problems in a portable and efficient way that naive programmers try
328 to solve in a more complicated manner using
329 threads, forking, IPCs, signals, memory sharing, and so on.
333 system call has the same functionality as
335 and is somewhat more efficient when monitoring sparse
336 file descriptor sets.
337 It is nowadays widely available, but historically was less portable than
342 API provides an interface that is more efficient than
346 when monitoring large numbers of file descriptors.
348 Here is an example that better demonstrates the true utility of
350 The listing below is a TCP forwarding program that forwards
351 from one TCP port to another.
357 #include <sys/select.h>
360 #include <sys/socket.h>
361 #include <netinet/in.h>
362 #include <arpa/inet.h>
365 static int forward_port;
368 #define max(x,y) ((x) > (y) ? (x) : (y))
371 listen_socket(int listen_port)
373 struct sockaddr_in addr;
377 lfd = socket(AF_INET, SOCK_STREAM, 0);
384 if (setsockopt(lfd, SOL_SOCKET, SO_REUSEADDR,
385 &yes, sizeof(yes)) == \-1) {
386 perror("setsockopt");
391 memset(&addr, 0, sizeof(addr));
392 addr.sin_port = htons(listen_port);
393 addr.sin_family = AF_INET;
394 if (bind(lfd, (struct sockaddr *) &addr, sizeof(addr)) == \-1) {
400 printf("accepting connections on port %d\en", listen_port);
406 connect_socket(int connect_port, char *address)
408 struct sockaddr_in addr;
411 cfd = socket(AF_INET, SOCK_STREAM, 0);
417 memset(&addr, 0, sizeof(addr));
418 addr.sin_port = htons(connect_port);
419 addr.sin_family = AF_INET;
421 if (!inet_aton(address, (struct in_addr *) &addr.sin_addr.s_addr)) {
422 fprintf(stderr, "inet_aton(): bad IP address format\en");
427 if (connect(cfd, (struct sockaddr *) &addr, sizeof(addr)) == \-1) {
429 shutdown(cfd, SHUT_RDWR);
436 #define SHUT_FD1 do { \e
438 shutdown(fd1, SHUT_RDWR); \e
444 #define SHUT_FD2 do { \e
446 shutdown(fd2, SHUT_RDWR); \e
452 #define BUF_SIZE 1024
455 main(int argc, char *argv[])
458 int fd1 = \-1, fd2 = \-1;
459 char buf1[BUF_SIZE], buf2[BUF_SIZE];
460 int buf1_avail = 0, buf1_written = 0;
461 int buf2_avail = 0, buf2_written = 0;
464 fprintf(stderr, "Usage\en\etfwd <listen\-port> "
465 "<forward\-to\-port> <forward\-to\-ip\-address>\en");
469 signal(SIGPIPE, SIG_IGN);
471 forward_port = atoi(argv[2]);
473 h = listen_socket(atoi(argv[1]));
480 fd_set readfds, writefds, exceptfds;
488 if (fd1 > 0 && buf1_avail < BUF_SIZE)
489 FD_SET(fd1, &readfds);
490 /* Note: nfds is updated below, when fd1 is added to
492 if (fd2 > 0 && buf2_avail < BUF_SIZE)
493 FD_SET(fd2, &readfds);
495 if (fd1 > 0 && buf2_avail \- buf2_written > 0)
496 FD_SET(fd1, &writefds);
497 if (fd2 > 0 && buf1_avail \- buf1_written > 0)
498 FD_SET(fd2, &writefds);
501 FD_SET(fd1, &exceptfds);
502 nfds = max(nfds, fd1);
505 FD_SET(fd2, &exceptfds);
506 nfds = max(nfds, fd2);
509 ready = select(nfds + 1, &readfds, &writefds, &exceptfds, NULL);
511 if (ready == \-1 && errno == EINTR)
519 if (FD_ISSET(h, &readfds)) {
521 struct sockaddr_in client_addr;
524 addrlen = sizeof(client_addr);
525 memset(&client_addr, 0, addrlen);
526 fd = accept(h, (struct sockaddr *) &client_addr, &addrlen);
532 buf1_avail = buf1_written = 0;
533 buf2_avail = buf2_written = 0;
535 fd2 = connect_socket(forward_port, argv[3]);
539 printf("connect from %s\en",
540 inet_ntoa(client_addr.sin_addr));
542 /* Skip any events on the old, closed file
549 /* NB: read OOB data before normal reads. */
551 if (fd1 > 0 && FD_ISSET(fd1, &exceptfds)) {
554 nbytes = recv(fd1, &c, 1, MSG_OOB);
558 send(fd2, &c, 1, MSG_OOB);
560 if (fd2 > 0 && FD_ISSET(fd2, &exceptfds)) {
563 nbytes = recv(fd2, &c, 1, MSG_OOB);
567 send(fd1, &c, 1, MSG_OOB);
569 if (fd1 > 0 && FD_ISSET(fd1, &readfds)) {
570 nbytes = read(fd1, buf1 + buf1_avail,
571 BUF_SIZE \- buf1_avail);
575 buf1_avail += nbytes;
577 if (fd2 > 0 && FD_ISSET(fd2, &readfds)) {
578 nbytes = read(fd2, buf2 + buf2_avail,
579 BUF_SIZE \- buf2_avail);
583 buf2_avail += nbytes;
585 if (fd1 > 0 && FD_ISSET(fd1, &writefds) && buf2_avail > 0) {
586 nbytes = write(fd1, buf2 + buf2_written,
587 buf2_avail \- buf2_written);
591 buf2_written += nbytes;
593 if (fd2 > 0 && FD_ISSET(fd2, &writefds) && buf1_avail > 0) {
594 nbytes = write(fd2, buf1 + buf1_written,
595 buf1_avail \- buf1_written);
599 buf1_written += nbytes;
602 /* Check if write data has caught read data. */
604 if (buf1_written == buf1_avail)
605 buf1_written = buf1_avail = 0;
606 if (buf2_written == buf2_avail)
607 buf2_written = buf2_avail = 0;
609 /* One side has closed the connection, keep
610 writing to the other side until empty. */
612 if (fd1 < 0 && buf1_avail \- buf1_written == 0)
614 if (fd2 < 0 && buf2_avail \- buf2_written == 0)
621 The above program properly forwards most kinds of TCP connections
622 including OOB signal data transmitted by \fBtelnet\fP servers.
623 It handles the tricky problem of having data flow in both directions
625 You might think it more efficient to use a
627 call and devote a thread to each stream.
628 This becomes more tricky than you might suspect.
629 Another idea is to set nonblocking I/O using
631 This also has its problems because you end up using
632 inefficient timeouts.
634 The program does not handle more than one simultaneous connection at a
635 time, although it could easily be extended to do this with a linked list
636 of buffers\(emone for each connection.
638 connections cause the current connection to be dropped.
651 .\" This man page was written by Paul Sheer.