1 Non-blocking layered I/O
2 ========================
4 *[last edited by AOF 24 March 1998 14:15]*
5 I've recently been working on a long standing issue regarding NSPR's I/O
6 model. For a long time I've believed that the non-blocking I/O prevalent
7 in classic operating systems (e.g., UNIX) was the major determent for
8 having an reasonable layered protocols. Now that I have some first hand
9 experience, albeit just a silly little test program, I am more convinced
10 that ever of this truth.
12 This memo is some of what I think must be done in NSPR's I/O subsystem
13 to make layered, non-blocking protocols workable. It is just a proposal.
14 There is an API change.
19 NSPR 2.0 defines a structure by which one may define I/O layers. Each
20 layer looks basically like any other in that it still uses a
21 :ref:`PRFileDesc` as a object identifier, complete with the
22 **``IOMethods``** table of functions. However, each layer may override
23 default behavior of a particular operation to implement other services.
24 For instance, the experiment at hand is one that implements a little
25 reliable echo protocol; the client sends *n* bytes, and the same bytes
26 get echoed back by the server. In the non-layered design of this it is
28 The goal of the experiment was to put a layer between the client and
29 the network, and not have the client know about it. This additional
30 layer is one that, before sending the client's data, must ask permission
31 from the peer layer to send that many bytes. It imposes an additional
32 send and response inside of each client visible send operation. The
33 receive operations parallel the sends. Before actually receiving real
34 client data, the layer receives a notification that the other would like
35 to send some bytes. The layer is responsible for granting permission for
36 that data to be sent, then actually receiving the data itself, which is
37 delivered to the client.
39 The synchronous form of the layer's operation is straight forward. A
40 call to receive (:ref:`PR_Recv`) first receives the request to send,
41 sends (:ref:`PR_Send`) the grant, then receives the actual data
42 (:ref:`PR_Recv`). All the client of the layer sees is the data coming
43 in. Similar behavior is observed on the sending side.
48 The non-blocking method is not so simple. Any of the I/O operations
49 potentially result in an indication that no progress can be made. The
50 intermediate layers cannot act directly on this information, but must
51 store the state of the I/O operation until it can be resumed. The method
52 for determining that a I/O operation can make progress is to call
53 :ref:`PR_Poll` and indicating what type of progress is desired,
54 either input or output (or some others). Therein lies the problem.
55 The intermediate layer is performing operations that the client is
56 unaware. So when the client calls send (:ref:`PR_Send`) and is told
57 that the operation would block, it is possible that the layer below is
58 actually doing a receive (:ref:`PR_Recv`). The problem is that the
59 flag bits passed to :ref:`PR_Poll` are only reflective of the
60 client's knowledge and desires. This is further complicated by the fact
61 that :ref:`PR_Poll` is not layered. That is each layer does not have
62 the opportunity to override the behavior. It operates, not on a single
63 file descriptor (:ref:`PRFileDesc`), but on an arbitrary collection of
66 Into the picture comes another I/O method, **``poll()``**. Keep in mind
67 that all I/O methods are those that are part of the I/O methods table
68 structure (:ref:`PRIOMethods`). These functions are layered, and layers
69 may and sometimes must override their behavior by offering unique
70 implementations. The **``poll()``** method is used to provide two
71 modifying aspects to the semantics of :ref:`PR_Poll`: redefining the
72 polling bits (i.e., what to poll for) and to indicate that a layer is
73 already able to make progress in the manner suggested by the polling
76 The **``poll()``** method is called by :ref:`PR_Poll` as the latter
77 is building the structure to provide the operating system call. The
78 stack's top layer will be called first. Each layer's implementation is
79 responsible for performing appropriate operations and possibly calling
80 the next lower layer's **``poll()``** method.
81 What the poll method is returning are the appropriate flags to assign to
82 the operating system's call. A layer would compute these based on the
83 values of the argument **``in_flags``** and possibly some state
84 maintained by the layer for the particular file descriptor.
86 Additionally, if the layer has buffered data that will allow the
87 operation defined by **``in_flags``** to make progress, it will set
88 corresponding bits in **``out_flags``**. For instance, if
89 **``in_flags``** indicates that the client (or higher layer) wishes to
90 test for read ready and the layer has input data buffered, it would set
91 the read bits in the **``out_flags``**. If that is the case, then it
92 should also suppress the calling of the next lower layer's
93 **``poll()``** method and return a value equal to that of