Add credits for Marek
[trojita.git] / docs / masters / imap-protocol.tex
blob1eab00a1a75e37f2122b4e18b124158eb8c0a3b0
1 % vim: spelllang=en spell textwidth=120
2 \documentclass[trojita]{subfiles}
4 \begin{document}
6 \chapter{IMAP Protocol Essentials}
7 \label{sec:imap-protocol}
9 This chapter provides a gentle introduction to peculiarities of the IMAP protocol and presents an analysis of how its
10 users can benefit from the unique protocol features.
12 \section{IMAP}
14 The IMAP protocol, as defined by RFC~3501~\cite{rfc3501}, is an internet protocol suitable for managing e-mail folders
15 and individual messages stored on a remote mail server. In contrast to the older POP3 protocol~\cite{rfc1939}, IMAP is
16 actually intended to serve as an {\em access} protocol. Where a POP3 client would happily download a full message from
17 the mail server, store it into a local mailbox and perform all further processing locally, the IMAP mode of operation is
18 much more complicated. These complications, however, bring a whole slew of new features and interesting applications
19 along.
21 For one thing, IMAP presents a single authoritative place storing messages --- that feature alone is a must in today's world
22 where everyone expects to be able to access mail from their cell phones. Furthermore, given that all messages
23 are located on a single place, it is possible to perform efficient server-side operations like searching or sorting over
24 the whole mail store. IMAP also makes it possible to access individual message parts like attachments separately,
25 eliminating the need to download a huge message before reading a short accompanying textual information. Finally,
26 advanced servers can recognize clients with limited resources and only present a subset of messages to them.
28 At the same time, IMAP is an old protocol burdened with many compatibility warts. Its designers were struggling with
29 people objecting to novel ideas due to legacy code in their mail implementations. Over the years, though, various
30 protocol extensions appeared. Some of them are extremely useful for contemporary clients, yet cannot be relied upon
31 as there is no general agreement on what extensions are really crucial, and hence available on most IMAP servers.
33 The rest of this chapter provides a quick overview of the basic IMAP concepts and how they relate to the usual client's
34 workflow. A detailed introduction to the basic IMAP concepts can be found in my bachelor thesis on this topic
35 \cite[p. 9 - 19]{jkt-bc-thesis}.
37 \subsection{Basic Features}
39 An IMAP server exports a set of {\em mailboxes}, folders which contain individual messages (and further mailboxes, if
40 the server allows that). Each message can be identified either by its {\em sequence number}, an order in which it
41 appears in mailbox, or by its {\em UID}. Sequence numbers are by definition very volatile (deleting the first message
42 in a mailbox changes sequence numbers of all subsequent messages, for example) while the UIDs provide better chances of
43 persistence across reconnects.~\footnote{It shall be noted that IMAP does {\em not} guarantee UIDs to be persistent at
44 all. The reason behind this decision was to allow IMAP to publish messages from obsolete mail stores which could not
45 have been extended to support UIDs at all. Even today, UID changes have to be expected when signalled through {\tt
46 UIDVALIDITY}.} When the UIDs have to be invalidated for some reason, a per-mailbox integer property {\tt UIDVALIDITY}
47 is incremented to signal all clients that previously used UIDs are no longer valid.
49 \subsection{Cache Filing Protocol}
51 As Mark Crispin, the principal author of the IMAP standard, has to
52 say~\cite{crispin-imap-cache-filing-1}~\cite{crispin-imap-cache-filing-2}, IMAP is a {\em cache filing protocol}. That
53 means that whatever the server thinks about a mailbox state is {\em the truth}, and any state stored on the clients can
54 be invalidated by the server at any time. This critical design choice has impact on all further operations. IMAP
55 clients which do not anticipate such a behavior~\footnote{Such clients are usually called ``POP3 clients converted to
56 speak IMAP'' on various IMAP-related mailing
57 lists.~\cite{shannon-imap-clients-glorified-pop}~\cite{crispin-imap-clients-glorified-pop}} are bound to operate in an
58 inefficient manner or fail in unexpected scenarios.
60 The first issue which typically comes up on the {\tt imap-protocol} mailing list is treating UIDs as a persistent
61 identifier of some kind. In fact, IMAP guarantees that a triple of (mailbox name, {\tt UIDVALIDITY}, {\tt UID}) will
62 not refer to {\em any other} message at any time, but there's no guarantee that the very same message, quite possibly in
63 the same mailbox, will not get another UID in future.~\footnote{People have been trying to solve this issue for quite
64 some time, but no standardized solution is ready yet. The recent iterations of these proposals concentrate on providing
65 a cryptographic hash of a message body, but is far from clear whether doing so would get any traction. Furthermore, the
66 hashes are typically too long to serve as the only identifier of a message, so UIDs will definitely be around in
67 future.} That said, on reasonable server implementations, the UIDs should not get invalidated too often under normal
68 circumstances. Given the IMAP protocol doesn't offer anything else, they are widely used (along with the {\tt
69 UIDVALIDITY} and when limited to the scope of a single mailbox) as a semi-persistent identification of a message.
71 UIDs are assigned to the messages in a strictly monotonic sense, i.e. if message $A$ has a sequence number $seq_A$ and
72 message $B$ has sequence number $seq_B$ such as $seq_A < seq_B$, it is true that $UID_A < UID_B$. UID numbers are also
73 guaranteed to never be reused in the same mailbox, unless the {\tt UIDVALIDITY} changes.
75 Due to the facts described above, virtually any IMAP client which maintains a persistent cache of the downloaded data
76 uses UIDs to assign the cached data to individual messages. Such an approach leads to a need to maintain a mapping
77 between the sequence numbers and the UID numbers of all messages in the mailbox --- upon reconnect, clients have to
78 recognize whether any messages previously available in the mailbox disappeared, and if they did, the clients should
79 remove the cached data for these messages.~\footnote{The reader shall be reminded that IMAP is a {\em cache filing
80 protocol}, i.e. the server is always right about what messages ``are'' in a mailbox and what messages are gone.} This
81 is in a strong contrast to the usual POP3 mode of operation where the clients are expected to prune their cache only
82 based on their local policy, perhaps moving older messages to a designated archive, but definitely not discarding the
83 retrieved data as soon as the server doesn't present the message any longer.
85 Furthermore, even during an established session the IMAP server informs about messages being permanently deleted through
86 the {\tt EXPUNGED} response which contains sequence number only. Given that the cache is usually addressed by UID, a
87 caching client shall maintain full UID mapping at any time.
89 \section{Mailbox Synchronization}
90 \label{sec:imap-mailbox-sync}
92 When an IMAP client opens a mailbox, the server provides it with a few data points about the state of the mail store.
93 Among these data, there's a number representing the total amount of messages in the mailbox through the {\tt EXISTS}
94 response, the current {\tt UIDVALIDITY} value and, finally, the {\tt UIDNEXT} which represents the lowest UID which the
95 next arriving message could possibly get assigned. Please note that the {\tt UIDNEXT} is merely a lower bound of the
96 future UID; there is no guarantee that a message with such UID would ever exist in the mailbox in future.
98 Having obtained these three values, the client can perform a few optimizations before it proceeds to fetch an updated
99 UID mapping from the IMAP server:
101 \begin{itemize}
102 \item If the {\tt UIDVALIDITY} has changed, the client is obliged to completely purge any data which it might have
103 accumulated in its local persistent cache. This is a hard requirement allowing the server to inform the client that
104 no state whatsoever can be reused from the previous connections. In the real world, this situation shall be only
105 reached under exceptional circumstances (like when migrating to a completely different server implementation, or
106 after having to restore the data after an inadvertent damage caused by a reckless system
107 administrator.~\footnote{The IMAP standard nevertheless allows servers to increment the {\tt UIDVALIDITY} upon each
108 reconnect to accommodate server implementations which are unable to assign persistent UIDs at all. It shall be
109 noted that although such servers are {\em compliant} with the IMAP specification, they offer severely limited user
110 experience and little room for further optimization --- the clients cannot reuse any data from previous connections,
111 so the overall efficiency is similar to accessing e-mail through the POP3 protocol.}
112 \item If {\tt UIDNEXT} is not available, the client has to resort to asking for the whole UID mapping from scratch.
113 \item If the {\tt UIDNEXT} has decreased, the IMAP server exhibits a bug. This situation is explicitly forbidden by
114 the IMAP standard. Trojitá will, nevertheless, try to work in this scenario by purging its cache and continuing as
115 if no state was cached locally.
116 \item If the {\tt UIDNEXT} has not changed since the last time the client has opened the mailbox, the IMAP protocol
117 says that no messages could have been delivered to the mailbox at all.
118 \begin{itemize}
119 \item If the {\tt EXISTS} remains constant as well, it is clear that no deletions have taken place. This means
120 that the cached sequence $\rightarrow$ UID mapping from the last time is directly usable, and the UID syncing
121 phase of the mailbox synchronization is concluded.
122 \item Otherwise, if the {\tt EXISTS} has grown, the client is talking to a non-compliant IMAP server which failed
123 to adjust either {\tt UIDNEXT} or {\tt UIDVALIDITY}, and cannot assume anything about the server's behavior.
124 Trojitá will gracefully degrade to a complete UID mapping resynchronization.
125 \item If the {\tt EXISTS} has decreased, one can be sure that some messages have been deleted. In this situation,
126 the client has two possible options on how to proceed:
127 \begin{itemize}
128 \item One can try to perform a binary search in the list of messages to find the first deleted message and ask
129 for UIDs of all messages at the subsequent positions. This is a heuristics which relies on an observation
130 that it is more likely for users working with big mailboxes to delete messages at the end of the mailbox.
131 However, each step in this incremental search requires a complete round trip to the IMAP server over a
132 network; with a mailbox with tens of thousands of messages, this could lead to 17 round trips. Given that
133 real-world cellular networks like the GPRS/EDGE infrastructure, unfortunately still common in the Czech
134 Republic, exhibit the RTT latencies which can often be larger than one second~\cite{gprs-rtt-report}, such
135 an approach to incremental synchronization of the UID mapping will have severe impact on the total
136 synchronization time.
137 \item Another way is to give up on possible bandwidth reduction possibility and to fetch the complete UID
138 mapping.
139 \end{itemize}
140 \end{itemize}
141 \item If the {\tt UIDNEXT} has grown, some messages might have arrived into the mailbox. There's no guarantee that
142 any of them are still present, though, so the clients could use another set of heuristics:
143 \begin{itemize}
144 \item If the increase in {\tt EXISTS} is exactly the same as the growth of the {\tt UIDNEXT}, all of the new
145 arrivals are still present in the mailbox and no message have been expunged since the last time. The client can
146 ask only for UIDs of the new arrivals.
147 \item In any other case, the situation is very similar to a changed {\tt EXISTS} with constant {\tt UIDNEXT} and
148 the same possible optimization about the binary search might apply. Alternatively, clients could fetch a
149 complete UID mapping.
150 \end{itemize}
151 \end{itemize}
153 If the decisions described above suggest that at least a part of the UID mapping shall be updated, an IMAP client can
154 --- in absence of the optional extensions --- use one of the following ways to update the map. The first one is through
155 the generic {\tt FETCH} command:
157 \begin{minted}{text}
158 C: y1 UID FETCH 1:* (UID)
159 S: * 1 FETCH (UID 123)
160 S: * 2 FETCH (UID 125)
161 S: * 4 FETCH (UID 127)
162 S: * 3 FETCH (UID 126)
163 S: y1 OK Fetched
164 \end{minted}
166 This command simply requests the {\tt FETCH} response containing UID for each and every message in the mailbox. The
167 sample results show that the received data are in no particular order and demonstrate that the UID range is not
168 necessarily continuous. If the heuristics shows that there is just a subset of messages with unknown UIDs, the sequence
169 range (the {\tt "1:*"} string in the example above) shall be changed to only refer to the relevant subset, like the {\tt
170 "last\_uidnext:*"}. It is also possible to request {\tt FLAGS} (which will be described later on) at this point.
172 Alternatively, the {\tt UID SEARCH} command can be used as follows:
174 \begin{minted}{text}
175 C: y1 UID SEARCH UID ALL
176 S: * SEARCH 123 125 127 126
177 S: y1 OK search completed
178 \end{minted}
180 As one can see, the {\tt SEARCH} response is much more compact. In practice, the bandwidth saving is slightly lower as
181 the UID discovery and {\tt FLAGS} synchronization can be merged into a single {\tt FETCH} command, but the overhead is
182 still at least four bytes for each message in the mailbox,~\footnote{If the {\tt FLAGS} are fetched as well, the real
183 overhead is just the {\tt "UID<space>"} string --- the number and its trailing space is present in the {\tt SEARCH}
184 response as well and the overhead of the {\tt FETCH} response format is required for the updated flags anyway.} which
185 leads to at least 200~kB of useless data on a mailbox with fifty thousands of messages.
187 \subsection{Message Flags}
189 I have mentioned message flags when describing the mailbox synchronization. These flags allow the system and the mail
190 user to attach a certain state to the messages --- information like whether the message has been read or replied to is
191 tracked at this level. Further applications include user-level arbitrary tagging of messages with flags like
192 ``important'' or ``todo''.
194 Strictly speaking, asking for message flags of all messages in a mailbox is not necessary, provided the program is
195 capable of lazy-loading --- flags could, for example, only be fetched for those messages which are immediately visible on
196 screen (probably with some intelligent preload of items which will likely get to the viewport in near future), avoiding
197 a potentially expensive operation. On the other hand, contemporary user agents typically have to display an aggregated
198 statistics like ``$X$ unread messages, $Y$ total'' to the user. IMAP certainly has methods for delivering such
199 statistics, however, the baseline specification's only two ways of conveying that information are through the {\tt
200 STATUS} command or via an explicit {\tt SEARCH}. In practice, this design leads to a pressing need to load {\em all}
201 flags for all messages at the start of the session.
203 The problem with the {\tt STATUS} command is that it is unfortunately forbidden from being used on an actively selected
204 mailbox~\cite[p. 43]{rfc3501}. That makes this command usable for an initial estimate, but prevents further updates --- consider
205 that an IMAP client has opened a big mailbox and scrolled to the end of the message listing. Suddenly, an {\tt *
206 EXPUNGE 3} arrives, informing the client that the third message in the mailbox is now gone. Because the flags of those
207 ``older'' messages haven't been loaded in this scenario, the client has no way of knowing whether the number of unread
208 messages shall be updated. At this point, the client has no choice but to explicitly ask for message flags of all
209 messages or conduct a special {\tt SEARCH}. The {\tt SEARCH} command looking for unread messages (or for any set of
210 messages tagged with a certain flag, for that matter) can surely be constructed, even the baseline IMAP4rev1 provides a
211 way of requesting that information. However, each {\tt SEARCH} only provides the client with an information about one
212 kind of a particular flag. It is not an unreasonable idea to design a client with further development in mind, most
213 notably it might make a lot of sense not to special-case the {\tt {\textbackslash}UnSeen} message flag --- after all, certain
214 applications will benefit from having access to all messages matching the {\tt \$SubmitPending} flag or those which were
215 marked as a ``Draft'' by the user, for example. Unfortunately, statistics about these user-defined flags cannot be
216 determined via the {\tt STATUS} command and have to be discovered explicitly, either through a lot of separate {\tt
217 SEARCH} commands, one for each ``interesting'' flag, or via an explicit synchronization through the {\tt FETCH (FLAGS)}
218 command.
220 In short, deferring the flag synchronization certainly has some merit, but at the same time, special-casing the {\tt
221 {\textbackslash}UnSeen} flag for unread messages is not a viable long-term solution. Given that extensions designed for
222 speeding up the flags resynchronization exist, Trojitá will always ask for a full flags mapping when synchronizing
223 through the baseline IMAP profile.
225 \subsection{Immutable Data}
226 \label{sec:imap-immutable-data}
228 In the previous sections, I have spoken about data which have to be resynchronized during each reconnect to the mailbox,
229 be it message flags or the UIDs. Other data available through IMAP are, however, immutable by nature. Examples of
230 these data are message headers or the individual body parts.
232 IMAP is pretty unique in allowing its implementors to dissect a MIME message~\cite{rfc2045} into individual body parts.
233 In practice, this is a very useful feature --- clients can read a short textual body of a message before deciding whether
234 to download a big binary attachment etc. On the other hand, it requires servers to include a full MIME parser. Some of
235 them, notably the Google's GImap, have been struggling with this requirement for many
236 years~\cite{gmail-bodystructure-sucks}.
238 IMAP defines a data structure called {\tt ENVELOPE} containing some of the most interesting headers from the
239 RFC~2822~\cite{rfc2822} message. Among them, the {\tt Subject}, {\tt Date}, {\tt From}, {\tt Sender}, {\tt Reply-To},
240 {\tt To}, {\tt Cc}, {\tt Bcc}, {\tt In-Reply-To} and {\tt Message-Id} are included. Unfortunately, the {\tt References}
241 header is missing.~\footnote{The {\tt References} header is useful when the client wants to be as compatible as possible
242 with the other agents that deal with message threading. Strictly speaking, the {\tt Message-Id} and {\tt In-Reply-To}
243 headers are sufficient for some forms of threading, but MUAs should strive to be ``good citizens'' and support the {\tt
244 References} header as well.} Even with this omission, the {\tt ENVELOPE} is useful for clients which do not necessarily
245 have to include RFC~2822-style header parsing code. However, this usefulness is unfortunately further limited by not
246 including an RFC~2047~\cite{rfc2047} decoder, so non-ASCII data in fields like senders' human readable names or in the
247 subject field have to be decoded by the clients.
249 \section{Protocol Design}
251 The baseline version of IMAP, as defined in RFC~3501~\cite{rfc3501}, contains a few features which limit its performance
252 by a fair amount. One example of these features are IMAP's {\em synchronizing literals}.
254 Before the client is allowed to send a big amount of data to the server, it has to ask for its explicit permission via a
255 continuation request. While such an idea is good on paper (and is probably intended to {\em save bandwidth} by allowing
256 the server to refuse huge uploads before the client sends them), in reality this leads to rather slow operation because
257 each transmission requires a full roundtrip over the network. Fortunately, extensions like the {\tt LITERAL+} (see
258 \secref{sec:imap-literalplus}) have eliminated this bottleneck.
260 Another manifestation of a situation which could potentially use an improvement is the protocol's requirement on clients
261 to accept any response at any time,~\footnote{{\tt ``The client MUST be prepared to accept any response at all
262 times.''}~\cite[p. 61]{rfc3501}} which is not applied consistently --- the same RFC also mandates that the servers
263 cannot send {\tt EXPUNGE} when no command is in progress~\cite[p. 72]{rfc3501}. This particular wording certainly has
264 merits (it encourages client implementors to {\em really} accept anything at any time) and is required for proper
265 synchronization --- if {\tt EXPUNGE}s were allowed when no command was in progress and a client issued a {\tt STORE}
266 command referencing a message through its sequence number, that action could affect a completely different message.
267 This design is probably required due to the old decision to support addressing through both sequence numbers and UIDs,
268 but has a side effect of requiring constant polling for mailbox updates. Again, extensions have emerged (see
269 \secref{sec:imap-idle}) which try to eliminate this drawback through a special mode.
271 \subsection{Additional Server-Side Features}
273 Having all messages available without much effort (or, certainly with much less effort than a client), servers are in a
274 unique position to make certain operations smoother, faster and more efficient than performing on the client side.
276 The baseline IMAP specification contains provisions for server-side searching. Features notably missing, however, are
277 server-side sorting and conveying information about message threading~\footnote{Message {\em threading} refers to a
278 mechanism which allows graphical user agents to present related messages together. Recently re-branded as
279 ``conversations'', threading is in fact a pretty old idea which builds on the {\tt Message-Id}, {\tt References} and
280 {\tt In-Reply-To} headers, or, in its crudest form, just on similarity of message subjects. Threading presents the
281 human with a tree of messages where each immediate child of a particular node represents a reply to the parent message.}
282 which are available through optional extensions.
284 Certain scenarios (like having a cell phone with severely limited resources) could benefit from server-side content
285 conversion, similar to how the Opera Mobile browser converts images to low-resolution versions for display on the
286 phone's screen. An extension for just that exists, but its support is rather scarce among the IMAP servers --- in fact,
287 the author of this thesis was not able to find a single reasonably-deployed server which would offer such a feature.
289 At the same time, the overall design of the IMAP protocol is rather promising; it allows executing commands in
290 parallel through pipelining and even the most basic profile provides enough features to implement a reasonably efficient
291 client which can maintain its state over reconnects. It is therefore reasonable to start with the existing state and
292 try to build upon its solid foundation, improving the whole experience in the process.
294 \end{document}