docs/masters/imap-protocol.tex

   1 % vim: spelllang=en spell textwidth=120
   2 \documentclass[trojita]{subfiles}
   3
   4 \begin{document}
   5
   6 \chapter{IMAP Protocol Essentials}
   7 \label{sec:imap-protocol}
   8
   9 This chapter provides a gentle introduction to peculiarities of the IMAP protocol and presents an analysis of how its
  10 users can benefit from the unique protocol features.
  11
  12 \section{IMAP}
  13
  14 The IMAP protocol, as defined by RFC~3501~\cite{rfc3501}, is an internet protocol suitable for managing e-mail folders
  15 and individual messages stored on a remote mail server.  In contrast to the older POP3 protocol~\cite{rfc1939}, IMAP is
  16 actually intended to serve as an {\em access} protocol.  Where a POP3 client would happily download a full message from
  17 the mail server, store it into a local mailbox and perform all further processing locally, the IMAP mode of operation is
  18 much more complicated.  These complications, however, bring a whole slew of new features and interesting applications
  19 along.
  20
  21 For one thing, IMAP presents a single authoritative place storing messages --- that feature alone is a must in today's world
  22 where everyone expects to be able to access mail from their cell phones.  Furthermore, given that all messages
  23 are located on a single place, it is possible to perform efficient server-side operations like searching or sorting over
  24 the whole mail store.  IMAP also makes it possible to access individual message parts like attachments separately,
  25 eliminating the need to download a huge message before reading a short accompanying textual information.  Finally,
  26 advanced servers can recognize clients with limited resources and only present a subset of messages to them.
  27
  28 At the same time, IMAP is an old protocol burdened with many compatibility warts.  Its designers were struggling with
  29 people objecting to novel ideas due to legacy code in their mail implementations.  Over the years, though, various
  30 protocol extensions appeared.  Some of them are extremely useful for contemporary clients, yet cannot be relied upon
  31 as there is no general agreement on what extensions are really crucial, and hence available on most IMAP servers.
  32
  33 The rest of this chapter provides a quick overview of the basic IMAP concepts and how they relate to the usual client's
  34 workflow.  A detailed introduction to the basic IMAP concepts can be found in my bachelor thesis on this topic
  35 \cite[p. 9 - 19]{jkt-bc-thesis}.
  36
  37 \subsection{Basic Features}
  38
  39 An IMAP server exports a set of {\em mailboxes}, folders which contain individual messages (and further mailboxes, if
  40 the server allows that).  Each message can be identified either by its {\em sequence number}, an order in which it
  41 appears in mailbox, or by its {\em UID}.  Sequence numbers are by definition very volatile (deleting the first message
  42 in a mailbox changes sequence numbers of all subsequent messages, for example) while the UIDs provide better chances of
  43 persistence across reconnects.~\footnote{It shall be noted that IMAP does {\em not} guarantee UIDs to be persistent at
  44 all.  The reason behind this decision was to allow IMAP to publish messages from obsolete mail stores which could not
  45 have been extended to support UIDs at all.  Even today, UID changes have to be expected when signalled through {\tt
  46 UIDVALIDITY}.} When the UIDs have to be invalidated for some reason, a per-mailbox integer property {\tt UIDVALIDITY}
  47 is incremented to signal all clients that previously used UIDs are no longer valid.
  48
  49 \subsection{Cache Filing Protocol}
  50
  51 As Mark Crispin, the principal author of the IMAP standard, has to
  52 say~\cite{crispin-imap-cache-filing-1}~\cite{crispin-imap-cache-filing-2}, IMAP is a {\em cache filing protocol}.  That
  53 means that whatever the server thinks about a mailbox state is {\em the truth}, and any state stored on the clients can
  54 be invalidated by the server at any time.  This critical design choice has impact on all further operations.  IMAP
  55 clients which do not anticipate such a behavior~\footnote{Such clients are usually called ``POP3 clients converted to
  56 speak IMAP'' on various IMAP-related mailing
  57 lists.~\cite{shannon-imap-clients-glorified-pop}~\cite{crispin-imap-clients-glorified-pop}} are bound to operate in an
  58 inefficient manner or fail in unexpected scenarios.
  59
  60 The first issue which typically comes up on the {\tt imap-protocol} mailing list is treating UIDs as a persistent
  61 identifier of some kind.  In fact, IMAP guarantees that a triple of (mailbox name, {\tt UIDVALIDITY}, {\tt UID}) will
  62 not refer to {\em any other} message at any time, but there's no guarantee that the very same message, quite possibly in
  63 the same mailbox, will not get another UID in future.~\footnote{People have been trying to solve this issue for quite
  64 some time, but no standardized solution is ready yet.  The recent iterations of these proposals concentrate on providing
  65 a cryptographic hash of a message body, but is far from clear whether doing so would get any traction.  Furthermore, the
  66 hashes are typically too long to serve as the only identifier of a message, so UIDs will definitely be around in
  67 future.}  That said, on reasonable server implementations, the UIDs should not get invalidated too often under normal
  68 circumstances.  Given the IMAP protocol doesn't offer anything else, they are widely used (along with the {\tt
  69 UIDVALIDITY} and when limited to the scope of a single mailbox) as a semi-persistent identification of a message.
  70
  71 UIDs are assigned to the messages in a strictly monotonic sense, i.e. if message $A$ has a sequence number $seq_A$ and
  72 message $B$ has sequence number $seq_B$ such as $seq_A < seq_B$, it is true that $UID_A < UID_B$.  UID numbers are also
  73 guaranteed to never be reused in the same mailbox, unless the {\tt UIDVALIDITY} changes.
  74
  75 Due to the facts described above, virtually any IMAP client which maintains a persistent cache of the downloaded data
  76 uses UIDs to assign the cached data to individual messages.  Such an approach leads to a need to maintain a mapping
  77 between the sequence numbers and the UID numbers of all messages in the mailbox --- upon reconnect, clients have to
  78 recognize whether any messages previously available in the mailbox disappeared, and if they did, the clients should
  79 remove the cached data for these messages.~\footnote{The reader shall be reminded that IMAP is a {\em cache filing
  80 protocol}, i.e. the server is always right about what messages ``are'' in a mailbox and what messages are gone.}  This
  81 is in a strong contrast to the usual POP3 mode of operation where the clients are expected to prune their cache only
  82 based on their local policy, perhaps moving older messages to a designated archive, but definitely not discarding the
  83 retrieved data as soon as the server doesn't present the message any longer.
  84
  85 Furthermore, even during an established session the IMAP server informs about messages being permanently deleted through
  86 the {\tt EXPUNGED} response which contains sequence number only.  Given that the cache is usually addressed by UID, a
  87 caching client shall maintain full UID mapping at any time.
  88
  89 \section{Mailbox Synchronization}
  90 \label{sec:imap-mailbox-sync}
  91
  92 When an IMAP client opens a mailbox, the server provides it with a few data points about the state of the mail store.
  93 Among these data, there's a number representing the total amount of messages in the mailbox through the {\tt EXISTS}
  94 response, the current {\tt UIDVALIDITY} value and, finally, the {\tt UIDNEXT} which represents the lowest UID which the
  95 next arriving message could possibly get assigned.  Please note that the {\tt UIDNEXT} is merely a lower bound of the
  96 future UID; there is no guarantee that a message with such UID would ever exist in the mailbox in future.
  97
  98 Having obtained these three values, the client can perform a few optimizations before it proceeds to fetch an updated
  99 UID mapping from the IMAP server:
 100
 101 \begin{itemize}
 102   \item If the {\tt UIDVALIDITY} has changed, the client is obliged to completely purge any data which it might have
 103     accumulated in its local persistent cache.  This is a hard requirement allowing the server to inform the client that
 104     no state whatsoever can be reused from the previous connections.  In the real world, this situation shall be only
 105     reached under exceptional circumstances (like when migrating to a completely different server implementation, or
 106     after having to restore the data after an inadvertent damage caused by a reckless system
 107     administrator.~\footnote{The IMAP standard nevertheless allows servers to increment the {\tt UIDVALIDITY} upon each
 108     reconnect to accommodate server implementations which are unable to assign persistent UIDs at all.  It shall be
 109     noted that although such servers are {\em compliant} with the IMAP specification, they offer severely limited user
 110     experience and little room for further optimization --- the clients cannot reuse any data from previous connections,
 111     so the overall efficiency is similar to accessing e-mail through the POP3 protocol.}
 112   \item If {\tt UIDNEXT} is not available, the client has to resort to asking for the whole UID mapping from scratch.
 113   \item If the {\tt UIDNEXT} has decreased, the IMAP server exhibits a bug.  This situation is explicitly forbidden by
 114     the IMAP standard.  Trojitá will, nevertheless, try to work in this scenario by purging its cache and continuing as
 115     if no state was cached locally.
 116   \item If the {\tt UIDNEXT} has not changed since the last time the client has opened the mailbox, the IMAP protocol
 117     says that no messages could have been delivered to the mailbox at all.
 118     \begin{itemize}
 119       \item If the {\tt EXISTS} remains constant as well, it is clear that no deletions have taken place. This means
 120         that the cached sequence $\rightarrow$ UID mapping from the last time is directly usable, and the UID syncing
 121         phase of the mailbox synchronization is concluded.
 122       \item Otherwise, if the {\tt EXISTS} has grown, the client is talking to a non-compliant IMAP server which failed
 123         to adjust either {\tt UIDNEXT} or {\tt UIDVALIDITY}, and cannot assume anything about the server's behavior.
 124         Trojitá will gracefully degrade to a complete UID mapping resynchronization.
 125       \item If the {\tt EXISTS} has decreased, one can be sure that some messages have been deleted.  In this situation,
 126         the client has two possible options on how to proceed:
 127         \begin{itemize}
 128           \item One can try to perform a binary search in the list of messages to find the first deleted message and ask
 129             for UIDs of all messages at the subsequent positions.  This is a heuristics which relies on an observation
 130             that it is more likely for users working with big mailboxes to delete messages at the end of the mailbox.
 131             However, each step in this incremental search requires a complete round trip to the IMAP server over a
 132             network; with a mailbox with tens of thousands of messages, this could lead to 17 round trips.  Given that
 133             real-world cellular networks like the GPRS/EDGE infrastructure, unfortunately still common in the Czech
 134             Republic, exhibit the RTT latencies which can often be larger than one second~\cite{gprs-rtt-report}, such
 135             an approach to incremental synchronization of the UID mapping will have severe impact on the total
 136             synchronization time.
 137           \item Another way is to give up on possible bandwidth reduction possibility and to fetch the complete UID
 138             mapping.
 139         \end{itemize}
 140     \end{itemize}
 141   \item If the {\tt UIDNEXT} has grown, some messages might have arrived into the mailbox.  There's no guarantee that
 142     any of them are still present, though, so the clients could use another set of heuristics:
 143     \begin{itemize}
 144       \item If the increase in {\tt EXISTS} is exactly the same as the growth of the {\tt UIDNEXT}, all of the new
 145         arrivals are still present in the mailbox and no message have been expunged since the last time.  The client can
 146         ask only for UIDs of the new arrivals.
 147       \item In any other case, the situation is very similar to a changed {\tt EXISTS} with constant {\tt UIDNEXT} and
 148         the same possible optimization about the binary search might apply.  Alternatively, clients could fetch a
 149         complete UID mapping.
 150     \end{itemize}
 151 \end{itemize}
 152
 153 If the decisions described above suggest that at least a part of the UID mapping shall be updated, an IMAP client can
 154 --- in absence of the optional extensions --- use one of the following ways to update the map.  The first one is through
 155 the generic {\tt FETCH} command:
 156
 157 \begin{minted}{text}
 158   C: y1 UID FETCH 1:* (UID)
 159   S: * 1 FETCH (UID 123)
 160   S: * 2 FETCH (UID 125)
 161   S: * 4 FETCH (UID 127)
 162   S: * 3 FETCH (UID 126)
 163   S: y1 OK Fetched
 164 \end{minted}
 165
 166 This command simply requests the {\tt FETCH} response containing UID for each and every message in the mailbox.  The
 167 sample results show that the received data are in no particular order and demonstrate that the UID range is not
 168 necessarily continuous.  If the heuristics shows that there is just a subset of messages with unknown UIDs, the sequence
 169 range (the {\tt "1:*"} string in the example above) shall be changed to only refer to the relevant subset, like the {\tt
 170 "last\_uidnext:*"}.  It is also possible to request {\tt FLAGS} (which will be described later on) at this point.
 171
 172 Alternatively, the {\tt UID SEARCH} command can be used as follows:
 173
 174 \begin{minted}{text}
 175     C: y1 UID SEARCH UID ALL
 176     S: * SEARCH 123 125 127 126
 177     S: y1 OK search completed
 178 \end{minted}
 179
 180 As one can see, the {\tt SEARCH} response is much more compact.  In practice, the bandwidth saving is slightly lower as
 181 the UID discovery and {\tt FLAGS} synchronization can be merged into a single {\tt FETCH} command, but the overhead is
 182 still at least four bytes for each message in the mailbox,~\footnote{If the {\tt FLAGS} are fetched as well, the real
 183 overhead is just the {\tt "UID<space>"} string --- the number and its trailing space is present in the {\tt SEARCH}
 184 response as well and the overhead of the {\tt FETCH} response format is required for the updated flags anyway.} which
 185 leads to at least 200~kB of useless data on a mailbox with fifty thousands of messages.
 186
 187 \subsection{Message Flags}
 188
 189 I have mentioned message flags when describing the mailbox synchronization.  These flags allow the system and the mail
 190 user to attach a certain state to the messages --- information like whether the message has been read or replied to is
 191 tracked at this level.  Further applications include user-level arbitrary tagging of messages with flags like
 192 ``important'' or ``todo''.
 193
 194 Strictly speaking, asking for message flags of all messages in a mailbox is not necessary, provided the program is
 195 capable of lazy-loading --- flags could, for example, only be fetched for those messages which are immediately visible on
 196 screen (probably with some intelligent preload of items which will likely get to the viewport in near future), avoiding
 197 a potentially expensive operation.  On the other hand, contemporary user agents typically have to display an aggregated
 198 statistics like ``$X$ unread messages, $Y$ total'' to the user.  IMAP certainly has methods for delivering such
 199 statistics, however, the baseline specification's only two ways of conveying that information are through the {\tt
 200 STATUS} command or via an explicit {\tt SEARCH}.  In practice, this design leads to a pressing need to load {\em all}
 201 flags for all messages at the start of the session.
 202
 203 The problem with the {\tt STATUS} command is that it is unfortunately forbidden from being used on an actively selected
 204 mailbox~\cite[p. 43]{rfc3501}.  That makes this command usable for an initial estimate, but prevents further updates --- consider
 205 that an IMAP client has opened a big mailbox and scrolled to the end of the message listing.  Suddenly, an {\tt *
 206 EXPUNGE 3} arrives, informing the client that the third message in the mailbox is now gone.  Because the flags of those
 207 ``older'' messages haven't been loaded in this scenario, the client has no way of knowing whether the number of unread
 208 messages shall be updated.  At this point, the client has no choice but to explicitly ask for message flags of all
 209 messages or conduct a special {\tt SEARCH}.  The {\tt SEARCH} command looking for unread messages (or for any set of
 210 messages tagged with a certain flag, for that matter) can surely be constructed, even the baseline IMAP4rev1 provides a
 211 way of requesting that information.  However, each {\tt SEARCH} only provides the client with an information about one
 212 kind of a particular flag.  It is not an unreasonable idea to design a client with further development in mind, most
 213 notably it might make a lot of sense not to special-case the {\tt {\textbackslash}UnSeen} message flag --- after all, certain
 214 applications will benefit from having access to all messages matching the {\tt \$SubmitPending} flag or those which were
 215 marked as a ``Draft'' by the user, for example.  Unfortunately, statistics about these user-defined flags cannot be
 216 determined via the {\tt STATUS} command and have to be discovered explicitly, either through a lot of separate {\tt
 217 SEARCH} commands, one for each ``interesting'' flag, or via an explicit synchronization through the {\tt FETCH (FLAGS)}
 218 command.
 219
 220 In short, deferring the flag synchronization certainly has some merit, but at the same time, special-casing the {\tt
 221 {\textbackslash}UnSeen} flag for unread messages is not a viable long-term solution.  Given that extensions designed for
 222 speeding up the flags resynchronization exist, Trojitá will always ask for a full flags mapping when synchronizing
 223 through the baseline IMAP profile.
 224
 225 \subsection{Immutable Data}
 226 \label{sec:imap-immutable-data}
 227
 228 In the previous sections, I have spoken about data which have to be resynchronized during each reconnect to the mailbox,
 229 be it message flags or the UIDs.  Other data available through IMAP are, however, immutable by nature.  Examples of
 230 these data are message headers or the individual body parts.
 231
 232 IMAP is pretty unique in allowing its implementors to dissect a MIME message~\cite{rfc2045} into individual body parts.
 233 In practice, this is a very useful feature --- clients can read a short textual body of a message before deciding whether
 234 to download a big binary attachment etc.  On the other hand, it requires servers to include a full MIME parser.  Some of
 235 them, notably the Google's GImap, have been struggling with this requirement for many
 236 years~\cite{gmail-bodystructure-sucks}.
 237
 238 IMAP defines a data structure called {\tt ENVELOPE} containing some of the most interesting headers from the
 239 RFC~2822~\cite{rfc2822} message.  Among them, the {\tt Subject}, {\tt Date}, {\tt From}, {\tt Sender}, {\tt Reply-To},
 240 {\tt To}, {\tt Cc}, {\tt Bcc}, {\tt In-Reply-To} and {\tt Message-Id} are included.  Unfortunately, the {\tt References}
 241 header is missing.~\footnote{The {\tt References} header is useful when the client wants to be as compatible as possible
 242 with the other agents that deal with message threading.  Strictly speaking, the {\tt Message-Id} and {\tt In-Reply-To}
 243 headers are sufficient for some forms of threading, but MUAs should strive to be ``good citizens'' and support the {\tt
 244 References} header as well.}  Even with this omission, the {\tt ENVELOPE} is useful for clients which do not necessarily
 245 have to include RFC~2822-style header parsing code.  However, this usefulness is unfortunately further limited by not
 246 including an RFC~2047~\cite{rfc2047} decoder, so non-ASCII data in fields like senders' human readable names or in the
 247 subject field have to be decoded by the clients.
 248
 249 \section{Protocol Design}
 250
 251 The baseline version of IMAP, as defined in RFC~3501~\cite{rfc3501}, contains a few features which limit its performance
 252 by a fair amount.  One example of these features are IMAP's {\em synchronizing literals}.
 253
 254 Before the client is allowed to send a big amount of data to the server, it has to ask for its explicit permission via a
 255 continuation request.  While such an idea is good on paper (and is probably intended to {\em save bandwidth} by allowing
 256 the server to refuse huge uploads before the client sends them), in reality this leads to rather slow operation because
 257 each transmission requires a full roundtrip over the network.  Fortunately, extensions like the {\tt LITERAL+} (see
 258 \secref{sec:imap-literalplus}) have eliminated this bottleneck.
 259
 260 Another manifestation of a situation which could potentially use an improvement is the protocol's requirement on clients
 261 to accept any response at any time,~\footnote{{\tt ``The client MUST be prepared to accept any response at all
 262 times.''}~\cite[p. 61]{rfc3501}} which is not applied consistently --- the same RFC also mandates that the servers
 263 cannot send {\tt EXPUNGE} when no command is in progress~\cite[p. 72]{rfc3501}.  This particular wording certainly has
 264 merits (it encourages client implementors to {\em really} accept anything at any time) and is required for proper
 265 synchronization --- if {\tt EXPUNGE}s were allowed when no command was in progress and a client issued a {\tt STORE}
 266 command referencing a message through its sequence number, that action could affect a completely different message.
 267 This design is probably required due to the old decision to support addressing through both sequence numbers and UIDs,
 268 but has a side effect of requiring constant polling for mailbox updates.  Again, extensions have emerged (see
 269 \secref{sec:imap-idle}) which try to eliminate this drawback through a special mode.
 270
 271 \subsection{Additional Server-Side Features}
 272
 273 Having all messages available without much effort (or, certainly with much less effort than a client), servers are in a
 274 unique position to make certain operations smoother, faster and more efficient than performing on the client side.
 275
 276 The baseline IMAP specification contains provisions for server-side searching.  Features notably missing, however, are
 277 server-side sorting and conveying information about message threading~\footnote{Message {\em threading} refers to a
 278 mechanism which allows graphical user agents to present related messages together.  Recently re-branded as
 279 ``conversations'', threading is in fact a pretty old idea which builds on the {\tt Message-Id}, {\tt References} and
 280 {\tt In-Reply-To} headers, or, in its crudest form, just on similarity of message subjects.  Threading presents the
 281 human with a tree of messages where each immediate child of a particular node represents a reply to the parent message.}
 282 which are available through optional extensions.
 283
 284 Certain scenarios (like having a cell phone with severely limited resources) could benefit from server-side content
 285 conversion, similar to how the Opera Mobile browser converts images to low-resolution versions for display on the
 286 phone's screen.  An extension for just that exists, but its support is rather scarce among the IMAP servers --- in fact,
 287 the author of this thesis was not able to find a single reasonably-deployed server which would offer such a feature.
 288
 289 At the same time, the overall design of the IMAP protocol is rather promising; it allows executing commands in
 290 parallel through pipelining and even the most basic profile provides enough features to implement a reasonably efficient
 291 client which can maintain its state over reconnects.  It is therefore reasonable to start with the existing state and
 292 try to build upon its solid foundation, improving the whole experience in the process.
 293
 294 \end{document}