Add credits for Marek
[trojita.git] / docs / masters / extensions.tex
blob93601322437343986cc4e9366ae4d8c4d71a56c3
1 % vim: spelllang=en spell textwidth=120
2 \documentclass[trojita]{subfiles}
4 \begin{document}
6 \chapter{IMAP Extensions}
7 \label{sec:imap-extensions}
9 It might be concluded from the brief analysis presented in the previous chapter that certain features of the IMAP
10 protocol are rather limiting in its real-world deployment. Fortunately, IMAP was designed to include support for
11 extensions which allow rather substantial changes to its mode of operation. Throughout this chapter, I provide a
12 detailed analysis of opportunities where the baseline IMAP protocol leaves something to be desired. Many such gaps have
13 been addressed by various extensions over the last twenty years; these extensions are thoroughly explained and evaluated
14 on their merits. Information about their support in Trojitá is also included.
16 \section{Optimizing the Protocol}
18 Before dwelling into more advanced topics like improving the synchronization performance or adding new features, let's
19 have a look at the basic layer of the IMAP protocol and investigate how these affect performance.
21 \subsection{The LITERAL+ Extension}
22 \label{sec:imap-literalplus}
24 One of the lowest-hanging optimization fruit to cater are IMAP's synchronizing literals. In the basic IMAP, before a
25 client proceeds with tasks involving upload of binary data (or any data over a certain size, for that matter), it has to
26 ask for an explicit server's approval based on the length of the data in question. As it has been shown previously,
27 this confirmation imposes a full round trip over the network, inducing latency and destroying any potential pipelining
28 improvements.
30 The LITERAL+ extension (RFC~2088~\cite{rfc2088}) simply lifts the requirement of having to wait for the server's
31 continuation requests by a subtle change of the syntax. Adding an overhead of just one byte, the latency is completely
32 eliminated and communication gets rapidly streamlined. One can go as far as to say that the code paths for dealing with
33 LITERAL+ data are actually simpler than having to deal with the old-fashioned synchronizing literals. Consider the
34 following example:
36 \begin{minted}{text}
37 S: * OK
38 C: A001 LOGIN {11}
39 # The client has to wait for server's response before proceeding any further
40 S: + go ahead
41 C: FRED FOOBAR {7}
42 # A second round-trip wait occurs here
43 S: + go ahead
44 C: fat man
45 S: A001 OK LOGIN completed
46 \end{minted}
48 When using the LITERAL+ syntax, the whole interaction happens without having to wait for the server:
50 \begin{minted}{text}
51 S: * OK
52 C: A001 LOGIN {11+}
53 C: FRED FOOBAR {7+}
54 C: fat man
55 S: A001 OK LOGIN completed
56 \end{minted}
58 \begin{trojitabehavior}
59 Trojitá includes full support for the LITERAL+ extension --- when it detects the {\tt LITERAL+} capability, it will
60 immediately switch to using non-synchronizing literals for increased performance.
61 \end{trojitabehavior}
63 \subsection{Data Compression}
65 IMAP is a textual, line-based protocol. As such, it presents extremely good opportunities for compression --- using the
66 tried DEFLATE algorithm~\cite{rfc1951}, the basic IMAP chatter can be easily compressed to 25~-~40~\% of its original
67 size~\cite[p. 4]{rfc4978}. RFC~4978~\cite{rfc4978} provides mechanism for exactly this functionality through the {\tt
68 COMPRESS=DEFLATE} capability.
70 \begin{trojitabehavior}
71 Trojitá ships with full support for this extension through the permissively licensed {\tt zlib} library. Unfortunately,
72 the Qt's {\tt QSslSocket} currently doesn't provide a way to reliably tell whether the SSL connection is already
73 employing compression. When combined with IMAP servers hidden behind SSL accelerators or load balancers (i.e. in
74 situations where the server does not have a clear idea whether the session is already compressed either), this has a
75 risk of needlessly trying to compress data twice. This is a limitation in system libraries which cannot be overcome
76 without resorting to patching system components or conducting non-portable hacks.
77 \end{trojitabehavior}
79 \subsection{Improving Security through Cryptography}
81 RFC~2595~\cite{rfc2595} deals with best practices for establishing SSL/TLS connections to the IMAP server.
83 \begin{trojitabehavior}
84 Trojitá
85 follows these recommendations, most notably it tries to establish a secure channel over {\tt STARTTLS} command even
86 without an explicit action on the user's side, should the server be configured to advertise itself as not accepting
87 logins over insecure connections through the {\tt LOGINDISABLED} capability. A manual override is available in
88 situations where the SSL encryption is not available.
90 In advent of the recent breaches of many well-known (and widely trusted) Certificate Authorities~\cite{ssl-breaches},
91 Trojitá also comes with support for SSL key pinning~\cite{ssl-pinning}. The trust model presented to the user is
92 similar to handling of SSH servers' public keys with OpenSSH --- upon first connection, the user is always presented with
93 a choice of whether to accept the certificate or not, along with a confirmation about whether the operating system and
94 its policy considers the certificate as ``trusted''. No matter what the system-wide policy says, a changed public key
95 is always considered a threat and the situation is presented to the user accordingly.
96 \end{trojitabehavior}
98 \subsection{The IDLE Mode}
99 \label{sec:imap-idle}
101 I have mentioned that even though the protocol requires clients to be ready to accept any responses at any time, in
102 practice, servers are forbidden to send {\tt EXPUNGE}s when no command is in progress. This requirement is necessary to
103 prevent a dangerous resynchronization as the server cannot possibly know whether the client has started to issue an
104 UID-less {\tt STORE} command which references messages through their sequence numbers. Unfortunately, this directly
105 translates to clients having to {\em poll} the server quite often if they care about updates concerning the deleted
106 messages.
108 Any protocol which uses polling looks bad on paper --- having to poll leads to increased latency and higher power usage
109 because the equipment has to actively check for updates every now and then. In contrast, {\em push-based} updates allow
110 the client to enter a low-power state where it merely waits to be woken up when a change occurs. Such a mode is exactly
111 what the IDLE extension defined by RFC~2177~\cite{rfc2177} adds to IMAP. It must, however, be said that real-world
112 concerns related to firewall timeouts and especially the NAT traversal has limited the usefulness of the IDLE command
113 somewhat, even to the extent where Mark Crispin, the original author of the IMAP protocol, claims that ``I see no
114 particular benefit to use of IDLE on a desktop machine''~\cite{crispin-idle-useless} --- a view which is not shared by
115 the wider community~\cite{tss-idle-keepalive} \cite{android-idle}, yet certainly worth a consideration.
117 The IDLE extension is basically a hack on top of the IMAP protocol which reverses a mantra of the basic IMAP
118 specification~\cite[p. 72]{rfc3501}:
120 \begin{quote}
121 A command is not ``in progress'' until the complete command has been received; in particular, a command is not ``in
122 progress'' during the negotiation of command continuation.
123 \end{quote}
125 With IDLE, a typical interaction might look like this one:
127 \begin{minted}{text}
128 C: A004 IDLE
129 S: * 2 EXPUNGE
130 S: * 3 EXISTS
131 S: + idling
132 ...time passes; another client expunges message 3...
133 S: * 3 EXPUNGE
134 S: * 2 EXISTS
135 ...time passes; new mail arrives...
136 S: * 3 EXISTS
137 C: DONE
138 S: A004 OK IDLE terminated
139 C: A005 FETCH 3 ALL
140 S: * 3 FETCH (...)
141 S: A005 OK FETCH completed
142 C: A006 IDLE
143 \end{minted}
145 The whole effect of the {\tt IDLE} command is therefore to indicate to the server that the client is {\em really}
146 willing to listen for any updates to the mailbox state. Because of compatibility concerns with legacy mail stores, the
147 IDLE extension still does {\em not} mandate the server to actually send updates about any changes as soon as they are
148 conducted --- indeed, a server which internally polls every fifteen minutes to check whether a message has arrived is
149 fully compliant with the IDLE extension, albeit rather useless to user who might expect (and, one might add, rightly so)
150 to be {\em instantly} notified about changes to the mailbox.
152 \begin{trojitabehavior}
153 Trojitá includes full support for the IDLE extension and will enter that mode automatically shortly after a mailbox is
154 selected. A simple heuristics is implemented which delays re-entering the IDLE command if it is likely that the
155 connection will be reused for any other purpose in near future, further eliminating needless data transfers.
156 Unfortunately, Trojitá is at the mercy of the IMAP server when it comes to superfluous data transfers, so it cannot
157 prevent the ``pings'' sent even when the connection does not contain a gateway with overly short timeouts.
158 \end{trojitabehavior}
160 \section{Improving Mailbox Synchronization}
162 The previous section dealt with optimizing the overall IMAP protocol as a whole. At this stage, let's have a look at
163 more specific issues which cannot be easily overcome through generic measures like data compression using off-the-shelf
164 algorithms or updates to the basic protocol flows.
166 In the basic IMAP, neither the server nor the client are required to keep any persistent state. Clearly, it is
167 beneficiary for a client to keep downloaded copies of the immutable mailbox/message data (consult
168 \secref{sec:imap-immutable-data} in its persistent cache for some time, should the device constraints allow such a
169 storage. There is still quite a lot of other data which has to be validated while the mailbox is being resynchronized.
170 Consider the following scenario where a mail user agent opens a mailbox with a thousand of messages which has witnessed
171 expunges and new arrivals since the last time it was opened:
173 \begin{minted}{text}
174 C: y1 SELECT foo
175 S: * 1000 EXISTS
176 S: * OK [UIDVALIDITY 12345] UIDs valid
177 S: * OK [UIDNEXT 2345] Next UID
178 S: y1 OK Selected
179 C: y2 UID SEARCH ALL
181 The following response has been shortened for demonstration purposes. In
182 practice, it will have to contain a thousand of numbers.
184 S: * SEARCH 2 4 5 6 7 8 9 10 ... 2290 2310 2311 2312 2333
185 S: y2 OK Search completed
186 C: y3 FETCH 1:1000 (FLAGS)
187 S: * 1 FETCH (FLAGS ())
188 S: * 2 FETCH (FLAGS (\\Seen))
189 S: * 3 FETCH (FLAGS (\\Recent \$Answered))
191 996 additional FETCH responses were omitted from this example for brevity.
193 S: * 1000 FETCH (FLAGS (\\Seen))
194 S: y3 OK fetched
195 \end{minted}
197 Let's identify two steps which substantially contribute to the transferred data:
199 \begin{itemize}
200 \item synchronizing the UIDs,
201 \item updating flags.
202 \end{itemize}
204 The rest of this section takes a look at optimization opportunities at each of these stages. Please keep in mind that
205 some basic optimization heuristics concerning the UID synchronization were discussed in \secref{sec:imap-mailbox-sync}
206 along with reasons on why these steps are necessary in clients willing to maintain an offline cache of immutable data.
208 \subsection{The ESEARCH Extension}
210 As seen in the protocol sample, the {\tt SEARCH} response containing UIDs of all messages in a mailbox can be rather
211 large. At the same time, chances are that at least some of the adjacent messages might have been assigned contiguous
212 UIDs --- this is certainly not a requirement per se, but quite a few IMAP servers internally {\em do} assign UIDs from a
213 per-mailbox counter. Real-world, albeit anecdotal evidence \cite{cridland-uids-are-often-monotonic} indicates that this
214 scenario is very common, and therefore it might make sense to transmit the UIDs of all messages using the {\tt
215 sequence-set} \cite[p. 89]{rfc3501} syntax. The ESEARCH extension, as defined in RFC~4731~\cite{rfc4731}, allows
216 exactly that:
218 \begin{minted}{text}
219 S: * ESEARCH [TAG "y12"] UID 1,3:9,17:25,30:1000
220 \end{minted}
222 At the time of the ESEARCH adoption, the imap-protocol mailing list witnessed a disagreement on how exactly the {\tt
223 sequence-set} shall be interpreted. Mark Crispin, the author of the original IMAP protocol (but not of the ESEARCH
224 extension) implemented ESEARCH in a different manner. He chose to take an advantage of the RFC3501-style definition of
225 UID sequences where the RFC mandates that servers shall treat non-existent UIDs given in sequence sets as if they
226 weren't referenced from the command at all. For example, if the mailbox contained just UIDs 3, 5 and 10, a client using
227 the {\tt 3:10} construct has to be interpreted as if it requested {\tt sequence-set 3,5,10}. Doing so present certain
228 optimization opportunities to the servers, for example when the client already knows the UID mapping and performs a
229 server-side search for messages matching certain criteria {\em and} the result set accurately matches an adjacent range
230 of messages, the server could take advantage of this adjacency a return a {\tt sequence-set} in the form of {\tt
231 10:150}, even though the mailbox contains only a few UIDs from this range~\cite{crispin-esearch-flawed}. Furthermore,
232 his another point is that the clients already have two other ways of obtaining the UID mapping, either through the {\tt
233 UID SEARCH ALL} command or via an explicit {\tt UID FETCH 1:*}. Needless to say, such a reasoning fails to take into
234 account potential bandwidth savings which can be rather substantial on ``reasonable'' mailboxes. In the end, the
235 authors of the RFC 4731 disagreed with Crispin \cite{melnikov-esearch-interpretation}
236 \cite{cridland-esearch-interpretation}.
238 \begin{trojitabehavior}
239 The ESEARCH extension allows nice bandwidth savings, so Trojitá tries to use it if the server says that it is supported.
240 \end{trojitabehavior}
242 In addition to that, format of the returned responses is changed so that it also includes the tag of the command which
243 caused it, allowing much more aggressive pipelining --- for example, clients are free to perform the UID discovery at the
244 same time as running a user-initiated search. On the other hand, even in presence of ESEARCH, the UID mapping still has
245 to be synchronized explicitly. This requirement is only lifted in the QRESYNC extension
246 (\secref{sec:extension-qresync}). Before describing that, though, it is necessary to have a look at the CONDSTORE.
248 \subsection{Avoiding Flags Resynchronization via CONDSTORE}
250 Leaving the UID synchronization alone for a while, let's have a look at various ways of eliminating the need to ask for
251 changed message flags. In this case, no extension trying to reduce the data overhead of the {\tt FETCH} response was
252 proposed, but the problem got attacked from another side.
254 The whole point of flags synchronization is to be able to pick up changes which have happened since the last time the
255 mailbox was selected. If only the server was somehow able to assign a ``serial number'' to each change, clients could
256 subsequently ask for all changes which have happened after a certain point. The CONDSTORE extension from RFC 4551
257 \cite{rfc4551} works in this way.
259 CONDSTORE-capable servers share a concept of ``modification sequence'', a {\tt MODSEQ}. Each message in a mailbox is
260 assigned an unsigned 64bit integer. Whenever message metadata (like its flags) change, the {\tt MODSEQ} of that
261 particular message gets increased. Each increment is also required to reach a value which is higher than {\tt MODSEQ}
262 of any other message in that mailbox. Similarly, a mailbox is assigned a {\tt HIGHESTMODSEQ}, an unsigned 64bit integer
263 which is interpreted as ``no message has ever had a {\tt MODSEQ} higher than this number'' --- of course subject to the
264 usual {\tt UIDVALIDITY} rules.~\footnote{As always, any change in {\tt UIDVALIDITY} directly translates to a full cache
265 flush and discarding any data previously remembered for the affected mailbox. This includes not only the immutable data
266 of messages, but also the UIDs, message flags and --- in this extension --- the {\tt MODSEQ} and {\tt HIGHESTMODSEQ}
267 values.}
269 When a CONDSTORE-capable client opens a mailbox which was previously synced (and if the server supports CONDSTORE as
270 well, of course --- keep in mind that the IMAP extensions are strictly voluntary in their nature), at first it synces the
271 UID mapping as usual, possibly through the ESEARCH command discussed earlier. After that, the client can use an
272 extended variant of the {\tt FETCH} command to ask for flags of those messages whose {\em current} {\tt MODSEQ} is
273 higher than the {\tt HIGHESTMODSEQ} which the client has remembered previously. The server will respond with regular
274 {\tt FETCH} responses for each affected message. In result, after this interaction is completed, the client is aware of
275 all pending flag changes and is fully resynchronized again.
277 This is how a typical synchronization might look like:
279 \begin{minted}{text}
280 C: y1 SELECT foo (CONDSTORE)
281 S: * 1000 EXISTS
282 S: * OK [UIDVALIDITY 12345] UIDs valid
283 S: * OK [UIDNEXT 2345] Next UID
284 S: * OK [HIGHESTMODSEQ 715194045007]
285 S: y1 OK Selected
287 At this point, the client will obtain the UID mapping, likely through the
288 UID SEARCH or its ESEARCH variant:
290 C: y2 UID SEARCH RETURN () ALL
291 S: * ESEARCH (TAG "y2") UID ALL 2,4:10,21:1008,2290,2310:2312,2333
292 S: y2 OK Search completed
294 At this point the CONDSTORE extension can be finally utilized -- only changed
295 flags will be transmitted:
297 C: y3 UID FETCH 1:1000 (FLAGS) (CHANGEDSINCE 613184045007)
298 S: * 997 FETCH (UID 2310 MODSEQ 715194045007 FLAGS (\\Seen \\Deleted))
299 S: * 1000 FETCH (UID 2333 MODSEQ 715194045005 FLAGS (\\Seen))
300 S: y3 OK Fetched
301 \end{minted}
303 The algorithm is race-free --- as every message has a separate {\tt MODSEQ} counter, the delay between the {\tt SELECT}
304 and {\tt FETCH} command doesn't lead to data loss; by the time the {\tt FETCH} completes, the server guarantees that the
305 client has received any pending updates since the last synchronization.
307 The CONDSTORE is an extremely valuable extension; its savings on big mailboxes are predictable and automatic --- instead
308 of having to transmit $O(n)$ responses where $n$ is the number of {\em messages}, only $O(m)$ are required under QRESYNC
309 with $m$ being the number of {\em modifications}. This is an extension which, unfortunately, places a certain burden on
310 the IMAP server which has to track the serial numbers of messages' metadata; however, given the obvious reductions in
311 bandwidth, many servers have already implemented it, most notably the Dovecot and Cyrus open source IMAP servers.
313 \begin{trojitabehavior}
314 Trojitá includes full support for this extension, making use of it whenever it is available.
315 \end{trojitabehavior}
317 \subsection{Optimizing UID Synchronization with QRESYNC}
318 \label{sec:extension-qresync}
320 The CONDSTORE extension discussed earlier has brought in the concept of a server-side state tracking and used that to
321 allow bandwidth-efficient way of synchronizing flag changes. Given that a CONDSTORE-capable server already tracks
322 certain state, it might be worthwhile to somehow extend this state to cover the deleted messages as well. As it turns
323 out, such a mechanism is implemented in the QRESYNC extension which is defined by RFC 5162 \cite{rfc5162}.
325 The basic idea behind QRESYNC is that as long as the UID mapping was fresh at some point in past, it is only necessary
326 to inform the client about which UIDs from that set are no longer there and push out the UIDs of newly arrived messages.
328 The QRESYNC extension modifies the {\tt SELECT command} so that it includes a few more parameters. First of all, the
329 updated version of this command includes a tuple of {\tt (UIDNEXT, HIGHESTMODSEQ)} as known to the client. If the {\tt
330 UIDNEXT} did not change, the server will have a look at the {\tt HIGHESTMODSEQ} value and in addition to essentially
331 behaving as a CONDSTORE server, it also sends out a list of expunged messages.~\footnote{Technically, the expunges are
332 sent out before the information about the updated flags, but that isn't the point here.} A new response is defined for
333 this purpose, the {\tt VANISHED EARLIER}.
335 In format similar to the {\tt ESEARCH} response, the {\tt VANISHED EARLIER} contains a {\tt sequence-set} of UIDs which
336 the server believes that the client considers to be present in the mailbox. Not only are these UIDs transmitted in a
337 compact syntax, thanks to the {\tt sequence-set} format, but the response typically contains only such UIDs which were
338 {\em just} removed. The actual wording (and therefore the implications of this extension) are slightly different --- the
339 server is free to inform the clients about any UIDs, as long as they aren't in the mailbox right now, at the time of the
340 sync. This is motivated by the need to relieve the servers from having to maintain a list of expunged UIDs
341 indefinitely, just in case a QRESYNC-enabled client reconnects after two years of inactivity. When such a situation
342 happens, a server which cannot remember expunges going so far in history has no other option but to send a {\tt
343 VANISHED EARLIER} for {\em all} UIDs lower than the {\tt UIDNEXT}, no matter if they {\em ever} were present in the
344 mailbox. This fallback suggests that the QRESYNC extension could very well have a negative net effect overall, at least
345 in certain pathological situations --- essentially when the list of expunges grows so long that the server decides to
346 prune some of its records.
348 In order to mitigate this issue, a few other options were added to QRESYNC. The first of them is a way of indicating to
349 the server the range of UIDs about which the client actually cares. The idea here is that if the client only cached a
350 subset of messages (for example those with UID higher than 50000), there isn't much point in informing the client about
351 each and every UID which might had been in the mailbox before (like those 49,999 of UIDs lower than 50,000).
353 However, chances are that this optimization is not enough to overcome the danger of having to sync too many UIDs --- and
354 indeed, some user agents might want to preemptively load messages both from the beginning {\em and} from the end of the
355 mailbox in an effort to optimize preloading. Such user agents would not be able to benefit from the ``range of known
356 UIDs'' optimization.
358 Fortunately, the {\tt SELECT QRESYNC} command includes provisions for passing another type of data around --- it is also
359 possible to provide a representative list of $(sequence, UID)$ pairs. Using technique similar to the proposed binary
360 search when discovering UIDs, the client can decide to send along a UID from roughly middle of the mailbox as the first
361 one, followed by another one located at circa 75~\% of the mapping, next one from $7 \over 8$ etc., halving the interval
362 in each step. The concrete strategy to be pursuit is left to the client, as it is basically a policy decision. Using
363 more fine-grained interval means that more data is sent along during each resynchronization without a direct merit (i.e.
364 when the server {\em still} remembers the previous {\tt HIGHESTMODSEQ} and can provide the client with relevant data),
365 while on the other hand sending less UIDs leads to minor data savings during ordinary reconnects while causing
366 potentially huge amounts of data to be transfered when the server is unable to use the --- perhaps very old or otherwise
367 stale --- {\tt HIGHESTMODSEQ}.
369 \begin{trojitabehavior}
370 The presented version of Trojitá always uses halving of sequences, effectively transmitting $log_2(n)$ sequence-UID
371 pairs:
372 \end{trojitabehavior}
374 \begin{minted}{c++}
375 Sequence knownSeq, knownUid;
376 int i = oldUidMap.size() / 2;
377 while (i < oldUidMap.size()) {
378 // Message sequence number is one-based, our indexes are zero-based
379 knownSeq.add(i + 1);
380 knownUid.add(oldUidMap[i]);
381 i += (oldUidMap.size() - i) / 2 + 1;
383 \end{minted}
385 Our example synchronization (a thousand messages in a mailbox, unspecified changed since the last time) could therefore
386 look like this --- the client has fresh enough UID mapping up to UID 1008 (sequence number 995). The server does not
387 remember deletions so far to the past, but the passed UID mapping fragment could nevertheless be used to optimize the
388 delivered data:
390 \begin{minted}{text}
391 C: SELECT foo (QRESYNC 12345 613184045007 (500,750,875,937,968,983,990,995,997
392 512,772,887,949,980,985,995,1002,1008,1111))
393 S: * 1000 EXISTS
394 S: * OK [UIDVALIDITY 12345] UIDs valid
395 S: * OK [UIDNEXT 2345] Next UID
396 S: * OK [HIGHESTMODSEQ 715194045007]
397 S: * VANISHED (EARLIER) 1009:2289,2291:2309,2313:2332,2343,2344
398 S: * 997 FETCH (UID 2310 MODSEQ 715194045007 FLAGS (\\Seen \\Deleted))
399 S: * 1000 FETCH (UID 2333 MODSEQ 715194045005 FLAGS (\\Seen))
400 S: y1 OK Selected
401 \end{minted}
403 The received data contain enough information to reconstruct complete UID mapping even under these unfavorable
404 conditions. The redundant UID information in the {\tt FETCH} responses can be also used to make sure that both sides of
405 the connection arrive to the same final state.
407 \begin{trojitabehavior}
408 Trojitá will throw an error when it detects failure at any time.
409 \end{trojitabehavior}
411 To further reduce the amount of data transmitted during the IMAP session, the QRESYNC extension also introduces a second
412 kind of the {\tt VANISHED} response --- the one without the {\tt EARLIER} modifier. Serving as a substitute for the
413 ordinary {\tt EXPUNGE}, the {\tt VANISHED}'s biggest advantage is that it can inform about multiple expunges in a single
414 response. Somewhat ironically, this modification also relieves the clients of their need to maintain a complete
415 UID-sequence mapping at all times --- but only after providing a method of making this synchronization severely less
416 painful in the first place. The whole matter is a bit more complicated by the wording of the RFC which is pretty clear
417 on that the {\tt VANISHED} responses {\em should} be sent instead of {\tt EXPUNGE} --- a language which, in RFC terms,
418 means that the servers are supposed to do so, yet the clients are forbidden from relying on such behavior because under
419 special circumstances, the servers might very well have a good reason to defer back to the {\tt EXPUNGE}~\cite{rfc2119}.
421 Unfortunately, the combination of offset-based {\tt EXISTS} (which is a response used to inform the client about growth
422 of the number of messages in a mailbox) with UID-based {\tt VANISHED} can lead to races, a dangerous condition which I
423 have identified~\cite{kundrat-vanished-race}. The problem lies in QRESYNC's allowance for non-existent UIDs to be
424 included in the {\tt VANISHED} response. Consider the following scenario where the client is fully synced with a
425 mailbox with just a single message bearing UID 5. The mailbox' {\tt UIDNEXT} is 11:
427 \begin{minted}{text}
428 S: * 3 EXISTS
429 C: x UID FETCH 11:* (FLAGS)
430 S: * VANISHED 12:20
431 \end{minted}
433 The server is telling the client that any UIDs between 12 and 20 are gone. The problem is that the client cannot
434 possibly know whether any message has got any of these particular UIDs, i.e. whether the messages \#2 and \#3 (the first
435 and second arrival) fall into that range. Trojitá will immediately send out a request for UIDs of the new arrivals
436 (that is the {\tt UID FETCH} command in the previous example), but due to the timing issues, it is perfectly possible
437 that these messages are ``long'' gone (and the appropriate {\tt VANISHED} sent) by the time the server receives the {\tt
438 UID FETCH} command. There isn't much that a compliant IMAP client can do at this point besides issuing an explicit
439 command for finding out whether any new messages have actually remained in the mailbox. This is a minor deficiency in
440 the QRESYNC extension which could be easily avoided by replacing the {\tt EXISTS} in manner similar to how {\tt EXPUNGE}
441 got replaced by {\tt VANISHED}. The previous example would look like this one, eliminating any possibility of races:
443 \begin{minted}{text}
444 S: * ARRIVED 12,33
445 C: x UID FETCH 11:* (FLAGS)
446 S: * VANISHED 12:20
447 \end{minted}
449 The {\tt ARRIVED} command is defined in a proposed draft in \secref{sec:draft-arrived}. A similar functionality could
450 be achieved through the NOTIFY extension defined by RFC 5465 \cite{rfc5465}, but supporting NOTIFY is a rather strong
451 undertaking for an IMAP server and --- as of July 2012 --- support for that extension is still rather
452 scarce.~\footnote{None of the widely deployed open source IMAP servers supported {\tt NOTIFY} at the time this thesis
453 was written. The Dovecot IMAP server had an experimental branch with partial support which would be enough to serve as
454 a replacement for {\tt ARRIVED}, but it could not be discovered through the IMAP capabilities because {\tt NOTIFY} is an
455 all-or-nothing extension; it mandates full server support and doesn't include provisions for partial functionality which
456 would have been enough in this case.}
458 The QRESYNC extension also mandates an interesting mechanism for its activation. The {\tt SELECT \ldots QRESYNC}
459 command does not work when the IMAP client opens a mailbox for the first time --- the client has no cached state, so it
460 cannot construct any of the optional arguments to the {\tt QRESYNC} select modifier. Furthermore, it cannot fabricate a
461 proper {\tt HIGHESTMODSEQ} in a safe way -- using value too low would result in needlessly sending a huge number of {\tt
462 VANISHED EARLIER} responses, while using a too high value could confuse the server and treat the client as a buggy one.
463 This is why a special {\tt ENABLE QRESYNC} command (from the ENABLE extension, RFC~5161 \cite{rfc5161}) is defined to be
464 a {\em QRESYNC-enabling command}, activating any necessary {\tt MODSEQ} tracking on the server side. This requirement
465 might be non-obvious at first, for the {\tt SELECT \ldots QRESYNC} on its own should be sufficient to inform the server
466 that the client indeed wants to speak QRESYNC.
468 Unfortunately, there is also a certain murkiness about the {\tt ENABLE} command --- the errata \#1365 for RFC 5162
469 \cite{rfc5162-errata} proposes to add an explicit note that ``(A server MUST respond with a tagged BAD
470 response if) (\ldots) or the server has not positively responded to that command with "ENABLED QRESYNC", in the current
471 connection'', even though the RFC 5161 explicitly allows for aggressive pipelining of {\tt ENABLE} and {\tt
472 SELECT}.~\footnote{``There are no limitations on pipelining ENABLE. For example, it is possible to send ENABLE and then
473 immediately SELECT, or a LOGIN immediately followed by ENABLE. \cite[p. 2]{rfc5161}} I have raised this issue on the
474 {\tt imap-protocol} mailing list and the consensus there was that it is indeed allowed not to wait for the server's {\tt
475 ENABLED} before issuing a {\tt SELECT \ldots QRESYNC} \cite{melnikov-qresync-enable}. I fully support such an outcome
476 as it would be rather awkward to see a requirement for extra network round trips in contemporary IMAP extensions.
478 \section{Fetching the Data}
480 The IMAP protocol contains rather rich set of features aimed at downloading the message data in an efficient manner;
481 clients can defer parsing of the MIME message parts~\cite{rfc2045} to the server and deal with the individual data
482 separately. In spite of that, there are a few optimization opportunities which can drastically reduce the amount of
483 data to be transfered.
485 \subsection{The BINARY Extension}
487 Historically, e-mail messages could only contain English text, for which a 7-bit character set and the US-ASCII encoding
488 was adequate. However, with the advent of ``multimedia'', a steady pressure had emerged, leading to the MIME standard
489 family. Using MIME, complex tree-like structures can be embedded in e-mail messages and transmitted over the Internet
490 mail. However, at the time these were introduced, there was a real risk of not being able to transmit such complex
491 messages over traditional communication channels which were often only 7-bit safe. Due to these backward compatibility
492 concerns, a few standard method of converting arbitrary data to a textual form were conceived under the name of {\em
493 Content-Transfer-Encoding}.
495 The two most common encoding schemes are Quoted-Printable~\cite[p. 18]{rfc2045} and Base64~\cite[p. 23]{rfc2045}, the
496 latter of which is especially suitable for converting arbitrary binary data to a 7-bit form. However, it is clear that
497 mapping generic 8-bit data into 7-bit octets (and eliminating the appearance of certain ``magic'' characters in the
498 process) cannot possibly work without inflating the total size of the transferred data. For the Base64
499 Content-Transfer-Encoding, the mapping converts 8-bit input (i.e. 256 values per octets) into a target alphabet of only
500 64 characters, imposing an overhead of roughly 33~\% compared to the raw binary form. Whenever the MIME-encoded message
501 is transmitted, the amount of transferred data is therefore roughly one-third higher than strictly required.
503 RFC 3516 \cite{rfc3516} adds a feature to work around this limitation through the {\tt BINARY} extension. When the
504 server supports this feature, clients can delegate the Content-Transfer-Encoding processing to the server and receive
505 the raw binary data. Perhaps surprisingly, the {\tt BINARY} extension was not supported in Dovecot, one of the most
506 popular IMAP servers, at the time this thesis was written.
508 \begin{trojitabehavior}
509 Nevertheless, Trojitá includes full support for this extension and will automatically fetch data using the appropriate
510 FETCH modifier in order to reduce the amount of data to send over the network.
511 \end{trojitabehavior}
513 \subsection{Server-side Conversions via CONVERT}
515 Certain devices might have limitations which the sender might not have expected when she was preparing the message. For
516 example, a screen of a cell phone could have a very low resolution. Unless the user really wants to see the full
517 details after zooming in eight times, it might make sense to reduce the resolution of that 22-megapixel $5760 \times
518 3840$ image produced by Canon 5D~Mk.~III to fit on a $480 \times 800$ pixels screen of a high-end smart phone from 2012.
519 Even if the user actually wants to see the real image, it might be worthwhile to offer an access to a lower-resolution
520 version for a quick preview. This server-side conversion is what the {\tt CONVERT} extension from RFC 5259
521 \cite{rfc5259} enables.
523 Unfortunately, it appears that there are actually {\em no} publicly available servers which offer support for
524 server-side conversions and the most popular open source implementations have not expressed much interest when asked for
525 their future plans.
527 \begin{trojitabehavior}
528 Due to this induced inability to test this feature for interoperability, Trojitá doesn't support the {\tt CONVERT}
529 extension at this point.
530 \end{trojitabehavior}
532 \subsection{Metadata Decoding}
534 IMAP requires compliant servers to support MIME message parsing and RFC 2822 header decoding. One feature which is
535 notably absent, though, is a support for server-side decoding of RFC 2047-formatted message headers and IMAP's {\tt
536 ENVELOPE} fields. This shortcoming is partially addressed by two RFCs --- the already mentioned {\tt CONVERT} extension
537 mandates support for character set decoding and conversions of RFC 2822 message headers while an experimental RFC 5738
538 \cite{rfc5738} adds an ``UTF-8'' mode which switches all {\tt FETCH} commands to return the decoded Unicode data,
539 including the IMAP's {\tt ENVELOPE}. Sadly, support for this extension is similar to what has been already said for the
540 {\tt CONVERT} and one can also safely claim that the possible data savings are minuscule, if any. In addition, the
541 UTF-8 extension changes quite a lot of assumptions from the traditional IMAP protocol, to the extent that one could very
542 well propose yet another extension which would {\em just} enable access to the decoded {\tt ENVELOPE} and RFC 2822
543 header data. No such extension exists at this point, though.
545 \begin{trojitabehavior}
546 If the IMAP protocol was designed today, mandating a full-featured RFC 2047 decoder would be an obvious addition, but
547 with the legacy of the protocol history, a requirement to implement a client-side decoder anyway and no available server
548 support and the RFC being marked as {\em experimental}, Trojitá does not try to use the {\tt SELECT \ldots UTF8}
549 parameter.
550 \end{trojitabehavior}
552 \section{Updating Mailboxes}
554 Previous sections have introduced the existing extensions aimed at improving mailbox synchronization and data download.
555 In this part, I will talk about how to optimize access to an already selected mailbox and how to get updated information
556 about other mailboxes.
558 \subsection{Sorting Messages}
560 In a typical IMAP scenario, the server can access data for any message in a mailbox in a very cheap way. This is in a
561 strong contrast to its clients which are typically connected over a network whose quality could leave much to be
562 desired. It might therefore make sense to offload the data-intensive processing to the IMAP server and only send the
563 results to the client.
565 RFC 5256 \cite{rfc5256} adds support for server-side sorting. Using these features allows clients to request the server
566 to sort a subset of messages using predefined sorting criteria like the message date, the time of its arrival, name of
567 the sender, contents of the subject field etc. This feature set is subsequently augmented by RFC 5957 \cite{rfc5957}
568 which adds another sorting method which prefers the ``real name'' included in the e-mail addresses instead of using the
569 raw {\tt foo@example.org} format. The results of the {\tt SORT} operation are transmitted in a format similar to the
570 {\tt SEARCH} response.
572 \begin{trojitabehavior}
573 Trojitá includes full support for both of the mentioned RFCs.
574 \end{trojitabehavior}
576 This feature alone is a tremendous improvements over the traditional method where clients would have to download
577 envelope data for all messages in a mailbox and then sort the messages based on the obtained data. However, clients
578 have no way of reusing the sort result when a new message arrives, mandating at least a limited support for client-side
579 sorting as well. This limitation is mitigated only by the {\tt CONTEXT} family of extensions which is discussed later
580 in this chapter.
582 \subsection{Threads and Conversations}
584 RFC 5256 \cite{rfc5256} also includes support for organizing messages into threads. The threading algorithm takes a
585 look at various pieces of message metadata like their subjects or the {\tt Message-Id}, {\tt References} and {\tt
586 In-Reply-To} headers and builds a tree of messages where children of a particular node represent messages which were
587 made as responses to the parent one. This RFC specifies a few threading algorithms from almost useless one (the {\tt
588 ORDEREDSUBJECT} which groups together messages sharing a similar subject, effectively bundling unrelated items like
589 generic ``info'', ``inquiry'' etc. messages) to almost perfect ones (the {\tt REFERENCES} which works like the {\tt
590 ORDEREDSUBJECT}, but also takes the special machine-readable headers into account). Sadly, even the {\tt REFERENCES}
591 algorithm only sorts the threads by the time stamp of the thread root and not by the latest message in a thread,
592 effectively ``hiding'' new responses deep in the mailbox history. It also still looks at the message subjects,
593 potentially lumping unrelated messages together.
595 Both of these limitations are removed by the {\tt REFS} threading algorithm \cite{draft-ietf-morg-inthread}. Despite
596 being still in the draft phase, the Dovecot IMAP server includes full support for it.
598 \begin{trojitabehavior}
599 Trojitá will use it if the {\tt THREAD=REFS} capability is advertised. In absence of the {\tt REFS} algorithm, Trojitá
600 degrades to {\tt REFERENCES} and {\tt ORDEREDSUBJECT}, respectively.
601 \end{trojitabehavior}
603 Unfortunately, none of these extensions addresses the need to re-download the whole thread mapping when a new message
604 arrives; matters are in fact even more complicated than when sorting because a single arriving message could have a wide
605 ranging effect on the whole threading information. This issue is addressed by the proposed ``extended {\tt INTHREAD}''
606 described in \secref{sec:draft-incthread}.
608 \subsection{Incremental Sorting and Searching}
610 Soon after the original {\tt SORT} command got standardized, it became apparent that having to request a full update of
611 the sort order whenever a new message arrived would nullify some of the bandwidth saving opportunitites of performing
612 the server-side sort. A similar concern existed for server-side searching --- here, too, the client would have to
613 explicitly check if the new arrivals match the search criteria. This limitation was addressed by introduction of the
614 so-called {\em contexts} in RFC 5267 \cite{rfc5267}.
616 This RFC defines three extensions, the {\tt ESORT}, {\tt CONTEXT=SEARCH} and {\tt CONTEXT=SORT}. The first one is
617 similar to the {\tt ESEARCH} (and in fact reuses the {\tt ESEARCH} response) in that it could reduce the amount of data
618 transmitted in response to the {\tt SORT} command. The two {\tt CONTEXT=\ldots} capabilities extend the {\tt SEARCH}
619 and {\tt SORT} commands by a way to tell the server that it should tell the client whenever the results are updated. An
620 efficient way of communicating the changes through the {\tt ADDTO} and {\tt REMOVEFROM} {\tt ESEARCH} return values was
621 introduced.
623 In absence of the search/sort contexts, a newly arriving message would typically result in client repeating the
624 operation.~\footnote{When speaking about searching, there is a pretty straightforward optimization opportunity where the
625 {\tt SEARCH} command can be augmented to include an explicit ``and the messages' UID is higher than the
626 $old\_uidnext$''. Unfortunately, no such optimization can be performed for the {\tt SORT} command where the sort order
627 can possibly depend on each and every other message in a mailbox.} This is turn leads to excess data transfers like in
628 the following example where a single arrival results in transferring the whole mapping again:
630 \begin{minted}{text}
631 C: x1 UID SORT (SUBJECT) utf-8 ALL
632 S: * SORT 1 2 10 5 50 20 ... [rest of UIDs goes here]
633 S: x1 OK sorted
634 ...time passes...
635 S: * 1007 EXISTS
636 C: x2 UID SORT (SUBJECT) utf-8 ALL
637 S: * SORT 1 2 1111 10 5 50 20 ... [rest of UIDs goes here]
638 S: x2 OK sorted
639 \end{minted}
641 For brevity purposes, the sort order of the other 1,000 messages has been omitted. In presence of the {\tt
642 CONTEXT=SORT} extension, though, the same protocol interaction would look like this one:
644 \begin{minted}{text}
645 C: x1 UID SORT RETURN (ALL UPDATE) (SUBJECT) utf-8 ALL
646 S: * ESEARCH (TAG "x1") UID ALL 1:2,10,5,50,20,90:1090
647 S: x1 OK sorted
648 ...time passes...
649 S: * 1007 EXISTS
650 S: * ESEARCH (TAG "x1") UID ADDTO (2 1111)
651 \end{minted}
653 Using the sort contexts therefore leads to dramatic bandwidth savings on ``busy'' mailboxes which keep receiving new
654 mail over time; the {\tt ESEARCH} response can also occasionally contribute to a reduced amount of transferred data when
655 the UIDs were assigned in a favorable way. The biggest contribution is, however, the introduction of the {\tt ADDTO}
656 {\tt ESEARCH} return response which obliterates the need to transfer sort order of the whole mailbox along with the
657 single updated item.
659 The search and sort contexts also impose certain, albeit abysmal overhead, though --- whenever a message is removed, an
660 explicit {\tt ESEARCH REMOVEFROM} has to be issued. This functionality is mandated by the relevant RFC, even though it
661 doesn't provide any extra functionality --- all clients {\em still} have to listen for {\tt EXPUNGE} or {\tt VANISHED}
662 untagged responses (and rightly so, obviously), so there is no technical obstacle in requiring them to remove the
663 freshly removed items from their search/sort result cache. Etiquette also dictates that clients shall cancel these
664 updates when they are no longer needed through the {\tt CANCELUPDATE} command. These drawback are however extremely
665 small when working with larger mailboxes and absolutely worth the increased benefit of not transferring the whole set of
666 UIDs over and over again.
668 \begin{trojitabehavior}
669 Trojitá includes full support for both {\tt CONTEXT=SEARCH} and {\tt CONTEXT=SORT}. The {\tt CONTEXT=SEARCH} has been
670 tested for interoperability against the Dovecot IMAP server, one of the few implementations which actually offer this
671 functionality. Unfortunately, I was not able to locate a single IMAP server supporting {\tt CONTEXT=SORT}, so no
672 real-world interoperability tests could have been performed.~\footnote{As any other feature of Trojitá, though, the
673 support for {\tt CONTEXT=SORT} is covered by the automated test suite which was carefully written to minimize the chance
674 of any unwanted side effects and guard against unintended results when the servers deviate from the expected behavior.}
675 This is a rather surprising outcome given that the RFC 5267 is not exactly a fresh standard now and that its authors
676 work from a commercial software vendor offering a pretty advanced IMAP server implementation.
677 \end{trojitabehavior}
679 Sadly, no equivalent of these incremental updates is defined for the {\tt THREAD} command. I suspect that this is
680 caused by the fact that a single new arrival can affect each and every message in the previously received thread
681 mapping. I have attempted to address this limitation by augmenting the {\tt SEARCH=INTHREAD} draft, see
682 \secref{sec:draft-incthread} for details about the selected approach.
684 \subsection{Advanced Searching}
686 Although the basic IMAP specification provides quite a rich set of features aimed at searching the currently selected
687 mailbox, the specification leaves quite a fertile ground for improvements.
689 The biggest problem with IMAP's searching is that it requires a strict substring matching, a requirement which is openly
690 ignored by at least Google's IMAP implementation \cite{gmail-imap-unsupported-features}. Google's partial answer to
691 this problem is at least an ability to issue ``raw'' GMail-like searches through the {\tt X-GM-RAW} SEARCH operator
692 \cite{gmail-x-gm-raw}. Others have tried to add a similar feature through the {\tt SEARCH=FUZZY} extension as defined
693 in RFC 6203 \cite{rfc6203}.
695 \begin{trojitabehavior}
696 Trojitá makes use of the fuzzy searching if available and announced in the capability response. There were also some
697 experimental extensions \cite{draft-ietf-imapext-regex} aimed at introducing search based on regular expressions, but
698 they have not gained much traction.
699 \end{trojitabehavior}
701 A long-missing feature from IMAP is an ability to search multiple mailboxes at once. A very crude hack is the
702 (unofficial) {\tt SCAN} extension \cite{crispin-scan} which is said to be private to the University of Washington's {\tt
703 uw-imapd} daemon. More recent attempt at tackling down this problem is the {\tt MULTISEARCH} extension \cite{rfc6237}
704 built on top of the {\tt NOTIFY} \cite{rfc5465}. As of July 2012, no publicly available IMAP servers have announced
705 support for this extension.
707 \subsection{Obtaining Statistics for Other Mailboxes}
709 Many IMAP clients start their session by requesting a {\tt LIST} of all top-level mailboxes. This command is then
710 followed by a {\tt STATUS} for each of them in order to obtain information like the number of messages in each mailbox.
711 The obtained information is typically used in a GUI of some kind to show the mailbox list to the user.
713 The basic IMAP specification unfortunately doesn't convey the critical piece of information about whether a mailbox
714 contains any child mailboxes --- a data point typically required by any GUI to be able to show a proper widget for
715 opening a list of these child items. In absence of the {\tt CHILDREN} extensions, as defined by RFC 3348
716 \cite{rfc3348}, clients have no choice but to issue an explicit {\tt LIST} command for all mailboxes,~\footnote{The only
717 exception being those marked with the {\tt {\textbackslash}NoInferiors} which is meant to indicate that this mailbox
718 could {\em never} contain any child mailboxes, perhaps due to technical limitations on the server side --- a very
719 different case from not containing any child mailboxes at {\em this} time.} trying to list their children.
721 The {\tt CHILDREN} is a pretty straightforward extension which shall arguably be supported by any IMAP server worth its
722 salt; its absence does not improve the situation in any way.
724 \begin{trojitabehavior}
725 Trojitá supports it fully and will gracefully fall back to extra {\tt LIST} requests in case it is not available.
726 \end{trojitabehavior}
728 The initial discovery of mailboxes also mandates a separate {\tt STATUS} command for each mailbox --- behavior which
729 arguably goes against the spirit behind that command which was intended to {\em not} serve as a generic ``tell me about
730 updates to other mailboxes'' feature. This initial idea no longer has its merit, unfortunately --- users simply {\em
731 expect} being able to see a number of unread messages right next to the mailbox name, and client authors have to deliver
732 this information to them. Having to send an extra {\tt STATUS} command for each mailbox during the initial discovery is
733 not too evil thing per se (the total wasted bandwidth is negligible when compared to mailbox synchronization), but worth
734 optimizing anyway. RFC 5819 \cite{rfc5819} adds an option to the {\tt LIST} command to request automatic sending of the
735 untagged {\tt STATUS} command along.
737 \begin{trojitabehavior}
738 When the IMAP server announces the {\tt LIST-STATUS} capability, Trojitá will automatically make use of this extension.
739 \end{trojitabehavior}
741 \subsection{Push-notification of Other Mailboxes' State}
743 A feature most notably absent from the {\tt IDLE} is any support of passively monitoring changes of non-selected
744 mailboxes. Over the time, many extensions have appeared in the state of various drafts
745 \cite{draft-wener-lemonade-clearidle} \cite{draft-gulbrandsen-imap-nostore} \cite{draft-magicaltux-imap4-idleplus},
746 often simply requesting unsolicited delivery of {\tt STATUS} and {\tt FETCH} responses. None of these extensions gained
747 widespread support nor reached the state of a proposed standard, though.
749 Said status was reached by the {\tt NOTIFY} extension codified in RFC 5465 \cite{rfc5465}. It adds an impressive amount
750 of features; compliant agents can listen for creation and deletion of mailboxes (eliminating the need to redo a
751 top-to-bottom {\tt LIST} discovery), changes in amount of messages in specified list of mailboxes and even for their
752 flag changes. Sadly, support for this extension is scarce among the existing IMAP servers and its author reportedly has
753 mixed feelings \cite{arnt-good-bad-rfc} about its fate where basically ``noone implements it''. In early 2012, a
754 posting on the Dovecot mailing list announced \cite{dovecot-imap-notify} preliminary support for a part of this
755 extension in Dovecot, one of the most widely deployed open source IMAP daemons. Unfortunately, this code was not
756 complete as of July 2012 \cite{dovecot-hg-notify} and the branch I have tried contained regressions which prevented
757 regular use of other features of the IMAP server altogether.~\footnote{This is not to say that the Dovecot is a bad IMAP
758 daemon --- not at all. The version which was used is a development snapshot which is clearly marked as an experimental
759 version which requires future work.}
761 \begin{trojitabehavior}
762 Due to not being able to verify the {\tt NOTIFY} operation against any available IMAP server implementation, Trojitá
763 will not try to leverage this extension for the time being.
764 \end{trojitabehavior}
766 \section{Composing and Delivering Mail}
768 The baseline versions of the IMAP protocol does not offer any substantial assistance in composing and delivery of new
769 messages --- the only feature even remotely related to this topic is the {\tt APPEND} command which saves a message
770 passed by the client into a mailbox. Over the time, several extensions appeared aiming at improving this area.
772 The first extension is the {\tt MULTIAPPEND} command (RFC 3502 \cite{rfc3502}) which allows the client to atomically
773 upload many messages at once. Having such a feature could be a terrific boon in clients which support batched import of
774 data from the existing mail store, but it is not so valuable in a generic client.
776 \begin{trojitabehavior}
777 If Trojitá grows and a support for batched import becomes a wanted feature, the {\tt MULTIAPPEND} command will
778 doubtlessly contribute to a smoother experience.
779 \end{trojitabehavior}
781 Much more useful is the {\tt CATENATE} extension \cite{rfc4469} which allows clients to build a message from a
782 combination of uploaded parts and data already available on the IMAP server. This extension is crucial for implementing
783 advanced forward-without-download feature. Suppose a user who is currently on her vacation high above some Nordic
784 fjord, accessing e-mail over a metered GPRS connection, has just received a huge e-mail consisting of a big binary
785 attachment. IMAP already has a feature which allows her to check the accompanying text without downloading the full
786 message body. What is missing is some kind of support of forwarding the original message to another recipient.
788 This task consists of several steps --- first of all, the body of the resulting message to be sent has to be composed.
789 The {\tt CATENATE} extension tremendously helps with this task. Using {\tt CATENATE}, a client can compose a message
790 consisting of a mixture of data, some of which is coming from the client over the network as raw literals, others being
791 recycled from server-side data, be it full messages, arbitrary message parts or even byte-sized chunks of these. After
792 the message composing is concluded, the message shall be submitted to an MTA~\footnote{Mail Transfer Agent}, typically
793 over the SMTP protocol \cite{rfc6409}. Finally, a copy of the message shall be stored in the user's ``sent'' folder for
794 future reference.
796 \begin{trojitabehavior}
797 Trojitá will make use of the {\tt CATENATE} extension when available.
798 \end{trojitabehavior}
800 It can be seen that in absence of specialized extensions, this an interaction could possibly involve up to three
801 transfers of the huge binary data, possibly in an inefficient transport encoding, over the unreliable or expensive
802 network connection. Clearly, there's a huge room for improvements. The {\tt CATENATE} extension assists in a
803 server-side IMAP message assembly, but does not provide a way of improving actual message submission.
805 The first possible way of improving relies on a whole family of extensions concerning both SMTP and IMAP. The SMTP
806 protocol is extended by the {\tt BURL} command \cite{rfc4468} while the IMAP server has to support the {\tt URLAUTH}
807 extension \cite{rfc4467} and both daemons have to be properly configured. Using this protocol combination (which itself
808 depends on quite a few more extensions, please see the respective RFC documents for details), the IMAP client can
809 generate a single-use authorization token which --- if used --- enables its holder to access the given message part.
810 This token is then passed to the SMTP daemon which will combine data obtained directly from the MUA\footnote{Mail User
811 Agent, i.e. the program which acts on the user's behalf in accessing and submitting the Internet mail} (like the
812 accompanying text) with the data downloaded from the IMAP server via the passed authorization token, build a MIME
813 message and take care of its further delivery through the usual means. As was already mentioned, if the IMAP server
814 also supports the {\tt CATENATE} extension, the client can build the message on the server at once from the mentioned
815 fragments (the new accompanying text and the attachment from the original message) and pass this newly-formed message
816 for submission through the {\tt BURL} SMTP extension. This has a potential of eliminating the need to transfer the data
817 to/from the client {\em at all}, leading to a drastic bandwidth reduction.
819 \begin{trojitabehavior}
820 Trojitá includes full support for the {\tt BURL} and {\tt URLAUTH} extensions. Due to the interoperability troubles
821 with misconfigured servers which were observed in real world \cite{qmf-fastmail-burl-bug}, making use of {\tt BURL} has
822 to be explicitly allowed in the settings dialog of the application.
823 \end{trojitabehavior}
825 Unfortunately, such mode of operation comes at a cost. Support for the required extensions is not
826 ubiquitous among the deployed servers and even if the code contains all required features, it is often a policy decision
827 whether an IMAP server should ever allow access to possibly privacy-sensitive data to the outbound MTAs.
829 Another possibility presenting slightly different set of features, advantages and disadvantages is implemented by
830 deferring the actual message submission to the IMAP server as well. The {\tt CATENATE} extension is still useful
831 because it shall nonetheless be used to build the MIME message body from existing parts, eliminating the need to upload
832 the data to the IMAP server. When the message body is built (no matter how, using {\tt CATENATE} or deferring to legacy
833 means), the IMAP server can be asked to submit the resulting data for delivery.
835 This former approach has been traditionally dismissed by many IMAP proponents, including Mark Crispin, on various
836 grounds. The most common complaint is that the SMTP protocol is extremely mature, widely deployed and also complex ---
837 and that this inherent complexity is {\em required} by various use cases crucial for proper operation of today's
838 Internet mail. The critics often argue that covering the whole feature set of the (E)SMTP-based submission would lead
839 only to an equivalent of tunneling the SMTP session over IMAP \cite{crispin-smtp-tunneling}
840 \cite{cridland-imap-submission-sendmail-not-enough}, an outcome which is clearly not desirable. This approach was
841 also subject to various standardization efforts, often literally tunneling the (E)SMTP conversations \cite[p.
842 30]{draft-maes-lemonade-p-imap} over an IMAP connection. Proponents of this submit-over-IMAP approach, however,
843 counter-argument by stating that optimizing for a common use case has its own merits
844 \cite{brong-common-sendmail-makes-sense}. The critics concur that having two ways of obtaining an identical result is
845 suboptimal and that an easier, yet more limited way would threaten the existence of the older, more flexible solution
846 \cite{crispin-submission-would-kill-smtp}.
848 Finally, a third way of solving the forward-without-download problem was presented in September 2010 through the {\tt
849 POSTADDRESS} extension \cite{draft-melnikov-imap-postaddress}. Using this mechanism, a client first obtains a valid
850 e-mail address serving as a ``delivery address'' for the user's {\em Sent} mailbox. After getting hold of that
851 information, the SMTP session then proceeds as usual with one difference --- the obtained address is added as another
852 recipient of the submitted e-mail message.
854 \begin{trojitabehavior}
855 Besides including full support for the already mentioned ``Lemonade trio'' of extension ({\tt CATENATE}, {\tt URLAUTH}
856 and {\tt BURL}, Trojitá also includes experimental support for the second approach of forward-without-download, the
857 submission-over-IMAP variant.
858 \end{trojitabehavior}
860 Mail submission is an important topic, so I have dedicated a full section to a more detailed analysis of various
861 advantages and disadvantages of competing approaches. More information is available in section
862 (\secref{sec:draft-sendmail}) where I present my Internet-Draft documenting a proposed extension in which I've tried to
863 address many concerns raised during the previous discussion rounds.
865 \section{Further Improvements}
867 Many additional extensions have been defined over years, covering various areas of the protocol. This section deals
868 with those extensions which do not quite fit into any of the previous categories.
870 \subsection{Debugging}
872 Most of other communication protocols contain a way of letting the other party know what software implementation and in
873 which version it is talking to. In the web, this is usually accomplished through the {\tt User-Agent} and {\tt Server}
874 headers, e-mail messages often contain either an {\tt X-Mailer} or {\tt User-Agent} field in theirs RFC 2822 headers,
875 etc. IMAP adds a similar feature through the {\tt ID} extension defined in RFC 2971 \cite{rfc2971}.
877 This RFC is pretty clear on that implementations are explicitly {\em forbidden} from using {\em any} knowledge obtained
878 through the {\tt ID} extension to alter their behavior. This is a reasonable decision intended to prevent clients
879 implementing blacklists and whitelists of ``known'' servers. All IMAP protocol speakers are intended to only use the
880 {\tt CAPABILITY} responses (and the {\tt ENABLE} extension, if present) to change their behavior.
882 \begin{trojitabehavior}
883 Trojitá is fully compliant with these requirements.
884 \end{trojitabehavior}
886 \subsection{Internationalization}
888 The ``internationalization'' RFC (RFC 5255 \cite{rfc5255}) is focused on server implementations, mainly specifying how
889 to perform search/sort collation under various circumstances. That said, it also presents two commands to the IMAP
890 clients. The first of them is the {\tt LANGUAGE} command suitable for changing the language in which various error and
891 notifying messages are generated and sent by the server.
893 \begin{trojitabehavior}
894 Trojitá tries hard to rely on machine-readable response codes (RFC 5530 \cite{rfc5530}) instead.
895 \end{trojitabehavior}
897 The second command is the {\tt COMPARATOR}, a feature intended to let clients
898 specify which comparators and collators to use when performing search or sort operations (these comparators are defined
899 in RFC 4790 \cite{rfc4790}). By default, servers supporting at least the {\tt I18NLEVEL=1} extension are required to
900 perform collations using the {\tt i;unicode-casemap} comparator \cite{rfc5051} --- a feature which is very useful (and
901 often sufficient) in countries using the Latin alphabet. Those servers which support {\tt I18NLEVEL=2} also accept
902 client-specified preference about how to perform these operations.
904 \begin{trojitabehavior}
905 Adding support for this higher level would be trivial on a technical front (the {\tt LANGUAGE} and {\tt COMPARATOR}
906 commands are very simple with much more demanding requirements on servers), but no requests for such a feature in
907 Trojitá were received yet --- suggesting that the {\tt i;unicode-casemap} comparator works well for most users at this
908 point.
909 \end{trojitabehavior}
911 There is also the experimental ``UTF-8'' RFC \cite{rfc5738} whose aim is to get rid of any non-UTF8 data being
912 transferred over IMAP. Unfortunately, as designed, this document presents backward-incompatible changes to the IMAP
913 protocol and is hence not widely supported by the common server implementations. The client authors would very much
914 prefer to stop dealing with various Unicode encoding schemes, but this RFC does not completely address all of the
915 issues. An ongoing discussion is nevertheless taking place on official IETF's channels; it will be promising to watch
916 its future and especially the fate of the ``5738bis'' Internet Draft \cite{draft-ietf-eai-5738bis}.
918 \begin{trojitabehavior}
919 It shall be noted that Trojitá's source is completely ready for internationalization and localisation of the
920 application.~\footnote{This support does not come at no cost, unfortunately. Qt's {\tt QDateTime} class which is used
921 for keeping track of the date and time information in all Qt-using projects unfortunately does not support tracking of
922 timezone information \cite{qt-qdatetime-tz}. Trojitá works around this limitation of the public API by manual hack in
923 {\tt Imap::Mailbox::formatDateTimeWithTimeZoneAtEnd()} method.}
924 \end{trojitabehavior}
926 \subsection{Other Supported RFCs}
928 Trojitá is fully conforming to the description of the ``Distributed Electronic Mail Models in IMAP4'', as defined in RFC
929 1733 \cite{rfc1733}. It also respect the recommendations about concurrent access to mailboxes from RFC 2180
930 \cite{rfc2180}, generic suggestions to the IMAP implementors (RFC 2683 \cite{rfc2683}), Mark Crispin's famous ``Ten
931 Commandments of How to Write an IMAP client'' \cite{crispin-ten-commandments} and Timo Sirainen's suggestions for client
932 authors \cite{tss-client-coding-howto} \cite{imapwiki-client-best-practices}.
934 It will make use of the {\tt UNSELECT} command for internal technical reasons (RFC 3691 \cite{rfc3691}), if available;
935 if it isn't present, it will gracefully revert to using fabricated mailbox names with the {\tt EXAMINE} command from the
936 baseline IMAP specification. This failover is thoroughly verified by a suite of automated unit tests.
938 Trojitá is capable of recording responses from the {\tt UIDPLUS} extension (RFC 4315 \cite{rfc4315}); the author have
939 also contributed \cite{jkt-move-uidplus} to the discussion related to the {\tt COPYUID} equivalent for the proposed {\tt
940 UID MOVE} command.
942 The code also includes full support for recognizing various extensions to the IMAP's grammar (RFC 4466 \cite{rfc4466}),
943 response codes (RFC 5530 \cite{rfc5530}) and the reserved set of keywords (RFC 5788 \cite{rfc5788}).
945 RFC 5258 \cite{rfc5258} is used when available to explicitly encourage the IMAP server to send additional metadata in
946 {\tt LIST} responses --- the biggest benefit of this extension is eliminating the need of sending explicit {\tt LSUB}
947 requests to discover mailbox subscriptions.
949 Finally, the Lemonade extension (RFC 5550 \cite{rfc5550} and the obsolete RFC 4550 \cite{rfc4550}) has compiled a set of
950 requirements crucial to the mobile IMAP e-mail clients. Comparison of the proposed set of extensions with the Lemonade
951 profile is presented in a dedicated section of this thesis on page \pageref{sec:lemonade-comparison}.
953 \subsection{Out-of-scope Features}
955 The extensions presented so far all have a certain affinity towards the ``mobile IMAP''. Many other extensions have
956 been introduced, though, often solving a real problem.
958 The first family of extensions with debatable merit are extensions providing support for so-called referrals. The RFC
959 2221 \cite{rfc2221} adds a mechanism for servers to redirect clients based on their identity, a feature which was
960 originally supposed to come handy in large corporate environment. Similar to that, the RFC 2193 \cite{rfc2193} adds
961 mailbox referrals, a feature where a subset of user's mailboxes might be stored on a remote server. As it happens,
962 these features have not attained a big market share among client developers \cite{thunderbird-referrals} and the servers
963 which supports that are generally willing to act as a transparent proxy for their clients anyway
964 \cite{cyrus-referral-proxy}.
966 Extensions which are useful in a general-purpose e-mail client are the {\tt NAMESPACE} extension (RFC 2342
967 \cite{rfc2342}) which would allow compliant clients to automatically discover where e.g. other users' mailboxes are
968 located, support for managing access-control lists (ACLs) on the server (RFC 4314 \cite{rfc4314} and its obsolete form
969 given in RFC 2086 \cite{rfc2086}) and finally support for reporting and managing storage quotas (RFC 2087
970 \cite{rfc2087}).
972 \begin{trojitabehavior}
973 These have not been implemented in Trojitá yet.
974 \end{trojitabehavior}
976 A mechanism for increasing interoperability with organizations which have invested in a single-sign-on infrastructure
977 like Kerberos could be improved through better support for SASL (RFC 1731 \cite{rfc1731}, RFC4959 \cite{rfc4959}).
979 \begin{trojitabehavior}
980 A request from users which would very much prefer to have a GSSAPI-enabled Trojitá was already received, but
981 unfortunately this feature remains unimplemented due to time constraints.
982 \end{trojitabehavior}
984 A few extensions might improve the general comfort of users setting up their e-mail clients for the first time. Without
985 any doubt, autoconfiguration through the DNS SRV records (RFC 6186 \cite{rfc6186}) falls into this category.
987 There are also certain features which might add a whole new level of functionality to working with e-mail -- examples
988 are the {\tt ANNOTATE} extension (RFC 5257 \cite{rfc5257}) for adding arbitrary annotations to individual messages or
989 even their parts, which is still marked as experimental, or the similar {\tt METADATA} feature (RFC 5464 \cite{rfc5464})
990 adding the same functionality on a server or mailbox level. Needless to say, support for these extensions is scarce
991 among the IMAP clients.
993 Other extensions try to fill a certain niche. Examples are the {\tt WITHIN} extension (RFC 5032 \cite{rfc5032}) which
994 allows clients to search among messages of a certain age or the {\tt SEARCHRES} (RFC 5182 \cite{rfc5182}) adding a
995 low-level pipelining optimization which would allow the client to re-use the previous search result in the subsequent
996 commands. RFC 5466 \cite{rfc5466} adds support for persistent storage of search criteria on the server through the
997 already mentioned {\tt METADATA} extension.
999 \begin{trojitabehavior}
1000 I have not found a use case for having these optional extensions utilized from Trojitá in any place.
1001 \end{trojitabehavior}
1003 The RFC 3503 \cite{rfc3503} deals with how to generate the message delivery notifications (MDNs) in IMAP.
1005 \begin{trojitabehavior}
1006 This document clearly does not apply to clients which on purpose do not create MDNs for privacy reasons, such as
1007 Trojitá.
1008 \end{trojitabehavior}
1010 Finally, certain extensions improve the user experience in specialized environments. One of them is RFC 5616
1011 \cite{rfc5616}, an extension aimed at ``Streaming Internet Messaging Attachments''. One could imagine a use case where
1012 a carrier-level voice mailbox was implemented over IMAP; in similar situations, such a solution would have its merit.
1013 At the same time, this specific extension has so special requirements on the network architecture that it is clearly
1014 out-of-scope for a general-purpose e-mail client merely running on a cell phone.
1016 \section{Obsolete Extensions}
1018 IMAP is a rather old protocol (its history is slightly older than the author of this thesis, for that matter). Certain
1019 features have been therefore deprecated over time and it took years to grow to the current IMAP4rev1 version from the
1020 old standards of IMAP2 \cite{rfc1064} \cite{rfc1176}, IMAP3 \cite{rfc1203} and IMAP4 \cite{rfc1730}. Attention has been
1021 paid to make this transition as smooth as possible through various compatibility recommendations \cite{rfc1732}
1022 \cite{rfc2060} \cite{rfc2061} \cite{rfc2062}.
1024 \begin{trojitabehavior}
1025 Trojitá requires the server to at least announce the {\tt IMAP4rev1} capability. This protocol revision is currently
1026 defined by RFC 3501 \cite{rfc3501} from March 2003; however, changes since the previous version from December 1996
1027 (\cite{rfc2060}) are mostly backward compatible --- and in practice, no report of Trojitá not being able to work against
1028 any IMAP server implementation out there were received.
1029 \end{trojitabehavior}
1031 The {\tt UIDPLUS} extension got redefined from its former shape \cite{rfc2359} to the current revision \cite{rfc4315};
1032 the new document states that the reason for this revision was to prevent sending of bogus UID replies when the target
1033 mailbox did not support persistent UIDs.
1035 \begin{trojitabehavior}
1036 As such, Trojitá can deal with both revision of the {\tt UIDPLUS} document.
1037 \end{trojitabehavior}
1039 \end{document}