Fix spelling error in docs.
[PostgreSQL.git] / doc / src / sgml / datetime.sgml
blob2caf39a27d4b7b7158a998f4468924383170acc1
1 <!-- $PostgreSQL$ -->
3 <appendix id="datetime-appendix">
4 <title>Date/Time Support</title>
6 <para>
7 <productname>PostgreSQL</productname> uses an internal heuristic
8 parser for all date/time input support. Dates and times are input as
9 strings, and are broken up into distinct fields with a preliminary
10 determination of what kind of information can be in the
11 field. Each field is interpreted and either assigned a numeric
12 value, ignored, or rejected.
13 The parser contains internal lookup tables for all textual fields,
14 including months, days of the week, and time zones.
15 </para>
17 <para>
18 This appendix includes information on the content of these
19 lookup tables and describes the steps used by the parser to decode
20 dates and times.
21 </para>
23 <sect1 id="datetime-input-rules">
24 <title>Date/Time Input Interpretation</title>
26 <para>
27 The date/time type inputs are all decoded using the following procedure.
28 </para>
30 <procedure>
31 <step>
32 <para>
33 Break the input string into tokens and categorize each token as
34 a string, time, time zone, or number.
35 </para>
37 <substeps>
38 <step>
39 <para>
40 If the numeric token contains a colon (<literal>:</>), this is
41 a time string. Include all subsequent digits and colons.
42 </para>
43 </step>
45 <step>
46 <para>
47 If the numeric token contains a dash (<literal>-</>), slash
48 (<literal>/</>), or two or more dots (<literal>.</>), this is
49 a date string which might have a text month. If a date token has
50 already been seen, it is instead interpreted as a time zone
51 name (e.g., <literal>America/New_York</>).
52 </para>
53 </step>
55 <step>
56 <para>
57 If the token is numeric only, then it is either a single field
58 or an ISO 8601 concatenated date (e.g.,
59 <literal>19990113</literal> for January 13, 1999) or time
60 (e.g., <literal>141516</literal> for 14:15:16).
61 </para>
62 </step>
64 <step>
65 <para>
66 If the token starts with a plus (<literal>+</>) or minus
67 (<literal>-</>), then it is either a numeric time zone or a special
68 field.
69 </para>
70 </step>
71 </substeps>
72 </step>
74 <step>
75 <para>
76 If the token is a text string, match up with possible strings:
77 </para>
79 <substeps>
80 <step>
81 <para>
82 Do a binary-search table lookup for the token as a time zone
83 abbreviation.
84 </para>
85 </step>
87 <step>
88 <para>
89 If not found, do a similar binary-search table lookup to match
90 the token as either a special string (e.g., <literal>today</literal>),
91 day (e.g., <literal>Thursday</literal>),
92 month (e.g., <literal>January</literal>),
93 or noise word (e.g., <literal>at</literal>, <literal>on</literal>).
94 </para>
95 </step>
97 <step>
98 <para>
99 If still not found, throw an error.
100 </para>
101 </step>
102 </substeps>
103 </step>
105 <step>
106 <para>
107 When the token is a number or number field:
108 </para>
110 <substeps>
111 <step>
112 <para>
113 If there are eight or six digits,
114 and if no other date fields have been previously read, then interpret
115 as a <quote>concatenated date</quote> (e.g.,
116 <literal>19990118</literal> or <literal>990118</literal>).
117 The interpretation is <literal>YYYYMMDD</> or <literal>YYMMDD</>.
118 </para>
119 </step>
121 <step>
122 <para>
123 If the token is three digits
124 and a year has already been read, then interpret as day of year.
125 </para>
126 </step>
128 <step>
129 <para>
130 If four or six digits and a year has already been read, then
131 interpret as a time (<literal>HHMM</> or <literal>HHMMSS</>).
132 </para>
133 </step>
135 <step>
136 <para>
137 If three or more digits and no date fields have yet been found,
138 interpret as a year (this forces yy-mm-dd ordering of the remaining
139 date fields).
140 </para>
141 </step>
143 <step>
144 <para>
145 Otherwise the date field ordering is assumed to follow the
146 <varname>DateStyle</> setting: mm-dd-yy, dd-mm-yy, or yy-mm-dd.
147 Throw an error if a month or day field is found to be out of range.
148 </para>
149 </step>
150 </substeps>
151 </step>
153 <step>
154 <para>
155 If BC has been specified, negate the year and add one for
156 internal storage. (There is no year zero in the Gregorian
157 calendar, so numerically 1 BC becomes year zero.)
158 </para>
159 </step>
161 <step>
162 <para>
163 If BC was not specified, and if the year field was two digits in length,
164 then adjust the year to four digits. If the field is less than 70, then
165 add 2000, otherwise add 1900.
167 <tip>
168 <para>
169 Gregorian years AD 1-99 can be entered by using 4 digits with leading
170 zeros (e.g., <literal>0099</> is AD 99).
171 </para>
172 </tip>
173 </para>
174 </step>
175 </procedure>
176 </sect1>
179 <sect1 id="datetime-keywords">
180 <title>Date/Time Key Words</title>
182 <para>
183 <xref linkend="datetime-month-table"> shows the tokens that are
184 recognized as names of months.
185 </para>
187 <table id="datetime-month-table">
188 <title>Month Names</title>
189 <tgroup cols="2">
190 <thead>
191 <row>
192 <entry>Month</entry>
193 <entry>Abbreviations</entry>
194 </row>
195 </thead>
196 <tbody>
197 <row>
198 <entry>January</entry>
199 <entry>Jan</entry>
200 </row>
201 <row>
202 <entry>February</entry>
203 <entry>Feb</entry>
204 </row>
205 <row>
206 <entry>March</entry>
207 <entry>Mar</entry>
208 </row>
209 <row>
210 <entry>April</entry>
211 <entry>Apr</entry>
212 </row>
213 <row>
214 <entry>May</entry>
215 <entry></entry>
216 </row>
217 <row>
218 <entry>June</entry>
219 <entry>Jun</entry>
220 </row>
221 <row>
222 <entry>July</entry>
223 <entry>Jul</entry>
224 </row>
225 <row>
226 <entry>August</entry>
227 <entry>Aug</entry>
228 </row>
229 <row>
230 <entry>September</entry>
231 <entry>Sep, Sept</entry>
232 </row>
233 <row>
234 <entry>October</entry>
235 <entry>Oct</entry>
236 </row>
237 <row>
238 <entry>November</entry>
239 <entry>Nov</entry>
240 </row>
241 <row>
242 <entry>December</entry>
243 <entry>Dec</entry>
244 </row>
245 </tbody>
246 </tgroup>
247 </table>
249 <para>
250 <xref linkend="datetime-dow-table"> shows the tokens that are
251 recognized as names of days of the week.
252 </para>
254 <table id="datetime-dow-table">
255 <title>Day of the Week Names</title>
256 <tgroup cols="2">
257 <thead>
258 <row>
259 <entry>Day</entry>
260 <entry>Abbreviations</entry>
261 </row>
262 </thead>
263 <tbody>
264 <row>
265 <entry>Sunday</entry>
266 <entry>Sun</entry>
267 </row>
268 <row>
269 <entry>Monday</entry>
270 <entry>Mon</entry>
271 </row>
272 <row>
273 <entry>Tuesday</entry>
274 <entry>Tue, Tues</entry>
275 </row>
276 <row>
277 <entry>Wednesday</entry>
278 <entry>Wed, Weds</entry>
279 </row>
280 <row>
281 <entry>Thursday</entry>
282 <entry>Thu, Thur, Thurs</entry>
283 </row>
284 <row>
285 <entry>Friday</entry>
286 <entry>Fri</entry>
287 </row>
288 <row>
289 <entry>Saturday</entry>
290 <entry>Sat</entry>
291 </row>
292 </tbody>
293 </tgroup>
294 </table>
296 <para>
297 <xref linkend="datetime-mod-table"> shows the tokens that serve
298 various modifier purposes.
299 </para>
301 <table id="datetime-mod-table">
302 <title>Date/Time Field Modifiers</title>
303 <tgroup cols="2">
304 <thead>
305 <row>
306 <entry>Identifier</entry>
307 <entry>Description</entry>
308 </row>
309 </thead>
310 <tbody>
311 <row>
312 <entry><literal>ABSTIME</literal></entry>
313 <entry>Ignored</entry>
314 </row>
315 <row>
316 <entry><literal>AM</literal></entry>
317 <entry>Time is before 12:00</entry>
318 </row>
319 <row>
320 <entry><literal>AT</literal></entry>
321 <entry>Ignored</entry>
322 </row>
323 <row>
324 <entry><literal>JULIAN</>, <literal>JD</>, <literal>J</></entry>
325 <entry>Next field is Julian Day</entry>
326 </row>
327 <row>
328 <entry><literal>ON</literal></entry>
329 <entry>Ignored</entry>
330 </row>
331 <row>
332 <entry><literal>PM</literal></entry>
333 <entry>Time is on or after 12:00</entry>
334 </row>
335 <row>
336 <entry><literal>T</literal></entry>
337 <entry>Next field is time</entry>
338 </row>
339 </tbody>
340 </tgroup>
341 </table>
343 <para>
344 The key word <literal>ABSTIME</literal> is ignored for historical
345 reasons: In very old releases of
346 <productname>PostgreSQL</productname>, invalid values of type <type>abstime</type>
347 were emitted as <literal>Invalid Abstime</literal>. This is no
348 longer the case however and this key word will likely be dropped in
349 a future release.
350 </para>
351 </sect1>
353 <sect1 id="datetime-config-files">
354 <title>Date/Time Configuration Files</title>
356 <indexterm>
357 <primary>time zone</primary>
358 <secondary>input abbreviations</secondary>
359 </indexterm>
361 <para>
362 Since timezone abbreviations are not well standardized,
363 <productname>PostgreSQL</productname> provides a means to customize
364 the set of abbreviations accepted by the server. The
365 <xref linkend="guc-timezone-abbreviations"> run-time parameter
366 determines the active set of abbreviations. While this parameter
367 can be altered by any database user, the possible values for it
368 are under the control of the database administrator &mdash; they
369 are in fact names of configuration files stored in
370 <filename>.../share/timezonesets/</> of the installation directory.
371 By adding or altering files in that directory, the administrator
372 can set local policy for timezone abbreviations.
373 </para>
375 <para>
376 <literal>timezone_abbreviations</> can be set to any file name
377 found in <filename>.../share/timezonesets/</>, if the file's name
378 is entirely alphabetic. (The prohibition against non-alphabetic
379 characters in <literal>timezone_abbreviations</> prevents reading
380 files outside the intended directory, as well as reading editor
381 backup files and other extraneous files.)
382 </para>
384 <para>
385 A timezone abbreviation file can contain blank lines and comments
386 beginning with <literal>#</>. Non-comment lines must have one of
387 these formats:
389 <synopsis>
390 <replaceable>time_zone_name</replaceable> <replaceable>offset</replaceable>
391 <replaceable>time_zone_name</replaceable> <replaceable>offset</replaceable> D
392 @INCLUDE <replaceable>file_name</replaceable>
393 @OVERRIDE
394 </synopsis>
395 </para>
397 <para>
398 A <replaceable>time_zone_name</replaceable> is just the abbreviation
399 being defined. The <replaceable>offset</replaceable> is the zone's
400 offset in seconds from UTC, positive being east from Greenwich and
401 negative being west. For example, -18000 would be five hours west
402 of Greenwich, or North American east coast standard time. <literal>D</>
403 indicates that the zone name represents local daylight-savings time
404 rather than standard time. Since all known time zone offsets are on
405 15 minute boundaries, the number of seconds has to be a multiple of 900.
406 </para>
408 <para>
409 The <literal>@INCLUDE</> syntax allows inclusion of another file in the
410 <filename>.../share/timezonesets/</> directory. Inclusion can be nested,
411 to a limited depth.
412 </para>
414 <para>
415 The <literal>@OVERRIDE</> syntax indicates that subsequent entries in the
416 file can override previous entries (i.e., entries obtained from included
417 files). Without this, conflicting definitions of the same timezone
418 abbreviation are considered an error.
419 </para>
421 <para>
422 In an unmodified installation, the file <filename>Default</> contains
423 all the non-conflicting time zone abbreviations for most of the world.
424 Additional files <filename>Australia</> and <filename>India</> are
425 provided for those regions: these files first include the
426 <literal>Default</> file and then add or modify timezones as needed.
427 </para>
429 <para>
430 For reference purposes, a standard installation also contains files
431 <filename>Africa.txt</>, <filename>America.txt</>, etc, containing
432 information about every time zone abbreviation known to be in use
433 according to the <literal>zoneinfo</> timezone database. The zone name
434 definitions found in these files can be copied and pasted into a custom
435 configuration file as needed. Note that these files cannot be directly
436 referenced as <literal>timezone_abbreviations</> settings, because of
437 the dot embedded in their names.
438 </para>
440 <note>
441 <para>
442 If an error occurs while reading the time zone data sets, no new value is
443 applied but the old set is kept. If the error occurs while starting the
444 database, startup fails.
445 </para>
446 </note>
448 <caution>
449 <para>
450 Time zone abbreviations defined in the configuration file override
451 non-timezone meanings built into <productname>PostgreSQL</productname>.
452 For example, the <filename>Australia</> configuration file defines
453 <literal>SAT</> (for South Australian Standard Time). When this
454 file is active, <literal>SAT</> will not be recognized as an abbreviation
455 for Saturday.
456 </para>
457 </caution>
459 <caution>
460 <para>
461 If you modify files in <filename>.../share/timezonesets/</>,
462 it is up to you to make backups &mdash; a normal database dump
463 will not include this directory.
464 </para>
465 </caution>
467 </sect1>
469 <sect1 id="datetime-units-history">
470 <title>History of Units</title>
472 <para>
473 The Julian calendar was introduced by Julius Caesar in 45 BC.
474 It was in common use in the Western world
475 until the year 1582, when countries started changing to the Gregorian
476 calendar. In the Julian calendar, the tropical year is
477 approximated as 365 1/4 days = 365.25 days. This gives an error of
478 about 1 day in 128 years.
479 </para>
481 <para>
482 The accumulating calendar error prompted
483 Pope Gregory XIII to reform the calendar in accordance with
484 instructions from the Council of Trent.
485 In the Gregorian calendar, the tropical year is approximated as
486 365 + 97 / 400 days = 365.2425 days. Thus it takes approximately 3300
487 years for the tropical year to shift one day with respect to the
488 Gregorian calendar.
489 </para>
491 <para>
492 The approximation 365+97/400 is achieved by having 97 leap years
493 every 400 years, using the following rules:
495 <simplelist>
496 <member>
497 Every year divisible by 4 is a leap year.
498 </member>
499 <member>
500 However, every year divisible by 100 is not a leap year.
501 </member>
502 <member>
503 However, every year divisible by 400 is a leap year after all.
504 </member>
505 </simplelist>
507 So, 1700, 1800, 1900, 2100, and 2200 are not leap years. But 1600,
508 2000, and 2400 are leap years.
510 By contrast, in the older Julian calendar all years divisible by 4 are leap
511 years.
512 </para>
514 <para>
515 The papal bull of February 1582 decreed that 10 days should be dropped
516 from October 1582 so that 15 October should follow immediately after
517 4 October.
518 This was observed in Italy, Poland, Portugal, and Spain. Other Catholic
519 countries followed shortly after, but Protestant countries were
520 reluctant to change, and the Greek Orthodox countries didn't change
521 until the start of the 20th century.
523 The reform was observed by Great Britain and Dominions (including what is
524 now the USA) in 1752.
525 Thus 2 September 1752 was followed by 14 September 1752.
527 This is why Unix systems have the <command>cal</command> program
528 produce the following:
530 <screen>
531 $ <userinput>cal 9 1752</userinput>
532 September 1752
533 S M Tu W Th F S
534 1 2 14 15 16
535 17 18 19 20 21 22 23
536 24 25 26 27 28 29 30
537 </screen>
538 </para>
540 <para>
541 The SQL standard states that <quote>Within the definition of a
542 <quote>datetime literal</quote>, the <quote>datetime
543 value</quote>s are constrained by the natural rules for dates and
544 times according to the Gregorian calendar</quote>. Dates between
545 1582-10-05 and 1582-10-14, although eliminated in some countries
546 by Papal fiat, conform to <quote>natural rules</quote> and are
547 hence valid dates. <productname>PostgreSQL</> follows the SQL
548 standard's lead by counting dates exclusively in the Gregorian
549 calendar, even for years before that calendar was in use.
550 </para>
552 <para>
553 Different calendars have been developed in various parts of the
554 world, many predating the Gregorian system.
556 For example,
557 the beginnings of the Chinese calendar can be traced back to the 14th
558 century BC. Legend has it that the Emperor Huangdi invented that
559 calendar in 2637 BC.
561 The People's Republic of China uses the Gregorian calendar
562 for civil purposes. The Chinese calendar is used for determining
563 festivals.
564 </para>
566 <para>
567 The <quote>Julian Date</quote> is unrelated to the <quote>Julian
568 calendar</quote>.
569 The Julian Date system was invented by the French scholar
570 Joseph Justus Scaliger (1540-1609)
571 and probably takes its name from Scaliger's father,
572 the Italian scholar Julius Caesar Scaliger (1484-1558).
573 In the Julian Date system, each day has a sequential number, starting
574 from JD 0 (which is sometimes called <emphasis>the</> Julian Date).
575 JD 0 corresponds to 1 January 4713 BC in the Julian calendar, or
576 24 November 4714 BC in the Gregorian calendar. Julian Date counting
577 is most often used by astronomers for labeling their nightly observations,
578 and therefore a date runs from noon UTC to the next noon UTC, rather than
579 from midnight to midnight: JD 0 designates the 24 hours from noon UTC on
580 1 January 4713 BC to noon UTC on 2 January 4713 BC.
581 </para>
583 <para>
584 Although <productname>PostgreSQL</> supports Julian Date notation for
585 input and output of dates (and also uses them for some internal datetime
586 calculations), it does not observe the nicety of having dates run from
587 noon to noon. <productname>PostgreSQL</> treats a Julian Date as running
588 from midnight to midnight.
589 </para>
591 </sect1>
592 </appendix>