Make GROUP BY work properly for datatypes that only support hashing and not
[PostgreSQL.git] / doc / src / sgml / lobj.sgml
blobf97a52feb3e378085aeb69f1101ca876788d69ff
1 <!-- $PostgreSQL$ -->
3 <chapter id="largeObjects">
4 <title id="largeObjects-title">Large Objects</title>
6 <indexterm zone="largeobjects"><primary>large object</></>
7 <indexterm><primary>BLOB</><see>large object</></>
9 <para>
10 <productname>PostgreSQL</productname> has a <firstterm>large object</>
11 facility, which provides stream-style access to user data that is stored
12 in a special large-object structure. Streaming access is useful
13 when working with data values that are too large to manipulate
14 conveniently as a whole.
15 </para>
17 <para>
18 This chapter describes the implementation and the programming and
19 query language interfaces to <productname>PostgreSQL</productname>
20 large object data. We use the <application>libpq</application> C
21 library for the examples in this chapter, but most programming
22 interfaces native to <productname>PostgreSQL</productname> support
23 equivalent functionality. Other interfaces might use the large
24 object interface internally to provide generic support for large
25 values. This is not described here.
26 </para>
28 <sect1 id="lo-intro">
29 <title>Introduction</title>
31 <indexterm>
32 <primary>TOAST</primary>
33 <secondary>versus large objects</secondary>
34 </indexterm>
36 <para>
37 All large objects are placed in a single system table called
38 <classname>pg_largeobject</classname>.
39 <productname>PostgreSQL</productname> also supports a storage system called
40 <quote><acronym>TOAST</acronym></quote> that automatically stores values
41 larger than a single database page into a secondary storage area per table.
42 This makes the large object facility partially obsolete. One
43 remaining advantage of the large object facility is that it allows values
44 up to 2 GB in size, whereas <acronym>TOAST</acronym>ed fields can be at
45 most 1 GB. Also, large objects can be randomly modified using a read/write
46 API that is more efficient than performing such operations using
47 <acronym>TOAST</acronym>.
48 </para>
50 </sect1>
52 <sect1 id="lo-implementation">
53 <title>Implementation Features</title>
55 <para>
56 The large object implementation breaks large
57 objects up into <quote>chunks</quote> and stores the chunks in
58 rows in the database. A B-tree index guarantees fast
59 searches for the correct chunk number when doing random
60 access reads and writes.
61 </para>
62 </sect1>
64 <sect1 id="lo-interfaces">
65 <title>Client Interfaces</title>
67 <para>
68 This section describes the facilities that
69 <productname>PostgreSQL</productname> client interface libraries
70 provide for accessing large objects. All large object
71 manipulation using these functions <emphasis>must</emphasis> take
72 place within an SQL transaction block.
73 The <productname>PostgreSQL</productname> large object interface is modeled after
74 the <acronym>Unix</acronym> file-system interface, with analogues of
75 <function>open</function>, <function>read</function>,
76 <function>write</function>,
77 <function>lseek</function>, etc.
78 </para>
80 <para>
81 Client applications which use the large object interface in
82 <application>libpq</application> should include the header file
83 <filename>libpq/libpq-fs.h</filename> and link with the
84 <application>libpq</application> library.
85 </para>
87 <sect2>
88 <title>Creating a Large Object</title>
90 <para>
91 The function
92 <synopsis>
93 Oid lo_creat(PGconn *conn, int mode);
94 </synopsis>
95 <indexterm><primary>lo_creat</></>
96 creates a new large object.
97 The return value is the OID that was assigned to the new large object,
98 or <symbol>InvalidOid</symbol> (zero) on failure.
100 <replaceable class="parameter">mode</replaceable> is unused and
101 ignored as of <productname>PostgreSQL</productname> 8.1; however, for
102 backwards compatibility with earlier releases it is best to
103 set it to <symbol>INV_READ</symbol>, <symbol>INV_WRITE</symbol>,
104 or <symbol>INV_READ</symbol> <literal>|</> <symbol>INV_WRITE</symbol>.
105 (These symbolic constants are defined
106 in the header file <filename>libpq/libpq-fs.h</filename>.)
107 </para>
109 <para>
110 An example:
111 <programlisting>
112 inv_oid = lo_creat(conn, INV_READ|INV_WRITE);
113 </programlisting>
114 </para>
116 <para>
117 The function
118 <synopsis>
119 Oid lo_create(PGconn *conn, Oid lobjId);
120 </synopsis>
121 <indexterm><primary>lo_create</></>
122 also creates a new large object. The OID to be assigned can be
123 specified by <replaceable class="parameter">lobjId</replaceable>;
124 if so, failure occurs if that OID is already in use for some large
125 object. If <replaceable class="parameter">lobjId</replaceable>
126 is <symbol>InvalidOid</symbol> (zero) then <function>lo_create</> assigns an unused
127 OID (this is the same behavior as <function>lo_creat</>).
128 The return value is the OID that was assigned to the new large object,
129 or <symbol>InvalidOid</symbol> (zero) on failure.
130 </para>
132 <para>
133 <function>lo_create</> is new as of <productname>PostgreSQL</productname>
134 8.1; if this function is run against an older server version, it will
135 fail and return <symbol>InvalidOid</symbol>.
136 </para>
138 <para>
139 An example:
140 <programlisting>
141 inv_oid = lo_create(conn, desired_oid);
142 </programlisting>
143 </para>
144 </sect2>
146 <sect2>
147 <title>Importing a Large Object</title>
149 <para>
150 To import an operating system file as a large object, call
151 <synopsis>
152 Oid lo_import(PGconn *conn, const char *filename);
153 </synopsis>
154 <indexterm><primary>lo_import</></>
155 <replaceable class="parameter">filename</replaceable>
156 specifies the operating system name of
157 the file to be imported as a large object.
158 The return value is the OID that was assigned to the new large object,
159 or <symbol>InvalidOid</symbol> (zero) on failure.
160 Note that the file is read by the client interface library, not by
161 the server; so it must exist in the client file system and be readable
162 by the client application.
163 </para>
165 <para>
166 The function
167 <synopsis>
168 Oid lo_import_with_oid(PGconn *conn, const char *filename, Oid lobjId);
169 </synopsis>
170 <indexterm><primary>lo_import_with_oid</></>
171 also imports a new large object. The OID to be assigned can be
172 specified by <replaceable class="parameter">lobjId</replaceable>;
173 if so, failure occurs if that OID is already in use for some large
174 object. If <replaceable class="parameter">lobjId</replaceable>
175 is <symbol>InvalidOid</symbol> (zero) then <function>lo_import_with_oid</> assigns an unused
176 OID (this is the same behavior as <function>lo_import</>).
177 The return value is the OID that was assigned to the new large object,
178 or <symbol>InvalidOid</symbol> (zero) on failure.
179 </para>
181 <para>
182 <function>lo_import_with_oid</> is new as of <productname>PostgreSQL</productname>
183 8.4 and uses <function>lo_create</function> internally which is new in 8.1; if this function is run against 8.0 or before, it will
184 fail and return <symbol>InvalidOid</symbol>.
185 </para>
186 </sect2>
188 <sect2>
189 <title>Exporting a Large Object</title>
191 <para>
192 To export a large object
193 into an operating system file, call
194 <synopsis>
195 int lo_export(PGconn *conn, Oid lobjId, const char *filename);
196 </synopsis>
197 <indexterm><primary>lo_export</></>
198 The <parameter>lobjId</parameter> argument specifies the OID of the large
199 object to export and the <parameter>filename</parameter> argument
200 specifies the operating system name of the file. Note that the file is
201 written by the client interface library, not by the server. Returns 1
202 on success, -1 on failure.
203 </para>
204 </sect2>
206 <sect2>
207 <title>Opening an Existing Large Object</title>
209 <para>
210 To open an existing large object for reading or writing, call
211 <synopsis>
212 int lo_open(PGconn *conn, Oid lobjId, int mode);
213 </synopsis>
214 <indexterm><primary>lo_open</></>
215 The <parameter>lobjId</parameter> argument specifies the OID of the large
216 object to open. The <parameter>mode</parameter> bits control whether the
217 object is opened for reading (<symbol>INV_READ</>), writing
218 (<symbol>INV_WRITE</symbol>), or both.
219 (These symbolic constants are defined
220 in the header file <filename>libpq/libpq-fs.h</filename>.)
221 A large object cannot be opened before it is created.
222 <function>lo_open</function> returns a (non-negative) large object
223 descriptor for later use in <function>lo_read</function>,
224 <function>lo_write</function>, <function>lo_lseek</function>,
225 <function>lo_tell</function>, and <function>lo_close</function>.
226 The descriptor is only valid for
227 the duration of the current transaction.
228 On failure, -1 is returned.
229 </para>
231 <para>
232 The server currently does not distinguish between modes
233 <symbol>INV_WRITE</symbol> and <symbol>INV_READ</> <literal>|</>
234 <symbol>INV_WRITE</symbol>: you are allowed to read from the descriptor
235 in either case. However there is a significant difference between
236 these modes and <symbol>INV_READ</> alone: with <symbol>INV_READ</>
237 you cannot write on the descriptor, and the data read from it will
238 reflect the contents of the large object at the time of the transaction
239 snapshot that was active when <function>lo_open</> was executed,
240 regardless of later writes by this or other transactions. Reading
241 from a descriptor opened with <symbol>INV_WRITE</symbol> returns
242 data that reflects all writes of other committed transactions as well
243 as writes of the current transaction. This is similar to the behavior
244 of <literal>SERIALIZABLE</> versus <literal>READ COMMITTED</> transaction
245 modes for ordinary SQL <command>SELECT</> commands.
246 </para>
248 <para>
249 An example:
250 <programlisting>
251 inv_fd = lo_open(conn, inv_oid, INV_READ|INV_WRITE);
252 </programlisting>
253 </para>
254 </sect2>
256 <sect2>
257 <title>Writing Data to a Large Object</title>
259 <para>
260 The function
261 <synopsis>
262 int lo_write(PGconn *conn, int fd, const char *buf, size_t len);
263 </synopsis>
264 <indexterm><primary>lo_write</></> writes
265 <parameter>len</parameter> bytes from <parameter>buf</parameter>
266 to large object descriptor <parameter>fd</>. The <parameter>fd</parameter>
267 argument must have been returned by a previous
268 <function>lo_open</function>. The number of bytes actually
269 written is returned. In the event of an error, the return value
270 is negative.
271 </para>
272 </sect2>
274 <sect2>
275 <title>Reading Data from a Large Object</title>
277 <para>
278 The function
279 <synopsis>
280 int lo_read(PGconn *conn, int fd, char *buf, size_t len);
281 </synopsis>
282 <indexterm><primary>lo_read</></> reads
283 <parameter>len</parameter> bytes from large object descriptor
284 <parameter>fd</parameter> into <parameter>buf</parameter>. The
285 <parameter>fd</parameter> argument must have been returned by a
286 previous <function>lo_open</function>. The number of bytes
287 actually read is returned. In the event of an error, the return
288 value is negative.
289 </para>
290 </sect2>
292 <sect2>
293 <title>Seeking in a Large Object</title>
295 <para>
296 To change the current read or write location associated with a
297 large object descriptor, call
298 <synopsis>
299 int lo_lseek(PGconn *conn, int fd, int offset, int whence);
300 </synopsis>
301 <indexterm><primary>lo_lseek</></> This function moves the
302 current location pointer for the large object descriptor identified by
303 <parameter>fd</> to the new location specified by
304 <parameter>offset</>. The valid values for <parameter>whence</>
305 are <symbol>SEEK_SET</> (seek from object start),
306 <symbol>SEEK_CUR</> (seek from current position), and
307 <symbol>SEEK_END</> (seek from object end). The return value is
308 the new location pointer, or -1 on error.
309 </para>
310 </sect2>
312 <sect2>
313 <title>Obtaining the Seek Position of a Large Object</title>
315 <para>
316 To obtain the current read or write location of a large object descriptor,
317 call
318 <synopsis>
319 int lo_tell(PGconn *conn, int fd);
320 </synopsis>
321 <indexterm><primary>lo_tell</></> If there is an error, the
322 return value is negative.
323 </para>
324 </sect2>
326 <sect2>
327 <title>Truncating a Large Object</title>
329 <para>
330 To truncate a large object to a given length, call
331 <synopsis>
332 int lo_truncate(PGcon *conn, int fd, size_t len);
333 </synopsis>
334 <indexterm><primary>lo_truncate</></> truncates the large object
335 descriptor <parameter>fd</> to length <parameter>len</>. The
336 <parameter>fd</parameter> argument must have been returned by a
337 previous <function>lo_open</function>. If <parameter>len</> is
338 greater than the current large object length, the large object
339 is extended with null bytes ('\0').
340 </para>
342 <para>
343 The file offset is not changed.
344 </para>
346 <para>
347 On success <function>lo_truncate</function> returns
348 zero. On error, the return value is negative.
349 </para>
351 <para>
352 <function>lo_truncate</> is new as of <productname>PostgreSQL</productname>
353 8.3; if this function is run against an older server version, it will
354 fail and return a negative value.
355 </para>
356 </sect2>
358 <sect2>
359 <title>Closing a Large Object Descriptor</title>
361 <para>
362 A large object descriptor can be closed by calling
363 <synopsis>
364 int lo_close(PGconn *conn, int fd);
365 </synopsis>
366 <indexterm><primary>lo_close</></> where <parameter>fd</> is a
367 large object descriptor returned by <function>lo_open</function>.
368 On success, <function>lo_close</function> returns zero. On
369 error, the return value is negative.
370 </para>
372 <para>
373 Any large object descriptors that remain open at the end of a
374 transaction will be closed automatically.
375 </para>
376 </sect2>
378 <sect2>
379 <title>Removing a Large Object</title>
381 <para>
382 To remove a large object from the database, call
383 <synopsis>
384 int lo_unlink(PGconn *conn, Oid lobjId);
385 </synopsis>
386 <indexterm><primary>lo_unlink</></> The
387 <parameter>lobjId</parameter> argument specifies the OID of the
388 large object to remove. Returns 1 if successful, -1 on failure.
389 </para>
390 </sect2>
392 </sect1>
394 <sect1 id="lo-funcs">
395 <title>Server-Side Functions</title>
397 <para>
398 There are server-side functions callable from SQL that correspond to
399 each of the client-side functions described above; indeed, for the
400 most part the client-side functions are simply interfaces to the
401 equivalent server-side functions. The ones that are actually useful
402 to call via SQL commands are
403 <function>lo_creat</function><indexterm><primary>lo_creat</></>,
404 <function>lo_create</function><indexterm><primary>lo_create</></>,
405 <function>lo_unlink</function><indexterm><primary>lo_unlink</></>,
406 <function>lo_import</function><indexterm><primary>lo_import</></>, and
407 <function>lo_export</function><indexterm><primary>lo_export</></>.
408 Here are examples of their use:
410 <programlisting>
411 CREATE TABLE image (
412 name text,
413 raster oid
416 SELECT lo_creat(-1); -- returns OID of new, empty large object
418 SELECT lo_create(43213); -- attempts to create large object with OID 43213
420 SELECT lo_unlink(173454); -- deletes large object with OID 173454
422 INSERT INTO image (name, raster)
423 VALUES ('beautiful image', lo_import('/etc/motd'));
425 INSERT INTO image (name, raster) -- same as above, but specify OID to use
426 VALUES ('beautiful image', lo_import('/etc/motd', 68583));
428 SELECT lo_export(image.raster, '/tmp/motd') FROM image
429 WHERE name = 'beautiful image';
430 </programlisting>
431 </para>
433 <para>
434 The server-side <function>lo_import</function> and
435 <function>lo_export</function> functions behave considerably differently
436 from their client-side analogs. These two functions read and write files
437 in the server's file system, using the permissions of the database's
438 owning user. Therefore, their use is restricted to superusers. In
439 contrast, the client-side import and export functions read and write files
440 in the client's file system, using the permissions of the client program.
441 The client-side functions can be used by any
442 <productname>PostgreSQL</productname> user.
443 </para>
444 </sect1>
446 <sect1 id="lo-examplesect">
447 <title>Example Program</title>
449 <para>
450 <xref linkend="lo-example"> is a sample program which shows how the large object
451 interface
452 in <application>libpq</> can be used. Parts of the program are
453 commented out but are left in the source for the reader's
454 benefit. This program can also be found in
455 <filename>src/test/examples/testlo.c</filename> in the source distribution.
456 </para>
458 <example id="lo-example">
459 <title>Large Objects with <application>libpq</application> Example Program</title>
460 <programlisting>
461 /*--------------------------------------------------------------
463 * testlo.c--
464 * test using large objects with libpq
466 * Copyright (c) 1994, Regents of the University of California
468 *--------------------------------------------------------------
470 #include &lt;stdio.h&gt;
471 #include &quot;libpq-fe.h&quot;
472 #include &quot;libpq/libpq-fs.h&quot;
474 #define BUFSIZE 1024
477 * importFile
478 * import file &quot;in_filename&quot; into database as large object &quot;lobjOid&quot;
482 importFile(PGconn *conn, char *filename)
484 Oid lobjId;
485 int lobj_fd;
486 char buf[BUFSIZE];
487 int nbytes,
488 tmp;
489 int fd;
492 * open the file to be read in
494 fd = open(filename, O_RDONLY, 0666);
495 if (fd &lt; 0)
496 { /* error */
497 fprintf(stderr, &quot;cannot open unix file %s\n&quot;, filename);
501 * create the large object
503 lobjId = lo_creat(conn, INV_READ | INV_WRITE);
504 if (lobjId == 0)
505 fprintf(stderr, &quot;cannot create large object\n&quot;);
507 lobj_fd = lo_open(conn, lobjId, INV_WRITE);
510 * read in from the Unix file and write to the inversion file
512 while ((nbytes = read(fd, buf, BUFSIZE)) &gt; 0)
514 tmp = lo_write(conn, lobj_fd, buf, nbytes);
515 if (tmp &lt; nbytes)
516 fprintf(stderr, &quot;error while reading large object\n&quot;);
519 (void) close(fd);
520 (void) lo_close(conn, lobj_fd);
522 return lobjId;
525 void
526 pickout(PGconn *conn, Oid lobjId, int start, int len)
528 int lobj_fd;
529 char *buf;
530 int nbytes;
531 int nread;
533 lobj_fd = lo_open(conn, lobjId, INV_READ);
534 if (lobj_fd &lt; 0)
536 fprintf(stderr, &quot;cannot open large object %d\n&quot;,
537 lobjId);
540 lo_lseek(conn, lobj_fd, start, SEEK_SET);
541 buf = malloc(len + 1);
543 nread = 0;
544 while (len - nread &gt; 0)
546 nbytes = lo_read(conn, lobj_fd, buf, len - nread);
547 buf[nbytes] = ' ';
548 fprintf(stderr, &quot;&gt;&gt;&gt; %s&quot;, buf);
549 nread += nbytes;
551 free(buf);
552 fprintf(stderr, &quot;\n&quot;);
553 lo_close(conn, lobj_fd);
556 void
557 overwrite(PGconn *conn, Oid lobjId, int start, int len)
559 int lobj_fd;
560 char *buf;
561 int nbytes;
562 int nwritten;
563 int i;
565 lobj_fd = lo_open(conn, lobjId, INV_WRITE);
566 if (lobj_fd &lt; 0)
568 fprintf(stderr, &quot;cannot open large object %d\n&quot;,
569 lobjId);
572 lo_lseek(conn, lobj_fd, start, SEEK_SET);
573 buf = malloc(len + 1);
575 for (i = 0; i &lt; len; i++)
576 buf[i] = 'X';
577 buf[i] = ' ';
579 nwritten = 0;
580 while (len - nwritten &gt; 0)
582 nbytes = lo_write(conn, lobj_fd, buf + nwritten, len - nwritten);
583 nwritten += nbytes;
585 free(buf);
586 fprintf(stderr, &quot;\n&quot;);
587 lo_close(conn, lobj_fd);
591 * exportFile
592 * export large object &quot;lobjOid&quot; to file &quot;out_filename&quot;
595 void
596 exportFile(PGconn *conn, Oid lobjId, char *filename)
598 int lobj_fd;
599 char buf[BUFSIZE];
600 int nbytes,
601 tmp;
602 int fd;
605 * open the large object
607 lobj_fd = lo_open(conn, lobjId, INV_READ);
608 if (lobj_fd &lt; 0)
610 fprintf(stderr, &quot;cannot open large object %d\n&quot;,
611 lobjId);
615 * open the file to be written to
617 fd = open(filename, O_CREAT | O_WRONLY, 0666);
618 if (fd &lt; 0)
619 { /* error */
620 fprintf(stderr, &quot;cannot open unix file %s\n&quot;,
621 filename);
625 * read in from the inversion file and write to the Unix file
627 while ((nbytes = lo_read(conn, lobj_fd, buf, BUFSIZE)) &gt; 0)
629 tmp = write(fd, buf, nbytes);
630 if (tmp &lt; nbytes)
632 fprintf(stderr, &quot;error while writing %s\n&quot;,
633 filename);
637 (void) lo_close(conn, lobj_fd);
638 (void) close(fd);
640 return;
643 void
644 exit_nicely(PGconn *conn)
646 PQfinish(conn);
647 exit(1);
651 main(int argc, char **argv)
653 char *in_filename,
654 *out_filename;
655 char *database;
656 Oid lobjOid;
657 PGconn *conn;
658 PGresult *res;
660 if (argc != 4)
662 fprintf(stderr, &quot;Usage: %s database_name in_filename out_filename\n&quot;,
663 argv[0]);
664 exit(1);
667 database = argv[1];
668 in_filename = argv[2];
669 out_filename = argv[3];
672 * set up the connection
674 conn = PQsetdb(NULL, NULL, NULL, NULL, database);
676 /* check to see that the backend connection was successfully made */
677 if (PQstatus(conn) == CONNECTION_BAD)
679 fprintf(stderr, &quot;Connection to database '%s' failed.\n&quot;, database);
680 fprintf(stderr, &quot;%s&quot;, PQerrorMessage(conn));
681 exit_nicely(conn);
684 res = PQexec(conn, &quot;begin&quot;);
685 PQclear(res);
687 printf(&quot;importing file %s\n&quot;, in_filename);
688 /* lobjOid = importFile(conn, in_filename); */
689 lobjOid = lo_import(conn, in_filename);
691 printf(&quot;as large object %d.\n&quot;, lobjOid);
693 printf(&quot;picking out bytes 1000-2000 of the large object\n&quot;);
694 pickout(conn, lobjOid, 1000, 1000);
696 printf(&quot;overwriting bytes 1000-2000 of the large object with X's\n&quot;);
697 overwrite(conn, lobjOid, 1000, 1000);
700 printf(&quot;exporting large object to file %s\n&quot;, out_filename);
701 /* exportFile(conn, lobjOid, out_filename); */
702 lo_export(conn, lobjOid, out_filename);
704 res = PQexec(conn, &quot;end&quot;);
705 PQclear(res);
706 PQfinish(conn);
707 exit(0);
709 </programlisting>
710 </example>
712 </sect1>
713 </chapter>