Update git submodules
[LibreOffice.git] / unoidl / README.md
blob792038429d188c77edc7842bcf5a57f459360fff
1 # Support for UNOIDL Registry Formats
3 `Library_unoidl` contains the `unoidl::Manager` and `unoidl::Provider` implementations
4 for the following registry formats:
6 * The new `UNOIDL` binary `types.rdb` format.
7 * The old legacy binary `types.rdb` format (based on modules "store" and
8   "registry").
9 * A source-file format, reading (multiple) `UNOIDL` entity definitions directly
10   from a single `.idl` source file.
11 * A source-tree format, reading `UNOIDL` entity definitions directly from a tree
12   of `.idl` source files rooted at a given directory.  (Where an entity named
13   `foo.bar.Baz` is expected in a file named `foo/bar/Baz.idl` within that tree.)
15 (While `.idl` files still contain `#include` directives for legacy idlc, the source-
16 based formats ignore any preprocessing directives starting with `#` in the `.idl`
17 files.)  `unoidl::Manager::addProvider` transparently detects the registry format
18 for a given URI and instantiates the corresponding provider implementation.
20 `Executable_unoidl-write` is a helper tool to convert from any of the registry
21 formats to the `UNOIDL` format.  It is used at build-time to compile `UNOIDL` format
22 `.rdb` files (that are used at build-time only, or included in installation sets
23 in `URE` or `program/types/` or as part of bundled extensions that are created
24 during the build and not merely included as pre-built `.oxt` files) from source
25 `.idl` files.  (The SDK still uses idlc and generates legacy format `.rdb` files for
26 now.)
28 `Executable_unoidl-read` is a helper tool to convert from any of the registry
29 formats to the source-file format.  It can be used manually after a LibreOffice
30 version update to create new reference registries for `Executable_unoidl-check`.
32 `Executable_unoidl-check` is a helper tool to check that one registry is
33 backwards-compatible with another registry.  It is used at build-time to detect
34 inadvertent breakage of the udkapi and offapi APIs.
36 ## Specification of the New UNOIDL types.rdb Format
38 The format uses byte-oriented, platform-independent, binary files.  Larger
39 quantities are stored LSB first, without alignment requirements.  Offsets are
40 32 bit, effectively limiting the overall file size to 4GB, but that is not
41 considered a limitation in practice (and avoids unnecessary bloat compared to
42 64 bit offsets).
44 Annotations can be added for (non-module) entities and certain parts of such
45 entities (e.g., both for an interface type definition and for a direct method of
46 an interface type definition; the idea is that it can be added for direct parts
47 that forma a "many-to-one" relationship; there is a tradeoff between generality
48 of concept and size of representation, esp. for the C++ representation types in
49 namespace `unoidl`) and consist of arbitrary sequences of name/value strings.
50 Each name/value string is encoded as a single UTF-8 string containing a name (an
51 arbitrary sequence of Unicode code points not containing `U+003D EQUALS SIGN`),
52 optionally followed by `U+003D EQUALS SIGN` and a value (an arbitrary sequence of
53 Unicode code points).  The only annotation name currently in use is "deprecated"
54 (without a value).
56 The following definitions are used throughout:
58 * `UInt16`: 2-byte value, LSB first
59 * `UInt32`: 4-byte value, LSB first
60 * `UInt64`: 8-byte value, LSB first
61 * Offset: `UInt32` value, counting bytes from start of file
62 * `NUL`-Name: zero or more non-`NUL` US-ASCII bytes followed by a `NUL` byte
63 * Len-String: UInt32 number of characters, with `0x80000000` bit 0, followed by
64    that many US-ASCII (for `UNOIDL` related names) resp. UTF-8 (for annotations)
65    bytes
66 * Idx-String: either an Offset (with `0x80000000` bit 1) of a Len-String, or a
67    Len-String
68 * Annotations: `UInt32` number `N` of annotations followed by `N * Idx-String`
69 * Entry: Offset of `NUL`-Name followed by Offset of payload
70 * Map: zero or more Entries
72 The file starts with an 8 byte header, followed by information about the root
73 map (`unoidl-write` generates files in a single depth-first pass, so the root map
74 itself is at the end of the file):
76 * 7 byte magic header `UNOIDL\xFF`
77 * version byte 0
78 * Offset of root Map
79 * `UInt32` number of entries of root Map
80 ...
82 Files generated by unoidl-write follow that by a
84     "\0** Created by LibreOffice " LIBO_VERSION_DOTTED " unoidl-write **\0"
86 banner (cf. `config_host/config_version.h.in`), as a debugging aid.  (Old versions
87 used `reg2unoidl` instead of `unoidl-write` in that banner.)
89 Layout of per-entry payload in the root or a module Map:
91 * kind byte:
93     * 0: module
94         * followed by:
95             * `UInt32` number `N1` of entries of Map
96             * `N1 * Entry`
98     * otherwise:
99         * `0x80` bit: 1 if published
100         * `0x40` bit: 1 if annotated
101         * `0x20` bit: flag (may only be 1 for certain kinds, see below)
102         * remaining bits:
104             * 1: enum type
105                 * followed by:
106                     * `UInt32` number N1 of members
107                     * `N1 * tuple` of:
108                         * `Idx-String`
109                         * `UInt32`
110                         * if annotated: Annotations
112             * 2: plain struct type (with base if flag is 1)
113                 * followed by:
114                     * if "with base": `Idx-String`
115                     * `UInt32` number `N1` of direct members
116                     * `N1 * tuple` of:
117                         * `Idx-String` name
118                         * `Idx-String` type
119                         * if annotated: Annotations
121             * 3: polymorphic struct type template
122                 * followed by:
123                     * `UInt32` number `N1` of type parameters
124                     * `N1 * Idx-String`
125                     * `UInt32` number `N2` of members
126                     * `N2 * tuple` of:
127                         * kind byte: `0x01` bit is 1 if parameterized type
128                         * `Idx-String` name
129                         * `Idx-String` type
130                         * if annotated: Annotations
132             * 4: exception type (with base if flag is 1)
133                 * followed by:
134                     * if "with base": `Idx-String`
135                     * `UInt32` number `N1` of direct members
136                     * `N1 * tuple` of:
137                         * `Idx-String` name
138                         * `Idx-String` type
139                         * if annotated: Annotations
141             * 5: interface type
142                 * followed by:
143                     * `UInt32` number `N1` of direct mandatory bases
144                     * `N1 * tuple` of:
145                         * `Idx-String`
146                         * if annotated: Annotations
147                     * `UInt32` number `N2` of direct optional bases
148                     * `N2 * tuple` of:
149                         * `Idx-String`
150                         * if annotated: Annotations
151                     * `UInt32` number `N3` of direct attributes
152                     * `N3 * tuple` of:
153                         * kind byte:
154                             * `0x02` bit: 1 if read-only
155                             * `0x01` bit: 1 if bound
156                         * `Idx-String` name
157                         * `Idx-String` type
158                         * `UInt32` number `N4` of get exceptions
159                         * `N4 * Idx-String`
160                         * `UInt32` number `N5` of set exceptions
161                         * `N5 * Idx-String`
162                         * if annotated: Annotations
163                     * `UInt32` number `N6` of direct methods
164                     * `N6 * tuple` of:
165                         * `Idx-String` name
166                         * `Idx-String` return type
167                         * `UInt32` number `N7` of parameters
168                         * `N7 * tuple` of:
169                             * direction byte: 0 for in, 1 for out, 2 for in-out
170                             * `Idx-String` name
171                             * `Idx-String` type
172                         * `UInt32` number `N8` of exceptions
173                         * N8 * Idx-String
174                         * if annotated: Annotations
176             * 6: typedef
177                 * followed by:
178                     * `Idx-String`
180             * 7: constant group
181                 * followed by:
182                     * `UInt32` number `N1` of entries of Map
183                     * `N1 * Entry`
185             * 8: single-interface--based service (with default constructor if flag is 1)
186                 * followed by:
187                     * `Idx-String`
188                     * if not "with default constructor":
189                         * `UInt32` number `N1` of constructors
190                         * `N1 * tuple` of:
191                             * `Idx-String`
192                             * `UInt32` number `N2` of parameters
193                             * `N2 * tuple` of
194                                 * kind byte: `0x04` bit is 1 if rest parameter
195                                 * `Idx-String` name
196                                 * `Idx-String` type
197                             * `UInt32` number `N3` of exceptions
198                             * `N3 * Idx-String`
199                             * if annotated: Annotations
201             * 9: accumulation-based service
202                 * followed by:
203                     * `UInt32` number `N1` of direct mandatory base services
204                     * `N1 * tuple` of:
205                         * `Idx-String`
206                         * if annotated: Annotations
207                     * `UInt32` number `N2` of direct optional base services
208                     * `N2 * tuple` of:
209                         * `Idx-String`
210                         * if annotated: Annotations
211                     * `UInt32` number `N3` of direct mandatory base interfaces
212                     * `N3 * tuple` of:
213                         * `Idx-String`
214                         * if annotated: Annotations
215                     * `UInt32` number `N4` of direct optional base interfaces
216                     * `N4 * tuple` of:
217                         * `Idx-String`
218                         * if annotated: Annotations
219                     * `UInt32` number `N5` of direct properties
220                     * `N5 * tuple` of:
221                         * `UInt16` kind:
222                             * `0x0100` bit: 1 if optional
223                             * `0x0080` bit: 1 if removable
224                             * `0x0040` bit: 1 if maybedefault
225                             * `0x0020` bit: 1 if maybeambiguous
226                             * `0x0010` bit: 1 if readonly
227                             * `0x0008` bit: 1 if transient
228                             * `0x0004` bit: 1 if constrained
229                             * `0x0002` bit: 1 if bound
230                             * `0x0001` bit: 1 if maybevoid
231                             * `Idx-String` name
232                             * `Idx-String` type
233                             * if annotated: Annotations
235             * 10: interface-based singleton
236                 * followed by:
237                 * `Idx-String`
239             * 11: service-based singleton
240                 * followed by:
241                     * `Idx-String`
243         * if annotated, followed by: Annotations
245 Layout of per-entry payload in a constant group Map:
247 * kind byte:
248     * `0x80` bit: 1 if annotated
249     * remaining bits:
251         * 0: `BOOLEAN`
252             * followed by value byte, 0 represents false, 1 represents true
254         * 1: `BYTE`
255             * followed by value byte, representing values with two's complement
257         * 2: `SHORT`
258             * followed by `UInt16` value, representing values with two's complement
260         * 3: `UNSIGNED SHORT`
261             * followed by `UInt16` value
263         * 4: `LONG`
264             * followed by `UInt32` value, representing values with two's complement
266         * 5: `UNSIGNED LONG`
267             * followed by `UInt32` value
269         * 6: `HYPER`
270             * followed by `UInt64` value, representing values with two's complement
272         * 7: `UNSIGNED HYPER`
273             * followed by `UInt64` value
275         * 8: `FLOAT`
276             * followed by 4-byte value, representing values in ISO 60599 binary32 format,
277       LSB first
279         * 9: `DOUBLE`
280             * followed by 8-byte value, representing values in ISO 60599 binary64 format,
281       LSB first
283 * if annotated, followed by: Annotations