manual/startup.texi

   1 @node Program Basics, Processes, Signal Handling, Top
   2 @c %MENU% Writing the beginning and end of your program
   3 @chapter The Basic Program/System Interface
   4
   5 @cindex process
   6 @cindex program
   7 @cindex address space
   8 @cindex thread of control
   9 @dfn{Processes} are the primitive units for allocation of system
  10 resources.  Each process has its own address space and (usually) one
  11 thread of control.  A process executes a program; you can have multiple
  12 processes executing the same program, but each process has its own copy
  13 of the program within its own address space and executes it
  14 independently of the other copies.  Though it may have multiple threads
  15 of control within the same program and a program may be composed of
  16 multiple logically separate modules, a process always executes exactly
  17 one program.
  18
  19 Note that we are using a specific definition of ``program'' for the
  20 purposes of this manual, which corresponds to a common definition in the
  21 context of Unix systems.  In popular usage, ``program'' enjoys a much
  22 broader definition; it can refer for example to a system's kernel, an
  23 editor macro, a complex package of software, or a discrete section of
  24 code executing within a process.
  25
  26 Writing the program is what this manual is all about.  This chapter
  27 explains the most basic interface between your program and the system
  28 that runs, or calls, it.  This includes passing of parameters (arguments
  29 and environment) from the system, requesting basic services from the
  30 system, and telling the system the program is done.
  31
  32 A program starts another program with the @code{exec} family of system calls.
  33 This chapter looks at program startup from the execee's point of view.  To
  34 see the event from the execor's point of view, see @ref{Executing a File}.
  35
  36 @menu
  37 * Program Arguments::           Parsing your program's command-line arguments
  38 * Environment Variables::       Less direct parameters affecting your program
  39 * Auxiliary Vector::            Least direct parameters affecting your program
  40 * System Calls::                Requesting service from the system
  41 * Program Termination::         Telling the system you're done; return status
  42 @end menu
  43
  44 @node Program Arguments, Environment Variables, , Program Basics
  45 @section Program Arguments
  46 @cindex program arguments
  47 @cindex command line arguments
  48 @cindex arguments, to program
  49
  50 @cindex program startup
  51 @cindex startup of program
  52 @cindex invocation of program
  53 @cindex @code{main} function
  54 @findex main
  55 The system starts a C program by calling the function @code{main}.  It
  56 is up to you to write a function named @code{main}---otherwise, you
  57 won't even be able to link your program without errors.
  58
  59 In @w{ISO C} you can define @code{main} either to take no arguments, or to
  60 take two arguments that represent the command line arguments to the
  61 program, like this:
  62
  63 @smallexample
  64 int main (int @var{argc}, char *@var{argv}[])
  65 @end smallexample
  66
  67 @cindex argc (program argument count)
  68 @cindex argv (program argument vector)
  69 The command line arguments are the whitespace-separated tokens given in
  70 the shell command used to invoke the program; thus, in @samp{cat foo
  71 bar}, the arguments are @samp{foo} and @samp{bar}.  The only way a
  72 program can look at its command line arguments is via the arguments of
  73 @code{main}.  If @code{main} doesn't take arguments, then you cannot get
  74 at the command line.
  75
  76 The value of the @var{argc} argument is the number of command line
  77 arguments.  The @var{argv} argument is a vector of C strings; its
  78 elements are the individual command line argument strings.  The file
  79 name of the program being run is also included in the vector as the
  80 first element; the value of @var{argc} counts this element.  A null
  81 pointer always follows the last element: @code{@var{argv}[@var{argc}]}
  82 is this null pointer.
  83
  84 For the command @samp{cat foo bar}, @var{argc} is 3 and @var{argv} has
  85 three elements, @code{"cat"}, @code{"foo"} and @code{"bar"}.
  86
  87 In Unix systems you can define @code{main} a third way, using three arguments:
  88
  89 @smallexample
  90 int main (int @var{argc}, char *@var{argv}[], char *@var{envp}[])
  91 @end smallexample
  92
  93 The first two arguments are just the same.  The third argument
  94 @var{envp} gives the program's environment; it is the same as the value
  95 of @code{environ}.  @xref{Environment Variables}.  POSIX.1 does not
  96 allow this three-argument form, so to be portable it is best to write
  97 @code{main} to take two arguments, and use the value of @code{environ}.
  98
  99 @menu
 100 * Argument Syntax::             By convention, options start with a hyphen.
 101 * Parsing Program Arguments::   Ways to parse program options and arguments.
 102 @end menu
 103
 104 @node Argument Syntax, Parsing Program Arguments, , Program Arguments
 105 @subsection Program Argument Syntax Conventions
 106 @cindex program argument syntax
 107 @cindex syntax, for program arguments
 108 @cindex command argument syntax
 109
 110 POSIX recommends these conventions for command line arguments.
 111 @code{getopt} (@pxref{Getopt}) and @code{argp_parse} (@pxref{Argp}) make
 112 it easy to implement them.
 113
 114 @itemize @bullet
 115 @item
 116 Arguments are options if they begin with a hyphen delimiter (@samp{-}).
 117
 118 @item
 119 Multiple options may follow a hyphen delimiter in a single token if
 120 the options do not take arguments.  Thus, @samp{-abc} is equivalent to
 121 @samp{-a -b -c}.
 122
 123 @item
 124 Option names are single alphanumeric characters (as for @code{isalnum};
 125 @pxref{Classification of Characters}).
 126
 127 @item
 128 Certain options require an argument.  For example, the @samp{-o} command
 129 of the @code{ld} command requires an argument---an output file name.
 130
 131 @item
 132 An option and its argument may or may not appear as separate tokens.  (In
 133 other words, the whitespace separating them is optional.)  Thus,
 134 @w{@samp{-o foo}} and @samp{-ofoo} are equivalent.
 135
 136 @item
 137 Options typically precede other non-option arguments.
 138
 139 The implementations of @code{getopt} and @code{argp_parse} in @theglibc{}
 140 normally make it appear as if all the option arguments were
 141 specified before all the non-option arguments for the purposes of
 142 parsing, even if the user of your program intermixed option and
 143 non-option arguments.  They do this by reordering the elements of the
 144 @var{argv} array.  This behavior is nonstandard; if you want to suppress
 145 it, define the @code{_POSIX_OPTION_ORDER} environment variable.
 146 @xref{Standard Environment}.
 147
 148 @item
 149 The argument @samp{--} terminates all options; any following arguments
 150 are treated as non-option arguments, even if they begin with a hyphen.
 151
 152 @item
 153 A token consisting of a single hyphen character is interpreted as an
 154 ordinary non-option argument.  By convention, it is used to specify
 155 input from or output to the standard input and output streams.
 156
 157 @item
 158 Options may be supplied in any order, or appear multiple times.  The
 159 interpretation is left up to the particular application program.
 160 @end itemize
 161
 162 @cindex long-named options
 163 GNU adds @dfn{long options} to these conventions.  Long options consist
 164 of @samp{--} followed by a name made of alphanumeric characters and
 165 dashes.  Option names are typically one to three words long, with
 166 hyphens to separate words.  Users can abbreviate the option names as
 167 long as the abbreviations are unique.
 168
 169 To specify an argument for a long option, write
 170 @samp{--@var{name}=@var{value}}.  This syntax enables a long option to
 171 accept an argument that is itself optional.
 172
 173 Eventually, @gnusystems{} will provide completion for long option names
 174 in the shell.
 175
 176 @node Parsing Program Arguments, , Argument Syntax, Program Arguments
 177 @subsection Parsing Program Arguments
 178
 179 @cindex program arguments, parsing
 180 @cindex command arguments, parsing
 181 @cindex parsing program arguments
 182 If the syntax for the command line arguments to your program is simple
 183 enough, you can simply pick the arguments off from @var{argv} by hand.
 184 But unless your program takes a fixed number of arguments, or all of the
 185 arguments are interpreted in the same way (as file names, for example),
 186 you are usually better off using @code{getopt} (@pxref{Getopt}) or
 187 @code{argp_parse} (@pxref{Argp}) to do the parsing.
 188
 189 @code{getopt} is more standard (the short-option only version of it is a
 190 part of the POSIX standard), but using @code{argp_parse} is often
 191 easier, both for very simple and very complex option structures, because
 192 it does more of the dirty work for you.
 193
 194 @menu
 195 * Getopt::                      Parsing program options using @code{getopt}.
 196 * Argp::                        Parsing program options using @code{argp_parse}.
 197 * Suboptions::                  Some programs need more detailed options.
 198 * Suboptions Example::          This shows how it could be done for @code{mount}.
 199 @end menu
 200
 201 @c Getopt and argp start at the @section level so that there's
 202 @c enough room for their internal hierarchy (mostly a problem with
 203 @c argp).         -Miles
 204
 205 @include getopt.texi
 206 @include argp.texi
 207
 208 @node Suboptions, Suboptions Example, Argp, Parsing Program Arguments
 209 @c This is a @section so that it's at the same level as getopt and argp
 210 @subsubsection Parsing of Suboptions
 211
 212 Having a single level of options is sometimes not enough.  There might
 213 be too many options which have to be available or a set of options is
 214 closely related.
 215
 216 For this case some programs use suboptions.  One of the most prominent
 217 programs is certainly @code{mount}(8).  The @code{-o} option take one
 218 argument which itself is a comma separated list of options.  To ease the
 219 programming of code like this the function @code{getsubopt} is
 220 available.
 221
 222 @deftypefun int getsubopt (char **@var{optionp}, char *const *@var{tokens}, char **@var{valuep})
 223 @standards{???, stdlib.h}
 224 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 225 @c getsubopt ok
 226 @c  strchrnul dup ok
 227 @c  memchr dup ok
 228 @c  strncmp dup ok
 229
 230 The @var{optionp} parameter must be a pointer to a variable containing
 231 the address of the string to process.  When the function returns, the
 232 reference is updated to point to the next suboption or to the
 233 terminating @samp{\0} character if there are no more suboptions available.
 234
 235 The @var{tokens} parameter references an array of strings containing the
 236 known suboptions.  All strings must be @samp{\0} terminated and to mark
 237 the end a null pointer must be stored.  When @code{getsubopt} finds a
 238 possible legal suboption it compares it with all strings available in
 239 the @var{tokens} array and returns the index in the string as the
 240 indicator.
 241
 242 In case the suboption has an associated value introduced by a @samp{=}
 243 character, a pointer to the value is returned in @var{valuep}.  The
 244 string is @samp{\0} terminated.  If no argument is available
 245 @var{valuep} is set to the null pointer.  By doing this the caller can
 246 check whether a necessary value is given or whether no unexpected value
 247 is present.
 248
 249 In case the next suboption in the string is not mentioned in the
 250 @var{tokens} array the starting address of the suboption including a
 251 possible value is returned in @var{valuep} and the return value of the
 252 function is @samp{-1}.
 253 @end deftypefun
 254
 255 @node Suboptions Example, , Suboptions, Parsing Program Arguments
 256 @subsection Parsing of Suboptions Example
 257
 258 The code which might appear in the @code{mount}(8) program is a perfect
 259 example of the use of @code{getsubopt}:
 260
 261 @smallexample
 262 @include subopt.c.texi
 263 @end smallexample
 264
 265
 266 @node Environment Variables, Auxiliary Vector, Program Arguments, Program Basics
 267 @section Environment Variables
 268
 269 @cindex environment variable
 270 When a program is executed, it receives information about the context in
 271 which it was invoked in two ways.  The first mechanism uses the
 272 @var{argv} and @var{argc} arguments to its @code{main} function, and is
 273 discussed in @ref{Program Arguments}.  The second mechanism uses
 274 @dfn{environment variables} and is discussed in this section.
 275
 276 The @var{argv} mechanism is typically used to pass command-line
 277 arguments specific to the particular program being invoked.  The
 278 environment, on the other hand, keeps track of information that is
 279 shared by many programs, changes infrequently, and that is less
 280 frequently used.
 281
 282 The environment variables discussed in this section are the same
 283 environment variables that you set using assignments and the
 284 @code{export} command in the shell.  Programs executed from the shell
 285 inherit all of the environment variables from the shell.
 286 @c !!! xref to right part of bash manual when it exists
 287
 288 @cindex environment
 289 Standard environment variables are used for information about the user's
 290 home directory, terminal type, current locale, and so on; you can define
 291 additional variables for other purposes.  The set of all environment
 292 variables that have values is collectively known as the
 293 @dfn{environment}.
 294
 295 Names of environment variables are case-sensitive and must not contain
 296 the character @samp{=}.  System-defined environment variables are
 297 invariably uppercase.
 298
 299 The values of environment variables can be anything that can be
 300 represented as a string.  A value must not contain an embedded null
 301 character, since this is assumed to terminate the string.
 302
 303
 304 @menu
 305 * Environment Access::          How to get and set the values of
 306                                  environment variables.
 307 * Standard Environment::        These environment variables have
 308                                  standard interpretations.
 309 @end menu
 310
 311 @node Environment Access
 312 @subsection Environment Access
 313 @cindex environment access
 314 @cindex environment representation
 315
 316 The value of an environment variable can be accessed with the
 317 @code{getenv} function.  This is declared in the header file
 318 @file{stdlib.h}.
 319 @pindex stdlib.h
 320
 321 Libraries should use @code{secure_getenv} instead of @code{getenv}, so
 322 that they do not accidentally use untrusted environment variables.
 323 Modifications of environment variables are not allowed in
 324 multi-threaded programs.  The @code{getenv} and @code{secure_getenv}
 325 functions can be safely used in multi-threaded programs.
 326
 327 @deftypefun {char *} getenv (const char *@var{name})
 328 @standards{ISO, stdlib.h}
 329 @safety{@prelim{}@mtsafe{@mtsenv{}}@assafe{}@acsafe{}}
 330 @c Unguarded access to __environ.
 331 This function returns a string that is the value of the environment
 332 variable @var{name}.  You must not modify this string.  In some non-Unix
 333 systems not using @theglibc{}, it might be overwritten by subsequent
 334 calls to @code{getenv} (but not by any other library function).  If the
 335 environment variable @var{name} is not defined, the value is a null
 336 pointer.
 337 @end deftypefun
 338
 339 @deftypefun {char *} secure_getenv (const char *@var{name})
 340 @standards{GNU, stdlib.h}
 341 @safety{@prelim{}@mtsafe{@mtsenv{}}@assafe{}@acsafe{}}
 342 @c Calls getenv unless secure mode is enabled.
 343 This function is similar to @code{getenv}, but it returns a null
 344 pointer if the environment is untrusted.  This happens when the
 345 program file has SUID or SGID bits set.  General-purpose libraries
 346 should always prefer this function over @code{getenv} to avoid
 347 vulnerabilities if the library is referenced from a SUID/SGID program.
 348
 349 This function is a GNU extension.
 350 @end deftypefun
 351
 352
 353 @deftypefun int putenv (char *@var{string})
 354 @standards{SVID, stdlib.h}
 355 @safety{@prelim{}@mtunsafe{@mtasuconst{:@mtsenv{}}}@asunsafe{@ascuheap{} @asulock{}}@acunsafe{@acucorrupt{} @aculock{} @acsmem{}}}
 356 @c putenv @mtasuconst:@mtsenv @ascuheap @asulock @acucorrupt @aculock @acsmem
 357 @c  strchr dup ok
 358 @c  strndup dup @ascuheap @acsmem
 359 @c  add_to_environ dup @mtasuconst:@mtsenv @ascuheap @asulock @acucorrupt @aculock @acsmem
 360 @c  free dup @ascuheap @acsmem
 361 @c  unsetenv dup @mtasuconst:@mtsenv @asulock @aculock
 362 The @code{putenv} function adds or removes definitions from the environment.
 363 If the @var{string} is of the form @samp{@var{name}=@var{value}}, the
 364 definition is added to the environment.  Otherwise, the @var{string} is
 365 interpreted as the name of an environment variable, and any definition
 366 for this variable in the environment is removed.
 367
 368 If the function is successful it returns @code{0}.  Otherwise the return
 369 value is nonzero and @code{errno} is set to indicate the error.
 370
 371 The difference to the @code{setenv} function is that the exact string
 372 given as the parameter @var{string} is put into the environment.  If the
 373 user should change the string after the @code{putenv} call this will
 374 reflect automatically in the environment.  This also requires that
 375 @var{string} not be an automatic variable whose scope is left before the
 376 variable is removed from the environment.  The same applies of course to
 377 dynamically allocated variables which are freed later.
 378
 379 This function is part of the extended Unix interface.  You should define
 380 @var{_XOPEN_SOURCE} before including any header.
 381 @end deftypefun
 382
 383
 384 @deftypefun int setenv (const char *@var{name}, const char *@var{value}, int @var{replace})
 385 @standards{BSD, stdlib.h}
 386 @safety{@prelim{}@mtunsafe{@mtasuconst{:@mtsenv{}}}@asunsafe{@ascuheap{} @asulock{}}@acunsafe{@acucorrupt{} @aculock{} @acsmem{}}}
 387 @c setenv @mtasuconst:@mtsenv @ascuheap @asulock @acucorrupt @aculock @acsmem
 388 @c  add_to_environ @mtasuconst:@mtsenv @ascuheap @asulock @acucorrupt @aculock @acsmem
 389 @c   strlen dup ok
 390 @c   libc_lock_lock @asulock @aculock
 391 @c   strncmp dup ok
 392 @c   realloc dup @ascuheap @acsmem
 393 @c   libc_lock_unlock @aculock
 394 @c   malloc dup @ascuheap @acsmem
 395 @c   free dup @ascuheap @acsmem
 396 @c   mempcpy dup ok
 397 @c   memcpy dup ok
 398 @c   KNOWN_VALUE ok
 399 @c    tfind(strcmp) [no @mtsrace guarded access]
 400 @c     strcmp dup ok
 401 @c   STORE_VALUE @ascuheap @acucorrupt @acsmem
 402 @c    tsearch(strcmp) @ascuheap @acucorrupt @acsmem [no @mtsrace or @asucorrupt guarded access makes for mtsafe and @asulock]
 403 @c     strcmp dup ok
 404 The @code{setenv} function can be used to add a new definition to the
 405 environment.  The entry with the name @var{name} is replaced by the
 406 value @samp{@var{name}=@var{value}}.  Please note that this is also true
 407 if @var{value} is the empty string.  To do this a new string is created
 408 and the strings @var{name} and @var{value} are copied.  A null pointer
 409 for the @var{value} parameter is illegal.  If the environment already
 410 contains an entry with key @var{name} the @var{replace} parameter
 411 controls the action.  If replace is zero, nothing happens.  Otherwise
 412 the old entry is replaced by the new one.
 413
 414 Please note that you cannot remove an entry completely using this function.
 415
 416 If the function is successful it returns @code{0}.  Otherwise the
 417 environment is unchanged and the return value is @code{-1} and
 418 @code{errno} is set.
 419
 420 This function was originally part of the BSD library but is now part of
 421 the Unix standard.
 422 @end deftypefun
 423
 424 @deftypefun int unsetenv (const char *@var{name})
 425 @standards{BSD, stdlib.h}
 426 @safety{@prelim{}@mtunsafe{@mtasuconst{:@mtsenv{}}}@asunsafe{@asulock{}}@acunsafe{@aculock{}}}
 427 @c unsetenv @mtasuconst:@mtsenv @asulock @aculock
 428 @c  strchr dup ok
 429 @c  strlen dup ok
 430 @c  libc_lock_lock @asulock @aculock
 431 @c  strncmp dup ok
 432 @c  libc_lock_unlock @aculock
 433 Using this function one can remove an entry completely from the
 434 environment.  If the environment contains an entry with the key
 435 @var{name} this whole entry is removed.  A call to this function is
 436 equivalent to a call to @code{putenv} when the @var{value} part of the
 437 string is empty.
 438
 439 The function returns @code{-1} if @var{name} is a null pointer, points to
 440 an empty string, or points to a string containing a @code{=} character.
 441 It returns @code{0} if the call succeeded.
 442
 443 This function was originally part of the BSD library but is now part of
 444 the Unix standard.  The BSD version had no return value, though.
 445 @end deftypefun
 446
 447 There is one more function to modify the whole environment.  This
 448 function is said to be used in the POSIX.9 (POSIX bindings for Fortran
 449 77) and so one should expect it did made it into POSIX.1.  But this
 450 never happened.  But we still provide this function as a GNU extension
 451 to enable writing standard compliant Fortran environments.
 452
 453 @deftypefun int clearenv (void)
 454 @standards{GNU, stdlib.h}
 455 @safety{@prelim{}@mtunsafe{@mtasuconst{:@mtsenv{}}}@asunsafe{@ascuheap{} @asulock{}}@acunsafe{@aculock{} @acsmem{}}}
 456 @c clearenv @mtasuconst:@mtsenv @ascuheap @asulock @aculock @acsmem
 457 @c  libc_lock_lock @asulock @aculock
 458 @c  free dup @ascuheap @acsmem
 459 @c  libc_lock_unlock @aculock
 460 The @code{clearenv} function removes all entries from the environment.
 461 Using @code{putenv} and @code{setenv} new entries can be added again
 462 later.
 463
 464 If the function is successful it returns @code{0}.  Otherwise the return
 465 value is nonzero.
 466 @end deftypefun
 467
 468
 469 You can deal directly with the underlying representation of environment
 470 objects to add more variables to the environment (for example, to
 471 communicate with another program you are about to execute;
 472 @pxref{Executing a File}).
 473
 474 @deftypevar {char **} environ
 475 @standards{POSIX.1, unistd.h}
 476 The environment is represented as an array of strings.  Each string is
 477 of the format @samp{@var{name}=@var{value}}.  The order in which
 478 strings appear in the environment is not significant, but the same
 479 @var{name} must not appear more than once.  The last element of the
 480 array is a null pointer.
 481
 482 This variable is declared in the header file @file{unistd.h}.
 483
 484 If you just want to get the value of an environment variable, use
 485 @code{getenv}.
 486 @end deftypevar
 487
 488 Unix systems, and @gnusystems{}, pass the initial value of
 489 @code{environ} as the third argument to @code{main}.
 490 @xref{Program Arguments}.
 491
 492 @node Standard Environment
 493 @subsection Standard Environment Variables
 494 @cindex standard environment variables
 495
 496 These environment variables have standard meanings.  This doesn't mean
 497 that they are always present in the environment; but if these variables
 498 @emph{are} present, they have these meanings.  You shouldn't try to use
 499 these environment variable names for some other purpose.
 500
 501 @comment Extra blank lines make it look better.
 502 @table @code
 503 @item HOME
 504 @cindex @code{HOME} environment variable
 505 @cindex home directory
 506
 507 This is a string representing the user's @dfn{home directory}, or
 508 initial default working directory.
 509
 510 The user can set @code{HOME} to any value.
 511 If you need to make sure to obtain the proper home directory
 512 for a particular user, you should not use @code{HOME}; instead,
 513 look up the user's name in the user database (@pxref{User Database}).
 514
 515 For most purposes, it is better to use @code{HOME}, precisely because
 516 this lets the user specify the value.
 517
 518 @c !!! also USER
 519 @item LOGNAME
 520 @cindex @code{LOGNAME} environment variable
 521
 522 This is the name that the user used to log in.  Since the value in the
 523 environment can be tweaked arbitrarily, this is not a reliable way to
 524 identify the user who is running a program; a function like
 525 @code{getlogin} (@pxref{Who Logged In}) is better for that purpose.
 526
 527 For most purposes, it is better to use @code{LOGNAME}, precisely because
 528 this lets the user specify the value.
 529
 530 @item PATH
 531 @cindex @code{PATH} environment variable
 532
 533 A @dfn{path} is a sequence of directory names which is used for
 534 searching for a file.  The variable @code{PATH} holds a path used
 535 for searching for programs to be run.
 536
 537 The @code{execlp} and @code{execvp} functions (@pxref{Executing a File})
 538 use this environment variable, as do many shells and other utilities
 539 which are implemented in terms of those functions.
 540
 541 The syntax of a path is a sequence of directory names separated by
 542 colons.  An empty string instead of a directory name stands for the
 543 current directory (@pxref{Working Directory}).
 544
 545 A typical value for this environment variable might be a string like:
 546
 547 @smallexample
 548 :/bin:/etc:/usr/bin:/usr/new/X11:/usr/new:/usr/local/bin
 549 @end smallexample
 550
 551 This means that if the user tries to execute a program named @code{foo},
 552 the system will look for files named @file{foo}, @file{/bin/foo},
 553 @file{/etc/foo}, and so on.  The first of these files that exists is
 554 the one that is executed.
 555
 556 @c !!! also TERMCAP
 557 @item TERM
 558 @cindex @code{TERM} environment variable
 559
 560 This specifies the kind of terminal that is receiving program output.
 561 Some programs can make use of this information to take advantage of
 562 special escape sequences or terminal modes supported by particular kinds
 563 of terminals.  Many programs which use the termcap library
 564 (@pxref{Finding a Terminal Description,Find,,termcap,The Termcap Library
 565 Manual}) use the @code{TERM} environment variable, for example.
 566
 567 @item TZ
 568 @cindex @code{TZ} environment variable
 569
 570 This specifies the time zone.  @xref{TZ Variable}, for information about
 571 the format of this string and how it is used.
 572
 573 @item LANG
 574 @cindex @code{LANG} environment variable
 575
 576 This specifies the default locale to use for attribute categories where
 577 neither @code{LC_ALL} nor the specific environment variable for that
 578 category is set.  @xref{Locales}, for more information about
 579 locales.
 580
 581 @ignore
 582 @c I doubt this really exists
 583 @item LC_ALL
 584 @cindex @code{LC_ALL} environment variable
 585
 586 This is similar to the @code{LANG} environment variable.  However, its
 587 value takes precedence over any values provided for the individual
 588 attribute category environment variables, or for the @code{LANG}
 589 environment variable.
 590 @end ignore
 591
 592 @item LC_ALL
 593 @cindex @code{LC_ALL} environment variable
 594
 595 If this environment variable is set it overrides the selection for all
 596 the locales done using the other @code{LC_*} environment variables.  The
 597 value of the other @code{LC_*} environment variables is simply ignored
 598 in this case.
 599
 600 @item LC_COLLATE
 601 @cindex @code{LC_COLLATE} environment variable
 602
 603 This specifies what locale to use for string sorting.
 604
 605 @item LC_CTYPE
 606 @cindex @code{LC_CTYPE} environment variable
 607
 608 This specifies what locale to use for character sets and character
 609 classification.
 610
 611 @item LC_MESSAGES
 612 @cindex @code{LC_MESSAGES} environment variable
 613
 614 This specifies what locale to use for printing messages and to parse
 615 responses.
 616
 617 @item LC_MONETARY
 618 @cindex @code{LC_MONETARY} environment variable
 619
 620 This specifies what locale to use for formatting monetary values.
 621
 622 @item LC_NUMERIC
 623 @cindex @code{LC_NUMERIC} environment variable
 624
 625 This specifies what locale to use for formatting numbers.
 626
 627 @item LC_TIME
 628 @cindex @code{LC_TIME} environment variable
 629
 630 This specifies what locale to use for formatting date/time values.
 631
 632 @item NLSPATH
 633 @cindex @code{NLSPATH} environment variable
 634
 635 This specifies the directories in which the @code{catopen} function
 636 looks for message translation catalogs.
 637
 638 @item _POSIX_OPTION_ORDER
 639 @cindex @code{_POSIX_OPTION_ORDER} environment variable.
 640
 641 If this environment variable is defined, it suppresses the usual
 642 reordering of command line arguments by @code{getopt} and
 643 @code{argp_parse}.  @xref{Argument Syntax}.
 644
 645 @c !!! GNU also has COREFILE, CORESERVER, EXECSERVERS
 646 @end table
 647
 648 @node Auxiliary Vector
 649 @section Auxiliary Vector
 650 @cindex auxiliary vector
 651
 652 When a program is executed, it receives information from the operating
 653 system about the environment in which it is operating.  The form of this
 654 information is a table of key-value pairs, where the keys are from the
 655 set of @samp{AT_} values in @file{elf.h}.  Some of the data is provided
 656 by the kernel for libc consumption, and may be obtained by ordinary
 657 interfaces, such as @code{sysconf}.  However, on a platform-by-platform
 658 basis there may be information that is not available any other way.
 659
 660 @subsection Definition of @code{getauxval}
 661 @deftypefun {unsigned long int} getauxval (unsigned long int @var{type})
 662 @standards{???, sys/auxv.h}
 663 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 664 @c Reads from hwcap or iterates over constant auxv.
 665 This function is used to inquire about the entries in the auxiliary
 666 vector.  The @var{type} argument should be one of the @samp{AT_} symbols
 667 defined in @file{elf.h}.  If a matching entry is found, the value is
 668 returned; if the entry is not found, zero is returned and @code{errno} is
 669 set to @code{ENOENT}.
 670 @end deftypefun
 671
 672 For some platforms, the key @code{AT_HWCAP} is the easiest way to inquire
 673 about any instruction set extensions available at runtime.  In this case,
 674 there will (of necessity) be a platform-specific set of @samp{HWCAP_}
 675 values masked together that describe the capabilities of the cpu on which
 676 the program is being executed.
 677
 678 @node System Calls
 679 @section System Calls
 680
 681 @cindex system call
 682 A system call is a request for service that a program makes of the
 683 kernel.  The service is generally something that only the kernel has
 684 the privilege to do, such as doing I/O.  Programmers don't normally
 685 need to be concerned with system calls because there are functions in
 686 @theglibc{} to do virtually everything that system calls do.
 687 These functions work by making system calls themselves.  For example,
 688 there is a system call that changes the permissions of a file, but
 689 you don't need to know about it because you can just use @theglibc{}'s
 690 @code{chmod} function.
 691
 692 @cindex kernel call
 693 System calls are sometimes called kernel calls.
 694
 695 However, there are times when you want to make a system call explicitly,
 696 and for that, @theglibc{} provides the @code{syscall} function.
 697 @code{syscall} is harder to use and less portable than functions like
 698 @code{chmod}, but easier and more portable than coding the system call
 699 in assembler instructions.
 700
 701 @code{syscall} is most useful when you are working with a system call
 702 which is special to your system or is newer than @theglibc{} you
 703 are using.  @code{syscall} is implemented in an entirely generic way;
 704 the function does not know anything about what a particular system
 705 call does or even if it is valid.
 706
 707 The description of @code{syscall} in this section assumes a certain
 708 protocol for system calls on the various platforms on which @theglibc{}
 709 runs.  That protocol is not defined by any strong authority, but
 710 we won't describe it here either because anyone who is coding
 711 @code{syscall} probably won't accept anything less than kernel and C
 712 library source code as a specification of the interface between them
 713 anyway.
 714
 715
 716 @code{syscall} is declared in @file{unistd.h}.
 717
 718 @deftypefun {long int} syscall (long int @var{sysno}, @dots{})
 719 @standards{???, unistd.h}
 720 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 721
 722 @code{syscall} performs a generic system call.
 723
 724 @cindex system call number
 725 @var{sysno} is the system call number.  Each kind of system call is
 726 identified by a number.  Macros for all the possible system call numbers
 727 are defined in @file{sys/syscall.h}
 728
 729 The remaining arguments are the arguments for the system call, in
 730 order, and their meanings depend on the kind of system call.  Each kind
 731 of system call has a definite number of arguments, from zero to five.
 732 If you code more arguments than the system call takes, the extra ones to
 733 the right are ignored.
 734
 735 The return value is the return value from the system call, unless the
 736 system call failed.  In that case, @code{syscall} returns @code{-1} and
 737 sets @code{errno} to an error code that the system call returned.  Note
 738 that system calls do not return @code{-1} when they succeed.
 739 @cindex errno
 740
 741 If you specify an invalid @var{sysno}, @code{syscall} returns @code{-1}
 742 with @code{errno} = @code{ENOSYS}.
 743
 744 Example:
 745
 746 @smallexample
 747
 748 #include <unistd.h>
 749 #include <sys/syscall.h>
 750 #include <errno.h>
 751
 752 @dots{}
 753
 754 int rc;
 755
 756 rc = syscall(SYS_chmod, "/etc/passwd", 0444);
 757
 758 if (rc == -1)
 759    fprintf(stderr, "chmod failed, errno = %d\n", errno);
 760
 761 @end smallexample
 762
 763 This, if all the compatibility stars are aligned, is equivalent to the
 764 following preferable code:
 765
 766 @smallexample
 767
 768 #include <sys/types.h>
 769 #include <sys/stat.h>
 770 #include <errno.h>
 771
 772 @dots{}
 773
 774 int rc;
 775
 776 rc = chmod("/etc/passwd", 0444);
 777 if (rc == -1)
 778    fprintf(stderr, "chmod failed, errno = %d\n", errno);
 779
 780 @end smallexample
 781
 782 @end deftypefun
 783
 784
 785 @node Program Termination
 786 @section Program Termination
 787 @cindex program termination
 788 @cindex process termination
 789
 790 @cindex exit status value
 791 The usual way for a program to terminate is simply for its @code{main}
 792 function to return.  The @dfn{exit status value} returned from the
 793 @code{main} function is used to report information back to the process's
 794 parent process or shell.
 795
 796 A program can also terminate normally by calling the @code{exit}
 797 function.
 798
 799 In addition, programs can be terminated by signals; this is discussed in
 800 more detail in @ref{Signal Handling}.  The @code{abort} function causes
 801 a signal that kills the program.
 802
 803 @menu
 804 * Normal Termination::          If a program calls @code{exit}, a
 805                                  process terminates normally.
 806 * Exit Status::                 The @code{exit status} provides information
 807                                  about why the process terminated.
 808 * Cleanups on Exit::            A process can run its own cleanup
 809                                  functions upon normal termination.
 810 * Aborting a Program::          The @code{abort} function causes
 811                                  abnormal program termination.
 812 * Termination Internals::       What happens when a process terminates.
 813 @end menu
 814
 815 @node Normal Termination
 816 @subsection Normal Termination
 817
 818 A process terminates normally when its program signals it is done by
 819 calling @code{exit}.  Returning from @code{main} is equivalent to
 820 calling @code{exit}, and the value that @code{main} returns is used as
 821 the argument to @code{exit}.
 822
 823 @deftypefun void exit (int @var{status})
 824 @standards{ISO, stdlib.h}
 825 @safety{@prelim{}@mtunsafe{@mtasurace{:exit}}@asunsafe{@asucorrupt{}}@acunsafe{@acucorrupt{} @aculock{}}}
 826 @c Access to the atexit/on_exit list, the libc_atexit hook and tls dtors
 827 @c is not guarded.  Streams must be flushed, and that triggers the usual
 828 @c AS and AC issues with streams.
 829 The @code{exit} function tells the system that the program is done, which
 830 causes it to terminate the process.
 831
 832 @var{status} is the program's exit status, which becomes part of the
 833 process' termination status.  This function does not return.
 834 @end deftypefun
 835
 836 Normal termination causes the following actions:
 837
 838 @enumerate
 839 @item
 840 Functions that were registered with the @code{atexit} or @code{on_exit}
 841 functions are called in the reverse order of their registration.  This
 842 mechanism allows your application to specify its own ``cleanup'' actions
 843 to be performed at program termination.  Typically, this is used to do
 844 things like saving program state information in a file, or unlocking
 845 locks in shared data bases.
 846
 847 @item
 848 All open streams are closed, writing out any buffered output data.  See
 849 @ref{Closing Streams}.  In addition, temporary files opened
 850 with the @code{tmpfile} function are removed; see @ref{Temporary Files}.
 851
 852 @item
 853 @code{_exit} is called, terminating the program.  @xref{Termination Internals}.
 854 @end enumerate
 855
 856 @node Exit Status
 857 @subsection Exit Status
 858 @cindex exit status
 859
 860 When a program exits, it can return to the parent process a small
 861 amount of information about the cause of termination, using the
 862 @dfn{exit status}.  This is a value between 0 and 255 that the exiting
 863 process passes as an argument to @code{exit}.
 864
 865 Normally you should use the exit status to report very broad information
 866 about success or failure.  You can't provide a lot of detail about the
 867 reasons for the failure, and most parent processes would not want much
 868 detail anyway.
 869
 870 There are conventions for what sorts of status values certain programs
 871 should return.  The most common convention is simply 0 for success and 1
 872 for failure.  Programs that perform comparison use a different
 873 convention: they use status 1 to indicate a mismatch, and status 2 to
 874 indicate an inability to compare.  Your program should follow an
 875 existing convention if an existing convention makes sense for it.
 876
 877 A general convention reserves status values 128 and up for special
 878 purposes.  In particular, the value 128 is used to indicate failure to
 879 execute another program in a subprocess.  This convention is not
 880 universally obeyed, but it is a good idea to follow it in your programs.
 881
 882 @strong{Warning:} Don't try to use the number of errors as the exit
 883 status.  This is actually not very useful; a parent process would
 884 generally not care how many errors occurred.  Worse than that, it does
 885 not work, because the status value is truncated to eight bits.
 886 Thus, if the program tried to report 256 errors, the parent would
 887 receive a report of 0 errors---that is, success.
 888
 889 For the same reason, it does not work to use the value of @code{errno}
 890 as the exit status---these can exceed 255.
 891
 892 @strong{Portability note:} Some non-POSIX systems use different
 893 conventions for exit status values.  For greater portability, you can
 894 use the macros @code{EXIT_SUCCESS} and @code{EXIT_FAILURE} for the
 895 conventional status value for success and failure, respectively.  They
 896 are declared in the file @file{stdlib.h}.
 897 @pindex stdlib.h
 898
 899 @deftypevr Macro int EXIT_SUCCESS
 900 @standards{ISO, stdlib.h}
 901 This macro can be used with the @code{exit} function to indicate
 902 successful program completion.
 903
 904 On POSIX systems, the value of this macro is @code{0}.  On other
 905 systems, the value might be some other (possibly non-constant) integer
 906 expression.
 907 @end deftypevr
 908
 909 @deftypevr Macro int EXIT_FAILURE
 910 @standards{ISO, stdlib.h}
 911 This macro can be used with the @code{exit} function to indicate
 912 unsuccessful program completion in a general sense.
 913
 914 On POSIX systems, the value of this macro is @code{1}.  On other
 915 systems, the value might be some other (possibly non-constant) integer
 916 expression.  Other nonzero status values also indicate failures.  Certain
 917 programs use different nonzero status values to indicate particular
 918 kinds of "non-success".  For example, @code{diff} uses status value
 919 @code{1} to mean that the files are different, and @code{2} or more to
 920 mean that there was difficulty in opening the files.
 921 @end deftypevr
 922
 923 Don't confuse a program's exit status with a process' termination status.
 924 There are lots of ways a process can terminate besides having its program
 925 finish.  In the event that the process termination @emph{is} caused by program
 926 termination (i.e., @code{exit}), though, the program's exit status becomes
 927 part of the process' termination status.
 928
 929 @node Cleanups on Exit
 930 @subsection Cleanups on Exit
 931
 932 Your program can arrange to run its own cleanup functions if normal
 933 termination happens.  If you are writing a library for use in various
 934 application programs, then it is unreliable to insist that all
 935 applications call the library's cleanup functions explicitly before
 936 exiting.  It is much more robust to make the cleanup invisible to the
 937 application, by setting up a cleanup function in the library itself
 938 using @code{atexit} or @code{on_exit}.
 939
 940 @deftypefun int atexit (void (*@var{function}) (void))
 941 @standards{ISO, stdlib.h}
 942 @safety{@prelim{}@mtsafe{}@asunsafe{@ascuheap{} @asulock{}}@acunsafe{@aculock{} @acsmem{}}}
 943 @c atexit @ascuheap @asulock @aculock @acsmem
 944 @c  cxa_atexit @ascuheap @asulock @aculock @acsmem
 945 @c   __internal_atexit @ascuheap @asulock @aculock @acsmem
 946 @c    __new_exitfn @ascuheap @asulock @aculock @acsmem
 947 @c     __libc_lock_lock @asulock @aculock
 948 @c     calloc dup @ascuheap @acsmem
 949 @c     __libc_lock_unlock @aculock
 950 @c    atomic_write_barrier dup ok
 951 The @code{atexit} function registers the function @var{function} to be
 952 called at normal program termination.  The @var{function} is called with
 953 no arguments.
 954
 955 The return value from @code{atexit} is zero on success and nonzero if
 956 the function cannot be registered.
 957 @end deftypefun
 958
 959 @deftypefun int on_exit (void (*@var{function})(int @var{status}, void *@var{arg}), void *@var{arg})
 960 @standards{SunOS, stdlib.h}
 961 @safety{@prelim{}@mtsafe{}@asunsafe{@ascuheap{} @asulock{}}@acunsafe{@aculock{} @acsmem{}}}
 962 @c on_exit @ascuheap @asulock @aculock @acsmem
 963 @c  new_exitfn dup @ascuheap @asulock @aculock @acsmem
 964 @c  atomic_write_barrier dup ok
 965 This function is a somewhat more powerful variant of @code{atexit}.  It
 966 accepts two arguments, a function @var{function} and an arbitrary
 967 pointer @var{arg}.  At normal program termination, the @var{function} is
 968 called with two arguments:  the @var{status} value passed to @code{exit},
 969 and the @var{arg}.
 970
 971 This function is included in @theglibc{} only for compatibility
 972 for SunOS, and may not be supported by other implementations.
 973 @end deftypefun
 974
 975 Here's a trivial program that illustrates the use of @code{exit} and
 976 @code{atexit}:
 977
 978 @smallexample
 979 @include atexit.c.texi
 980 @end smallexample
 981
 982 @noindent
 983 When this program is executed, it just prints the message and exits.
 984
 985 @node Aborting a Program
 986 @subsection Aborting a Program
 987 @cindex aborting a program
 988
 989 You can abort your program using the @code{abort} function.  The prototype
 990 for this function is in @file{stdlib.h}.
 991 @pindex stdlib.h
 992
 993 @deftypefun void abort (void)
 994 @standards{ISO, stdlib.h}
 995 @safety{@prelim{}@mtsafe{}@asunsafe{@asucorrupt{}}@acunsafe{@aculock{} @acucorrupt{}}}
 996 @c The implementation takes a recursive lock and attempts to support
 997 @c calls from signal handlers, but if we're in the middle of flushing or
 998 @c using streams, we may encounter them in inconsistent states.
 999 The @code{abort} function causes abnormal program termination.  This
1000 does not execute cleanup functions registered with @code{atexit} or
1001 @code{on_exit}.
1002
1003 This function actually terminates the process by raising a
1004 @code{SIGABRT} signal, and your program can include a handler to
1005 intercept this signal; see @ref{Signal Handling}.
1006 @end deftypefun
1007
1008 @c Put in by rms.  Don't remove.
1009 @cartouche
1010 @strong{Future Change Warning:} Proposed Federal censorship regulations
1011 may prohibit us from giving you information about the possibility of
1012 calling this function.  We would be required to say that this is not an
1013 acceptable way of terminating a program.
1014 @end cartouche
1015
1016 @node Termination Internals
1017 @subsection Termination Internals
1018
1019 The @code{_exit} function is the primitive used for process termination
1020 by @code{exit}.  It is declared in the header file @file{unistd.h}.
1021 @pindex unistd.h
1022
1023 @deftypefun void _exit (int @var{status})
1024 @standards{POSIX.1, unistd.h}
1025 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1026 @c Direct syscall (exit_group or exit); calls __task_terminate on hurd,
1027 @c and abort in the generic posix implementation.
1028 The @code{_exit} function is the primitive for causing a process to
1029 terminate with status @var{status}.  Calling this function does not
1030 execute cleanup functions registered with @code{atexit} or
1031 @code{on_exit}.
1032 @end deftypefun
1033
1034 @deftypefun void _Exit (int @var{status})
1035 @standards{ISO, stdlib.h}
1036 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1037 @c Alias for _exit.
1038 The @code{_Exit} function is the @w{ISO C} equivalent to @code{_exit}.
1039 The @w{ISO C} committee members were not sure whether the definitions of
1040 @code{_exit} and @code{_Exit} were compatible so they have not used the
1041 POSIX name.
1042
1043 This function was introduced in @w{ISO C99} and is declared in
1044 @file{stdlib.h}.
1045 @end deftypefun
1046
1047 When a process terminates for any reason---either because the program
1048 terminates, or as a result of a signal---the
1049 following things happen:
1050
1051 @itemize @bullet
1052 @item
1053 All open file descriptors in the process are closed.  @xref{Low-Level I/O}.
1054 Note that streams are not flushed automatically when the process
1055 terminates; see @ref{I/O on Streams}.
1056
1057 @item
1058 A process exit status is saved to be reported back to the parent process
1059 via @code{wait} or @code{waitpid}; see @ref{Process Completion}.  If the
1060 program exited, this status includes as its low-order 8 bits the program
1061 exit status.
1062
1063
1064 @item
1065 Any child processes of the process being terminated are assigned a new
1066 parent process.  (On most systems, including GNU, this is the @code{init}
1067 process, with process ID 1.)
1068
1069 @item
1070 A @code{SIGCHLD} signal is sent to the parent process.
1071
1072 @item
1073 If the process is a session leader that has a controlling terminal, then
1074 a @code{SIGHUP} signal is sent to each process in the foreground job,
1075 and the controlling terminal is disassociated from that session.
1076 @xref{Job Control}.
1077
1078 @item
1079 If termination of a process causes a process group to become orphaned,
1080 and any member of that process group is stopped, then a @code{SIGHUP}
1081 signal and a @code{SIGCONT} signal are sent to each process in the
1082 group.  @xref{Job Control}.
1083 @end itemize