xorcyst.texinfo

   1 % $Id: xorcyst.texinfo,v 1.10 2005/01/09 11:20:49 kenth Exp $
   2 % $Log: xorcyst.texinfo,v $
   3 % Revision 1.10  2005/01/09 11:20:49  kenth
   4 % xorcyst 1.4.5
   5 %
   6 % Revision 1.9  2005/01/05 09:40:48  kenth
   7 % xorcyst 1.4.4
   8 %
   9 % Revision 1.8  2005/01/05 02:29:50  kenth
  10 % xorcyst 1.4.3
  11 %
  12 % Revision 1.7  2004/12/29 21:51:58  kenth
  13 % xorcyst 1.4.2
  14 %
  15 % Revision 1.6  2004/12/25 02:25:51  kenth
  16 % xorcyst 1.4.1
  17 %
  18 % Revision 1.5  2004/12/19 20:56:25  kenth
  19 % xorcyst 1.4.0
  20 %
  21 % Revision 1.4  2004/12/16 13:26:22  kenth
  22 % xorcyst 1.3.5
  23 %
  24 % Revision 1.3  2004/12/14 01:51:07  kenth
  25 % xorcyst 1.3.0
  26 %
  27 % Revision 1.2  2004/12/11 02:12:02  kenth
  28 % xorcyst 1.2.0
  29 %
  30 % Revision 1.1  2004/12/10 20:49:34  kenth
  31 % Initial revision
  32 %
  33
  34 \input texinfo @c -*-texinfo-*-
  35 @c %**start of header
  36 @setfilename xorcyst.info
  37 @settitle The XORcyst Manual
  38 @c %**end of header
  39
  40 @copying
  41 This is the manual for The XORcyst version 1.4.5.
  42
  43 Copyright @copyright{} 2004, 2005, 2007, 2008, 2009 Kent Hansen.
  44 @end copying
  45
  46 @titlepage
  47 @title The XORcyst Manual
  48
  49 @c The following two commands start the copyright page.
  50 @page
  51 @vskip 0pt plus 1filll
  52 @insertcopying
  53 @end titlepage
  54
  55 @c Output the table of contents at the beginning.
  56 @contents
  57
  58 @ifnottex
  59 @node Top
  60 @top The XORcyst Manual
  61
  62 @insertcopying
  63 @end ifnottex
  64
  65 @menu
  66 * What's New::       An overview of the latest improvements.
  67 * Overview::         What is this thing?
  68 * The Assembler::    Describes the use and operation of the XORcyst assembler.
  69 * The Linker::       Describes the use and operation of the XORcyst linker.
  70 * Implementation Details:: Nice-to-know technical details concerning The XORcyst's implementation.
  71 * Known Bugs and Limitations:: Known bugs and limitations.
  72 * Assembler Directives:: Assembler directives.
  73 * Linker Script Commands:: Linker script commands.
  74 * Object Code Format:: Describes the format of the assembler's output.
  75 * Custom Character Maps :: Describes the valid contents of custom character maps.
  76 * Error and Warning Messages :: Alphabetical listing.
  77 @end menu
  78
  79 @node What's New
  80 @chapter What's New
  81
  82 @heading Version 1.5.2
  83
  84 More bug fixes.
  85
  86 @heading Version 1.5.0
  87
  88 Added some stuff, fixed some bugs.
  89
  90 @heading Version 1.4.5
  91
  92 @strong{Assembler}
  93
  94 @itemize
  95
  96 @item Fixed bug that prevented local labels from being used as operand to @code{DB}, @code{DW}, @code{DD} directives.
  97
  98 @item Fixed bug in processing of array of operands to @code{DB}, @code{DW}, @code{DD} directives (some high-level constructs were only reduced in first item).
  99
 100 @item Negative immediate operand no longer gives truncation warning as long as it fits in signed byte (@code{DB}, immediate mode instructions) or word (@code{DW}).
 101
 102 @item Added @code{BLT}, @code{BGE} as aliases for @code{BCC}, @code{BCS} (unsigned comparison).
 103
 104 @end itemize
 105
 106 @strong{Linker}
 107
 108 @itemize
 109
 110 @item Prints physical addresses of relocated public symbols when @code{--verbose}.
 111
 112 @end itemize
 113
 114 @heading Version 1.4.4
 115
 116 @strong{Linker}
 117
 118 @itemize
 119
 120 @item Fixed bug in RAM allocator.
 121
 122 @item Prints statistics on RAM management (total, used, left) when @code{--verbose}.
 123
 124 @end itemize
 125
 126 @heading Version 1.4.3
 127
 128 @strong{Assembler}
 129
 130 @itemize
 131
 132 @item Support for anonymous unions.
 133
 134 @item Fixed bug in result of @code{sizeof} operator when applied to an initialized structure variable.
 135
 136 @item Returns error code so that i.e. Make stops after the first erroneous invocation.
 137
 138 @end itemize
 139
 140 @strong{Linker}
 141
 142 @itemize
 143
 144 @item Returns error code so that i.e. Make stops after the first erroneous invocation.
 145
 146 @end itemize
 147
 148 @heading Version 1.4.2
 149
 150 @strong{Assembler}
 151
 152 @itemize
 153
 154 @item Symbols can be indexed statically, C-style; see section 3.2.16, ``Indexing symbols statically''.
 155
 156 @item @code{sizeof} operator now works correctly when applied to an array.
 157
 158 @item Fixed bug that lead to dysfunctional symbol table when using `=' equates.
 159
 160 @end itemize
 161
 162 @strong{Linker}
 163
 164 @itemize
 165
 166 @item Fixed bug in RAM allocator.
 167
 168 @item Fixed line number bug in error messages.
 169
 170 @item Removed duplicate error message (unresolved symbols).
 171
 172 @end itemize
 173
 174 @heading Version 1.4.1
 175
 176 This is a bugfix release.
 177
 178 @strong{Assembler}
 179
 180 @itemize
 181
 182 @item Fixed bug in processing of declaration of array of user-defined type (!).
 183
 184 @item Fixed bug that lead to no error message when declaring an uninitialized variable of non-existing user-defined type.
 185
 186 @end itemize
 187
 188 @strong{Linker}
 189
 190 @itemize
 191
 192 @item Fixed imperfection in allocation of alignment-constrained data.
 193
 194 @item Fixed memory leak in RAM allocator.
 195
 196 @end itemize
 197
 198 @heading Version 1.4.0
 199
 200 @strong{Assembler}
 201
 202 @itemize
 203
 204 @item Added @code{--debug} switch (short form: @code{-g}). When this switch is given, the assembler will retain file and line information in the object file, which the linker can use to produce more descriptive link-time warning and error messages.
 205
 206 @item @code{LABEL} directive can take a specific address as argument, so that ``pointers'' can be made to any part of memory (i.e. you can address memory location @code{$200} as a structure (or array of structures), without having to explicitly define storage for it).
 207
 208 @item Constraints can be communicated to the linker on how contents of data segments should be mapped to RAM; see section 3.2.19, ``Controlling data mapping''.
 209
 210 @item PUBLIC modifier can be specified directly when defining a variable.
 211
 212 @item Fixed a bug in code generation of exported string constants.
 213
 214 @end itemize
 215
 216 @strong{Linker}
 217
 218 @itemize
 219
 220 @item Uses the information generated from the assembler @code{--debug} switch to produce descriptive warning and error messages.
 221
 222 @item Rewrote data segment mapping function to take zeropage and alignment constraints into account.
 223
 224 @item Improved code relocation; as a result, the current PC ($) can be used freely in any expression, and the @code{origin} argument for the @code{pad} command works.
 225
 226 @item Fixed a linker script parsing bug.
 227
 228 @end itemize
 229
 230 @heading Version 1.3.5
 231
 232 @strong{Assembler}
 233
 234 @itemize
 235
 236 @item Added ability to declare storage for array of user-defined types, C-style (works for native types too).
 237
 238 @item Added ability to specify the type of data that a label addresses.
 239
 240 @item Fixed bug in code generation of storage of user-defined types.
 241
 242 @item Fixed some error detection and parsing woes.
 243
 244 @item Added @code{DEFINE} directive (same semantics as @code{EQU}, but potentially more compact).
 245
 246 @end itemize
 247
 248 @strong{Linker}
 249
 250 @itemize
 251
 252 @item Fixed a bad code relocation bug.
 253
 254 @item Implemented bank operator (@code{^}).
 255
 256 @item @code{--verbose} switch now gives helpful info on what the linker is doing.
 257
 258 @end itemize
 259
 260 @heading Version 1.3.0
 261
 262 @itemize
 263
 264 @item Added support for user-defined records (@code{RECORD} directive, @code{MASK} operator).
 265
 266 @item Added @code{WHILE} directive.
 267
 268 @item Implemented @code{ELIF} directive.
 269
 270 @item Improved @code{--define} switch: A value can now be assigned to the identifier (i.e. @code{--define a=10}).
 271
 272 @item @code{SIZEOF} operator now works on variable identifiers too.
 273
 274 @item Fixed bug that prevented single-character identifiers from working.
 275
 276 @item Added @code{--no-warn} switch to suppress assembler warnings.
 277
 278 @item Early support for @code{--verbose} switch.
 279
 280 @end itemize
 281
 282 @heading Version 1.2.0
 283
 284 @itemize
 285
 286 @item Added support for forward/backward branches (@code{-}, @code{--}, @code{+}, @code{++}, and so on, up to eight levels (@code{++++++++})).
 287
 288 @item Fixed bug that caused the assembler to run out of file handles when including a large number of files.
 289
 290 @item Fixed bug that caused @code{.db <a, >b} and other lines with @code{< >} to be parsed erroneously.
 291
 292 @end itemize
 293
 294 @heading Version 1.1.0
 295
 296 @itemize
 297
 298 @item Full support for user-defined types: Structures, unions and enums.
 299
 300 @item Better separation of symbol types. In the previous versions, @emph{everything} was a label. The assembler now distinguishes properly between labels, procedures, variables, constants and user-defined types.
 301
 302 @item Support for anonymous macros (@code{rept} directive).
 303
 304 @item Crash-bug fixes (@code{if} directive, @code{incbin} directive).
 305
 306 @item Preliminary support for @code{--define=@var{IDENT}} assembler switch (can be used in @code{ifdef} and @code{ifndef} directives).
 307
 308 @item Added @code{message} directive.
 309
 310 @item Improved literal expression folding. @var{"hello " + 123} will now be folded to @var{"hello 123"}.
 311
 312 @item Added assembler switch @code{--swap-parens}, which swaps the operators used for indirection from [ ] to ( ).
 313
 314 @item Syntax of @code{extrn} directive changed slightly: Must now specify the symbol type.
 315
 316 @item Relaxed syntax of @code{db}, @code{dsb} and similar directives. If no expression is given as argument, a single item is allocated.
 317
 318 @end itemize
 319
 320 @node Overview
 321 @chapter Overview
 322
 323 The XORcyst is a set of languages and commandline tools for assembling and linking code to be run on a 6502 processor.
 324
 325 @node The Assembler
 326 @chapter The Assembler
 327
 328 The XORcyst assembler takes a @dfn{plaintext file} containing a sequence of 6502 instructions and assembler
 329 directives (collectively referred to as assembler statements), and produces from this an @dfn{object file} (usually referred to as a @dfn{unit}) that can be fed on to the XORcyst linker.
 330
 331 @section Invoking the assembler (@command{xasm})
 332
 333 The basic usage is
 334
 335 @samp{@command{xasm} @var{assembler-file}}
 336
 337 where @var{assembler-file} is the (top-level) file of assembler statements.
 338 If all goes well, this will produce a similarly named file of extension @file{.o}.
 339
 340 For example,
 341 @example
 342 xasm driver.asm
 343 @end example
 344 produces the object file @file{driver.o} if no errors are encountered by the assembler.
 345
 346 @subsection Switches
 347
 348 @table @code
 349
 350 @item --define IDENT[=VALUE]
 351 Enters the identifier @code{IDENT} into the global symbol table, optionally assigning it the value @code{VALUE}. The default value is integer @code{0}. To assign a string, escape sequences must be used, i.e. @code{--define my_string=\"Have a nice day\"}.
 352
 353 @item --output FILE
 354 Directs output to the file @code{FILE} rather than the default file.
 355
 356 @item --pure-binary
 357 Specifies that the output should be in the form of pure 6502 code. This will only succeed if the input has no external dependencies.
 358
 359 @item --swap-parens
 360 Changes the operators used to specify indirection from @code{[ ]} to @code{( )}. @code{[ ]} takes over @code{( )}'s role in arithmetic expressions.
 361
 362 @item --include-path DIR
 363 Adds @code{DIR} to the set of paths that are searched when a file is included.
 364
 365 @item --case-insensitive
 366 Ignores case of identifiers.
 367
 368 @item --no-warn
 369 Suppresses assembler warning messages.
 370
 371 @item --verbose
 372 Instructs the assembler to print some messages about what it is doing.
 373
 374 @item --debug
 375 Retains file and line information, so that the linker can produce more descriptive warning and error messages.
 376
 377 @end table
 378
 379 For the full list of switches, run @code{xasm --help}.
 380
 381 @section Assembler statements
 382 (@strong{Note:} This is not meant to be an introductory guide to 6502 assembly. Only the XORcyst-specific features and quirks will be explained. (For readers new to the 6502 and assemblers, @uref{http://www.google.com/search?q=6502+tutorial} may be a good starting point.)
 383
 384 Because the assembler aims to enforce completely position-independent code, it does not allow the @code{.org @var{address}} or @code{.base @var{address}} directives commonly employed by 6502 assemblers. But most other constructs familiar to some people are in place. These and additional features will be explained subsequently. (For a complete list of directives, see @ref{Assembler Directives}.)
 385
 386 In the code templates given in this section, any arguments enclosed in italic square brackets @emph{[ ... ]} are optional.
 387
 388 @subsection A simple assembler example
 389
 390 Here is a short assembler file which demonstrates basic functionality:
 391
 392 @example
 393 .dataseg                   ; begin data segment
 394
 395   my_variable .byte          ; define a byte variable
 396
 397   my_array .word[16]         ; define an array of 16 words
 398
 399 .codeseg                   ; begin code segment
 400
 401 .include "config.h"        ; include another source file
 402
 403 ; conditional definition of constant my_priority
 404 .ifdef HAVE_CONFIG_H
 405   my_priority = 10
 406 .else
 407   my_priority = 0
 408 .endif
 409
 410 ; declare a macro named store_const with parameters value and addr
 411 .macro store_const value, addr
 412   lda #value
 413   sta addr
 414 .endm                      ; end macro
 415
 416 ; a subroutine entrypoint is here
 417 .proc my_subroutine
 418   store_const $10, my_array+10           ; macro invocation
 419   store_const my_priority, my_variable   ; macro invocation
 420
 421   lda [$0A], y               ; NOTE: [ ] used for indirection, not ( ), unless --swap-parens switch used
 422   beq +
 423   jsr some_function          ; call external function
 424
 425 ; produce a short delay
 426 + ldx #60
 427   @@@@delay:
 428   dex
 429   bne @@@@delay
 430
 431 ; exit with my_priority in accumulator
 432   lda #my_priority
 433   rts
 434 .endp                      ; end of procedure definition
 435
 436 .public my_subroutine      ; make my_subroutine visible to other units
 437 .extrn some_function:proc  ; some_function is located in another unit
 438
 439 .end                       ; end of assembler input
 440 @end example
 441
 442 While the example itself doesn't do anything useful, it shows how you can.
 443
 444 @subsection Literals
 445
 446 The following kinds of integer literal are understood by the assembler (examples given in parentheses):
 447
 448 @itemize
 449
 450 @item @strong{Decimal:} Non-zero decimal digit followed by zero or more decimal digits (@code{1234})
 451
 452 @item @strong{Hexadecimal:} @code{0x} or @code{$} followed by one or more hexadecimal digits (@code{0xFACE, $BEEF}); one or more hexadecimal digits followed by @code{h} (@code{95Ah}). In the latter case numbers beginning with A through F must be preceded by a 0 (otherwise, say, @code{BABEh} would be interpreted as an identifier).
 453
 454 @item @strong{Binary:} String of binary digits either preceded by @code{%} or succeeded by @code{b} (@code{%010110, 11001100b}).
 455
 456 @item @strong{Octal:} A string of octal digits preceded by a 0 (@code{0755}).
 457
 458 @end itemize
 459
 460 String literals must be enclosed inbetween a pair of @code{"} (as in @code{"You are a dweeb"}).
 461
 462 Character literals must be of the form @code{'A'}.
 463
 464 @subsection Identifiers
 465
 466 Identifiers must conform to the regular expression @code{[[:alpha:]_][[:alnum:]_]*}. They are case sensitive.
 467 Examples of valid identifiers are
 468 @example
 469 no_brainer, schools_out, my_2nd_home, catch22, FunkyMama
 470 @end example
 471 Examples of invalid identifiers are
 472 @example
 473 3stooges, i-was-here, f00li$h
 474 @end example
 475
 476 @subsection Expressions
 477
 478 Operands to assembler statements are expressions. An expression can contain any number of operators, identifiers and literals, and parentheses to group terms. The operators are the familiar arithmetic, binary, shift and relational ones (same as in C, pretty much), plus a few more which are useful when writing code for a machine which has a 16-bit address space but only 8-bit registers:
 479
 480 @itemize
 481
 482 @item @code{< @var{expression}} : Get low 8 bits of @var{expression}
 483
 484 @item @code{> @var{expression}} : Get high 8 bits of @var{expression}
 485
 486 @end itemize
 487
 488 @code{$} can be used in an expression to refer to the address where the current instruction is assembled.
 489
 490 @code{^@var{symbol}} gets the bank number in which @var{symbol} is located (determined at link time).
 491
 492 @code{sizeof(@var{symbol})} gets the size of @var{symbol} in bytes.
 493
 494 When both operands to an operator are strings, the semantics are as follows: @var{str1} + @var{str2} concatenates; the relational operators perform string comparison; and all other operators are invalid. When one operand is a string and the other is an integer, the integer is implicitly converted to a string and concatenated with the string operand to produce a string as result.
 495
 496 @subsection Global labels
 497
 498 There are two ways to define a global label.
 499
 500 @itemize
 501
 502 @item @code{@var{identifier}@strong{:}} at the beginning of a source line defines the label @var{identifier} and assigns it the address of the current Program Counter. The colon is mandatory.
 503
 504 @item Using the @code{.label} directive. It is of the form
 505
 506 @example
 507 .label @var{identifier} @emph{[= @var{address}]} @emph{[ : @var{type}]}
 508 @end example
 509
 510 The absolute address of the label can be specified. If no address is given, the address is the current Program Counter.
 511
 512 The type of data that the label addresses can also be specified. The valid type specifiers are @code{byte}, @code{word}, @code{dword}, or an identifier, which must be the name of a user-defined type.
 513
 514 @end itemize
 515
 516 @subsection Local labels
 517
 518 A @dfn{local label} is only visible in the scope consisting of the statements between two regular labels; or, for macros, only in the body of the macro. Just as a regular label must be unique in the whole program scope, a local label must be unique in the scope in which it is defined. The big advantage here is that the name of the local label can be reused as long as the definitions exist in different local scopes. Local labels are prefixed by @code{@@@@}. Unlike regular labels the local name itself can start with a digit, so for instance @code{@@@@10} is valid.
 519 The following example shows how a local label can exist unambigiously in two scopes.
 520 @example
 521 my_first_delay:        ; new local scope begins here
 522 ldx #100
 523 @@@@loop:                ; this label exists in my_first_delay's namespace
 524 dex
 525 bne @@@@loop
 526 rts
 527
 528 my_second_delay:        ; new local scope begins here
 529 ldy #200
 530 @@@@loop:                 ; this label exists in my_second_delay's namespace
 531 dey
 532 bne @@@@loop
 533 rts
 534 @end example
 535
 536 As mentioned, the same local cannot be redefined within a scope. So having, say, two labels called @code{@@@@loop} in the same scope would produce an assembler error. Also, something like the following would produce an error:
 537 @example
 538 adc #10
 539 bvs @@@@handle_overflow
 540 barrier:
 541 rts
 542 @@@@handle_overflow:
 543 ; ...
 544 @end example
 545 since the branch instruction refers to a local label defined in a different scope (because of the strategic placement of the label @code{barrier}).
 546
 547 @subsection Forward/backward branches
 548
 549 These are ``anonymous'' labels that can be redefined as many times as you want. A reference to a forward/backward label is resolved to the closest matching definition in the succeeding assembly statements (forward branches) or preceding assembly statements (backward branches).
 550
 551 A forward branch consists of one or more (up to eight) consecutive @code{+} (plus) symbols. A backward branch consists of one or more (up to eight) consecutive @code{-} (minus) symbols. The following examples illustrate use of forward and backward branches.
 552
 553 @example
 554    lda $50
 555    bmi ++
 556    lda $40
 557    bne +         ; branches to first forward label
 558    ; do something ...
 559 +  dex           ; first forward label
 560    beq +         ; branches to second forward label
 561    ; do something more ...
 562 +  sta $40       ; second forward label
 563 ++ rts
 564
 565 @end example
 566
 567 @example
 568    lda $60
 569    bmi +
 570  - lda $2002      ; first backward label
 571    bne -          ; branches to first backward label
 572  - lda $2002      ; second backward label
 573    bne -          ; branches to second backward label
 574  + rts
 575 @end example
 576
 577 @subsection Equates
 578
 579 There are three ways to define equates.
 580 @itemize
 581
 582 @item With the @code{=} operator. An equate defined this way can be redefined, and it obeys program order.
 583
 584 @example
 585 i = 10
 586 ldx #i
 587 i = i + 1
 588 ldy #i
 589 @end example
 590
 591 In the example above, the assembler will substitute @code{10} for the first occurence of @code{i} and @code{11} for the last.
 592
 593 @item With the @code{.equ} directive. An equate defined this way can only be defined once, and it does not obey program order (that is, it can be defined at a later point from where it is used). An equate of this type can be exported, so that it may be accessed by other units (more on exporting symbols later).
 594
 595 @example
 596 lib_version .equ $10
 597 lib_author .equ "The Godfather"
 598 @end example
 599
 600 @item With the @code{.define} directive. This directive is semantically equal to @code{.equ}, but the value is optional, so you can write CPP-like defines, which is more compact. When no value is given, the symbol is defined as integer 0.
 601
 602 @example
 603 .ifndef MYHEADER_H
 604 .define MYHEADER_H
 605 ; ...
 606 .endif     ; !MYHEADER_H
 607 @end example
 608
 609 @end itemize
 610
 611 @subsection Conditional assembly
 612
 613 There are two ways to go about doing conditional assembly. One way is to test if a certain identifier has been defined (that is, equated) using the @code{.ifdef} directive, as shown in the next two templates.
 614
 615 @example
 616 .ifdef @var{identifier}
 617 @var{statements}
 618 .endif
 619 @end example
 620
 621 @example
 622 .ifdef @var{identifier}
 623 @var{true-statements}
 624 .else
 625 @var{false-statements}
 626 .endif
 627 @end example
 628
 629 The other way is to test a full-fledged expression, as shown in the next template.
 630
 631 @example
 632 .if @var{expression}
 633 @var{statements}
 634 .elif @var{expression-II}
 635 @var{statements-II}
 636 .else
 637 @var{other-statements}
 638 .endif
 639 @end example
 640
 641 @subsection Macros
 642
 643 Macro definitions are of the form
 644
 645 @example
 646 .macro @var{name} @emph{[@var{parameter1}, @var{parameter2}, ...]}
 647 @var{statements}
 648 .endm
 649 @end example
 650 The parameters must be legal identifiers.
 651
 652 To invoke (expand) the statements (body) of a macro in your program, issue the assembler statement @code{@var{name}}, where @var{name} is the macro name, followed by a comma-separated list of actual arguments, if the macro has any. The arguments will be substituted for the respective parameter names in the resulting statements.
 653
 654 You can use local labels in the body of a macro. These labels will be completely local and unique to each expanded macro instance; any local labels defined outside the expanded body are not ``seen''. For example, if you have the following macro definition
 655 @example
 656 .macro my_macro
 657 @@@@loop:
 658 dey
 659 bne @@@@loop
 660 .endm
 661 @end example
 662 and then use the macro as shown in the following
 663 @example
 664 @@@@loop:
 665 my_macro
 666 my_macro
 667 dex
 668 bne @@@@loop
 669 @end example
 670 each expansion of @code{my_macro} will have its own local label @code{@@@@loop}, neither of which interfere with the local label @code{@@@@loop} in the scope where the macro is invoked.
 671
 672 Macros can be nested to arbitrary depth.
 673
 674 @subsection Anonymous macros
 675
 676 An anonymous REPT (REPeaT) macro is of the form
 677
 678 @example
 679 i = 1
 680 @strong{.rept 8}
 681 .db i
 682 i = i*2
 683 @strong{.endm}
 684 @end example
 685
 686 The statements between @code{rept} and @code{endm} will be repeated as many times as specified by the argument to @code{rept}. In the preceding example, the resulting expansion is equivalent to
 687
 688 @example
 689 .db 1, 2, 4, 8, 16, 32, 64, 128
 690 @end example
 691
 692 Similarly, an anonymous WHILE macro is of the form
 693
 694 @example
 695 i = 1
 696 @strong{.while i <= 128}
 697 .db i
 698 i = i*2
 699 @strong{.endm}
 700 @end example
 701
 702 The statements between @code{while} and @code{endm} will be repeated while the expression given as argument to @code{while} is true (non-zero). The code inside the macro body is responsible for updating the variables involved in the expression, so that it will eventually become false. In the preceding example, the resulting expansion is equivalent to
 703
 704 @example
 705 .db 1, 2, 4, 8, 16, 32, 64, 128
 706 @end example
 707
 708 @subsection Including files
 709
 710 There are two directives for including files.
 711
 712 @itemize
 713
 714 @item @code{.incsrc "@var{src-file}"} (can also be written @code{.include}) interprets the specified file as textual assembler statements.
 715
 716 @item @code{.incbin "@var{bin-file}"} interprets the specified file as a binary buffer.
 717
 718 @end itemize
 719
 720 @subsection Defining native data
 721
 722 There is a class of directives for defining data storage and values.
 723
 724 @itemize
 725
 726 @item @code{.db} @emph{[@var{expression}, ...]} : Defines a string of bytes
 727 @item @code{.dw} @emph{[@var{expression}, ...]} : Defines a string of words
 728 @item @code{.dd} @emph{[@var{expression}, ...]} : Defines a string of doublewords
 729 @item @code{.char} @emph{[@var{expression}, ...]} : Defines a string of characters (explained later)
 730 @item @code{.dsb} @emph{[@var{expression}]} : Defines a storage of size @var{expression} bytes
 731 @item @code{.dsw} @emph{[@var{expression}]} : Defines a storage of size @var{expression} words
 732 @item @code{.dsd} @emph{[@var{expression}]} : Defines a storage of size @var{expression} doublewords
 733
 734 @end itemize
 735
 736 If no argument is given to the directive, a single item of the respective datatype is allocated, i.e.
 737 @example
 738 .db
 739 @end example
 740 is equivalent to
 741 @example
 742 .dsb 1
 743 @end example
 744
 745 Alternatively, data arrays can be allocated using square brackets [ ] like in C:
 746
 747 @example
 748 .db[100]
 749 @end example
 750 which is equivalent to
 751 @example
 752 .dsb 100
 753 @end example
 754
 755 @code{.byte}, @code{.word} and @code{.dword} are more verbose aliases for @code{.db}, @code{.dw} and @code{.dd}, respectively.
 756
 757 Note that data cannot be initialized in a data segment; only storage for the data can be allocated there.
 758
 759 @heading Defining non-ASCII text data
 760
 761 Use the @code{.charmap} directive to specify a map file describing the mapping from regular ASCII-coded characters to your custom set. See @ref{Custom Character Maps} for a description of the format of such a custom character map file. Once the character map has been set, you can define your textual data by using the @code{.char}-directive. The information in the character map is applied to the given data by the assembler in order to transform it to a regular @code{.db} directive internally. The @code{.charmap} directive obeys program order, meaning you can use different character maps at different points in your code. If no character map has been set, @code{.char} is equivalent to @code{.db}. A simple example of the use of @code{.charmap} and @code{.char} follows.
 762
 763 @example
 764 .charmap "my_map.tbl"          ; set the custom character map to the one defined in my_map.tbl
 765 .char "It is a delight for me to be encoded in non-ASCII form", 0
 766 @end example
 767
 768 @subsection User-defined types
 769
 770 There are currently four kinds of types that can be defined by the user. For further information on the concepts of their use, consult a C manual.
 771
 772 @itemize
 773
 774 @item @strong{Structures}.
 775
 776 @example
 777 .struc my_struc
 778 my_1st_field .db
 779 my_2nd_field .dw
 780 my_3rd_field .type my_other_struc
 781 .ends
 782 @end example
 783
 784 Using ``flat'' addressing, structure members are accessed just like in C.
 785
 786 @example
 787 lda the_player.inventory.sword
 788 @end example
 789
 790 For indirect addressing, the scope operator can be used to get the offset of the field.
 791
 792 @example
 793 ldy #(player_struct::inventory + inventory_struct::sword)
 794 lda [$00],y     ; load ($00).inventory.sword
 795 @end example
 796
 797 @item @strong{Unions}.
 798
 799 @example
 800 .union my_union
 801 byte_value .db
 802 word_value .dw
 803 string_value .char[32]
 804 .ends
 805 @end example
 806
 807 In a union, the fields are ``overlaid''; that is, they share the same storage, and in general only one of the fields is used (at a time) for a particular instance of the union. A typical usage is to define a structure with two members: An enumerated type that selects one of the union fields, and the actual union containing the fields.
 808
 809 Anonymous unions can be defined ``inline'' as part of a structure, as shown in the following example:
 810
 811 @example
 812 .struc my_struc
 813 type    .byte
 814 @strong{    .union}
 815 @strong{    byte_value .byte[4]}
 816 @strong{    word_value .word[2]}
 817 @strong{    dword_value .dword}
 818 @strong{    .ends}
 819 .ends
 820 @end example
 821
 822 @code{byte_value}, @code{word_value} and @code{dword_value} may then be accessed as top-level members of the structure, but do in fact share storage.
 823
 824 @item @strong{Records} (bitfields).
 825
 826 @example
 827 .record my_record top_bits:3, middle_bits:2, bottom_bits:3
 828 @end example
 829
 830 A record can be maximum 8 bits (1 byte) wide. The bitfields are arranged from high to low; for example, in the record shown above, @code{top_bits} would occupy bits 7:5, @code{middle_bits} 4:3 and @code{bottom_bits} 2:0. Lower bits are padded if necessary to fill the byte.
 831
 832 The scope operator (@code{::}) returns the number of right shifts necessary to bring the LSb of a bitfield into the LSb of the accumulator. The @code{MASK} operator returns a bitfield's logical AND mask. For example, using the record definition shown above,
 833
 834 @example
 835 my_record::middle_bits
 836 @end example
 837 returns @code{3}, and
 838 @example
 839 MASK my_record::middle_bits
 840 @end example
 841 returns @code{%00011000}. These are the two basic operations necessary to manipulate bitfields. The following macro shows how a field can be extracted:
 842
 843 @example
 844 ; IN:  ACC = instance of record `rec'
 845 ;      rec = record type identifier
 846 ;      fld = bitfield identifier
 847 ; OUT: ACC = field `fld' of `rec' in lower bits; upper bits zero
 848 .macro get_field rec, fld
 849     and #(mask rec::fld)       ; ditch other fields
 850     .rept rec::fld             ; shift down to bit 0
 851     lsr
 852     .endm
 853 .endm
 854 @end example
 855
 856 @item @strong{Enumerations}.
 857
 858 @example
 859 .enum my_enum
 860 option_1 = 1
 861 option_2
 862 option_3
 863 option_4
 864 .ende
 865 @end example
 866
 867 Note that an enumerated value is encoded as a @code{byte}.
 868
 869 @end itemize
 870
 871 @subsection Defining data of user-defined types
 872
 873 The general syntax is
 874
 875 @example
 876 .type @var{identifier}
 877 @end example
 878
 879 or just
 880
 881 @example
 882 .@var{identifier}
 883 @end example
 884
 885 Where @var{identifier} is the name of a user-defined type. This allocates @code{sizeof(@var{identifier})} bytes of storage. Optionally, a value initializer can be specified (only in code segments). The form of this initializer depends on the type of data.
 886
 887 @itemize
 888
 889 @item @strong{Structure}. The initializer is of the form
 890
 891 @example
 892 @{ @var{field1-value}, @emph{[@var{field2-value}, ..., ]} @}
 893 @end example
 894
 895 The field initializers must match the order of the fields in the type definition. To leave a field blank, leave its initializer empty. For example
 896
 897 @example
 898 my_array .type my_struc @{ 10, , "hello" @}, @{ , , "cool!" @}, @{ 45 @}
 899 @end example
 900
 901 defines three instances of type @code{my_struc}, with various fields explicitly initialized and others implicitly padded by the assembler.
 902
 903 Since structures can contain sub-structures, so can a structure initializer. To initialize a sub-structure, simply start a new pair of @{ @} and specify field values, recursively.
 904
 905 @item @strong{Union}. The initializer is of the same form as a structure initializer, except only one of the fields in the union can be initialized.
 906
 907 @item @strong{Record}. The initializer is of the same form as a structure initializer, but cannot contain sub-structure initializers (each bitfield is a ``simple'' value).
 908
 909 @item @strong{Enum}. The initializer is simply an identifier that must be one of the identifiers appearing in the type definition.
 910
 911 @end itemize
 912
 913 To define an array of (uninitialized) values of a user-defined type, use the C-style method, for example:
 914
 915 @example
 916 my_array .my_struc@strong{[100]}        ; array of 100 values of type my_struc
 917 @end example
 918
 919 @subsection Indexing symbols statically
 920
 921 A symbol can be indexed statically using the C-style syntax
 922
 923 @example
 924 @var{identifier}@strong{[}@var{expression}@strong{]}
 925 @end example
 926
 927 For byte arrays, this is simply equivalent to the expression
 928
 929 @example
 930 @var{identifier} + @var{expression}
 931 @end example
 932
 933 In general, it is equivalent to
 934
 935 @example
 936 @var{identifier} + @var{expression} * sizeof @var{identifier-type}
 937 @end example
 938
 939 where @var{identifier-type} is the type of @var{identifier}.
 940
 941 An example:
 942
 943 @example
 944 my_array .my_struc[10]        ; array of 10 values of type my_struc
 945 lda #1
 946 i = 0
 947 .while i < 10
 948 sta my_array[i].my_field               ; initialize my_field to 1
 949 i = i + 1
 950 .endm
 951
 952 @end example
 953
 954 @subsection Procedures
 955
 956 A procedure is of the form
 957
 958 @example
 959 .proc @var{name}
 960 @var{statements}
 961 .endp
 962 @end example
 963
 964 Currently, there is no internal differentiation between a procedure and a label, but @code{.proc} is more specific than a label, so it improves the semantics.
 965
 966 @subsection Importing and exporting symbols
 967
 968 To specify that a symbol used in your code is defined in a different unit, use the @code{.extrn} directive. This way you can call procedures or access constants exported by that unit. When you use the linker to create a final executable you also have to link in the unit(s) where the external symbols you use are defined.
 969
 970 The @code{extrn} directive takes as arguments a comma-separated list of identifiers, followed by a colon (:), followed by a @var{symbol type}. The symbol type must be one of @code{BYTE}, @code{WORD}, @code{DWORD}, @code{LABEL}, @code{PROC}, or the name of a user-defined type, such as a structure or union.
 971
 972 To export a symbol defined in your own code, thereby making it accessible to other units, use the @code{.public} directive. The next example shows how both directives may be used.
 973
 974 @example
 975 .extrn proc1, proc2, proc3 : proc  ; these are defined somewhere else
 976 my_proc:
 977 jsr proc1
 978 jsr proc2
 979 jsr proc3
 980 rts
 981 .public my_proc                ; make my_proc accessible to the outside world
 982
 983 @end example
 984
 985 You can also specify the @code{.public} keyword directly when defining a variable, so you don't need a separate directive to make it public:
 986
 987 @example
 988 .public my_public_variable .word
 989 @end example
 990
 991 @subsection Controlling data mapping
 992 By default, the linker takes the members of data segments and maps them to the best free RAM locations it finds. However, there are times when you want to specify some constraints on the mapping. For example, you want the variable to always be mapped to the 6502's zero page. Or, you have a large array and want it to be aligned to a proper boundary so you don't risk suffering page cross penalties on indexed accesses.
 993
 994 The XORcyst assembler provides the following ways to communicate mapping constraints to the linker.
 995
 996 @itemize
 997
 998 @item To specify that a data segment variable should always be mapped to zero page, precede its definition by the @code{.zeropage} keyword:
 999
1000 @example
1001 .zeropage my_zeropage_variable .byte
1002 @end example
1003
1004 Alternatively, specify the @code{.zeropage} keyword as argument to the @code{.dataseg} directive:
1005
1006 @example
1007 .dataseg .zeropage       ; turn on .zeropage constraint
1008 my_1st_var .byte         ; .zeropage constraint will be set automatically
1009 my_2nd_var .word         ; ditto
1010 .dataseg                 ; turn off .zeropage constraint
1011 @end example
1012
1013 @item To specify that one or more data variables should be aligned, use the @code{.align} directive. It takes a list of identifiers followed by the alignment boundary, for example
1014
1015 @example
1016 .dataseg
1017 my_array .byte[64]
1018 .align my_array 64       ; my_array should be aligned on a 64-byte boundary
1019 @end example
1020
1021 @end itemize
1022
1023 @subsection An important note on indirect addressing
1024
1025 If you're familiar with 6502 assembly, you know that parentheses ( ) are normally used to indicate indirect addressing modes. Unfortunately, this clashes with the use of parentheses in operand expressions. I couldn't get Bison (the parser generator) to deal with this context dependency. As I'm used to coding Intel X86 assembly, which uses brackets for indirection, I opted for [ ] as the default indirection operators. This could be a source of bugs, since if you type it the ``old'' way, @code{LDA ($FA),Y} is equivalent to @code{LDA $FA,Y} -- which probably isn't what you wanted. However, by specifying the switch
1026
1027 @example
1028 --swap-parens
1029 @end example
1030
1031 upon invoking the assembler, the behaviour of [ ] and ( ) will be reversed. That is, the ``normal'' way of specifying indirection, i.e. @code{LDA ($00),Y} is used, while expression operands are grouped with [ ], i.e. @code{A/[B+C]}.
1032
1033 @node The Linker
1034 @chapter The Linker
1035
1036 The main job of the linker is to take object code files (units) created by the assembler, resolve any dependencies among them and reduce them to pure 6502 binaries.
1037
1038 The XORcyst linker takes as input a linker script. The linker script is a plaintext file containing a sequence of commands which describe the layout and contents of the linker output. (For a complete list and description of script commands, see @ref{Linker Script Commands}.) The final output of the linker process is a single binary file containing all the 6502 code properly relocated and resolved, plus any other data specified in the linker script.
1039
1040 @section Invoking the linker (@command{xlnk})
1041
1042 The basic usage is
1043
1044 @samp{@command{xlnk} @var{script-file}}
1045
1046 where @var{script-file} is the linker script file containing commands to be processed by the linker.
1047
1048 To have the linker print some information on what it is doing, give the @code{--verbose} switch.
1049
1050 @section A simple linker script example
1051
1052 The example below shows what a very simple linker script may look like. It is the simplest case, where you have a single unit @file{my_unit.o} (created by the assembler, presumably from @file{my_unit.asm}), and want to create executable 6502 code from it. For small, single-source projects you won't need much more than this.
1053
1054 @example
1055 ram@{start=0x0000,end=0x0800@}           # define an available range of 6502 RAM
1056 output@{file=program.bin@}               # set the output file
1057 link@{file=my_unit.o, origin=0xC000@}    # relocate my_unit.o to 0xC000 and write it to output
1058 @end example
1059
1060 Commands in the script are of the form @code{@var{command-name}@{@emph{[@var{arg-name}=@var{value}, @var{arg-name}=@var{value}, ...]}@}}. The kind and number of valid arguments depends on the particular command. Some arguments are optional while others are mandatory, again depending on the particular command. (Even if the command has no arguments, you have to have a pair of empty braces).
1061
1062 The @code{ram}-command tells the linker that it has available a chunk of RAM in the 6502's memory starting at address 0x0000 and ending at 0x0800. The linker will map the contents of data segments to physical addresses in this region.
1063
1064 The @code{output}-command is used to tell the linker which file to direct its output to.
1065
1066 The @code{link}-command tells the linker to relocate the given unit and output the resulting binary representation.
1067
1068 As you can see, a line comment in the script is initiated with a @code{#}-character.
1069
1070 @section Linking multiple units
1071
1072 In principle, linking more than one unit into the same output file is simple: Just add appropriate @code{link}-commands to the linker script. For example, say you have written a small library of functions you commonly use across all your projects and assembled it to @file{my_lib.o}. Assume that your main program, say, @file{my_unit.o} depends on @file{my_lib.o}; it calls one or more functions exported from the library. You would then add an additional line to the previous example script:
1073
1074 @example
1075 ram@{start=0x0000,end=0x0800@}
1076 output@{file=program.bin@}
1077 link@{file=my_unit.o, origin=0xC000@}
1078 link@{file=my_lib.o@}                     # my_lib will be relocated to directly after my_unit.o
1079 @end example
1080
1081 Note that there is no @option{origin}-argument to the latter @code{link}-command. This is because we generally don't know how much space the code from @file{my_unit.o} will occupy. So we let the linker take care of it; when no origin is specified, the unit will be relocated to the location where the previous entity processed by the linker ended (the linker manages a ``pseudo-Program Counter'' internally to keep track of where it is in 6502 memory). So if the code for @file{my_unit.o} was @code{0x0ABC} bytes in size, @file{my_lib.o} would be relocated to @code{0xCABC}.
1082
1083 @section Separating units into banks
1084
1085 You will get an error during linking if the Program Counter exceeds 64K. To write larger programs you normally have to divide the program into banks and manually switch them in and out of 6502 memory as they are needed. How the switching is done is very system-specific, so The XORcyst doesn't corcern itself with that. However, it does allow you to manage banks.
1086
1087 The linker script command @code{bank} is used to start a new bank. There are two (semi-optional) arguments to @code{bank}:
1088 @itemize
1089
1090 @item @option{size}, which specifies the bank size in bytes. If a size is not specified, the size of the previous bank is used; and
1091
1092 @item @option{origin}, which specifies the bank's origin in 6502 memory. This is the address where the bank must be located when it resides in memory during program execution. If an origin is not specified, the origin of the previous bank is used.
1093
1094 @end itemize
1095
1096 For example,
1097 @example
1098 bank@{size=0x4000, origin=0x8000@}
1099 @end example
1100 indicates the start of a bank that is to be 16KBytes in size, and its contents should be linked relative to address 0x8000.
1101
1102 So to build on our previous example script, say that you for some reason want to put the library in a separate bank from your main program:
1103
1104 @example
1105 ram@{start=0x0000,end=0x0800@}
1106 output@{file=program.bin@}
1107 bank@{size=0x4000, origin=0x8000@}
1108 link@{file=my_unit.o@}
1109 bank@{size=0x4000, origin=0xC000@}
1110 link@{file=my_lib.o@}                     # my_lib will be relocated to directly after my_unit.o
1111 @end example
1112
1113 This will create an output 32KBytes in size, the first 16KBytes being the bank containing the code from @file{my_unit.o} and the latter 16KBytes containing @file{my_lib.o}'s code.
1114
1115 A couple of things worth mentioning:
1116 @enumerate
1117
1118 @item It isn't necessary to specify the origin in any of the @code{link}-commands anymore, since an origin is specified in the owning bank instead. If you do specify an origin in the @code{link}-command, it will override the internal linker origin.
1119
1120 @item When you start a new bank, the previous bank may not have been completely ``filled up'' with code and/or data; in this case the output is automatically padded with zeroes so that the size of the output matches the given bank size. (In addition to the 16-bit Program Counter, the linker also keeps track of the current 0-relative bank offset, and advances it as stuff is added to output.)
1121
1122 @end enumerate
1123
1124 @section Partitioning 6502 RAM
1125
1126 Usually you don't want to let the linker have @emph{all} the 6502 RAM at its disposal for data mapping; some regions of memory have special meaning and should generally be off-limits to the linker. For example, the 6502 has a stack which grows down from address 0x01FF. So it would be good idea to reserve some space there for the stack.
1127
1128 Partitioning the RAM is easy. Just put multiple @code{ram}-commands in the linker script, leaving out the reserved regions. For example, this is a typical configuration I use for NES game programming:
1129 @example
1130 ram@{start=0x0000, end=0x0180@}
1131 ram@{start=0x0300, end=0x0800@}
1132 ram@{start=0x6000, end=0x6000@}    # only if the board has WRAM
1133 @end example
1134
1135 Here I have left out the region @code{0x0180}...@code{0x0300}. Address @code{0x0180} up to and including @code{0x01FF} is where the stack lives, while the page starting at @code{0x0200} is used to hold game sprite data; this address is hard-coded in the assembly source.
1136
1137 The order of @code{ram}-commands is significant. The order defines the order in which the linker will attempt to map data segments' symbols to RAM. This is why the region containing the zeropage should preferably come first, since we generally want as much data as possible to be mapped here. Only when the linker runs out of space in the first region will it try the next one, and so on.
1138
1139 @section Copying files to linker output
1140
1141 Analogous to the assembler directive @code{.incbin}, the linker script command @code{copy} allows you to copy a file straight to the linker's output file. For example, you might like to prepend your 6502 executable with a header. So you create a custom header file called, say, @file{header.bin} and, prior to the @code{bank}-commands, you issue the command
1142
1143 @example
1144 copy@{file=header.bin@}
1145 @end example
1146
1147 You can also use the @code{copy}-command inside banks of course, anywhere you like. In this case the internal Program Counter and bank offset will be advanced in the same manner as when a unit is linked and copied to the output with the @code{link}-command. The only difference is that you can tell in advance how much the offsets will be increased (by looking at the size of the file that is copied).
1148
1149 @section Padding the output
1150
1151 You can pad the output explicitly with the @code{pad}-command. This will write an appropriate number of zero-bytes to the output file. The following are the (mutually exclusive) arguments to the command.
1152 @itemize
1153
1154 @item @code{size} : Pad as many bytes as indicated
1155
1156 @c @item @code{origin} : Pad until Program Counter equals the given origin
1157
1158 @item @code{offset} : Pad until bank offset equals the given offset
1159
1160 @end itemize
1161
1162 @c @section Specifying options
1163
1164 @node Known Bugs and Limitations
1165 @chapter Known Bugs and Limitations
1166
1167 Every source file must end with a newline.
1168
1169 @node Implementation Details
1170 @appendix Implementation Details
1171
1172 Some deep discussion will eventually go here; in the meantime, have a look at the sourcecode, which is full of comments.
1173
1174 @node Assembler Directives
1175 @appendix Assembler Directives
1176
1177 It is considered good practice to prepend a period to a directive when invoking it (to differentiate it from identifiers), but this is not a strict requirement.
1178
1179 The following is an alphabetical listing of the directives supported by the assembler and the arguments they may take. Arguments enclosed in square brackets [ ] are optional.
1180
1181 @table @code
1182
1183 @item align @var{identifier} @emph{[, @var{identifier-2}, ...]} @var{boundary}
1184 Specifies alignment constraints for a list of data variables.
1185
1186 @item asc (@emph{Alias for} char)
1187
1188 @item byte (@emph{Alias for} db)
1189
1190 @item char @var{expression} @emph{[, @var{expression}, ...]}
1191 Define (array of) character transformed by custom character map
1192
1193 @item charmap "@var{filename}"
1194 Set custom character map
1195
1196 @item codeseg, code
1197 Switch to code segment
1198
1199 @item dataseg, data @emph{[zeropage]}
1200 Switch to data segment
1201
1202 @item db @var{expression} @emph{[, @var{expression}, ...]}
1203 Define (array of) byte
1204
1205 @item dd @var{expression} @emph{[, @var{expression}, ...]}
1206 Define (array of) doubleword
1207
1208 @item define @var{identifier} @emph{[@var{expression}]}
1209 See @code{equ} directive
1210
1211 @item dsb @var{expression}
1212 Define storage of bytes
1213
1214 @item dsd @var{expression}
1215 Define storage of doublewords
1216
1217 @item dsw @var{expression}
1218 Define storage of words
1219
1220 @item dw @var{expression} @emph{[, @var{expression}, ...]}
1221 Define (array of) word
1222
1223 @item dword (@emph{Alias for} dd)
1224
1225 @item elif
1226 Used in conjunction with if
1227
1228 @item else
1229 Used in conjunction with if, ifdef or ifndef
1230
1231 @item endif
1232 Ends a statement block preceded by an if
1233
1234 @item end
1235 Ends the assembly unit
1236
1237 @item ende
1238 Ends an enum definition
1239
1240 @item endm
1241 Ends a macro definition
1242
1243 @item endp
1244 Ends a procedure definition
1245
1246 @item ends
1247 Ends a structure or union definition
1248
1249 @item enum @var{identifier}
1250 Begins an enum definition
1251
1252 @item error @var{expression}
1253 Prints an error
1254
1255 @item extrn @var{identifier} @emph{[, @var{identifier}, ...]} : @var{type}
1256 Flag identifier(s) as external (imported) of type @var{type}
1257
1258 @item @var{identifier} equ @var{expression}
1259 Define equate
1260
1261 @item if @var{expression}
1262 Assemble the following statement block only if @var{expression} evaluates to non-zero
1263
1264 @item ifdef @var{identifier}
1265 Assemble the following statement block only if @var{identifier} is defined
1266
1267 @item ifndef @var{identifier}
1268 Assemble the following statement block only if @var{identifier} is not defined
1269
1270 @item incbin "@var{filename}"
1271 Include contents of @var{filename} as binary data
1272
1273 @item include (@emph{Alias for} incsrc)
1274
1275 @item incsrc "@var{filename}"
1276 Include contents of @var{filename} as assembler statements
1277
1278 @item label @var{identifier} @emph{[= @var{address}]} @emph{[ : @var{type}]}
1279 Defines a global label
1280
1281 @item macro @var{identifier} @emph{[@var{identifier}, ...]}
1282 Begins a macro definition
1283
1284 @item message @var{expression}
1285 Prints a message to stdout during assembly
1286
1287 @item org @var{expression}
1288 Sets the origin address
1289
1290 @item pad @var{expression}  (@emph{Alias for} dsb)
1291
1292 @item proc @var{identifier}
1293 Begins a procedure definition
1294
1295 @item public @var{identifier} @emph{[, @var{identifier}, ...]}
1296 Flag identifier(s) as public (exported)
1297
1298 @item record @var{identifier} @var{identifier}:@var{width} @emph{[, @var{identifier}:@var{width}, ...]}
1299 Defines a record consisting of bitfields.
1300
1301 @item rept @var{count}
1302 Begins an anonymous macro to be repeated @var{count} times
1303
1304 @item struc @var{identifier}
1305 Begins a structure definition
1306
1307 @item type @var{identifier} @emph{[@var{expression}, ...]}
1308 Define data of user-defined type @var{identifier}
1309
1310 @item union @var{identifier}
1311 Begins a union definition
1312
1313 @item warning @var{expression}
1314 Prints a warning
1315
1316 @item while @var{expression}
1317 Begins an anonymous macro to be repeated while @var{expression} is true (non-zero)
1318
1319 @item word (@emph{Alias for} dw)
1320
1321 @end table
1322
1323 @node Linker Script Commands
1324 @appendix Linker Script Commands
1325
1326 The following is an alphabetical listing of the script commands recognized by the linker and the arguments they may take. Note that not all arguments are mandatory and some are mutually exclusive.
1327
1328 @table @code
1329
1330 @item bank @{ size=@var{size}, origin=@var{origin-address} @}
1331 Start a new bank of size @var{size} bytes and set initial relocation address to @var{origin-address}.
1332
1333 @item copy @{ file=@var{filename} @}
1334 Copy contents of @var{filename} to output.
1335
1336 @item link @{ file=@var{filename}, origin=@var{origin-address} @}
1337 Relocate code in the unit @var{filename} to @var{origin-address} and copy the result to output. (If an origin is not specified, the internally managed linker origin is used.)
1338
1339 @c @item options
1340
1341 @item output @{ file=@var{filename} @}
1342 Set the linker output file.
1343
1344 @item pad @{ origin=@var{origin-address}, offset=@var{offset}, size=@var{size} @}
1345 Pad to the given origin or bank offset, or pad @var{size} bytes (only one of the arguments should be given).
1346
1347 @item ram @{ start=@var{start-address}, end=@var{end-address} @}
1348 Specify that 6502 RAM in the range @var{start-address}...@var{end-address} (non-inclusive) may be used by the linker to map contents of data segments.
1349
1350 @end table
1351
1352 @node Object Code Format
1353 @appendix Object Code Format
1354
1355 An object code file, or unit, produced by the assembler has the following major sections:
1356
1357 @itemize
1358
1359 @item Magic number and assembler version
1360
1361 @item Definitions of exported constants
1362
1363 @item Descriptors for imported symbols
1364
1365 @item Data segment bytecodes
1366
1367 @item Code segment bytecodes
1368
1369 @item Definitions of expressions referred to by bytecodes
1370
1371 @end itemize
1372
1373 Each of these will be described in the sequel.
1374
1375 @section Magic number and assembler version
1376
1377 The magic number is a 16-bit constant (@code{0xCAFE}, if you must know), used to validate the object file. It is followed by 1 byte which denotes the version of the assembler that was used to build the file; the major version in the upper nibble and minor version in the lower nibble (should be @code{0x10}).
1378
1379 @section Definitions of exported constants
1380
1381 This is a series of triplets @var{(identifier, type, value)}, each describing a constant made publicly available.
1382
1383 @section Descriptors for imported symbols
1384
1385 This is a list of descriptors for the symbols used by this unit which are not defined in the unit itself; that is, they are external dependencies.
1386
1387 @section Data segment bytecodes
1388
1389 Statements in the original assembler file are encoded in a compact bytecode form. The bytecodes here define the labels and data storages located in the unit's @code{dataseg} section(s). The bytecode commands are a subset of the ones described in the next section.
1390
1391 @section Code segment bytecodes
1392
1393 These bytecodes are a compact representation of the contents of the unit's @code{codeseg} section(s). The commands and their arguments are as follows:
1394
1395 @table @code
1396
1397 @item CMD_END
1398 Indicates the end of the segment.
1399
1400 @item CMD_BIN8 @var{count} @var{byte1, byte2, ...}
1401 The next @var{count} bytes are binary data which needn't be processed in any special way. @var{count} is an 8-bit quantity.
1402
1403 @item CMD_BIN16 @var{count} @var{byte1, byte2, ...}
1404 The next @var{count} bytes are binary data which needn't be processed in any special way. @var{count} is a 16-bit quantity.
1405
1406 @item CMD_LABEL @var{flag} @var{identifier}
1407 Define a label. If bit 0 of the byte @var{flag} is set, this is a public variable and its identifier follows.
1408
1409 @item CMD_INSTR @var{opcode} @var{expression-id}
1410 An instruction whose operand must ultimately be resolved. @var{opcode} is the 6502 operation code. @var{expression-id} is a 16-bit quantity which refers to the expression which is the (symbolic) operand of the instruction (see the next section).
1411
1412 @item CMD_DB @var{expression-id}
1413 Define a byte symbolically. @var{expression-id} refers to the expression which is the operand.
1414
1415 @item CMD_DW @var{expression-id}
1416 Define a word symbolically. @var{expression-id} refers to the expression which is the operand.
1417
1418 @item CMD_DD @var{expression-id}
1419 Define a doubleword symbolically. @var{expression-id} refers to the expression which is the operand.
1420
1421 @item CMD_DSI8 @var{size}
1422 Define data storage of @var{size} bytes. @var{size} is an 8-bit quantity.
1423
1424 @item CMD_DSI16
1425 Define data storage of @var{size} bytes. @var{size} is a 16-bit quantity.
1426
1427 @item CMD_DSB @var{expression-id}
1428 Define data storage of bytes, the size of which is determined by the expression referred to by @var{expression-id}.
1429
1430 @end table
1431
1432 @section Expressions
1433
1434 As you may have noticed in the preceding section, many bytecodes have an expression identifier as argument. This is just an index into the list of expressions defined in the final part of the object file. The main advantages of separating the @emph{use} of an expression (through its identifier) from its @emph{definition} is ease of parsing and processing (each bytecoded instruction will always occupy 4 bytes), and the ability to share the same expression among several instructions without having to redefine it every time. In most cases, the expression will just be a stand-alone reference to a symbol (local or external). But any expression that the assembler understands can be encoded here.
1435
1436 @node Custom Character Maps
1437 @appendix Custom Character Maps
1438
1439 @c Using custom character maps is a convenient way of mapping ASCII characters to a different encoding.
1440 A custom character map is a plaintext file containing statements of the form
1441 @example
1442 @var{key} = @var{value}
1443 @end example
1444
1445 where @var{key} is a character or escape sequence and @var{value} is the integer literal instances of this character should be mapped to when occuring as argument to the @code{.char} assembler directive.
1446
1447 There is a also a shorthand form for mapping a range of characters at once:
1448 @example
1449 @var{low_key}-@var{high_key} = @var{value}
1450 @end example
1451
1452 This is most useful when mapping the decimal digits and the alphabet. Instead of typing monotonic statements like
1453 @example
1454 0=0x10
1455 1=0x11
1456 ...
1457 9=0x19
1458 @end example
1459 you can achieve the same result from the statement
1460 @example
1461 0-9=0x10
1462 @end example
1463
1464 @node Error and Warning Messages
1465 @appendix Error and Warning Messages
1466
1467 @section Assembler Error Messages
1468
1469 @table @samp
1470
1471 @item cannot expand `@var{identifier}'; not a macro
1472 Make sure @var{identifier} is a macro and not a label, constant or other type of symbol.
1473
1474 @item conditional expression does not evaluate to literal
1475 Conditional assembly with the @code{.if}-directive requires that the expression tested can be evaluated immediately, so it can't contain references to labels and such (since these aren't computed by the assembler, that's the linker's job).
1476
1477 @item could not open `@var{filename}' for reading
1478 Check that the file exists and that you have read privileges.
1479
1480 @item duplicate symbol `@var{identifier}'
1481 You tried to define the same symbol more than once. Global labels must be unique across the entire program. Local labels must be unique within the relevant local scope.
1482
1483 @item field declaration expected
1484 A structure or union field must be of the form @code{@var{identifier} @var{datatype} @var{[count]}}; for example, @code{my_field .db}, @code{my_2nd_field .dsw 10}, @code{my_3rd_field .type other_struc}.
1485
1486 @item data initialization not allowed here
1487 A structure or union declaration cannot contain initialization of its fields.
1488
1489 @item initializer does not evaluate to integer literal
1490 A member of an enumerated datatype must be assigned a constant value.
1491
1492 @item initializer for field `@var{identifier}' exceeds field size
1493 When defining the value of a structure or union field, the value must not exceed the number of bytes of storage allocated for that field. For example, if the field definition is @code{a_string .dsb 4}, then the value @code{"too long"} won't fit since it is 8 bytes long.
1494
1495 @item instructions not allowed in data segment
1496 Instructions can only be contained in a code segment (after a @code{codeseg} directive).
1497
1498 @item invalid addressing mode
1499 The combination of mnemonic and addressing mode for the 6502 instruction is invalid. For example, @code{LDX $00,X} is invalid since the @code{LDX} instruction does not have a ZeroPage,X-mode version. Consult a 6502 manual to see what modes are valid for each instruction.
1500
1501 @item invalid dataseg statement
1502 A data segment only supports a subset of the statements allowed in code segments. You can't put instructions or initialized data in a data segment.
1503
1504 @item invalid operand
1505 You supplied an invalid operand to a statement, for example a string as operand to the @code{LDA} instruction.
1506
1507 @item macro `@var{identifier}' does not take @var{count} argument(s)
1508 You supplied the wrong amount of arguments to the macro. Check the macro definition if you're unsure how many arguments it takes, and try again.
1509
1510 @item member `@var{sub-struct-identifier}' of `@var{struct-identifier}' is not a structure
1511 The expression @code{@var{struct-identifier}.@var{sub-struct-identifier}.@var{some-member}} did not resolve because @var{sub-struct-identifier} is not a structure.
1512
1513 @item member '@var{field-identifier}' of '@var{struct-identifier}' is of unknown type (`@var{type-identifier}')
1514 @var{type-identifier} is an undefined type.
1515
1516 @item only one field of union can be initialized
1517 When defining an instance of a union, only one of the possible fields can be given a value between the pair of enclosing braces @{ @}.
1518
1519 @item operand out of range
1520 A 6502 instruction has either an 8-bit or 16-bit operand, so its value has to fit in that many bits. However, the value you supplied was too large to fit.
1521
1522 @item procedures not allowed in data segment
1523 A procedure contains code. Code cannot be contained in a data segment.
1524
1525 @item repeat count does not evaluate to literal
1526 Anonymous macros are expanded as soon as they are encountered. Thus, the argument of a @code{rept} directive must be an immediate expression.
1527
1528 @item size of `@var{identifier}' is unknown
1529 The operand to the @code{sizeof} operator must be one of @code{BYTE}, @code{WORD}, @code{DWORD}, or the name of a structure definition.
1530
1531 @item string or integer argument expected
1532 The @code{message} directive takes a string or integer as its argument.
1533
1534 @item structure initializer expected
1535 When defining data that is an instance of a structure or union, the field value(s) must be enclosed in a pair of braces @{ @}.
1536
1537 @item too many field initializers
1538 There are too many values given compared to the actual number of fields in the structure or union.
1539
1540 @item union member must be of constant size
1541 The size of a union member must be known at assembly time. This restriction does not apply to structs.
1542
1543 @item unknown macro or directive `@var{identifier}'
1544 You attempted to invoke a macro or directive that the assembler doesn't recognize. Check your spelling (remember that identifiers are case sensitive) and/or your macro definitions.
1545
1546 @item unknown namespace `@var{identifier}'
1547 The expression @code{@var{identifier}::@var{symbol}} did not resolve because @var{identifier} is not a namespace.
1548
1549 @item unknown symbol `@var{identifier}'
1550 Your code refers to a symbol which hasn't been defined locally nor has it been declared to be external.
1551
1552 @item value not allowed in data segment
1553 Data cannot be initialized in a data segment; the data segment can only specify how many bytes of storage will be needed at runtime.
1554
1555 @item `@var{identifier}' declared as extrn but is defined locally
1556 Defining a label in your own code and then declaring it as an external symbol doesn't make much sense.
1557
1558 @item `@var{identifier}' already declared extrn
1559 An identifier already specified in a @code{public} directive cannot at the same time be external.
1560
1561 @item `@var{identifier}' is of non-exportable type
1562 Macros and other volatile symbols cannot be exported.
1563
1564 @item `@var{field-identifier}' is not a member of `@var{struct-identifier}'
1565 The expression @code{@var{struct-identifier}.@var{field-identifier}} did not resolve.
1566
1567 @end table
1568
1569 @section Assembler Warning Messages
1570
1571 @table @samp
1572
1573 @item `@var{identifier}' declared as public but is not defined
1574 You cannot export a symbol that isn't defined in your code.
1575
1576 @item `@var{identifier}' defined but not used
1577 Usually there is a reason for defining a symbol, so the assembler will warn you if there are no references to it in the code.
1578
1579 @item operand out of range; truncated
1580 Operand exceeds 8 or 16 bits, so the upper bits are chopped off.
1581
1582 @item redefinition of `@var{identifier}' is not identical; ignored
1583 When using the @var{.equ}-directive you can only define each identifier once. (Use the = operator instead if appropriate.)
1584
1585 @end table
1586
1587 @section Linker Error Messages
1588
1589 @table @samp
1590
1591 @item branch out of range
1592 A relative branch instruction went too far. Trim your code or do an inverse-branch-followed-by-jump combo instead.
1593
1594 @item duplicate symbol `@var{identifier}'
1595 A symbol with the same name is exported from two or more of the units being linked. When linking, exported names must be unique across all units.
1596
1597 @item incompatible operand(s) to `@var{operator}' in expression
1598
1599 @item instruction operand doesn't fit in 1 byte
1600 A rather fatal error. A zeropage instruction's operand address won't fit.
1601
1602 @item instruction operand doesn't fit in 2 bytes
1603 A rather fatal error which shouldn't even occur.
1604
1605 @item invalid instruction operand (string)
1606 6502 instructions only take integer operands.
1607
1608 @item negative count
1609 A storage directive must have a positive integer operand.
1610
1611 @item out of 6502 RAM while allocating unit `@var{unit}'
1612 The linker couldn't map the data segments to 6502 RAM because there was too little of it available. Check your @code{ram}-commands in the script or reduce your program's memory requirements.
1613
1614 @item PC went beyond 64K when linking `@var{unit}'
1615
1616 @item unexpected string operand (`@var{string}') to storage directive
1617 Storage directives only take integer operands.
1618
1619 @item unknown symbol `@var{identifier}' referenced from @var{unit}
1620 The external symbol couldn't be resolved. You need to link the unit containing the symbol.
1621
1622 @end table
1623
1624 @subsection Linker Script Error Messages
1625
1626 @table @samp
1627
1628 @item bank size (@var{size}) exceeded by @var{count} bytes
1629 The bank output exceeded the size of the current bank. The bank size is wrong, your code is too large or the files you are copying to the bank are.
1630
1631 @item cannot pad backwards
1632 If you start a bank and copy a, say, 2K file to it, then attempt to pad to offset 1K you will get this error. Padding can only be done from a smaller offset to a larger or equal offset. Your pad offset is wrong or the data preceding it is too large.
1633
1634 @item could not open `@var{filename}' for reading
1635 I'm sure you know what this means by now.
1636
1637 @item could not open `@var{filename}' for writing
1638 The specified output file could not be created.
1639
1640 @item `end' is smaller than `start'
1641 The end address should be larger or equal to the start address, not the other way around.
1642
1643 @item failed to load `@var{unit}'
1644 The object file could not be loaded from storage. The file is missing, you don't have access to it or it is corrupted.
1645
1646 @item invalid size
1647 The size must be a positive (larger than zero) quantity.
1648
1649 @item missing argument `@var{name}'
1650 The script command requires an argument which you did not supply.
1651
1652 @item no bank size set
1653 At a minimum, the first @code{bank}-command in the script must supply a bank size.
1654
1655 @item no output open
1656 When executing a script command which writes to the linker's output, an output file must have been specified first. Make sure that all @code{link}-, @code{copy}-, @code{pad}-commands etc. are preceded by the proper @code{output}-command.
1657
1658 @item value of argument `@var{name}' is out of range
1659 The script command argument's value is outside the expected range. For example, an argument which specifies a 6502 address should be between 0 and 64K.
1660
1661 @end table
1662
1663 @section Linker Warning Messages
1664
1665 @table @samp
1666
1667 @item `.D(B|W)' operand @var{integer} out of range; truncated
1668 Operand exceeds 8 or 16 bits, so the upper bits are chopped off.
1669
1670 @end table
1671
1672 @bye