beta-0.89.2
[luatex.git] / source / texk / web2c / triptrap / trapman.tex
blob7968c6e44eae3fc896f2c0a4d5c6fab93ada4836
1 % The TRAP manual: How to validate MF --- last updated by D E Knuth on 4 Dec 89
2 \font\eighttt= cmtt8
3 \font\eightrm= cmr8
4 \font\titlefont=cmssdc10 at 40pt
5 \let\mc=\eightrm
6 \font\logo=manfnt % font used for the METAFONT logo
7 \def\MF{{\logo META}\-{\logo FONT}}
8 \rm
9 \let\mainfont=\tenrm
11 \def\.#1{\hbox{\tt#1}}
12 \def\\#1{\hbox{\it#1\/\hskip.05em}} % italic type for identifiers
14 \parskip 2pt plus 1pt
15 \baselineskip 12pt plus .25pt
17 \def\verbatim#1{\begingroup \frenchspacing
18 \def\do##1{\catcode`##1=12 } \dospecials
19 \parskip 0pt \parindent 0pt
20 \catcode`\ =\active \catcode`\^^M=\active
21 \tt \def\par{\ \endgraf} \obeylines \obeyspaces
22 \input #1 \endgroup}
23 % a blank line will be typeset at the end of the file;
24 % if you're unlucky it will appear on a page by itself!
25 {\obeyspaces\global\let =\ }
27 \output{\shipout\box255\global\advance\pageno by 1} % for the title page only
28 \null
29 \vfill
30 \centerline{\titlefont A Torture Test}
31 \vskip8pt
32 \centerline{\titlefont for \logo ()*+,-.*}
33 \vskip 24pt
34 \centerline{by Donald E. Knuth}
35 \centerline{Stanford University}
36 \vskip 6pt
37 \centerline{({\sl Version 2, January 1990\/})}
38 \vfill
39 \centerline{\vbox{\hsize 4in
40 \noindent Programs that claim to be implementations of \MF84 are
41 supposed to be able to process the test routine contained in this
42 report, producing the outputs contained in this report.}}
43 \vskip 24pt
44 {\baselineskip 9pt
45 \eightrm\noindent
46 The preparation of this report was supported in part by the National Science
47 Foundation under grants IST-8201926 and MCS-8300984,
48 and by the System Development Foundation.
49 {\logo opqrstuq} is a trademark of Addison-Wesley Publishing Company.
52 }\pageno=0\eject
54 \output{\shipout\vbox{ % for subsequent pages
55 \baselineskip0pt\lineskip0pt
56 \hbox to\hsize{\strut
57 \ifodd\pageno \hfil\eightrm\firstmark\hfil
58 \mainfont\the\pageno
59 \else\mainfont\the\pageno\hfil
60 \eightrm\firstmark\hfil\fi}
61 \vskip 10pt
62 \box255}
63 \global\advance\pageno by 1}
64 \let\runninghead=\mark
65 \outer\def\section#1.{\noindent{\bf#1.}\quad
66 \runninghead{\uppercase{#1} }\ignorespaces}
68 \section Introduction.
69 People often think that their programs are ``debugged'' when large applications
70 have been run successfully. But system programmers know that a typical large
71 application tends to use at most about 50 per cent of the instructions
72 in a typical compiler. Although the other half of the code---which tends
73 to be the ``harder half''---might be riddled with errors, the system seems
74 to be working quite impressively until an unusual case shows up on the
75 next day. And on the following day another error manifests itself, and so on;
76 months or years go by before certain parts of the compiler are even
77 activated, much less tested in combination with other portions of the system,
78 if user applications provide the only tests.
80 How then shall we go about testing a compiler? Ideally we would like to
81 have a formal proof of correctness, certified by a computer.
82 This would give us a lot of confidence,
83 although of course the formal verification program might itself be incorrect.
84 A more serious drawback of automatic verification is that the formal
85 specifications of the compiler are likely to be wrong, since they aren't
86 much easier to write than the compiler itself. Alternatively, we can
87 substitute an informal proof of correctness: The programmer writes his or
88 her code in a structured manner and checks that appropriate relations
89 remain invariant, etc. This helps greatly to reduce errors, but it cannot
90 be expected to remove them completely; the task of checking a large
91 system is sufficiently formidable that human beings cannot do it without
92 making at least a few slips here and there.
94 Thus, we have seen that test programs are unsatisfactory if they are simply
95 large user applications; yet some sort of test program is needed because
96 proofs of correctness aren't adequate either. People have proposed schemes
97 for constructing test data automatically from a program text, but such
98 approaches run the risk of circularity, since they cannot assume that a
99 given program has the right structure.
101 I have been having good luck with a somewhat different approach,
102 first used in 1960 to debug an {\mc ALGOL} compiler. The idea is to
103 construct a test file that is about as different from a typical user
104 application as could be imagined. Instead of testing things that people
105 normally want to do, the file tests complicated things that people would
106 never dare to think of, and it embeds these complexities in still
107 more arcane constructions. Instead of trying to make the compiler do the
108 right thing, the goal is to make it fail (until the bugs have all been found).
110 To write such a fiendish test routine, one simply gets into a nasty frame
111 of mind and tries to do everything in the unexpected way. Parameters
112 that are normally positive are set negative or zero; borderline cases
113 are pushed to the limit; deliberate errors are made in hopes that the
114 compiler will not be able to recover properly from them.
116 A user's application tends to exercise 50\%\ of a compiler's logic,
117 but my first fiendish tests tend to improve this to about 90\%. As the
118 next step I generally make use of frequency-counting software to identify
119 the instructions that have still not been called upon. Then I add ever more
120 fiendishness to the test routine, until more than 99\%\ of the code
121 has been used at least once. (The remaining bits are things that
122 can occur only if the source program is really huge, or if certain
123 fatal errors are detected; or they are cases so similar to other well-tested
124 things that there can be little doubt of their validity.)
126 Of course, this is not guaranteed to work. But my experience in 1960 was
127 that only two bugs were ever found in that {\mc ALGOL} compiler after it
128 correctly translated that original fiendish test. And one of those bugs
129 was actually present in the results of the test; I simply had failed to
130 notice that the output was incorrect. Similar experiences occurred later
131 during the 60s and 70s, with respect to a few assemblers, compilers,
132 and simulators that I wrote.
134 This method of debugging, combined with the methodology of structured
135 programming and informal proofs (otherwise known as careful desk checking),
136 leads to greater reliability of production software than any other
137 method I know. Therefore I have used it in developing \MF84, and the
138 main bulk of this report is simply a presentation of the test program
139 that was used to get the bugs out of \MF.
141 Such a test file is useful also after a program has been debugged, since
142 it can be used to give some assurance that subsequent modifications don't
143 mess things up.
145 The test file is called \.{TRAP.MF}, because of my warped sense of humor:
146 \MF's companion system, \TeX, has a similar test file called \.{TRIP}, and I
147 couldn't help thinking about Billy Goat Gruff and the story of ``trip,
148 trap, trip, trap.''
150 The contents of this test file are so remote from what people actually
151 do with \MF, I feel apologetic if I have to explain the correct
152 translation of \.{TRAP.MF}; nobody really cares about most of the
153 nitty-gritty rules that are involved. Yet I believe \.{TRAP} exemplifies
154 the sort of test program that has outstanding diagnostic ability, as
155 explained above.
157 If somebody claims to have a correct implementation of \MF, I will not
158 believe it until I see that \.{TRAP.MF} is translated properly.
159 I propose, in fact, that a program must meet two criteria before it
160 can justifiably be called \MF: (1)~The person who wrote it must be
161 happy with the way it works at his or her installation; and (2)~the
162 program must produce the correct results from \.{TRAP.MF}.
164 \MF\ is in the public domain, and its algorithms are published;
165 I've done this since I do not want to discourage its use by placing
166 proprietary restrictions on the software. However, I don't want
167 faulty imitations to masquerade as \MF\ processors, since users
168 want \MF\ to produce identical results on different machines.
169 Hence I am planning to do whatever I can to suppress any systems that
170 call themselves \MF\ without meeting conditions (1) and~(2).
171 I have copyrighted the programs so that I have some chance to forbid
172 unauthorized copies; I explicitly authorize copying of correct
173 \MF\ implementations, and not of incorrect ones!
175 The remainder of this report consists of appendices, whose contents ought
176 to be described briefly here:
178 Appendix A explains in detail how to carry out a test of \MF, given
179 a tape that contains copies of the other appendices.
181 Appendix B is \.{TRAP.MF}, the fiendish test file that has already
182 been mentioned. People who think that they understand \MF\ are challenged
183 to see if they know what \MF\ is supposed to do with this file.
184 People who know only a little about \MF\ might still find it
185 interesting to study Appendix~B, just to get some insights into the
186 methodology advocated here.
188 Appendix C is \.{TRAPIN.LOG}, a correct transcript file \.{TRAP.LOG}
189 that results if \.{INIMF} is applied to \.{TRAP.MF}. (\.{INIMF} is
190 the name of a version of \MF\ that does certain initializations;
191 this run of \.{INIMF} also creates a binary base file called \.{TRAP.BASE}.)
193 Appendix D is a correct transcript file \.{TRAP.LOG} that results if
194 \.{INIMF} or any other version of \MF\ is applied to \.{TRAP.MF}
195 with base file \.{TRAP.BASE}.
197 Appendix E is \.{TRAP.TYP}, the symbolic version of a correct output
198 file \.{TRAP.72270GF} that was produced at the same time as the \.{TRAP.LOG}
199 file of Appendix~D.
201 Appendix F is \.{TRAP.PL}, the symbolic version of a correct output
202 file \.{TRAP.TFM} that was produced at the same time as the \.{TRAP.LOG}
203 file of Appendix~D.
205 Appendix G is \.{TRAP.FOT}, an abbreviated version of Appendix D that
206 appears on the user's terminal during the run that produces \.{TRAP.LOG},
207 \.{TRAP.72270GF}, and \.{TRAP.TFM}.
209 The debugging of \MF\ and the testing of the adequacy of \.{TRAP.MF}
210 could not have been done nearly as well as reported here except for
211 the magnificent software support provided by my colleague David R. Fuchs.
212 In particular, he extended our local Pascal compiler so that
213 frequency counting and a number of other important features were added
214 to its online debugging abilities.
216 The method of testing advocated here has one chief difficulty that deserves
217 comment: I had to verify by hand that \MF\ did the right things
218 to \.{TRAP.MF}. This took many hours, and perhaps I have missed
219 something (as I did in 1960); I must confess that I have not checked
220 every single number in Appendices D, E, and~F. However, I'm willing to pay
221 $\$$81.92 to the first finder of any remaining bug in \MF, and I will
222 be surprised if that bug doesn't show up also in one of these appendices.
224 \vfill\eject
226 \section Appendix A: How to test \MF.
228 \item{0.} Let's assume that you have a tape containing \.{TRAP.MF},
229 \.{TRAPIN.LOG}, \.{TRAP.LOG}, \.{TRAP.TYP}, \.{TRAP.PL}, and \.{TRAP.FOT},
230 as in Appendices B, C, D, E, F, and~G. Furthermore, let's suppose that you
231 have a working \.{WEB} system, and that you have working programs
232 \.{TFtoPL} and \.{GFtype}, as described in the \TeX ware and \MF ware reports.
234 \item{1.} Prepare a version of \.{INIMF}. (This means that your \.{WEB}
235 change file should have {\bf init} and {\bf tini} defined to be null.)
236 The {\bf debug} and {\bf gubed} macros should be null, in order to
237 activate special printouts that occur when $\\{tracingedges}>1.0$.
238 The {\bf stat} and {\bf tats} macros should also be null, so that
239 statistics are kept. Set \\{mem\_top} and \\{mem\_max} to 3000
240 (or to \\{mem\_min} plus 3000, if \\{mem\_min} isn't zero),
241 for purposes of this test version.
242 Also set $\\{error\_line}=64$, $\\{half\_error\_line}=32$,
243 $\\{max\_print\_line}=72$, $\\{screen\_width}=100$, and
244 $\\{screen\_depth}=200$; these parameters affect many of the lines of
245 the test output, so your job will be much easier if you use the same
246 settings that were used to produce Appendix~E. Also (if possible) set
247 $\\{gf\_buf\_size}=8$, since this tests more parts of the program.
248 You probably should also use the ``normal'' settings of other parameters
249 found in \.{MF.WEB} (e.g., $\\{max\_internal}=100$, $\\{buf\_size}=500$,
250 etc.), since these show up in a few lines of the test output. Finally,
251 change \MF's screen-display routines by putting the following simple lines
252 in the change file:
253 $$\vbox{\halign{\tt#\hfil\cr
254 \char`\@x Screen routines:\cr
255 begin init\char`\_screen:=false;\cr
256 \char`\@y\cr
257 begin init\char`\_screen:=true;
258 \char`\{screen instructions will be logged\char`\}\cr
259 \char`\@z\cr}}$$
260 None of the other screen routines (\\{update\_screen}, \\{blank\_rectangle},
261 \\{paint\_row}) should be changed in any way; the effect will be to have
262 \MF's actions recorded in the transcript files instead of on the screen,
263 in a machine-independent way.
265 \item{2.} Run the \.{INIMF} prepared in step 1. In response to the first
266 `\.{**}' prompt, type carriage return (thus getting another `\.{**}').
267 Then type `\.{\char`\\input trap}'. You should get an output that matches
268 the file \.{TRAPIN.LOG} (Appendix~C). Don't be alarmed by the error
269 messages that you see, unless they are different from those in Appendix~C.
271 \def\sp{{\char'40}}
272 \item{3.} Run \.{INIMF} again. This time type `\.{\sp\&trap\sp\sp trap\sp}'.
273 (The spaces in this input help to check certain parts of \MF\ that
274 aren't otherwise used.) You should get outputs \.{TRAP.LOG}, \.{TRAP.72270GF},
275 and \.{TRAP.TFM}.
276 Furthermore, your terminal should receive output that matches \.{TRAP.FOT}
277 (Appendix~G). During the middle part of this test, however, the terminal
278 will not be getting output, because \.{batchmode} is being
279 tested; don't worry if nothing seems to be happening for a while---nothing
280 is supposed to.
282 \item{4.} Compare the \.{TRAP.LOG} file from step 3 with the ``master''
283 \.{TRAP.LOG} file of step~0. (Let's hope you put that master file in a
284 safe place so that it wouldn't be clobbered.) There should be perfect
285 agreement between these files except in the following respects:
287 \itemitem{a)} The dates and possibly the file names will
288 naturally be different.
290 \itemitem{b)} If you had different values for \\{stack\_size}, \\{buf\_size},
291 etc., the corresponding capacity values will be different when they
292 are printed out at the end.
294 \itemitem{c)} Help messages may be different; indeed, the author encourages
295 non-English help messages in versions of \MF\ for people who don't
296 understand English as well as some other language.
298 \itemitem{d)} The total number and length of strings at the end and/or
299 ``still untouched'' may well be different.
301 \itemitem{e)} If your \MF\ uses a different memory allocation or
302 packing scheme, the memory usage statistics may change.
304 \itemitem{f)} If you use a different storage allocation scheme, the
305 capsule numbers will probably be different, but the order of variables
306 should be unchanged when dependent variables are shown. \MF\ should also
307 choose the same variables to be dependent.
309 \itemitem{g)} If your computer handles integer division of negative operands
310 in a nonstandard way, you may get results that are rounded differently.
311 Although \TeX\ is careful to be machine-independent in this regard,
312 \MF\ is not, because integer divisions are present in so many places.
314 \item{5.} Use \.{GFtype} to convert your file \.{TRAP.72270GF} to a file
315 \.{TRAP.TYP}. (Both of \.{GFtype}'s options, i.e., mnemonic output and image
316 output, should be enabled so that you get the maximum amount of output.)
317 The resulting file should agree with the master \.{TRAP.TYP} file of step~0,
318 assuming that your \.{GFtype} has the ``normal'' values of compile-time
319 constants ($\\{top\_pixel}=69$, etc.).
321 \item{6.} Use \.{TFtoPL} to convert your file \.{TRAP.TFM} to a file
322 \.{TRAP.PL}. The resulting file should agree with the master \.{TRAP.PL}
323 file of step~0.
325 \item{7.} You might also wish to test \.{TRAP} with other versions of
326 \MF\ (i.e., \.{VIRMF} or a production version with another base file
327 preloaded). It should work unless \MF's primitives have been redefined in
328 the base file. However, this step isn't essential, since all the code of
329 \.{VIRMF} appears in \.{INIMF}; you probably won't catch any more errors
330 this way, unless they would already become obvious from normal use of
331 the~system.
333 \vfill\eject
335 \section Appendix B: The \.{TRAP.MF} file.
336 The contents of the test routine are prefixed here with line numbers, for
337 ease in comparing this file with the error messages printed later; the
338 line numbers aren't actually present.
339 \runninghead{APPENDIX B: \.{TRAP.MF} (CONTINUED)}
341 \vskip 8pt
342 \begingroup\count255=0
343 \everypar{\global\advance\count255 by 1
344 \hbox to 20pt{\sevenrm\hfil\the\count255\ \ }}
345 \verbatim{trap.mf}
346 \endgroup
347 \vfill\eject
349 \section Appendix C: The \.{TRAPIN.LOG} file.
350 When \.{INIMF} makes the \.{TRAP.BASE} file, it also creates a file called
351 \.{TRAP.LOG} that looks like this.
352 \runninghead{APPENDIX C: \.{TRAPIN.LOG} (CONTINUED)}
354 \vskip8pt
355 \verbatim{trapin.log}
356 \vfill\eject
358 \section Appendix D: The \.{TRAP.LOG} file.
359 Here is the major output of the \.{TRAP} test; it is generated by running
360 \.{INIMF} and loading \.{TRAP.BASE}, then reading \.{TRAP.MF}.
361 \runninghead{APPENDIX D: \.{TRAP.LOG} (CONTINUED)}
363 {\let\tt=\eighttt\leftskip 1in\baselineskip 9pt plus .1pt minus .1pt
364 \vskip8pt
365 \verbatim{trap.log}
367 \vfill\eject
369 \section Appendix E: The \.{TRAP.TYP} file.
370 Here is another major component of the test. It shows the output of \.{GFtype}
371 applied to the file \.{TRAP.72270GF} that is created at the same time
372 Appendix D was produced.
373 \runninghead{APPENDIX E: \.{TRAP.TYP} (CONTINUED)}
375 {\let\tt=\eighttt\leftskip 1in\baselineskip 9pt plus .1pt minus .1pt
376 \vskip8pt
377 \verbatim{trap.typ}
379 \vfill\eject
381 \section Appendix F: The \.{TRAP.PL} file.
382 In this case we have the output of \.{TFtoPL}
383 applied to the file \.{TRAP.TFM} that is created at the same time
384 Appendix D was produced.
385 \runninghead{APPENDIX F: \.{TRAP.PL} (CONTINUED)}
387 {\let\tt=\eighttt\leftskip 1in\baselineskip 9pt plus .1pt minus .1pt
388 \vskip8pt
389 \verbatim{trap.pl}
391 \vfill\eject
393 \section Appendix G: The \.{TRAP.FOT} file.
394 This shows what appeared on the terminal while Appendix D was being produced.
395 \runninghead{APPENDIX G: \.{TRAP.FOT} (CONTINUED)}
397 \vskip8pt
398 \verbatim{trap.fot}
400 \vfill\end