3 (declare (simple-string s))
4 (declare (optimize (speed 3) (safety 0) (debug 0)))
7 (dotimes (i (length s))
8 (when (eql (aref s i) #\1)
12 * On X86 I is represented as a tagged integer.
14 * EQL uses "CMP reg,reg" instead of "CMP reg,im". This causes
15 allocation of an extra register and an extra move.
18 3: SLOT S!11[EDX] {SB-C::VECTOR-LENGTH 1 7} => t23[EAX]
19 4: MOVE t23[EAX] => t24[EBX]
21 --------------------------------------------------------------------------------
24 (declare (optimize (speed 3) (safety 0) (space 2) (debug 0)))
25 (declare (type (simple-array double-float 1) v))
27 (declare (type double-float s))
28 (dotimes (i (length v))
29 (setq s (+ s (aref v i))))
32 * Python does not combine + with AREF, so generates extra move and
35 * On X86 Python thinks that all FP registers are directly accessible
36 and emits costy MOVE ... => FR1.
38 --------------------------------------------------------------------------------
41 (declare (optimize (speed 3) (safety 0) (space 2))
43 (let ((v (make-list n)))
44 (setq v (make-array n))
47 * IR1 does not optimize away (MAKE-LIST N).
48 --------------------------------------------------------------------------------
51 (declare (optimize (speed 3) (safety 0) (space 2))
52 (type (simple-array base-char 1) v1 v2))
53 (dotimes (i (length v1))
54 (setf (aref v2 i) (aref v1 i))))
56 VOP DATA-VECTOR-SET/SIMPLE-STRING V2!14[EDI] t32[EAX] t30[S2]>t33[CL]
58 MOV #<TN t33[CL]>, #<TN t30[S2]>
59 MOV BYTE PTR [EDI+EAX+1], #<TN t33[CL]>
60 MOV #<TN t35[AL]>, #<TN t33[CL]>
61 MOV #<TN t34[S2]>, #<TN t35[AL]>
63 * The value of DATA-VECTOR-SET is not used, so there is no need in the
67 --------------------------------------------------------------------------------
71 uses generic arithmetic
72 --------------------------------------------------------------------------------
74 09:49:05 <jtra> I have found a case in those where suboptimal code is
75 generate with nested loops, it might be moderately easy to fix that
77 http://www.bagley.org/~doug/shootout/bench/nestedloop/nestedloop.cmucl
78 09:50:30 <jtra> if you add declarations to dotimes, generated code is
79 almost optimal, but most inner loops run out of registers and use
80 memory location for iteration variable
82 ;;; -*- mode: lisp -*-
84 ;;; http://www.bagley.org/~doug/shootout/
85 ;;; from Friedrich Dominicus
88 (let ((n (parse-integer (or (car (last extensions:*command-line-strings*)) "1")))
92 (optimize (speed 3) (debug 0) (safety 0)))
100 (format t "~A~%" x)))
101 --------------------------------------------------------------------------------
104 (declare (optimize speed (debug 0)))
105 (if (< x 0) x (foo (1- x))))
107 SBCL generates a full call of FOO (but CMUCL does not).
108 --------------------------------------------------------------------------------
111 (declare (optimize (speed 3) (safety 0) (debug 0)))
112 (declare (type (double-float 0d0 1d0) d))
113 (loop for i fixnum from 1 to 5
114 for x1 double-float = (sin d) ;;; !!!
115 do (loop for j fixnum from 1 to 4
116 sum x1 double-float)))
118 Without the marked declaration Python will use boxed representation for X1.
120 This is equivalent to
124 ;; use of X as DOUBLE-FLOAT
127 The initial binding is effectless, and without it X is of type
128 DOUBLE-FLOAT. Unhopefully, IR1 does not optimize away effectless
129 SETs/bindings, and IR2 does not perform type inference.
130 --------------------------------------------------------------------------------
131 #9 "Multi-path constant folding"
133 (if (= (cond ((irgh x) 0)
140 This code could be optimized to
143 (cond ((irgh x) :yes)
146 --------------------------------------------------------------------------------
148 (inverted variant of #9)
151 (let ((y (sap-alien x c-string)))
155 It could be optimized to
157 (lambda (x) (list x x))
159 (if Y were used only once, the current compiler would optimize it)
160 --------------------------------------------------------------------------------