Sverre says the link to the UGFWIINI contest was bothering him
[git/dscho.git] / index.html
blob3c11ee67c6a012b42399a5bbe35a59f921783621
1 <html>
2 <head>
3 <title>Dscho's Git log</title>
4 <meta http-equiv="Content-Type"
5 content="text/html; charset=UTF-8"/>
6 </head>
7 <body style="width:800px;background-image:url(dscho.git?a=blob_plain;hb=832be85c785c80202f17b87db7f063ae57ec2cac;f=paper.jpg);background-repeat:repeat-y;background-attachment:scroll;padding:0px;">
8 <div style="width:610px;margin-left:120px;margin-top:50px;align:left;vertical-align:top;">
9 <h1>Dscho's Git log</h1>
10 <div style="position:absolute;top:50px;left:810px;width=400px">
11 <table width=400px bgcolor=#e0e0e0 border=1>
12 <tr><th>Table of contents:</th></tr>
13 <tr><td>
14 <p><ul>
15 <li><a href=#1233707628>04 Feb 2009 New valgrind series</a>
16 <li><a href=#1233706294>04 Feb 2009 Problems with split-topic-branches.sh</a>
17 <li><a href=#1233277286>30 Jan 2009 More valgrind fun</a>
18 <li><a href=#1233193467>29 Jan 2009 Interactive stash</a>
19 <li><a href=#1233154567>28 Jan 2009 Splitting topic branches</a>
20 <li><a href=#1233102919>28 Jan 2009 Showing off that you're an Alpine user ... priceless!</a>
21 <li><a href=#1233101919>28 Jan 2009 Progress with the interactive rebase preserving merges</a>
22 <li><a href=#1233099894>28 Jan 2009 Another midnight riddle?</a>
23 <li><a href=#1233022809>27 Jan 2009 Fun with calculus after midnight</a>
24 <li><a href=#1232997290>26 Jan 2009 Valgrind takes a loooong time</a>
25 </ul></p>
26 <a href=dscho.git?a=blob_plain;hb=8e403b2b24aab139684bdb61736f4a5e8500b496;f=index.html>Older posts</a>
27 </td></tr></table>
28 <br>
29 <div style="text-align:right;">
30 <a href="dscho.git?a=blob_plain;hb=blog;f=blog.rss"
31 title="Subscribe to my RSS feed"
32 class="rss" rel="nofollow"
33 style="background-color:orange;text-decoration:none;color:white;font-family:sans-serif;">RSS</a>
34 </div>
35 <br>
36 <table width=400px bgcolor=#e0e0e0 border=1>
37 <tr><th>About this blog:</th></tr>
38 <tr><td>
39 <p>It is an active <a href=http://repo.or.cz/w/git/dscho.git?a=blob;f=source-1232626236.txt;h=1edde0467a>abuse</a> of <a href=http://repo.or.cz/>repo.or.cz</a>,
40 letting gitweb unpack the objects in the current tip of the branch <i>blog</i>,
41 including the images and the RSS feed.
42 </p><p>
43 Publishing means running a script that collects the posts, turns them into
44 HTML, makes sure all the images are checked in, and pushes the result.
45 </p><p>
46 This blog also serves to grace the world with Dscho's random thoughts on and
47 around Git.
48 </p>
49 </td></tr></table>
50 <br>
51 <table width=400px bgcolor=#e0e0e0 border=1>
52 <tr><th>Links:</th></tr>
53 <tr><td>
54 <ul>
55 <li> <a href=http://git-scm.com/>Git's homepage</a>
56 <li> <a href=http://gitster.livejournal.com/>Junio's blog</a>
57 <li> <a href=http://www.spearce.org/>Shawn's blog</a> seems to be sitting
58 idle ever since he started working for Google...
59 <li> <a href=http://torvalds-family.blogspot.com/>Linus' blog</a> does not
60 talk much about Git...
61 <li> Scott Chacon's <a href=http://whygitisbetterthanx.com/>Why Git is better
62 than X</a> site
63 <li> <a href=http://vilain.net/>The blog of mugwump</a>
64 <li> <a href=http://blogs.gnome.org/newren/>Elijah Newren</a> chose the
65 same path as Cogito, offering an alternative porcelain (an approach
66 that is doomed in my opinion)
67 <li> <a href=http://msysgit.googlecode.com/>The msysGit project</a>, a (mostly)
68 failed experiment to lure the many Windows developers out there to
69 contribute to Open Source for a change.
70 </ul>
71 </td></tr></table>
72 <br>
73 <table width=400px bgcolor=#e0e0e0 border=1>
74 <tr><th>Google Ads:</th></tr>
75 <tr><td>
76 <script type="text/javascript"><!--
77 google_ad_client = "pub-5106407705643819";
78 /* 300x250, created 1/22/09 */
79 google_ad_slot = "6468207338";
80 google_ad_width = 300;
81 google_ad_height = 250;
82 //-->
83 </script>
84 <script type="text/javascript"
85 src="http://pagead2.googlesyndication.com/pagead/show_ads.js">
86 </script>
87 </td></tr></table>
88 </div>
89 <h6>Wednesday, 4th of February, Anno Domini MMIX, at the hour of the Buffalo</h6>
90 <a name=1233707628>
91 <h2>New valgrind series</h2>
93 <p>
94 </p><p>
95 I spent quite some time cleaning up that patch series, and feel pretty
96 exhausted.
97 </p><p>
98 Granted, the new <i>git rebase -i -p</i> does its job without complaint so far
99 (so much so that I think I'll release a version of my <i>rebase</i> series
100 soonish), but it <u>is</u> a hassle when you have patches that you have a hard
101 time to decide upon the order/commit boundaries.
102 </p><p>
103 For example, I could imagine that the patch making the location of the
104 templates independent of the location of the Git binaries should come
105 <u>before</u> my patch series, and the valgrind specific part should then
106 be squashed into the first valgrind commit.
107 </p><p>
108 Also, it uses two features of valgrind 3.4.0:
109 </p><p>
110 <ul>
111 <li><i>...</i> in the suppression file, and
112 <li><i>--track-origins=yes</i>
113 </ul>
114 </p><p>
115 The latter is actually the reason I am pretty willing to keep the
116 requirement of that valgrind version, as it is really, really useful.
117 </p><p>
118 I guess we will see what happens to it.
119 </p>
120 <h6>Wednesday, 4th of February, Anno Domini MMIX, at the hour of the Buffalo</h6>
121 <a name=1233706294>
122 <h2>Problems with split-topic-branches.sh</h2>
125 </p><p>
126 So my little script that should help me to split my topic branches does
127 not work properly.
128 </p><p>
129 First some background: the idea was to let <i>git blame</i> do the hard work
130 to find overlapping changes, i.e. changes that would conflict when
131 changing the order (or skipping the first change, on which the next builds).
132 </p><p>
133 The first problem with that approach: when lines are <u>removed</u> by one
134 commit, and the next commit touches the same location, <i>git blame</i> does
135 not find that the first commit is required by the second.
136 </p><p>
137 Therefore I introduced a really slow reverse thing which tries to find
138 those commits whose removals survived until the parent of a particular
139 commit, but not further.
140 </p><p>
141 However, it does not work properly. Basically, only context sizes that
142 span the whole files lead to conflict-free topic branches so far.
143 </p><p>
144 As a consequence, I think I'll add an option --sprout to the revision
145 walker which will fake octopus merges (or a series of two-parent merges)
146 whenever it finds a perl of non-merge commits that are theoretically
147 independent, i.e. whose patches apply cleanly.
148 </p>
149 <h6>Friday, 30th of January, Anno Domini MMIX, at the hour of the Buffalo</h6>
150 <a name=1233277286>
151 <h2>More valgrind fun</h2>
154 </p><p>
155 So I spent quite a number of hours on that funny zlib/valgrind issue. The
156 thing is, zlib people claim that even if their code accesses uninitialized
157 memory, it does not produce erroneous data (by cutting out the results of the
158 uninitialized data, which is cheaper than checking for the end of the buffer
159 in an unaligned manner), so zlib will always be special for valgrind.
160 </p><p>
161 However, the bug I was chasing is funny, and different from said issue. zlib
162 deflates an input buffer to an output buffer that is exactly 58 bytes long.
163 But valgrind claims that the 52nd of those bytes is uninitialized, and <u>only</u>
164 that one.
165 </p><p>
166 But it is not. It must be 0x2c, otherwise zlib refuses to inflate the
167 buffer.
168 </p><p>
169 Now, I went into a debugging frenzy, and finally found out that zlib just
170 passes fine (with the default suppressions because of the "cute" way it
171 uses uninitialized memory), <u>except</u> when it is compiled with UNALIGNED_OK
172 defined.
173 </p><p>
174 Which Ubuntu does, of course. Ubuntu, the biggest forker of all.
175 </p><p>
176 The bad part is that it sounds like a bug in valgrind, and I <u>could</u> imagine
177 that it is an issue of an optimized memcpy() that copies int by int, and
178 that valgrind misses out on the fact that a part of that int is actually
179 <u>not</u> uninitialized.
180 </p><p>
181 But my debugging session's results disagree with that.
182 </p><p>
183 With the help of Julian Seward, the original author of valgrind, I instrumented
184 zlib's source code so that valgrind checks earlier if the byte is initialized
185 or not, to find out where the reason of the issue lies.
186 </p><p>
187 The sad part is that when I added the instrumentation to both the <u>end</u> of
188 the while() loop in compress_block() in zlib's trees.c, and just <u>after</u> the
189 while() loop (whose condition is a plain <i>variable < variable</i> comparison,
190 nothing fancy, certainly not changing any memory), only the <u>latter</u> catches
191 a valgrind error.
192 </p><p>
193 And that is truly strange.
194 </p>
195 <h6>Thursday, 29th of January, Anno Domini MMIX, at the hour of the Buffalo</h6>
196 <a name=1233193467>
197 <h2>Interactive stash</h2>
200 </p><p>
201 There is an easy way to split a patch:
202 </p><p>
203 <table
204 border=1 bgcolor=black>
205 <tr><td bgcolor=lightblue colspan=3>
206 <pre> </pre>
207 </td></tr>
208 <tr><td>
209 <table cellspacing=5 border=0
210 style="color:white;">
211 <tr><td>
212 <pre>
213 $ git reset HEAD^
214 $ git add -i
215 $ git commit
216 $ git diff -R HEAD@{1} | git apply --index
217 $ git commit
218 </pre>
219 </td></tr>
220 </table>
221 </td></tr>
222 </table>
223 </p><p>
224 but it misses out on the fact that the first of both commits does not
225 reflect the state of the working directory at any time.
226 </p><p>
227 So I think something like an interactive <i>stash</i> is needed. A method
228 to specify what you want to keep in the working directory, the rest should
229 be stashed. The idea would be something like this:
230 </p><p>
231 <ol>
232 <li>Add the desired changes into a temporary index.
233 <li>Put the rest of the changes in another temporary index.
234 <li>Stash the latter index.
235 <li>Synchronize the working directory with the first index.
236 <li>Clean up temporary indices.
237 </ol>
238 </p><p>
239 Or in code:
240 </p><p>
241 <table
242 border=1 bgcolor=black>
243 <tr><td bgcolor=lightblue colspan=3>
244 <pre> </pre>
245 </td></tr>
246 <tr><td>
247 <table cellspacing=5 border=0
248 style="color:white;">
249 <tr><td>
250 <pre>
251 $ cp .git/index .git/interactive-stash-1
252 $ GIT_INDEX_FILE=.git/interactive-stash-1 git add -i
253 $ cp .git/index .git/interactive-stash-2
254 $ GIT_INDEX_FILE=.git/interactive-stash-1 git diff -R |
255 (GIT_INDEX_FILE=.git/interactive-stash-2 git apply--index)
256 $ tree=$(GIT_INDEX_FILE=.git/index git write-tree)
257 $ commit=$(echo Current index | git commit-tree $tree -p HEAD)
258 $ tree=$(GIT_INDEX_FILE=.git/interactive-stash-2 git write-tree)
259 $ commit=$(echo Edited out | git commit-tree $tree -p HEAD -p $commit)
260 $ git update-ref refs/stash $commit
261 $ GIT_INDEX_FILE=.git/interactive-stash-1 git checkout-index -a -f
262 $ rm .git/interactive-stash-1 .git/interactive-stash-2
263 </pre>
264 </td></tr>
265 </table>
266 </td></tr>
267 </table>
268 </p><p>
269 This should probably go into <i>git-stash.sh</i>, maybe even with a switch
270 to start git-gui to do the interactive adding instead of git-add.
271 </p>
272 <h6>Wednesday, 28th of January, Anno Domini MMIX, at the hour of the Monkey</h6>
273 <a name=1233154567>
274 <h2>Splitting topic branches</h2>
277 </p><p>
278 One might be put off easily by the overarching use of buzzwords in the
279 description of how <i>Darcs</i> works. I, for one, do not expect an intelligent
280 author when I read <i>Theory of patches</i> and <i>based on quantum physics</i>.
281 </p><p>
282 The true story, however, is much simpler, and is actually not that dumb:
283 Let's call two commits "conflicting" when they contain at least one
284 overlapping change.
285 </p><p>
286 The idea is now: Given a list of commits (not a set, as the order is important),
287 to sort them into smaller lists such that conflicting commits are in the
288 sublists ("topic branches") and the sublists are minimal, i.e. no two
289 non-conflicting commits are in the same sublist.
290 </p><p>
291 The idea has flaws, of course, as you can have a patch changing the code,
292 and another changing the documentation, but splitting a list of commits
293 in that way is a first step to sort out my <i>my-next</i> mess, where I have
294 a linear perl of not-necessarily-dependent commits.
295 </p><p>
296 And actually, my whole rebase revamp aimed at the clean-up for my own
297 <i>my-next</i> branch, so I am currently writing a script that can be used
298 as a GIT_EDITOR for git-rebase which implements the Darcs algorithm. Kind of:
299 the result is not implicit, but explicit and can be fixed up later.
300 </p>
301 <h6>Wednesday, 28th of January, Anno Domini MMIX, at the hour of the Buffalo</h6>
302 <a name=1233102919>
303 <h2>Showing off that you're an Alpine user ... priceless!</h2>
306 </p><p>
307 So I was in a hurry to send the patches, and sent all the patches as replies
308 to the cover-letter, and therefore typed in <i>rnyn</i> all the time, which is the
309 mantra I need to say to Alpine for <i>Reply</i>, ... include quoted message?
310 <i>No</i>, ... reply to all recipients? <i>Yes</i>, ... use first role?
311 <i>No, use default role</i>.
312 </p><p>
313 That was pretty embarassing, as it shows everybody that I still do not trust
314 <i>send-email</i>, and rather paste every single patch by hand. Which is rather
315 annoying.
316 </p><p>
317 So I started using format-patch today, to output directly to Alpine's
318 <i>postponed-msgs</i> folder, so that I can do some touchups in the mailer
319 before sending the patch series on its way.
320 </p><p>
321 However, when running format-patch with <i>--thread</i>, it generates Message-ID
322 strings that Alpine does not like, and therefore replaces.
323 </p><p>
324 Oh, well, I'll probably just investigate how the Message-IDs are supposed to
325 look, and then use sed to rewrite the generated ones by Alpine-friendly ones
326 during the redirection to <i>postponed-msgs</i>.
327 </p><p>
328 But I alread realized that doing it that way is dramatically faster than the
329 workflow I had before.
330 </p><p>
331 And safer: no more <i>rnyn</i>.
332 </p>
333 <h6>Wednesday, 28th of January, Anno Domini MMIX, at the hour of the Buffalo</h6>
334 <a name=1233101919>
335 <h2>Progress with the interactive rebase preserving merges</h2>
338 </p><p>
339 I thought about the "dropped" commits a bit more, after all, and it is
340 probably a good thing to substitute them by their parent, as Stephen did it.
341 </p><p>
342 Imagine that you have merged a branch with two commits. One is in upstream,
343 and you want to rebase (preserving merges) onto upstream. Then you still
344 want to merge the single commit.
345 </p><p>
346 Even better, if there is no commit left, the <i>$REWRITTEN</i> mechanism will
347 substitute the commit onto which we are rebasing, so a merge will just
348 result in a fast-forward!
349 </p><p>
350 Oh, another thing: merge commits should not have a patch id, as they have
351 <u>multiple</u> patches. However, I borked the code long time ago (9c6efa36)
352 and merges get the patch-id of their diff to the first parent. Which is
353 probably wrong. So I guess I'll have to fix that with my rebase revamp.
354 </p><p>
355 So what about a root commit? If that was dropped, we will just substitute
356 it with the commit onto which we rebase (as a root commit did not really
357 have a parent, but will get the onto-commit as new parent)..
358 </p><p>
359 Now that I finally realized that t3410 is so strange because of a bug <u>I</u>
360 introduced, I can finally go about fixing it.
361 </p>
362 <h6>Wednesday, 28th of January, Anno Domini MMIX, at the hour of the Rat</h6>
363 <a name=1233099894>
364 <h2>Another midnight riddle?</h2>
367 </p><p>
368 Okay, here's another riddle: what is the next line?
369 </p><p>
370 <pre>
374 1 1 1 2
375 3 1 1 2
376 2 1 1 2 1 3
378 </pre>
379 </p><p>
380 And when does the line get wider than 10 digits?
381 </p>
382 <h6>Tuesday, 27th of January, Anno Domini MMIX, at the hour of the Tiger</h6>
383 <a name=1233022809>
384 <h2>Fun with calculus after midnight</h2>
387 </p><p>
388 Problem: what is the shortest way of defining a variable consisting of <i>N</i>
389 spaces? I.e. for <i>N=80</i> the result will look something like
390 </p><p>
391 <table
392 border=1 bgcolor=black>
393 <tr><td bgcolor=lightblue colspan=3>
394 <pre> </pre>
395 </td></tr>
396 <tr><td>
397 <table cellspacing=5 border=0
398 style="color:white;">
399 <tr><td>
400 <pre>
401 s=' '
402 s="$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s"
403 </pre>
404 </td></tr>
405 </table>
406 </td></tr>
407 </table>
408 </p><p>
409 Let's see. Let the minimal number of characters needed be <i>A(N)</i>. For
410 simplicity, let's say that we only use one variable. Then, certainly, <i>A(N)</i>
411 cannot be larger than <i>5+N</i>, as we could define a variable using 1 character
412 for the name, 1 for the equal sign, 2 for the quotes, and one for the semicolon
413 or newline character (whichever).
414 </p><p>
415 Now, let's assume <i>N</i> is a product <i>K*L</i>. Then certainly, <i>A(N)</i> cannot
416 be larger than <i>A(K)+5+2*L</i>, as we could first define a variable that has
417 exactly <i>K</i> spaces and then use that to define the end result (in the example
418 above, <i>K=5</i> and <i>L=20</i>).
419 </p><p>
420 So, for which <i>N=K*L</i> is it better to use two definitions instead of one?
421 </p><p>
422 Simple calculus says that <i>5+K*L>5+K+5+2*L</i> must hold true, or (after some
423 scribbling): <i>L>1+7/(K-2)</i>. Which means that it makes no sense to define
424 a variable with 1 or 2 spaces first, which is kinda obvious (writing '$s'
425 alone would use two characters, so we could write the spaces right away).
426 </p><p>
427 But what for the other values? For <i>K=3</i>, <i>L</i> must be at least 9 to make
428 sense (in other words, <i>N</i> must be at least 27). For <i>K=4</i>, <i>L</i> needs
429 to be greater or equal to 5 (<i>N>=20</i>), the next pairs are <i>(5,4)</i>,
430 <i>(6,3)</i>, <i>(7,3)</i>, <i>(8,3)</i>, <i>(9,3)</i> and starting with <i>K=10</i>, any
431 <i>L>1</i> makes sense.
432 </p><p>
433 The second definition can also contain spaces at the end, however, so for any
434 <i>N=K*L+M</i>, <i>A(N)</i> cannot be larger than <i>A(K)+5+2*L+M</i>.
435 </p><p>
436 Not surprisingly, this leads to exactly the same <i>L>1+7/(K-2)</i> (as we can
437 append the <i>M</i> spaces in the last definition, no matter if we use 1 or
438 2 definitions).
439 </p><p>
440 However, that means that as soon as <i>N>=18</i>, we should use two definitions,
441 prior to that, it makes no sense.
442 </p><p>
443 So for <i>N<18</i>, <i>A(N)=5+N</i>.
444 </p><p>
445 But what <i>K</i> should one choose, i.e. how many spaces in the first definition?
446 In other words, what is <i>A(N)</i> given that we use two definitions?
447 </p><p>
448 That will have to wait for another midnight. Just a teaser: <i>A(80)=36</i>. Oh,
449 and with 80 characters, you can define a string of 9900 spaces...
450 </p>
451 <h6>Monday, 26th of January, Anno Domini MMIX, at the hour of the Dog</h6>
452 <a name=1232997290>
453 <h2>Valgrind takes a loooong time</h2>
456 </p><p>
457 Yesterday, I started a run on a fast machine, and it took roughly 5.5
458 hours by the machine's clock.
459 </p><p>
460 And of course, I redirected stdout only... *sigh*
461 </p><p>
462 Which triggered a Google search how to force redirection of all the output
463 in the test scripts to a file and the terminal at the same time.
464 </p><p>
465 It seems as if that is not easily done. I tried
466 <center><table
467 border=1 bgcolor=black>
468 <tr><td bgcolor=lightblue colspan=3>
469 <pre> </pre>
470 </td></tr>
471 <tr><td>
472 <table cellspacing=5 border=0
473 style="color:white;">
474 <tr><td>
475 <pre>
476 exec >(tee out) 2>&1
477 </pre>
478 </td></tr>
479 </table>
480 </td></tr>
481 </table></center>
482 </p><p>
483 but that did not work: it mumbled something about invalid file handles or some
484 such.
485 </p><p>
486 The only solution I found was:
487 <center><table
488 border=1 bgcolor=black>
489 <tr><td bgcolor=lightblue colspan=3>
490 <pre> </pre>
491 </td></tr>
492 <tr><td>
493 <table cellspacing=5 border=0
494 style="color:white;">
495 <tr><td>
496 <pre>
497 mkpipe pipe
498 tee out < pipe &
499 exec > pipe 2>&1
500 </pre>
501 </td></tr>
502 </table>
503 </td></tr>
504 </table></center>
505 </p><p>
506 That is a problem for parallel execution, though, so I am still looking for a
507 better way to do it.
508 </p><p>
509 Once I have the output, it is relatively easy to analyze it, as I already
510 made a script which disects the output into valgrind output and the test
511 case it came from, then groups by common valgrind output and shows the
512 result to the user.
513 </p>
514 </div>
515 </body>
516 </html>