Update Friday, 30th of January, Anno Domini MMIX, at the hour of the Buffalo
[git/dscho.git] / blog.rss
blob136777271c569564945c13a5fb77aaa4945c2c37
1 <?xml version="1.0" encoding="utf-8"?>
2 <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
3 <channel>
4 <title>Dscho's blog</title>
5 <link>http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=index.html</link>
6 <atom:link href="http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=blog.rss" rel="self" type="application/rss+xml"/>
7 <description>A few stories told by Dscho</description>
8 <lastBuildDate>Fri, 30 Jan 2009 02:01:26 +0100</lastBuildDate>
9 <language>en-us</language>
10 <item>
11 <title>More valgrind fun</title>
12 <link>http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=index.html#1233277286</link>
13 <guid>http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=index.html#1233277286</guid>
14 <pubDate>Fri, 30 Jan 2009 02:01:26 +0100</pubDate>
15 <description><![CDATA[More valgrind fun
16 </p><p>
17 So I spent quite a number of hours on that funny zlib/valgrind issue. The
18 thing is, zlib people claim that even if their code accesses uninitialized
19 memory, it does not produce erroneous data (by cutting out the results of the
20 uninitialized data, which is cheaper than checking for the end of the buffer
21 in an unaligned manner), so zlib will always be special for valgrind.
22 </p><p>
23 However, the bug I was chasing is funny, and different from said issue. zlib
24 deflates an input buffer to an output buffer that is exactly 58 bytes long.
25 But valgrind claims that the 52nd of those bytes is uninitialized, and <u>only</u>
26 that one.
27 </p><p>
28 But it is not. It must be 0x2c, otherwise zlib refuses to inflate the
29 buffer.
30 </p><p>
31 Now, I went into a debugging frenzy, and finally found out that zlib just
32 passes fine (with the default suppressions because of the "cute" way it
33 uses uninitialized memory), <u>except</u> when it is compiled with UNALIGNED_OK
34 defined.
35 </p><p>
36 Which Ubuntu does, of course. Ubuntu, the biggest forker of all.
37 </p><p>
38 The bad part is that it sounds like a bug in valgrind, and I <u>could</u> imagine
39 that it is an issue of an optimized memcpy() that copies int by int, and
40 that valgrind misses out on the fact that a part of that int is actually
41 <u>not</u> uninitialized.
42 </p><p>
43 But my debugging session's results disagree with that.
44 </p><p>
45 With the help of Julian Seward, the original author of valgrind, I instrumented
46 zlib's source code so that valgrind checks earlier if the byte is initialized
47 or not, to find out where the reason of the issue lies.
48 </p><p>
49 The sad part is that when I added the instrumentation to both the <u>end</u> of
50 the while() loop in compress_block() in zlib's trees.c, and just <u>after</u> the
51 while() loop (whose condition is a plain <i>variable < variable</i> comparison,
52 nothing fancy, certainly not changing any memory), only the <u>latter</u> catches
53 a valgrind error.
54 </p><p>
55 And that is truly strange.]]></description>
56 </item>
57 <item>
58 <title>Interactive stash</title>
59 <link>http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=index.html#1233193467</link>
60 <guid>http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=index.html#1233193467</guid>
61 <pubDate>Thu, 29 Jan 2009 02:44:27 +0100</pubDate>
62 <description><![CDATA[Interactive stash
63 </p><p>
64 There is an easy way to split a patch:
65 </p><p>
66 <table
67 border=1 bgcolor=white>
68 <tr><td bgcolor=lightblue colspan=3>
69 <pre> </pre>
70 </td></tr>
71 <tr><td>
72 <table cellspacing=5 border=0
73 style="color:black;">
74 <tr><td>
75 <pre>
76 $ git reset HEAD^
77 $ git add -i
78 $ git commit
79 $ git diff -R HEAD@{1} | git apply --index
80 $ git commit
81 </pre>
82 </td></tr>
83 </table>
84 </td></tr>
85 </table>
86 </p><p>
87 but it misses out on the fact that the first of both commits does not
88 reflect the state of the working directory at any time.
89 </p><p>
90 So I think something like an interactive <i>stash</i> is needed. A method
91 to specify what you want to keep in the working directory, the rest should
92 be stashed. The idea would be something like this:
93 </p><p>
94 <ol>
95 <li>Add the desired changes into a temporary index.
96 <li>Put the rest of the changes in another temporary index.
97 <li>Stash the latter index.
98 <li>Synchronize the working directory with the first index.
99 <li>Clean up temporary indices.
100 </ol>
101 </p><p>
102 Or in code:
103 </p><p>
104 <table
105 border=1 bgcolor=white>
106 <tr><td bgcolor=lightblue colspan=3>
107 <pre> </pre>
108 </td></tr>
109 <tr><td>
110 <table cellspacing=5 border=0
111 style="color:black;">
112 <tr><td>
113 <pre>
114 $ cp .git/index .git/interactive-stash-1
115 $ GIT_INDEX_FILE=.git/interactive-stash-1 git add -i
116 $ cp .git/index .git/interactive-stash-2
117 $ GIT_INDEX_FILE=.git/interactive-stash-1 git diff -R |
118 (GIT_INDEX_FILE=.git/interactive-stash-2 git apply--index)
119 $ tree=$(GIT_INDEX_FILE=.git/index git write-tree)
120 $ commit=$(echo Current index | git commit-tree $tree -p HEAD)
121 $ tree=$(GIT_INDEX_FILE=.git/interactive-stash-2 git write-tree)
122 $ commit=$(echo Edited out | git commit-tree $tree -p HEAD -p $commit)
123 $ git update-ref refs/stash $commit
124 $ GIT_INDEX_FILE=.git/interactive-stash-1 git checkout-index -a -f
125 $ rm .git/interactive-stash-1 .git/interactive-stash-2
126 </pre>
127 </td></tr>
128 </table>
129 </td></tr>
130 </table>
131 </p><p>
132 This should probably go into <i>git-stash.sh</i>, maybe even with a switch
133 to start git-gui to do the interactive adding instead of git-add.]]></description>
134 </item>
135 <item>
136 <title>Splitting topic branches</title>
137 <link>http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=index.html#1233154567</link>
138 <guid>http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=index.html#1233154567</guid>
139 <pubDate>Wed, 28 Jan 2009 15:56:07 +0100</pubDate>
140 <description><![CDATA[Splitting topic branches
141 </p><p>
142 One might be put off easily by the overarching use of buzzwords in the
143 description of how <i>Darcs</i> works. I, for one, do not expect an intelligent
144 author when I read <i>Theory of patches</i> and <i>based on quantum physics</i>.
145 </p><p>
146 The true story, however, is much simpler, and is actually not that dumb:
147 Let's call two commits "conflicting" when they contain at least one
148 overlapping change.
149 </p><p>
150 The idea is now: Given a list of commits (not a set, as the order is important),
151 to sort them into smaller lists such that conflicting commits are in the
152 sublists ("topic branches") and the sublists are minimal, i.e. no two
153 non-conflicting commits are in the same sublist.
154 </p><p>
155 The idea has flaws, of course, as you can have a patch changing the code,
156 and another changing the documentation, but splitting a list of commits
157 in that way is a first step to sort out my <i>my-next</i> mess, where I have
158 a linear perl of not-necessarily-dependent commits.
159 </p><p>
160 And actually, my whole rebase revamp aimed at the clean-up for my own
161 <i>my-next</i> branch, so I am currently writing a script that can be used
162 as a GIT_EDITOR for git-rebase which implements the Darcs algorithm. Kind of:
163 the result is not implicit, but explicit and can be fixed up later.]]></description>
164 </item>
165 <item>
166 <title>Showing off that you're an Alpine user ... priceless!</title>
167 <link>http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=index.html#1233102919</link>
168 <guid>http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=index.html#1233102919</guid>
169 <pubDate>Wed, 28 Jan 2009 01:35:19 +0100</pubDate>
170 <description><![CDATA[Showing off that you're an Alpine user ... priceless!
171 </p><p>
172 So I was in a hurry to send the patches, and sent all the patches as replies
173 to the cover-letter, and therefore typed in <i>rnyn</i> all the time, which is the
174 mantra I need to say to Alpine for <i>Reply</i>, ... include quoted message?
175 <i>No</i>, ... reply to all recipients? <i>Yes</i>, ... use first role?
176 <i>No, use default role</i>.
177 </p><p>
178 That was pretty embarassing, as it shows everybody that I still do not trust
179 <i>send-email</i>, and rather paste every single patch by hand. Which is rather
180 annoying.
181 </p><p>
182 So I started using format-patch today, to output directly to Alpine's
183 <i>postponed-msgs</i> folder, so that I can do some touchups in the mailer
184 before sending the patch series on its way.
185 </p><p>
186 However, when running format-patch with <i>--thread</i>, it generates Message-ID
187 strings that Alpine does not like, and therefore replaces.
188 </p><p>
189 Oh, well, I'll probably just investigate how the Message-IDs are supposed to
190 look, and then use sed to rewrite the generated ones by Alpine-friendly ones
191 during the redirection to <i>postponed-msgs</i>.
192 </p><p>
193 But I alread realized that doing it that way is dramatically faster than the
194 workflow I had before.
195 </p><p>
196 And safer: no more <i>rnyn</i>.]]></description>
197 </item>
198 <item>
199 <title>Progress with the interactive rebase preserving merges</title>
200 <link>http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=index.html#1233101919</link>
201 <guid>http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=index.html#1233101919</guid>
202 <pubDate>Wed, 28 Jan 2009 01:18:39 +0100</pubDate>
203 <description><![CDATA[Progress with the interactive rebase preserving merges
204 </p><p>
205 I thought about the "dropped" commits a bit more, after all, and it is
206 probably a good thing to substitute them by their parent, as Stephen did it.
207 </p><p>
208 Imagine that you have merged a branch with two commits. One is in upstream,
209 and you want to rebase (preserving merges) onto upstream. Then you still
210 want to merge the single commit.
211 </p><p>
212 Even better, if there is no commit left, the <i>$REWRITTEN</i> mechanism will
213 substitute the commit onto which we are rebasing, so a merge will just
214 result in a fast-forward!
215 </p><p>
216 Oh, another thing: merge commits should not have a patch id, as they have
217 <u>multiple</u> patches. However, I borked the code long time ago (9c6efa36)
218 and merges get the patch-id of their diff to the first parent. Which is
219 probably wrong. So I guess I'll have to fix that with my rebase revamp.
220 </p><p>
221 So what about a root commit? If that was dropped, we will just substitute
222 it with the commit onto which we rebase (as a root commit did not really
223 have a parent, but will get the onto-commit as new parent)..
224 </p><p>
225 Now that I finally realized that t3410 is so strange because of a bug <u>I</u>
226 introduced, I can finally go about fixing it.]]></description>
227 </item>
228 <item>
229 <title>Another midnight riddle?</title>
230 <link>http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=index.html#1233099894</link>
231 <guid>http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=index.html#1233099894</guid>
232 <pubDate>Wed, 28 Jan 2009 00:44:54 +0100</pubDate>
233 <description><![CDATA[Another midnight riddle?
234 </p><p>
235 Okay, here's another riddle: what is the next line?
236 </p><p>
237 <pre>
241 1 1 1 2
242 3 1 1 2
243 2 1 1 2 1 3
245 </pre>
246 </p><p>
247 And when does the line get wider than 10 digits?]]></description>
248 </item>
249 <item>
250 <title>Fun with calculus after midnight</title>
251 <link>http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=index.html#1233022809</link>
252 <guid>http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=index.html#1233022809</guid>
253 <pubDate>Tue, 27 Jan 2009 03:20:09 +0100</pubDate>
254 <description><![CDATA[Fun with calculus after midnight
255 </p><p>
256 Problem: what is the shortest way of defining a variable consisting of <i>N</i>
257 spaces? I.e. for <i>N=80</i> the result will look something like
258 </p><p>
259 <table
260 border=1 bgcolor=white>
261 <tr><td bgcolor=lightblue colspan=3>
262 <pre> </pre>
263 </td></tr>
264 <tr><td>
265 <table cellspacing=5 border=0
266 style="color:black;">
267 <tr><td>
268 <pre>
269 s=' '
270 s="$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s"
271 </pre>
272 </td></tr>
273 </table>
274 </td></tr>
275 </table>
276 </p><p>
277 Let's see. Let the minimal number of characters needed be <i>A(N)</i>. For
278 simplicity, let's say that we only use one variable. Then, certainly, <i>A(N)</i>
279 cannot be larger than <i>5+N</i>, as we could define a variable using 1 character
280 for the name, 1 for the equal sign, 2 for the quotes, and one for the semicolon
281 or newline character (whichever).
282 </p><p>
283 Now, let's assume <i>N</i> is a product <i>K*L</i>. Then certainly, <i>A(N)</i> cannot
284 be larger than <i>A(K)+5+2*L</i>, as we could first define a variable that has
285 exactly <i>K</i> spaces and then use that to define the end result (in the example
286 above, <i>K=5</i> and <i>L=20</i>).
287 </p><p>
288 So, for which <i>N=K*L</i> is it better to use two definitions instead of one?
289 </p><p>
290 Simple calculus says that <i>5+K*L>5+K+5+2*L</i> must hold true, or (after some
291 scribbling): <i>L>1+7/(K-2)</i>. Which means that it makes no sense to define
292 a variable with 1 or 2 spaces first, which is kinda obvious (writing '$s'
293 alone would use two characters, so we could write the spaces right away).
294 </p><p>
295 But what for the other values? For <i>K=3</i>, <i>L</i> must be at least 9 to make
296 sense (in other words, <i>N</i> must be at least 27). For <i>K=4</i>, <i>L</i> needs
297 to be greater or equal to 5 (<i>N>=20</i>), the next pairs are <i>(5,4)</i>,
298 <i>(6,3)</i>, <i>(7,3)</i>, <i>(8,3)</i>, <i>(9,3)</i> and starting with <i>K=10</i>, any
299 <i>L>1</i> makes sense.
300 </p><p>
301 The second definition can also contain spaces at the end, however, so for any
302 <i>N=K*L+M</i>, <i>A(N)</i> cannot be larger than <i>A(K)+5+2*L+M</i>.
303 </p><p>
304 Not surprisingly, this leads to exactly the same <i>L>1+7/(K-2)</i> (as we can
305 append the <i>M</i> spaces in the last definition, no matter if we use 1 or
306 2 definitions).
307 </p><p>
308 However, that means that as soon as <i>N>=18</i>, we should use two definitions,
309 prior to that, it makes no sense.
310 </p><p>
311 So for <i>N<18</i>, <i>A(N)=5+N</i>.
312 </p><p>
313 But what <i>K</i> should one choose, i.e. how many spaces in the first definition?
314 In other words, what is <i>A(N)</i> given that we use two definitions?
315 </p><p>
316 That will have to wait for another midnight. Just a teaser: <i>A(80)=36</i>. Oh,
317 and with 80 characters, you can define a string of 9900 spaces...]]></description>
318 </item>
319 <item>
320 <title>Valgrind takes a loooong time</title>
321 <link>http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=index.html#1232997290</link>
322 <guid>http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=index.html#1232997290</guid>
323 <pubDate>Mon, 26 Jan 2009 20:14:50 +0100</pubDate>
324 <description><![CDATA[Valgrind takes a loooong time
325 </p><p>
326 Yesterday, I started a run on a fast machine, and it took roughly 5.5
327 hours by the machine's clock.
328 </p><p>
329 And of course, I redirected stdout only... *sigh*
330 </p><p>
331 Which triggered a Google search how to force redirection of all the output
332 in the test scripts to a file and the terminal at the same time.
333 </p><p>
334 It seems as if that is not easily done. I tried
335 <center><table
336 border=1 bgcolor=white>
337 <tr><td bgcolor=lightblue colspan=3>
338 <pre> </pre>
339 </td></tr>
340 <tr><td>
341 <table cellspacing=5 border=0
342 style="color:black;">
343 <tr><td>
344 <pre>
345 exec >(tee out) 2>&1
346 </pre>
347 </td></tr>
348 </table>
349 </td></tr>
350 </table></center>
351 </p><p>
352 but that did not work: it mumbled something about invalid file handles or some
353 such.
354 </p><p>
355 The only solution I found was:
356 <center><table
357 border=1 bgcolor=white>
358 <tr><td bgcolor=lightblue colspan=3>
359 <pre> </pre>
360 </td></tr>
361 <tr><td>
362 <table cellspacing=5 border=0
363 style="color:black;">
364 <tr><td>
365 <pre>
366 mkpipe pipe
367 tee out < pipe &
368 exec > pipe 2>&1
369 </pre>
370 </td></tr>
371 </table>
372 </td></tr>
373 </table></center>
374 </p><p>
375 That is a problem for parallel execution, though, so I am still looking for a
376 better way to do it.
377 </p><p>
378 Once I have the output, it is relatively easy to analyze it, as I already
379 made a script which disects the output into valgrind output and the test
380 case it came from, then groups by common valgrind output and shows the
381 result to the user.]]></description>
382 </item>
383 <item>
384 <title>A day full of rebase... and a little valgrind</title>
385 <link>http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=index.html#1232927812</link>
386 <guid>http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=index.html#1232927812</guid>
387 <pubDate>Mon, 26 Jan 2009 00:56:52 +0100</pubDate>
388 <description><![CDATA[A day full of rebase... and a little valgrind
389 </p><p>
390 I think that I am progressing nicely with my rebase -p work, so much so
391 that I will soon be able to use it myself to work on topic branches <u>and</u>
392 rebase all the time without much hassle.
393 </p><p>
394 In other words, I would like to be able to rebase all my topic branches
395 to Junio's <i>next</i> branch whenever that has new commits. With a single
396 rebase.
397 </p><p>
398 And finally, I got the idea of the thing Stephen implemented for dropped
399 commits; however, I am quite sure I do not like it.
400 </p><p>
401 So what are "dropped" commits?
402 </p><p>
403 When you rebase, chances are that the upstream already has applied at
404 least some of your patches. So we filter those out with <i>--cherry-pick</i>.
405 Stephen calls those "dropped" commits.
406 </p><p>
407 Then he goes on to reinvent the "$REWRITTEN" system: a directory containing
408 the mappings of old commit names to new commit names. That is easily fixed.
409 </p><p>
410 But worse, he substitutes the dropped commits with their <u>parents</u>, instead
411 of substituting them with the corresponding commits in upstream.
412 </p><p>
413 I guess this will be a medium-sized fight on the mailing list, depending
414 how much energy Stephen wants to put in to defend his strategy.
415 </p><p>
416 Anyway, I finally got to a point where only three of the tests are failing,
417 t3404, t3410 and t3412. Somewhat disappointing is t3404, as its name pretends
418 not to exercize -p at all. Oh well, I guess I'll see what is broken tomorrow.
419 </p><p>
420 Another part of the day was dedicated to the Valgrind patch series, which
421 should give us yet another level of code quality.
422 </p><p>
423 After having confused myself with several diverging/obsolete branches, I did
424 indeed finally manage to send that patch series off. Woohoo.]]></description>
425 </item>
426 <item>
427 <title>Regular diff with word coloring (as opposed to word diff)</title>
428 <link>http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=index.html#1232888842</link>
429 <guid>http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=index.html#1232888842</guid>
430 <pubDate>Sun, 25 Jan 2009 14:07:22 +0100</pubDate>
431 <description><![CDATA[Regular diff with word coloring (as opposed to word diff)
432 </p><p>
433 You know, if I were a bit faster with everything I do, I could do so much more!
434 </p><p>
435 For example, Junio's idea that you could keep showing a regular diff, only
436 coloring the words that have been removed/deleted.
437 </p><p>
438 Just imagine looking at the diff of a long line in LaTeX source code. It
439 should be much nicer to the eye to see the complete removed/added sentences
440 instead of one sentence with colored words in between, disrupting your read
441 flow.
442 </p><p>
443 Compare these two versions:
444 </p><p>
445 Regular diff with colored words:
446 <blockquote><tt>
447 -This sentence has a <font color=red>tyop</font> in it.<br>
448 +This sentence has a <font color=green>typo</font> in it.<br>
449 </tt></blockquote>
450 </p><p>
451 Word diff:
452 <blockquote><tt>
453 This sentence has a <font color=red>tyop</font><font color=green>typo</font> in it.<br>
454 </tt></blockquote>
455 </p><p>
456 And it should not be hard to do at all!
457 </p><p>
458 In <i>diff_words_show()</i>, we basically get the minus lines as
459 <i>diff_words->minus</i> and the plus lines as <i>diff_words->plus</i>. The
460 function then prepares the word lists and calls the xdiff engine to do all the
461 hard work, analyzing the result from xdiff and printing the lines in
462 <i>fn_out_diff_words_aux()</i>.
463 </p><p>
464 So all that would have to be changed would be to <u>record</u> the positions
465 of the removed/added words instead of outputting them, and at the end printing
466 the minus/plus buffers using the recorded information to color the words.
467 </p><p>
468 This would involve
469 </p><p>
470 <ul>
471 <li>adding two new members holding the offsets in the <i>diff_words</i>
472 struct,
473 <li>having a special handling for that mode in
474 <i>fn_out_diff_words_aux()</i> that appends the offsets and
475 returns,
476 <li>adding a function <i>show_lines_with_colored_words()</i> that
477 outputs a buffer with a given prefix ('-' or '+') and coloring the words at
478 given offsets with a given color,
479 <li>modify <i>diff_words_show()</i> to call that function for the "special
480 case: only removal" and at the end of the function, and
481 <li> disabling the <i>fwrite()</i> at the end of <i>diff_words_show()</i> for that
482 mode.
483 </ul>
484 </p><p>
485 Of course, the hardest part is to find a nice user interface for that. Maybe
486 <i>--colored-words</i>? &#x263a;]]></description>
487 </item>
488 </channel>
489 </rss>