Update Thursday, 29th of January, Anno Domini MMIX, at the hour of the Buffalo
[git/dscho.git] / index.html
blobf20bda96f3b138f06187fb96305a369a116478fe
1 <html>
2 <head>
3 <title>Dscho's blog</title>
4 <meta http-equiv="Content-Type"
5 content="text/html; charset=UTF-8"/>
6 </head>
7 <body style="width:800px;background-image:url(dscho.git?a=blob_plain;hb=832be85c785c80202f17b87db7f063ae57ec2cac;f=paper.jpg);background-repeat:repeat-y;background-attachment:scroll;padding:0px;">
8 <div style="width:610px;margin-left:120px;margin-top:50px;align:left;vertical-align:top;">
9 <h1>Dscho's blog</h1>
10 <div style="position:absolute;top:50px;left:810px;width=400px">
11 <table width=400px bgcolor=#e0e0e0 border=1>
12 <tr><th>Table of contents:</th></tr>
13 <tr><td>
14 <p><ul>
15 <li><a href=#1233193467>29 Jan 2009 Interactive stash</a>
16 <li><a href=#1233154567>28 Jan 2009 Splitting topic branches</a>
17 <li><a href=#1233102919>28 Jan 2009 Showing off that you're an Alpine user ... priceless!</a>
18 <li><a href=#1233101919>28 Jan 2009 Progress with the interactive rebase preserving merges</a>
19 <li><a href=#1233099894>28 Jan 2009 Another midnight riddle?</a>
20 <li><a href=#1233022809>27 Jan 2009 Fun with calculus after midnight</a>
21 <li><a href=#1232997290>26 Jan 2009 Valgrind takes a loooong time</a>
22 <li><a href=#1232927812>26 Jan 2009 A day full of rebase... and a little valgrind</a>
23 <li><a href=#1232888842>25 Jan 2009 Regular diff with word coloring (as opposed to word diff)</a>
24 <li><a href=#1232828715>24 Jan 2009 Ideas for a major revamp of the <i>--preserve-merges</i> handling in <i>git rebase</i></a>
25 </ul></p>
26 <a href=dscho.git?a=blob_plain;hb=a910b3bb049f5fa34908772ab008a90223a598ac;f=index.html>Older posts</a>
27 </td></tr></table>
28 <br>
29 <div style="text-align:right;">
30 <a href="dscho.git?a=blob_plain;hb=blog;f=blog.rss"
31 title="Subscribe to my RSS feed"
32 class="rss" rel="nofollow"
33 style="background-color:orange;text-decoration:none;color:white;font-family:sans-serif;">RSS</a>
34 </div>
35 <br>
36 <table width=400px bgcolor=#e0e0e0 border=1>
37 <tr><th>About this blog:</th></tr>
38 <tr><td>
39 <p>It is an active <a href=http://repo.or.cz/w/git/dscho.git?a=blob;f=source-1232626236.txt;h=1edde0467a>abuse</a> of <a href=http://repo.or.cz/>repo.or.cz</a>,
40 letting gitweb unpack the objects in the current tip of the branch <i>blog</i>,
41 including the images and the RSS feed.
42 </p><p>
43 Publishing means running a script that collects the posts, turns them into
44 HTML, makes sure all the images are checked in, and pushes the result.
45 </p><p>
46 This blog also serves to grace the world with Dscho's random thoughts on and
47 around Git.
48 </p>
49 </td></tr></table>
50 <br>
51 <table width=400px bgcolor=#e0e0e0 border=1>
52 <tr><th>Links:</th></tr>
53 <tr><td>
54 <ul>
55 <li> <a href=http://git-scm.com/>Git's homepage</a>
56 <li> <a href=http://gitster.livejournal.com/>Junio's blog</a>
57 <li> <a href=http://www.spearce.org/>Shawn's blog</a> seems to be sitting
58 idle ever since he started working for Google...
59 <li> <a href=http://torvalds-family.blogspot.com/>Linus' blog</a> does not
60 talk much about Git...
61 <li> Scott Chacon's <a href=http://whygitisbetterthanx.com/>Why Git is better
62 than X</a> site
63 <li> <a href=http://vilain.net/>The blog of mugwump</a>
64 <li> <a href=http://blogs.gnome.org/newren/>Elijah Newren</a> chose the
65 same path as Cogito, offering an alternative porcelain (an approach
66 that is doomed in my opinion)
67 <li> <a href=http://msysgit.googlecode.com/>The msysGit project</a>, a (mostly)
68 failed experiment to lure the many Windows developers out there to
69 contribute to Open Source for a change.
70 </ul>
71 </td></tr></table>
72 <br>
73 <table width=400px bgcolor=#e0e0e0 border=1>
74 <tr><th>Google Ads:</th></tr>
75 <tr><td>
76 <script type="text/javascript"><!--
77 google_ad_client = "pub-5106407705643819";
78 /* 300x250, created 1/22/09 */
79 google_ad_slot = "6468207338";
80 google_ad_width = 300;
81 google_ad_height = 250;
82 //-->
83 </script>
84 <script type="text/javascript"
85 src="http://pagead2.googlesyndication.com/pagead/show_ads.js">
86 </script>
87 </td></tr></table>
88 </div>
89 <h6>Thursday, 29th of January, Anno Domini MMIX, at the hour of the Buffalo</h6>
90 <a name=1233193467>
91 <h2>Interactive stash</h2>
93 <p>
94 </p><p>
95 There is an easy way to split a patch:
96 </p><p>
97 <table
98 border=1 bgcolor=black>
99 <tr><td bgcolor=lightblue colspan=3>
100 <pre> </pre>
101 </td></tr>
102 <tr><td>
103 <table cellspacing=5 border=0
104 style="color:white;">
105 <tr><td>
106 <pre>
107 $ git reset HEAD^
108 $ git add -i
109 $ git commit
110 $ git diff -R HEAD@{1} | git apply --index
111 $ git commit
112 </pre>
113 </td></tr>
114 </table>
115 </td></tr>
116 </table>
117 </p><p>
118 but it misses out on the fact that the first of both commits does not
119 reflect the state of the working directory at any time.
120 </p><p>
121 So I think something like an interactive <i>stash</i> is needed. A method
122 to specify what you want to keep in the working directory, the rest should
123 be stashed. The idea would be something like this:
124 </p><p>
125 <ol>
126 <li>Add the desired changes into a temporary index.
127 <li>Put the rest of the changes in another temporary index.
128 <li>Stash the latter index.
129 <li>Synchronize the working directory with the first index.
130 <li>Clean up temporary indices.
131 </ol>
132 </p><p>
133 Or in code:
134 </p><p>
135 <table
136 border=1 bgcolor=black>
137 <tr><td bgcolor=lightblue colspan=3>
138 <pre> </pre>
139 </td></tr>
140 <tr><td>
141 <table cellspacing=5 border=0
142 style="color:white;">
143 <tr><td>
144 <pre>
145 $ cp .git/index .git/interactive-stash-1
146 $ GIT_INDEX_FILE=.git/interactive-stash-1 git add -i
147 $ cp .git/index .git/interactive-stash-2
148 $ GIT_INDEX_FILE=.git/interactive-stash-1 git diff -R |
149 (GIT_INDEX_FILE=.git/interactive-stash-2 git apply--index)
150 $ tree=$(GIT_INDEX_FILE=.git/index git write-tree)
151 $ commit=$(echo Current index | git commit-tree $tree -p HEAD)
152 $ tree=$(GIT_INDEX_FILE=.git/interactive-stash-2 git write-tree)
153 $ commit=$(echo Edited out | git commit-tree $tree -p HEAD -p $commit)
154 $ git update-ref refs/stash $commit
155 $ GIT_INDEX_FILE=.git/interactive-stash-1 git checkout-index -a -f
156 $ rm .git/interactive-stash-1 .git/interactive-stash-2
157 </pre>
158 </td></tr>
159 </table>
160 </td></tr>
161 </table>
162 </p><p>
163 This should probably go into <i>git-stash.sh</i>, maybe even with a switch
164 to start git-gui to do the interactive adding instead of git-add.
165 </p>
166 <h6>Wednesday, 28th of January, Anno Domini MMIX, at the hour of the Monkey</h6>
167 <a name=1233154567>
168 <h2>Splitting topic branches</h2>
171 </p><p>
172 One might be put off easily by the overarching use of buzzwords in the
173 description of how <i>Darcs</i> works. I, for one, do not expect an intelligent
174 author when I read <i>Theory of patches</i> and <i>based on quantum physics</i>.
175 </p><p>
176 The true story, however, is much simpler, and is actually not that dumb:
177 Let's call two commits "conflicting" when they contain at least one
178 overlapping change.
179 </p><p>
180 The idea is now: Given a list of commits (not a set, as the order is important),
181 to sort them into smaller lists such that conflicting commits are in the
182 sublists ("topic branches") and the sublists are minimal, i.e. no two
183 non-conflicting commits are in the same sublist.
184 </p><p>
185 The idea has flaws, of course, as you can have a patch changing the code,
186 and another changing the documentation, but splitting a list of commits
187 in that way is a first step to sort out my <i>my-next</i> mess, where I have
188 a linear perl of not-necessarily-dependent commits.
189 </p><p>
190 And actually, my whole rebase revamp aimed at the clean-up for my own
191 <i>my-next</i> branch, so I am currently writing a script that can be used
192 as a GIT_EDITOR for git-rebase which implements the Darcs algorithm. Kind of:
193 the result is not implicit, but explicit and can be fixed up later.
194 </p>
195 <h6>Wednesday, 28th of January, Anno Domini MMIX, at the hour of the Buffalo</h6>
196 <a name=1233102919>
197 <h2>Showing off that you're an Alpine user ... priceless!</h2>
200 </p><p>
201 So I was in a hurry to send the patches, and sent all the patches as replies
202 to the cover-letter, and therefore typed in <i>rnyn</i> all the time, which is the
203 mantra I need to say to Alpine for <i>Reply</i>, ... include quoted message?
204 <i>No</i>, ... reply to all recipients? <i>Yes</i>, ... use first role?
205 <i>No, use default role</i>.
206 </p><p>
207 That was pretty embarassing, as it shows everybody that I still do not trust
208 <i>send-email</i>, and rather paste every single patch by hand. Which is rather
209 annoying.
210 </p><p>
211 So I started using format-patch today, to output directly to Alpine's
212 <i>postponed-msgs</i> folder, so that I can do some touchups in the mailer
213 before sending the patch series on its way.
214 </p><p>
215 However, when running format-patch with <i>--thread</i>, it generates Message-ID
216 strings that Alpine does not like, and therefore replaces.
217 </p><p>
218 Oh, well, I'll probably just investigate how the Message-IDs are supposed to
219 look, and then use sed to rewrite the generated ones by Alpine-friendly ones
220 during the redirection to <i>postponed-msgs</i>.
221 </p><p>
222 But I alread realized that doing it that way is dramatically faster than the
223 workflow I had before.
224 </p><p>
225 And safer: no more <i>rnyn</i>.
226 </p>
227 <h6>Wednesday, 28th of January, Anno Domini MMIX, at the hour of the Buffalo</h6>
228 <a name=1233101919>
229 <h2>Progress with the interactive rebase preserving merges</h2>
232 </p><p>
233 I thought about the "dropped" commits a bit more, after all, and it is
234 probably a good thing to substitute them by their parent, as Stephen did it.
235 </p><p>
236 Imagine that you have merged a branch with two commits. One is in upstream,
237 and you want to rebase (preserving merges) onto upstream. Then you still
238 want to merge the single commit.
239 </p><p>
240 Even better, if there is no commit left, the <i>$REWRITTEN</i> mechanism will
241 substitute the commit onto which we are rebasing, so a merge will just
242 result in a fast-forward!
243 </p><p>
244 Oh, another thing: merge commits should not have a patch id, as they have
245 <u>multiple</u> patches. However, I borked the code long time ago (9c6efa36)
246 and merges get the patch-id of their diff to the first parent. Which is
247 probably wrong. So I guess I'll have to fix that with my rebase revamp.
248 </p><p>
249 So what about a root commit? If that was dropped, we will just substitute
250 it with the commit onto which we rebase (as a root commit did not really
251 have a parent, but will get the onto-commit as new parent)..
252 </p><p>
253 Now that I finally realized that t3410 is so strange because of a bug <u>I</u>
254 introduced, I can finally go about fixing it.
255 </p>
256 <h6>Wednesday, 28th of January, Anno Domini MMIX, at the hour of the Rat</h6>
257 <a name=1233099894>
258 <h2>Another midnight riddle?</h2>
261 </p><p>
262 Okay, here's another riddle: what is the next line?
263 </p><p>
264 <pre>
268 1 1 1 2
269 3 1 1 2
270 2 1 1 2 1 3
272 </pre>
273 </p><p>
274 And when does the line get wider than 10 digits?
275 </p>
276 <h6>Tuesday, 27th of January, Anno Domini MMIX, at the hour of the Tiger</h6>
277 <a name=1233022809>
278 <h2>Fun with calculus after midnight</h2>
281 </p><p>
282 Problem: what is the shortest way of defining a variable consisting of <i>N</i>
283 spaces? I.e. for <i>N=80</i> the result will look something like
284 </p><p>
285 <table
286 border=1 bgcolor=black>
287 <tr><td bgcolor=lightblue colspan=3>
288 <pre> </pre>
289 </td></tr>
290 <tr><td>
291 <table cellspacing=5 border=0
292 style="color:white;">
293 <tr><td>
294 <pre>
295 s=' '
296 s="$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s"
297 </pre>
298 </td></tr>
299 </table>
300 </td></tr>
301 </table>
302 </p><p>
303 Let's see. Let the minimal number of characters needed be <i>A(N)</i>. For
304 simplicity, let's say that we only use one variable. Then, certainly, <i>A(N)</i>
305 cannot be larger than <i>5+N</i>, as we could define a variable using 1 character
306 for the name, 1 for the equal sign, 2 for the quotes, and one for the semicolon
307 or newline character (whichever).
308 </p><p>
309 Now, let's assume <i>N</i> is a product <i>K*L</i>. Then certainly, <i>A(N)</i> cannot
310 be larger than <i>A(K)+5+2*L</i>, as we could first define a variable that has
311 exactly <i>K</i> spaces and then use that to define the end result (in the example
312 above, <i>K=5</i> and <i>L=20</i>).
313 </p><p>
314 So, for which <i>N=K*L</i> is it better to use two definitions instead of one?
315 </p><p>
316 Simple calculus says that <i>5+K*L>5+K+5+2*L</i> must hold true, or (after some
317 scribbling): <i>L>1+7/(K-2)</i>. Which means that it makes no sense to define
318 a variable with 1 or 2 spaces first, which is kinda obvious (writing '$s'
319 alone would use two characters, so we could write the spaces right away).
320 </p><p>
321 But what for the other values? For <i>K=3</i>, <i>L</i> must be at least 9 to make
322 sense (in other words, <i>N</i> must be at least 27). For <i>K=4</i>, <i>L</i> needs
323 to be greater or equal to 5 (<i>N>=20</i>), the next pairs are <i>(5,4)</i>,
324 <i>(6,3)</i>, <i>(7,3)</i>, <i>(8,3)</i>, <i>(9,3)</i> and starting with <i>K=10</i>, any
325 <i>L>1</i> makes sense.
326 </p><p>
327 The second definition can also contain spaces at the end, however, so for any
328 <i>N=K*L+M</i>, <i>A(N)</i> cannot be larger than <i>A(K)+5+2*L+M</i>.
329 </p><p>
330 Not surprisingly, this leads to exactly the same <i>L>1+7/(K-2)</i> (as we can
331 append the <i>M</i> spaces in the last definition, no matter if we use 1 or
332 2 definitions).
333 </p><p>
334 However, that means that as soon as <i>N>=18</i>, we should use two definitions,
335 prior to that, it makes no sense.
336 </p><p>
337 So for <i>N<18</i>, <i>A(N)=5+N</i>.
338 </p><p>
339 But what <i>K</i> should one choose, i.e. how many spaces in the first definition?
340 In other words, what is <i>A(N)</i> given that we use two definitions?
341 </p><p>
342 That will have to wait for another midnight. Just a teaser: <i>A(80)=36</i>. Oh,
343 and with 80 characters, you can define a string of 9900 spaces...
344 </p>
345 <h6>Monday, 26th of January, Anno Domini MMIX, at the hour of the Dog</h6>
346 <a name=1232997290>
347 <h2>Valgrind takes a loooong time</h2>
350 </p><p>
351 Yesterday, I started a run on a fast machine, and it took roughly 5.5
352 hours by the machine's clock.
353 </p><p>
354 And of course, I redirected stdout only... *sigh*
355 </p><p>
356 Which triggered a Google search how to force redirection of all the output
357 in the test scripts to a file and the terminal at the same time.
358 </p><p>
359 It seems as if that is not easily done. I tried
360 <center><table
361 border=1 bgcolor=black>
362 <tr><td bgcolor=lightblue colspan=3>
363 <pre> </pre>
364 </td></tr>
365 <tr><td>
366 <table cellspacing=5 border=0
367 style="color:white;">
368 <tr><td>
369 <pre>
370 exec >(tee out) 2>&1
371 </pre>
372 </td></tr>
373 </table>
374 </td></tr>
375 </table></center>
376 </p><p>
377 but that did not work: it mumbled something about invalid file handles or some
378 such.
379 </p><p>
380 The only solution I found was:
381 <center><table
382 border=1 bgcolor=black>
383 <tr><td bgcolor=lightblue colspan=3>
384 <pre> </pre>
385 </td></tr>
386 <tr><td>
387 <table cellspacing=5 border=0
388 style="color:white;">
389 <tr><td>
390 <pre>
391 mkpipe pipe
392 tee out < pipe &
393 exec > pipe 2>&1
394 </pre>
395 </td></tr>
396 </table>
397 </td></tr>
398 </table></center>
399 </p><p>
400 That is a problem for parallel execution, though, so I am still looking for a
401 better way to do it.
402 </p><p>
403 Once I have the output, it is relatively easy to analyze it, as I already
404 made a script which disects the output into valgrind output and the test
405 case it came from, then groups by common valgrind output and shows the
406 result to the user.
407 </p>
408 <h6>Monday, 26th of January, Anno Domini MMIX, at the hour of the Rat</h6>
409 <a name=1232927812>
410 <h2>A day full of rebase... and a little valgrind</h2>
413 </p><p>
414 I think that I am progressing nicely with my rebase -p work, so much so
415 that I will soon be able to use it myself to work on topic branches <u>and</u>
416 rebase all the time without much hassle.
417 </p><p>
418 In other words, I would like to be able to rebase all my topic branches
419 to Junio's <i>next</i> branch whenever that has new commits. With a single
420 rebase.
421 </p><p>
422 And finally, I got the idea of the thing Stephen implemented for dropped
423 commits; however, I am quite sure I do not like it.
424 </p><p>
425 So what are "dropped" commits?
426 </p><p>
427 When you rebase, chances are that the upstream already has applied at
428 least some of your patches. So we filter those out with <i>--cherry-pick</i>.
429 Stephen calls those "dropped" commits.
430 </p><p>
431 Then he goes on to reinvent the "$REWRITTEN" system: a directory containing
432 the mappings of old commit names to new commit names. That is easily fixed.
433 </p><p>
434 But worse, he substitutes the dropped commits with their <u>parents</u>, instead
435 of substituting them with the corresponding commits in upstream.
436 </p><p>
437 I guess this will be a medium-sized fight on the mailing list, depending
438 how much energy Stephen wants to put in to defend his strategy.
439 </p><p>
440 Anyway, I finally got to a point where only three of the tests are failing,
441 t3404, t3410 and t3412. Somewhat disappointing is t3404, as its name pretends
442 not to exercize -p at all. Oh well, I guess I'll see what is broken tomorrow.
443 </p><p>
444 Another part of the day was dedicated to the Valgrind patch series, which
445 should give us yet another level of code quality.
446 </p><p>
447 After having confused myself with several diverging/obsolete branches, I did
448 indeed finally manage to send that patch series off. Woohoo.
449 </p>
450 <h6>Sunday, 25th of January, Anno Domini MMIX, at the hour of the Goat</h6>
451 <a name=1232888842>
452 <h2>Regular diff with word coloring (as opposed to word diff)</h2>
455 </p><p>
456 You know, if I were a bit faster with everything I do, I could do so much more!
457 </p><p>
458 For example, Junio's idea that you could keep showing a regular diff, only
459 coloring the words that have been removed/deleted.
460 </p><p>
461 Just imagine looking at the diff of a long line in LaTeX source code. It
462 should be much nicer to the eye to see the complete removed/added sentences
463 instead of one sentence with colored words in between, disrupting your read
464 flow.
465 </p><p>
466 Compare these two versions:
467 </p><p>
468 Regular diff with colored words:
469 <blockquote><tt>
470 -This sentence has a <font color=red>tyop</font> in it.<br>
471 +This sentence has a <font color=green>typo</font> in it.<br>
472 </tt></blockquote>
473 </p><p>
474 Word diff:
475 <blockquote><tt>
476 This sentence has a <font color=red>tyop</font><font color=green>typo</font> in it.<br>
477 </tt></blockquote>
478 </p><p>
479 And it should not be hard to do at all!
480 </p><p>
481 In <i>diff_words_show()</i>, we basically get the minus lines as
482 <i>diff_words->minus</i> and the plus lines as <i>diff_words->plus</i>. The
483 function then prepares the word lists and calls the xdiff engine to do all the
484 hard work, analyzing the result from xdiff and printing the lines in
485 <i>fn_out_diff_words_aux()</i>.
486 </p><p>
487 So all that would have to be changed would be to <u>record</u> the positions
488 of the removed/added words instead of outputting them, and at the end printing
489 the minus/plus buffers using the recorded information to color the words.
490 </p><p>
491 This would involve
492 </p><p>
493 <ul>
494 <li>adding two new members holding the offsets in the <i>diff_words</i>
495 struct,
496 <li>having a special handling for that mode in
497 <i>fn_out_diff_words_aux()</i> that appends the offsets and
498 returns,
499 <li>adding a function <i>show_lines_with_colored_words()</i> that
500 outputs a buffer with a given prefix ('-' or '+') and coloring the words at
501 given offsets with a given color,
502 <li>modify <i>diff_words_show()</i> to call that function for the "special
503 case: only removal" and at the end of the function, and
504 <li> disabling the <i>fwrite()</i> at the end of <i>diff_words_show()</i> for that
505 mode.
506 </ul>
507 </p><p>
508 Of course, the hardest part is to find a nice user interface for that. Maybe
509 <i>--colored-words</i>? &#x263a;
510 </p>
511 <h6>Saturday, 24th of January, Anno Domini MMIX, at the hour of the Pig</h6>
512 <a name=1232828715>
513 <h2>Ideas for a major revamp of the <i>--preserve-merges</i> handling in <i>git rebase</i></h2>
516 </p><p>
517 As probably everybody agrees, the code to preserve merges is a big mess
518 right now.
519 </p><p>
520 Worse, the whole concept of "pick <merge-sha1>" just does not fly well.
521 </p><p>
522 So I started a <u>major</u> cleanup, which happens to reduce the code very
523 nicely so far.
524 </p><p>
525 It will take a few days to flesh out, I guess, but these are the major
526 ideas of my work:
527 </p><p>
528 <b>pick $sha1</b><br>
529 <blockquote>will only work on non-merges in the future.</blockquote>
530 <b>merge $sha1 [$sha1...] was $sha1 Merge ...</b><br>
531 <blockquote>will merge the given list of commits into the current HEAD, for
532 the user's reference and to keep up-to-date what was rewritten,
533 the original merge is shown after the keyword "was" (which is not
534 a valid SHA-1, luckily).</blockquote>
535 <b>goto $sha1</b><br>
536 <blockquote>will reset the HEAD to the given commit.</blockquote>
537 <b>$sha1'</b><br>
538 <blockquote>for merge and goto, if a $sha1 ends in a single quote, the
539 rewritten commit is substituted (if there is one).</blockquote>
540 </p><p>
541 Example:
542 </p><p>
543 <pre>
544 A - B - - - E
546 C - D
547 </pre>
548 </p><p>
549 could yield this TODO script:
550 </p><p>
551 <pre>
552 pick A
553 pick C
554 pick D
555 goto A'
556 pick B
557 merge D' was E
558 </pre>
559 </p><p>
560 This should lead to a much more intuitive user experience.
561 </p><p>
562 I am very sorry if somebody actually scripted <i>rebase -i -p</i> (by setting
563 GIT_EDITOR with a script), but I am very certain that this cleanup is
564 absolutely necessary to make <i>rebase -i -p</i> useful.
565 </p>
566 </div>
567 </body>
568 </html>