Update Wednesday, 28th of January, Anno Domini MMIX, at the hour of the Buffalo
[git/dscho.git] / index.html
blobe0ac6bd5c261b7e61023dfa374c3ed91c590caee
1 <html>
2 <head>
3 <title>Dscho's blog</title>
4 <meta http-equiv="Content-Type"
5 content="text/html; charset=UTF-8"/>
6 </head>
7 <body style="width:800px;background-image:url(dscho.git?a=blob_plain;hb=832be85c785c80202f17b87db7f063ae57ec2cac;f=paper.jpg);background-repeat:repeat-y;background-attachment:scroll;padding:0px;">
8 <div style="width:610px;margin-left:120px;margin-top:50px;align:left;vertical-align:top;">
9 <h1>Dscho's blog</h1>
10 <div style="position:absolute;top:50px;left:810px;width=400px">
11 <table width=400px bgcolor=#e0e0e0 border=1>
12 <tr><th>Table of contents:</th></tr>
13 <tr><td>
14 <p><ul>
15 <li><a href=#1233107175>28 Jan 2009 </a>
16 <li><a href=#1233102919>28 Jan 2009 Showing off that you're an Alpine user ... priceless!</a>
17 <li><a href=#1233101919>28 Jan 2009 Progress with the interactive rebase preserving merges</a>
18 <li><a href=#1233099894>28 Jan 2009 Another midnight riddle?</a>
19 <li><a href=#1233022809>27 Jan 2009 Fun with calculus after midnight</a>
20 <li><a href=#1232997290>26 Jan 2009 Valgrind takes a loooong time</a>
21 <li><a href=#1232927812>26 Jan 2009 A day full of rebase... and a little valgrind</a>
22 <li><a href=#1232888842>25 Jan 2009 Regular diff with word coloring (as opposed to word diff)</a>
23 <li><a href=#1232828715>24 Jan 2009 Ideas for a major revamp of the <i>--preserve-merges</i> handling in <i>git rebase</i></a>
24 <li><a href=#1232778113>24 Jan 2009 Thoughts about <i>interactive rebase</i></a>
25 </ul></p>
26 <a href=dscho.git?a=blob_plain;hb=a910b3bb049f5fa34908772ab008a90223a598ac;f=index.html>Older posts</a>
27 </td></tr></table>
28 <br>
29 <div style="text-align:right;">
30 <a href="dscho.git?a=blob_plain;hb=blog;f=blog.rss"
31 title="Subscribe to my RSS feed"
32 class="rss" rel="nofollow"
33 style="background-color:orange;text-decoration:none;color:white;font-family:sans-serif;">RSS</a>
34 </div>
35 <br>
36 <table width=400px bgcolor=#e0e0e0 border=1>
37 <tr><th>About this blog:</th></tr>
38 <tr><td>
39 <p>It is an active <a href=http://repo.or.cz/w/git/dscho.git?a=blob;f=source-1232626236.txt;h=1edde0467a>abuse</a> of <a href=http://repo.or.cz/>repo.or.cz</a>,
40 letting gitweb unpack the objects in the current tip of the branch <i>blog</i>,
41 including the images and the RSS feed.
42 </p><p>
43 Publishing means running a script that collects the posts, turns them into
44 HTML, makes sure all the images are checked in, and pushes the result.
45 </p><p>
46 This blog also serves to grace the world with Dscho's random thoughts on and
47 around Git.
48 </p>
49 </td></tr></table>
50 <br>
51 <table width=400px bgcolor=#e0e0e0 border=1>
52 <tr><th>Links:</th></tr>
53 <tr><td>
54 <ul>
55 <li> <a href=http://git-scm.com/>Git's homepage</a>
56 <li> <a href=http://gitster.livejournal.com/>Junio's blog</a>
57 <li> <a href=http://www.spearce.org/>Shawn's blog</a> seems to be sitting
58 idle ever since he started working for Google...
59 <li> <a href=http://torvalds-family.blogspot.com/>Linus' blog</a> does not
60 talk much about Git...
61 <li> Scott Chacon's <a href=http://whygitisbetterthanx.com/>Why Git is better
62 than X</a> site
63 <li> <a href=http://vilain.net/>The blog of mugwump</a>
64 <li> <a href=http://blogs.gnome.org/newren/>Elijah Newren</a> chose the
65 same path as Cogito, offering an alternative porcelain (an approach
66 that is doomed in my opinion)
67 <li> <a href=http://msysgit.googlecode.com/>The msysGit project</a>, a (mostly)
68 failed experiment to lure the many Windows developers out there to
69 contribute to Open Source for a change.
70 </ul>
71 </td></tr></table>
72 <br>
73 <table width=400px bgcolor=#e0e0e0 border=1>
74 <tr><th>Google Ads:</th></tr>
75 <tr><td>
76 <script type="text/javascript"><!--
77 google_ad_client = "pub-5106407705643819";
78 /* 300x250, created 1/22/09 */
79 google_ad_slot = "6468207338";
80 google_ad_width = 300;
81 google_ad_height = 250;
82 //-->
83 </script>
84 <script type="text/javascript"
85 src="http://pagead2.googlesyndication.com/pagead/show_ads.js">
86 </script>
87 </td></tr></table>
88 </div>
89 <h6>Wednesday, 28th of January, Anno Domini MMIX, at the hour of the Buffalo</h6>
90 <a name=1233107175>
91 <h2></h2>
93 <p>
94 </p>
95 <h6>Wednesday, 28th of January, Anno Domini MMIX, at the hour of the Buffalo</h6>
96 <a name=1233102919>
97 <h2>Showing off that you're an Alpine user ... priceless!</h2>
99 <p>
100 </p><p>
101 So I was in a hurry to send the patches, and sent all the patches as replies
102 to the cover-letter, and therefore typed in <i>rnyn</i> all the time, which is the
103 mantra I need to say to Alpine for <i>Reply</i>, ... include quoted message?
104 <i>No</i>, ... reply to all recipients? <i>Yes</i>, ... use first role?
105 <i>No, use default role</i>.
106 </p><p>
107 That was pretty embarassing, as it shows everybody that I still do not trust
108 <i>send-email</i>, and rather paste every single patch by hand. Which is rather
109 annoying.
110 </p><p>
111 So I started using format-patch today, to output directly to Alpine's
112 <i>postponed-msgs</i> folder, so that I can do some touchups in the mailer
113 before sending the patch series on its way.
114 </p><p>
115 However, when running format-patch with <i>--thread</i>, it generates Message-ID
116 strings that Alpine does not like, and therefore replaces.
117 </p><p>
118 Oh, well, I'll probably just investigate how the Message-IDs are supposed to
119 look, and then use sed to rewrite the generated ones by Alpine-friendly ones
120 during the redirection to <i>postponed-msgs</i>.
121 </p><p>
122 But I alread realized that doing it that way is dramatically faster than the
123 workflow I had before.
124 </p><p>
125 And safer: no more <i>rnyn</i>.
126 </p>
127 <h6>Wednesday, 28th of January, Anno Domini MMIX, at the hour of the Buffalo</h6>
128 <a name=1233101919>
129 <h2>Progress with the interactive rebase preserving merges</h2>
132 </p><p>
133 I thought about the "dropped" commits a bit more, after all, and it is
134 probably a good thing to substitute them by their parent, as Stephen did it.
135 </p><p>
136 Imagine that you have merged a branch with two commits. One is in upstream,
137 and you want to rebase (preserving merges) onto upstream. Then you still
138 want to merge the single commit.
139 </p><p>
140 Even better, if there is no commit left, the <i>$REWRITTEN</i> mechanism will
141 substitute the commit onto which we are rebasing, so a merge will just
142 result in a fast-forward!
143 </p><p>
144 Oh, another thing: merge commits should not have a patch id, as they have
145 <u>multiple</u> patches. However, I borked the code long time ago (9c6efa36)
146 and merges get the patch-id of their diff to the first parent. Which is
147 probably wrong. So I guess I'll have to fix that with my rebase revamp.
148 </p><p>
149 So what about a root commit? If that was dropped, we will just substitute
150 it with the commit onto which we rebase (as a root commit did not really
151 have a parent, but will get the onto-commit as new parent)..
152 </p><p>
153 Now that I finally realized that t3410 is so strange because of a bug <u>I</u>
154 introduced, I can finally go about fixing it.
155 </p>
156 <h6>Wednesday, 28th of January, Anno Domini MMIX, at the hour of the Rat</h6>
157 <a name=1233099894>
158 <h2>Another midnight riddle?</h2>
161 </p><p>
162 Okay, here's another riddle: what is the next line?
163 </p><p>
164 <pre>
168 1 1 1 2
169 3 1 1 2
170 2 1 1 2 1 3
172 </pre>
173 </p><p>
174 And when does the line get wider than 10 digits?
175 </p>
176 <h6>Tuesday, 27th of January, Anno Domini MMIX, at the hour of the Tiger</h6>
177 <a name=1233022809>
178 <h2>Fun with calculus after midnight</h2>
181 </p><p>
182 Problem: what is the shortest way of defining a variable consisting of <i>N</i>
183 spaces? I.e. for <i>N=80</i> the result will look something like
184 </p><p>
185 <table
186 border=1 bgcolor=black>
187 <tr><td bgcolor=lightblue colspan=3>
188 <pre> </pre>
189 </td></tr>
190 <tr><td>
191 <table cellspacing=5 border=0
192 style="color:white;">
193 <tr><td>
194 <pre>
195 s=' '
196 s="$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s"
197 </pre>
198 </td></tr>
199 </table>
200 </td></tr>
201 </table>
202 </p><p>
203 Let's see. Let the minimal number of characters needed be <i>A(N)</i>. For
204 simplicity, let's say that we only use one variable. Then, certainly, <i>A(N)</i>
205 cannot be larger than <i>5+N</i>, as we could define a variable using 1 character
206 for the name, 1 for the equal sign, 2 for the quotes, and one for the semicolon
207 or newline character (whichever).
208 </p><p>
209 Now, let's assume <i>N</i> is a product <i>K*L</i>. Then certainly, <i>A(N)</i> cannot
210 be larger than <i>A(K)+5+2*L</i>, as we could first define a variable that has
211 exactly <i>K</i> spaces and then use that to define the end result (in the example
212 above, <i>K=5</i> and <i>L=20</i>).
213 </p><p>
214 So, for which <i>N=K*L</i> is it better to use two definitions instead of one?
215 </p><p>
216 Simple calculus says that <i>5+K*L>5+K+5+2*L</i> must hold true, or (after some
217 scribbling): <i>L>1+7/(K-2)</i>. Which means that it makes no sense to define
218 a variable with 1 or 2 spaces first, which is kinda obvious (writing '$s'
219 alone would use two characters, so we could write the spaces right away).
220 </p><p>
221 But what for the other values? For <i>K=3</i>, <i>L</i> must be at least 9 to make
222 sense (in other words, <i>N</i> must be at least 27). For <i>K=4</i>, <i>L</i> needs
223 to be greater or equal to 5 (<i>N>=20</i>), the next pairs are <i>(5,4)</i>,
224 <i>(6,3)</i>, <i>(7,3)</i>, <i>(8,3)</i>, <i>(9,3)</i> and starting with <i>K=10</i>, any
225 <i>L>1</i> makes sense.
226 </p><p>
227 The second definition can also contain spaces at the end, however, so for any
228 <i>N=K*L+M</i>, <i>A(N)</i> cannot be larger than <i>A(K)+5+2*L+M</i>.
229 </p><p>
230 Not surprisingly, this leads to exactly the same <i>L>1+7/(K-2)</i> (as we can
231 append the <i>M</i> spaces in the last definition, no matter if we use 1 or
232 2 definitions).
233 </p><p>
234 However, that means that as soon as <i>N>=18</i>, we should use two definitions,
235 prior to that, it makes no sense.
236 </p><p>
237 So for <i>N<18</i>, <i>A(N)=5+N</i>.
238 </p><p>
239 But what <i>K</i> should one choose, i.e. how many spaces in the first definition?
240 In other words, what is <i>A(N)</i> given that we use two definitions?
241 </p><p>
242 That will have to wait for another midnight. Just a teaser: <i>A(80)=36</i>. Oh,
243 and with 80 characters, you can define a string of 9900 spaces...
244 </p>
245 <h6>Monday, 26th of January, Anno Domini MMIX, at the hour of the Dog</h6>
246 <a name=1232997290>
247 <h2>Valgrind takes a loooong time</h2>
250 </p><p>
251 Yesterday, I started a run on a fast machine, and it took roughly 5.5
252 hours by the machine's clock.
253 </p><p>
254 And of course, I redirected stdout only... *sigh*
255 </p><p>
256 Which triggered a Google search how to force redirection of all the output
257 in the test scripts to a file and the terminal at the same time.
258 </p><p>
259 It seems as if that is not easily done. I tried
260 <center><table
261 border=1 bgcolor=black>
262 <tr><td bgcolor=lightblue colspan=3>
263 <pre> </pre>
264 </td></tr>
265 <tr><td>
266 <table cellspacing=5 border=0
267 style="color:white;">
268 <tr><td>
269 <pre>
270 exec >(tee out) 2>&1
271 </pre>
272 </td></tr>
273 </table>
274 </td></tr>
275 </table></center>
276 </p><p>
277 but that did not work: it mumbled something about invalid file handles or some
278 such.
279 </p><p>
280 The only solution I found was:
281 <center><table
282 border=1 bgcolor=black>
283 <tr><td bgcolor=lightblue colspan=3>
284 <pre> </pre>
285 </td></tr>
286 <tr><td>
287 <table cellspacing=5 border=0
288 style="color:white;">
289 <tr><td>
290 <pre>
291 mkpipe pipe
292 tee out < pipe &
293 exec > pipe 2>&1
294 </pre>
295 </td></tr>
296 </table>
297 </td></tr>
298 </table></center>
299 </p><p>
300 That is a problem for parallel execution, though, so I am still looking for a
301 better way to do it.
302 </p><p>
303 Once I have the output, it is relatively easy to analyze it, as I already
304 made a script which disects the output into valgrind output and the test
305 case it came from, then groups by common valgrind output and shows the
306 result to the user.
307 </p>
308 <h6>Monday, 26th of January, Anno Domini MMIX, at the hour of the Rat</h6>
309 <a name=1232927812>
310 <h2>A day full of rebase... and a little valgrind</h2>
313 </p><p>
314 I think that I am progressing nicely with my rebase -p work, so much so
315 that I will soon be able to use it myself to work on topic branches <u>and</u>
316 rebase all the time without much hassle.
317 </p><p>
318 In other words, I would like to be able to rebase all my topic branches
319 to Junio's <i>next</i> branch whenever that has new commits. With a single
320 rebase.
321 </p><p>
322 And finally, I got the idea of the thing Stephen implemented for dropped
323 commits; however, I am quite sure I do not like it.
324 </p><p>
325 So what are "dropped" commits?
326 </p><p>
327 When you rebase, chances are that the upstream already has applied at
328 least some of your patches. So we filter those out with <i>--cherry-pick</i>.
329 Stephen calls those "dropped" commits.
330 </p><p>
331 Then he goes on to reinvent the "$REWRITTEN" system: a directory containing
332 the mappings of old commit names to new commit names. That is easily fixed.
333 </p><p>
334 But worse, he substitutes the dropped commits with their <u>parents</u>, instead
335 of substituting them with the corresponding commits in upstream.
336 </p><p>
337 I guess this will be a medium-sized fight on the mailing list, depending
338 how much energy Stephen wants to put in to defend his strategy.
339 </p><p>
340 Anyway, I finally got to a point where only three of the tests are failing,
341 t3404, t3410 and t3412. Somewhat disappointing is t3404, as its name pretends
342 not to exercize -p at all. Oh well, I guess I'll see what is broken tomorrow.
343 </p><p>
344 Another part of the day was dedicated to the Valgrind patch series, which
345 should give us yet another level of code quality.
346 </p><p>
347 After having confused myself with several diverging/obsolete branches, I did
348 indeed finally manage to send that patch series off. Woohoo.
349 </p>
350 <h6>Sunday, 25th of January, Anno Domini MMIX, at the hour of the Goat</h6>
351 <a name=1232888842>
352 <h2>Regular diff with word coloring (as opposed to word diff)</h2>
355 </p><p>
356 You know, if I were a bit faster with everything I do, I could do so much more!
357 </p><p>
358 For example, Junio's idea that you could keep showing a regular diff, only
359 coloring the words that have been removed/deleted.
360 </p><p>
361 Just imagine looking at the diff of a long line in LaTeX source code. It
362 should be much nicer to the eye to see the complete removed/added sentences
363 instead of one sentence with colored words in between, disrupting your read
364 flow.
365 </p><p>
366 Compare these two versions:
367 </p><p>
368 Regular diff with colored words:
369 <blockquote><tt>
370 -This sentence has a <font color=red>tyop</font> in it.<br>
371 +This sentence has a <font color=green>typo</font> in it.<br>
372 </tt></blockquote>
373 </p><p>
374 Word diff:
375 <blockquote><tt>
376 This sentence has a <font color=red>tyop</font><font color=green>typo</font> in it.<br>
377 </tt></blockquote>
378 </p><p>
379 And it should not be hard to do at all!
380 </p><p>
381 In <i>diff_words_show()</i>, we basically get the minus lines as
382 <i>diff_words->minus</i> and the plus lines as <i>diff_words->plus</i>. The
383 function then prepares the word lists and calls the xdiff engine to do all the
384 hard work, analyzing the result from xdiff and printing the lines in
385 <i>fn_out_diff_words_aux()</i>.
386 </p><p>
387 So all that would have to be changed would be to <u>record</u> the positions
388 of the removed/added words instead of outputting them, and at the end printing
389 the minus/plus buffers using the recorded information to color the words.
390 </p><p>
391 This would involve
392 </p><p>
393 <ul>
394 <li>adding two new members holding the offsets in the <i>diff_words</i>
395 struct,
396 <li>having a special handling for that mode in
397 <i>fn_out_diff_words_aux()</i> that appends the offsets and
398 returns,
399 <li>adding a function <i>show_lines_with_colored_words()</i> that
400 outputs a buffer with a given prefix ('-' or '+') and coloring the words at
401 given offsets with a given color,
402 <li>modify <i>diff_words_show()</i> to call that function for the "special
403 case: only removal" and at the end of the function, and
404 <li> disabling the <i>fwrite()</i> at the end of <i>diff_words_show()</i> for that
405 mode.
406 </ul>
407 </p><p>
408 Of course, the hardest part is to find a nice user interface for that. Maybe
409 <i>--colored-words</i>? &#x263a;
410 </p>
411 <h6>Saturday, 24th of January, Anno Domini MMIX, at the hour of the Pig</h6>
412 <a name=1232828715>
413 <h2>Ideas for a major revamp of the <i>--preserve-merges</i> handling in <i>git rebase</i></h2>
416 </p><p>
417 As probably everybody agrees, the code to preserve merges is a big mess
418 right now.
419 </p><p>
420 Worse, the whole concept of "pick <merge-sha1>" just does not fly well.
421 </p><p>
422 So I started a <u>major</u> cleanup, which happens to reduce the code very
423 nicely so far.
424 </p><p>
425 It will take a few days to flesh out, I guess, but these are the major
426 ideas of my work:
427 </p><p>
428 <b>pick $sha1</b><br>
429 <blockquote>will only work on non-merges in the future.</blockquote>
430 <b>merge $sha1 [$sha1...] was $sha1 Merge ...</b><br>
431 <blockquote>will merge the given list of commits into the current HEAD, for
432 the user's reference and to keep up-to-date what was rewritten,
433 the original merge is shown after the keyword "was" (which is not
434 a valid SHA-1, luckily).</blockquote>
435 <b>goto $sha1</b><br>
436 <blockquote>will reset the HEAD to the given commit.</blockquote>
437 <b>$sha1'</b><br>
438 <blockquote>for merge and goto, if a $sha1 ends in a single quote, the
439 rewritten commit is substituted (if there is one).</blockquote>
440 </p><p>
441 Example:
442 </p><p>
443 <pre>
444 A - B - - - E
446 C - D
447 </pre>
448 </p><p>
449 could yield this TODO script:
450 </p><p>
451 <pre>
452 pick A
453 pick C
454 pick D
455 goto A'
456 pick B
457 merge D' was E
458 </pre>
459 </p><p>
460 This should lead to a much more intuitive user experience.
461 </p><p>
462 I am very sorry if somebody actually scripted <i>rebase -i -p</i> (by setting
463 GIT_EDITOR with a script), but I am very certain that this cleanup is
464 absolutely necessary to make <i>rebase -i -p</i> useful.
465 </p>
466 <h6>Saturday, 24th of January, Anno Domini MMIX, at the hour of the Dragon</h6>
467 <a name=1232778113>
468 <h2>Thoughts about <i>interactive rebase</i></h2>
471 </p><p>
472 Somebody mentioned that my <i>my-next</i> branch is a mess, as it mixes all
473 kinds of topics.
474 </p><p>
475 That is undeniably true, however, there is a good reason that I do not
476 have a lot of topic branches: I work on more than just one computer.
477 </p><p>
478 To make sure that I do not lose a commit by mistake, I always <i>rebase -i</i>
479 the <i>my-next</i> branch of the computer I happen to work on on top of the
480 <i>my-next</i> branch I fetch from <a href=http://repo.or.cz>repo.or.cz</a>.
481 </p><p>
482 To rebase a lot of topic branches at the same time seems a bit complicated.
483 But that is actually what the <i>-p</i> option (preserve merges) is all about.
484 </p><p>
485 The only problem is that the code for <i>rebase -i -p</i> has been messed up
486 recently, quite successfully, I might add.
487 </p><p>
488 Worse, some people are pushing for a completely and total unintuitive syntax.
489 </p><p>
490 So maybe I will start to work on <i>-p</i> again, for my own use (I should learn
491 to heed the principle more: work on things I can use myself).
492 </p><p>
493 My current idea is to implement a "goto" statement that will jump to another
494 commit. To make it easily usable, I will add the semantics that "goto" will
495 always try to go to the <u>rewritten</u> version of the given commit; if the user
496 wanted to have the original commit, she has to paste the unabbreviated commit
497 name.
498 </p><p>
499 The more I think about it, the more I actually like this idea &#x263a;
500 </p><p>
501 Of course, working on this little project means that I will have to cope with
502 that ugly code again. *urgh*
503 </p>
504 </div>
505 </body>
506 </html>