3 <title>Dscho's blog
</title>
4 <meta http-equiv=
"Content-Type"
5 content=
"text/html; charset=UTF-8"/>
7 <body style=
"width:800px;background-image:url(dscho.git?a=blob_plain;hb=aaa9edafbe6ca5349ad7b36848fb294e5f4fc529;f=paper.jpg);background-repeat:repeat-y;background-attachment:scroll;padding:0px;">
8 <div style=
"width:610px;margin-left:120px;margin-top:50px;align:left;vertical-align:top;">
10 <div style=
"position:absolute;top:50px;left:810px;width=400px">
11 <table width=
400px bgcolor=#e0e0e0 border=
1>
12 <tr><th>Table of contents:
</th></tr>
15 <li><a href=#
1233022809>27 Jan
2009 Fun with calculus after midnight
</a>
16 <li><a href=#
1232997290>26 Jan
2009 Valgrind takes a loooong time
</a>
17 <li><a href=#
1232927812>26 Jan
2009 A day full of rebase... and a little valgrind
</a>
18 <li><a href=#
1232888842>25 Jan
2009 Regular diff with word coloring (as opposed to word diff)
</a>
19 <li><a href=#
1232828715>24 Jan
2009 Ideas for a major revamp of the
<i>--preserve-merges
</i> handling in
<i>git rebase
</i></a>
20 <li><a href=#
1232778113>24 Jan
2009 Thoughts about
<i>interactive rebase
</i></a>
21 <li><a href=#
1232745071>23 Jan
2009 Git Logos
</a>
22 <li><a href=#
1232742582>23 Jan
2009 How to deal with files that are not source code when merging
</a>
23 <li><a href=#
1232626236>22 Jan
2009 The UGFWIINI contest
</a>
24 <li><a href=#
1232611542>22 Jan
2009 Top-posting
</a>
26 <a href=dscho.git?a=blob_plain;hb=
37f819e4ce4788efd55091e9e5d0c562f023c096;f=index.html
>Older posts
</a>
29 <div style=
"text-align:right;">
30 <a href=
"dscho.git?a=blob_plain;hb=blog;f=blog.rss"
31 title=
"Subscribe to my RSS feed"
32 class=
"rss" rel=
"nofollow"
33 style=
"background-color:orange;text-decoration:none;color:white;font-family:sans-serif;">RSS
</a>
36 <table width=
400px bgcolor=#e0e0e0 border=
1>
37 <tr><th>About this blog:
</th></tr>
39 <p>It is an active
<a href=http://repo.or.cz/w/git/dscho.git?a=blob;f=source-
1232626236.txt;h=
1edde0467a
>abuse
</a> of
<a href=http://repo.or.cz
/>repo.or.cz
</a>,
40 letting gitweb unpack the objects in the current tip of the branch
<i>blog
</i>,
41 including the images and the RSS feed.
43 Publishing means running a script that collects the posts, turns them into
44 HTML, makes sure all the images are checked in, and pushes the result.
46 This blog also serves to grace the world with Dscho's random thoughts on and
51 <table width=
400px bgcolor=#e0e0e0 border=
1>
52 <tr><th>Links:
</th></tr>
55 <li> <a href=http://git-scm.com
/>Git's homepage
</a>
56 <li> <a href=http://gitster.livejournal.com
/>Junio's blog
</a>
57 <li> <a href=http://www.spearce.org
/>Shawn's blog
</a> seems to be sitting
58 idle ever since he started working for Google...
59 <li> <a href=http://torvalds-family.blogspot.com
/>Linus' blog
</a> does not
60 talk much about Git...
61 <li> Scott Chacon's
<a href=http://whygitisbetterthanx.com
/>Why Git is better
63 <li> <a href=http://vilain.net
/>The blog of mugwump
</a>
64 <li> <a href=http://blogs.gnome.org/newren
/>Elijah Newren
</a> chose the
65 same path as Cogito, offering an alternative porcelain (an approach
66 that is doomed in my opinion)
67 <li> <a href=http://msysgit.googlecode.com
/>The msysGit project
</a>, a (mostly)
68 failed experiment to lure the many Windows developers out there to
69 contribute to Open Source for a change.
73 <table width=
400px bgcolor=#e0e0e0 border=
1>
74 <tr><th>Google Ads:
</th></tr>
76 <script type=
"text/javascript"><!--
77 google_ad_client
= "pub-5106407705643819";
78 /* 300x250, created 1/22/09 */
79 google_ad_slot
= "6468207338";
80 google_ad_width
= 300;
81 google_ad_height
= 250;
84 <script type=
"text/javascript"
85 src=
"http://pagead2.googlesyndication.com/pagead/show_ads.js">
89 <h6>Tuesday,
27th of January, Anno Domini MMIX, at the hour of the Tiger
</h6>
91 <h2>Fun with calculus after midnight
</h2>
95 Problem: what is the shortest way of defining a variable consisting of
<i>N
</i>
96 spaces? I.e. for
<i>N=
80</i> the result will look something like
99 border=
1 bgcolor=black
>
100 <tr><td bgcolor=lightblue colspan=
3>
104 <table cellspacing=
5 border=
0
105 style=
"color:white;">
109 s=
"$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s"
116 Let's see. Let the minimal number of characters needed be
<i>A(N)
</i>. For
117 simplicity, let's say that we only use one variable. Then, certainly,
<i>A(N)
</i>
118 cannot be larger than
<i>5+N
</i>, as we could define a variable using
1 character
119 for the name,
1 for the equal sign,
2 for the quotes, and one for the semicolon
120 or newline character (whichever).
122 Now, let's assume
<i>N
</i> is a product
<i>K*L
</i>. Then certainly,
<i>A(N)
</i> cannot
123 be larger than
<i>A(K)+
5+
2*L
</i>, as we could first define a variable that has
124 exactly
<i>K
</i> spaces and then use that to define the end result (in the example
125 above,
<i>K=
5</i> and
<i>L=
20</i>).
127 So, for which
<i>N=K*L
</i> is it better to use two definitions instead of one?
129 Simple calculus says that
<i>5+K*L
>5+K+
5+
2*L
</i> must hold true, or (after some
130 scribbling):
<i>L
>1+
7/(K-
2)
</i>. Which means that it makes no sense to define
131 a variable with
1 or
2 spaces first, which is kinda obvious (writing '$s'
132 alone would use two characters, so we could write the spaces right away).
134 But what for the other values? For
<i>K=
3</i>,
<i>L
</i> must be at least
9 to make
135 sense (in other words,
<i>N
</i> must be at least
27). For
<i>K=
4</i>,
<i>L
</i> needs
136 to be greater or equal to
5 (
<i>N
>=
20</i>), the next pairs are
<i>(
5,
4)
</i>,
137 <i>(
6,
3)
</i>,
<i>(
7,
3)
</i>,
<i>(
8,
3)
</i>,
<i>(
9,
3)
</i> and starting with
<i>K=
10</i>, any
138 <i>L
>1</i> makes sense.
140 The second definition can also contain spaces at the end, however, so for any
141 <i>N=K*L+M
</i>,
<i>A(N)
</i> cannot be larger than
<i>A(K)+
5+
2*L+M
</i>.
143 Not surprisingly, this leads to exactly the same
<i>L
>1+
7/(K-
2)
</i> (as we can
144 append the
<i>M
</i> spaces in the last definition, no matter if we use
1 or
147 However, that means that as soon as
<i>N
>=
18</i>, we should use two definitions,
148 prior to that, it makes no sense.
150 So for
<i>N
<18</i>,
<i>A(N)=
5+N
</i>.
152 But what
<i>K
</i> should one choose, i.e. how many spaces in the first definition?
153 In other words, what is
<i>A(N)
</i> given that we use two definitions?
155 That will have to wait for another midnight. Just a teaser:
<i>A(
80)=
36</i>. Oh,
156 and with
80 characters, you can define a string of
9900 spaces...
158 <h6>Monday,
26th of January, Anno Domini MMIX, at the hour of the Dog
</h6>
160 <h2>Valgrind takes a loooong time
</h2>
164 Yesterday, I started a run on a fast machine, and it took roughly
5.5
165 hours by the machine's clock.
167 And of course, I redirected stdout only... *sigh*
169 Which triggered a Google search how to force redirection of all the output
170 in the test scripts to a file and the terminal at the same time.
172 It seems as if that is not easily done. I tried
174 border=
1 bgcolor=black
>
175 <tr><td bgcolor=lightblue colspan=
3>
179 <table cellspacing=
5 border=
0
180 style=
"color:white;">
190 but that did not work: it mumbled something about invalid file handles or some
193 The only solution I found was:
195 border=
1 bgcolor=black
>
196 <tr><td bgcolor=lightblue colspan=
3>
200 <table cellspacing=
5 border=
0
201 style=
"color:white;">
213 That is a problem for parallel execution, though, so I am still looking for a
216 Once I have the output, it is relatively easy to analyze it, as I already
217 made a script which disects the output into valgrind output and the test
218 case it came from, then groups by common valgrind output and shows the
221 <h6>Monday,
26th of January, Anno Domini MMIX, at the hour of the Rat
</h6>
223 <h2>A day full of rebase... and a little valgrind
</h2>
227 I think that I am progressing nicely with my rebase -p work, so much so
228 that I will soon be able to use it myself to work on topic branches
<u>and
</u>
229 rebase all the time without much hassle.
231 In other words, I would like to be able to rebase all my topic branches
232 to Junio's
<i>next
</i> branch whenever that has new commits. With a single
235 And finally, I got the idea of the thing Stephen implemented for dropped
236 commits; however, I am quite sure I do not like it.
238 So what are
"dropped" commits?
240 When you rebase, chances are that the upstream already has applied at
241 least some of your patches. So we filter those out with
<i>--cherry-pick
</i>.
242 Stephen calls those
"dropped" commits.
244 Then he goes on to reinvent the
"$REWRITTEN" system: a directory containing
245 the mappings of old commit names to new commit names. That is easily fixed.
247 But worse, he substitutes the dropped commits with their
<u>parents
</u>, instead
248 of substituting them with the corresponding commits in upstream.
250 I guess this will be a medium-sized fight on the mailing list, depending
251 how much energy Stephen wants to put in to defend his strategy.
253 Anyway, I finally got to a point where only three of the tests are failing,
254 t3404, t3410 and t3412. Somewhat disappointing is t3404, as its name pretends
255 not to exercize -p at all. Oh well, I guess I'll see what is broken tomorrow.
257 Another part of the day was dedicated to the Valgrind patch series, which
258 should give us yet another level of code quality.
260 After having confused myself with several diverging/obsolete branches, I did
261 indeed finally manage to send that patch series off. Woohoo.
263 <h6>Sunday,
25th of January, Anno Domini MMIX, at the hour of the Goat
</h6>
265 <h2>Regular diff with word coloring (as opposed to word diff)
</h2>
269 You know, if I were a bit faster with everything I do, I could do so much more!
271 For example, Junio's idea that you could keep showing a regular diff, only
272 coloring the words that have been removed/deleted.
274 Just imagine looking at the diff of a long line in LaTeX source code. It
275 should be much nicer to the eye to see the complete removed/added sentences
276 instead of one sentence with colored words in between, disrupting your read
279 Compare these two versions:
281 Regular diff with colored words:
283 -This sentence has a
<font color=red
>tyop
</font> in it.
<br>
284 +This sentence has a
<font color=green
>typo
</font> in it.
<br>
289 This sentence has a
<font color=red
>tyop
</font><font color=green
>typo
</font> in it.
<br>
292 And it should not be hard to do at all!
294 In
<i>diff_words_show()
</i>, we basically get the minus lines as
295 <i>diff_words-
>minus
</i> and the plus lines as
<i>diff_words-
>plus
</i>. The
296 function then prepares the word lists and calls the xdiff engine to do all the
297 hard work, analyzing the result from xdiff and printing the lines in
298 <i>fn_out_diff_words_aux()
</i>.
300 So all that would have to be changed would be to
<u>record
</u> the positions
301 of the removed/added words instead of outputting them, and at the end printing
302 the minus/plus buffers using the recorded information to color the words.
307 <li>adding two new members holding the offsets in the
<i>diff_words
</i>
309 <li>having a special handling for that mode in
310 <i>fn_out_diff_words_aux()
</i> that appends the offsets and
312 <li>adding a function
<i>show_lines_with_colored_words()
</i> that
313 outputs a buffer with a given prefix ('-' or '+') and coloring the words at
314 given offsets with a given color,
315 <li>modify
<i>diff_words_show()
</i> to call that function for the
"special
316 case: only removal" and at the end of the function, and
317 <li> disabling the
<i>fwrite()
</i> at the end of
<i>diff_words_show()
</i> for that
321 Of course, the hardest part is to find a nice user interface for that. Maybe
322 <i>--colored-words
</i>?
☺
324 <h6>Saturday,
24th of January, Anno Domini MMIX, at the hour of the Pig
</h6>
326 <h2>Ideas for a major revamp of the
<i>--preserve-merges
</i> handling in
<i>git rebase
</i></h2>
330 As probably everybody agrees, the code to preserve merges is a big mess
333 Worse, the whole concept of
"pick <merge-sha1>" just does not fly well.
335 So I started a
<u>major
</u> cleanup, which happens to reduce the code very
338 It will take a few days to flesh out, I guess, but these are the major
341 <b>pick $sha1
</b><br>
342 <blockquote>will only work on non-merges in the future.
</blockquote>
343 <b>merge $sha1 [$sha1...] was $sha1 Merge ...
</b><br>
344 <blockquote>will merge the given list of commits into the current HEAD, for
345 the user's reference and to keep up-to-date what was rewritten,
346 the original merge is shown after the keyword
"was" (which is not
347 a valid SHA-
1, luckily).
</blockquote>
348 <b>goto $sha1
</b><br>
349 <blockquote>will reset the HEAD to the given commit.
</blockquote>
351 <blockquote>for merge and goto, if a $sha1 ends in a single quote, the
352 rewritten commit is substituted (if there is one).
</blockquote>
362 could yield this TODO script:
373 This should lead to a much more intuitive user experience.
375 I am very sorry if somebody actually scripted
<i>rebase -i -p
</i> (by setting
376 GIT_EDITOR with a script), but I am very certain that this cleanup is
377 absolutely necessary to make
<i>rebase -i -p
</i> useful.
379 <h6>Saturday,
24th of January, Anno Domini MMIX, at the hour of the Dragon
</h6>
381 <h2>Thoughts about
<i>interactive rebase
</i></h2>
385 Somebody mentioned that my
<i>my-next
</i> branch is a mess, as it mixes all
388 That is undeniably true, however, there is a good reason that I do not
389 have a lot of topic branches: I work on more than just one computer.
391 To make sure that I do not lose a commit by mistake, I always
<i>rebase -i
</i>
392 the
<i>my-next
</i> branch of the computer I happen to work on on top of the
393 <i>my-next
</i> branch I fetch from
<a href=http://repo.or.cz
>repo.or.cz
</a>.
395 To rebase a lot of topic branches at the same time seems a bit complicated.
396 But that is actually what the
<i>-p
</i> option (preserve merges) is all about.
398 The only problem is that the code for
<i>rebase -i -p
</i> has been messed up
399 recently, quite successfully, I might add.
401 Worse, some people are pushing for a completely and total unintuitive syntax.
403 So maybe I will start to work on
<i>-p
</i> again, for my own use (I should learn
404 to heed the principle more: work on things I can use myself).
406 My current idea is to implement a
"goto" statement that will jump to another
407 commit. To make it easily usable, I will add the semantics that
"goto" will
408 always try to go to the
<u>rewritten
</u> version of the given commit; if the user
409 wanted to have the original commit, she has to paste the unabbreviated commit
412 The more I think about it, the more I actually like this idea
☺
414 Of course, working on this little project means that I will have to cope with
415 that ugly code again. *urgh*
417 <h6>Friday,
23rd of January, Anno Domini MMIX, at the hour of the Pig
</h6>
423 The other day, when I did not exactly have too much time on my hands, but
424 definitely too much motivation, I played around creating several logos.
426 An ambigram (if you turn it
180 degrees around the appropriate axis, it looks
427 exactly the same as unrotated):
433 <embed type=
"image/svg+xml"
434 src=
"dscho.git?a=blob_plain;hb=aaa9edafbe6ca5349ad7b36848fb294e5f4fc529;f=git-ambigram.svg" width=
317 />
439 <a href=dscho.git?a=blob_plain;hb=aaa9edafbe6ca5349ad7b36848fb294e5f4fc529;f=git-ambigram.svg
>git-ambigram.svg
</a>
451 <embed type=
"image/svg+xml"
452 src=
"dscho.git?a=blob_plain;hb=aaa9edafbe6ca5349ad7b36848fb294e5f4fc529;f=git-gitk-logo.svg" width=
325 />
457 <a href=dscho.git?a=blob_plain;hb=aaa9edafbe6ca5349ad7b36848fb294e5f4fc529;f=git-gitk-logo.svg
>git-gitk-logo.svg
</a>
463 A play on the test you have to go through before getting new glasses:
469 <embed type=
"image/svg+xml"
470 src=
"dscho.git?a=blob_plain;hb=aaa9edafbe6ca5349ad7b36848fb294e5f4fc529;f=git-visual-test.svg" width=
325 />
475 <a href=dscho.git?a=blob_plain;hb=aaa9edafbe6ca5349ad7b36848fb294e5f4fc529;f=git-visual-test.svg
>git-visual-test.svg
</a>
481 This is Henrik Nyh's logo (converted to .svg by yours truly):
487 <embed type=
"image/svg+xml"
488 src=
"dscho.git?a=blob_plain;hb=aaa9edafbe6ca5349ad7b36848fb294e5f4fc529;f=gitlogo.svg" width=
165 />
493 <a href=dscho.git?a=blob_plain;hb=aaa9edafbe6ca5349ad7b36848fb294e5f4fc529;f=gitlogo.svg
>gitlogo.svg
</a>
499 And of course, the original logo...
505 <embed type=
"image/svg+xml"
506 src=
"dscho.git?a=blob_plain;hb=aaa9edafbe6ca5349ad7b36848fb294e5f4fc529;f=original-git-logo.svg" width=
165 />
511 <a href=dscho.git?a=blob_plain;hb=aaa9edafbe6ca5349ad7b36848fb294e5f4fc529;f=original-git-logo.svg
>original-git-logo.svg
</a>
517 Maybe some of you have fun with them...
519 <h6>Friday,
23rd of January, Anno Domini MMIX, at the hour of the Pig
</h6>
521 <h2>How to deal with files that are not source code when merging
</h2>
525 Last week, one of the mentors of last year's
<a href=http://code.google.com/soc
>
526 Summer of Code
</a> mentioned the idea that merge strategies are in dear need
527 for file types other than source code.
529 I think this idea is awesome, even if I cannot bring myself to believe that
530 any of the file types would make a good Summer of Code project: either they
531 are too complicated (think raster images such as .png or even .jpg), or they
532 are too straight-forward (think LaTeX, where all that is needed is a good
533 graphical user interface to inspect the three versions:
<i>ours
</i>,
<i>baseline
</i>
536 The LaTeX idea would be a good project for me to mentor, though: I have a
537 pretty clear idea how it should be done; I just lack the time (and motivation)
540 As for OpenOffice text documents, vector graphics (such as .svg), or more
541 specific data such as spreadsheets, I think that all of these are really
542 difficult: the problem is not so much the implementation (i.e. the programming
543 part of it), but the design.
545 This design should involve much more than a Summer of Code project is about:
546 you would need to survey users' expectations, and at least the mentor -- if
547 not the student -- would need to be an expert in usability questions, which
548 is rather unlikely in the realm of Open Source.
550 Maybe this is the missing part in Open Source: we have many brilliant
551 programmers, but next to nobody with a good idea how to design intuitive
554 That might be related to the fact that brilliant software engineers, as they
555 can be found in Open Source, are not exactly known for their social skills,
556 a human trait that seems to be a very important prerequisite for designing
557 intuitive user interfaces.
559 Well, I have
<a href=http://git.or.cz/gitwiki/SoC2009Ideas#head-
6188833471f79f277e162ef9fbe1592aa10b5f6c
>
560 added
</a> the proposal to Git's Summer of Code idea page on the Git Wiki; We will
561 see what comes out of it.
563 <h6>Thursday,
22nd of January, Anno Domini MMIX, at the hour of the Goat
</h6>
565 <h2>The UGFWIINI contest
</h2>
569 Just in case somebody finds this blog, here is a challenge. Inspired by my
570 own little hack (this blog), I announce the
"Using Git For What It Is Not
573 And it is especially cool, since the acronym sounds cool! You might miss
574 this fact if you do no know that I pronounce the
"F" like an
"A" so that
577 This will be a running contest; whenever I have
10 valid applications, I
578 will announce a winner on the Git mailing list.
580 So, what accounts for a valid application?
583 <li> You must use a Git program (the term is used loosely here, GitWeb is
584 considered a Git program, for example).
585 <li> The program must be intended for something completely different than
586 what you are using it for. E.g. GitWeb -- which was intended to let
587 you browse through the history using your web browser -- is used
588 to serve a blog to the wide world.
589 <li> You must be able to prove that you actually used the Git program to
590 the purpose you claim, preferably in a live demonstration like this
592 <li> Nobody and nothing must be harmed in the process (except your
593 laughing muscle, that's okay).
596 So, how does such an abuse look like?
599 <li> ... like this blog.
600 <li> Managing your mail (in maildir format) in a Git repository.
601 <li> Finding duplicate files by
603 border=
1 bgcolor=black
>
604 <tr><td bgcolor=lightblue colspan=
3>
608 <table cellspacing=
5 border=
0
609 style=
"color:white;">
614 $ git ls-files --stage | sort -k2 | uniq -d -s7 -w40
620 <li> Abusing the Git alias mechanism to call scripts defined directly in
624 I am really looking forward to all of your submissions... *chuckles*
627 <h6>Thursday,
22nd of January, Anno Domini MMIX, at the hour of the Snake
</h6>
633 Okay, last post for a while. But this is something that is nagging me
634 tremendously. I should probably just let go, but in my deepest inner self,
635 really close to my heart, I refuse to believe that any human beings could
636 be incapable of certain degrees of reason.
638 Take the example of top-posting. Everybody who read a top-posted email
639 knows that you have to scroll down, possibly weeding through tons of
640 pages to find out what the heck the author of the last reply was replying
643 Never mind that it would take the author of the reply just a couple of
644 seconds to remove all the irrelevant stuff -- as she already knows what
645 is the relevant part, saving minutes, in case of mailing lists hours,
646 easily, to the readers who otherwise would have to discern what is
647 irrelevant and what is relevant first.
649 It is a horrible time waste. But of course not for the top-poster.
651 The problem is that I frequently run into such people, and when I write
652 them a polite mail, explaining to them that it is impolite to top-post,
653 and why, the answers I get sometimes make me check if the sky is still up
654 and the earth down. Yesterday was an example of such a dubitable
657 Most funny are the ridiculous attempts by those persons at explaining why
658 top-posting is
<i>so
</i> much superior to anything else.
660 Which is good, because if they were not that funny, they would be pretty sad.