Sverre says the link to the UGFWIINI contest was bothering him
[git/dscho.git] / blog.rss
blobdc3cbe41bb866b65049b75891446e2946ed46752
1 <?xml version="1.0" encoding="utf-8"?>
2 <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
3 <channel>
4 <title>Dscho's blog</title>
5 <link>http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=index.html</link>
6 <atom:link href="http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=blog.rss" rel="self" type="application/rss+xml"/>
7 <description>A few stories told by Dscho</description>
8 <lastBuildDate>Wed, 04 Feb 2009 01:33:49 +0100</lastBuildDate>
9 <language>en-us</language>
10 <item>
11 <title>New valgrind series</title>
12 <link>http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=index.html#1233707628</link>
13 <guid>http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=index.html#1233707628</guid>
14 <pubDate>Wed, 04 Feb 2009 01:33:48 +0100</pubDate>
15 <description><![CDATA[New valgrind series
16 </p><p>
17 I spent quite some time cleaning up that patch series, and feel pretty
18 exhausted.
19 </p><p>
20 Granted, the new <i>git rebase -i -p</i> does its job without complaint so far
21 (so much so that I think I'll release a version of my <i>rebase</i> series
22 soonish), but it <u>is</u> a hassle when you have patches that you have a hard
23 time to decide upon the order/commit boundaries.
24 </p><p>
25 For example, I could imagine that the patch making the location of the
26 templates independent of the location of the Git binaries should come
27 <u>before</u> my patch series, and the valgrind specific part should then
28 be squashed into the first valgrind commit.
29 </p><p>
30 Also, it uses two features of valgrind 3.4.0:
31 </p><p>
32 <ul>
33 <li><i>...</i> in the suppression file, and
34 <li><i>--track-origins=yes</i>
35 </ul>
36 </p><p>
37 The latter is actually the reason I am pretty willing to keep the
38 requirement of that valgrind version, as it is really, really useful.
39 </p><p>
40 I guess we will see what happens to it.]]></description>
41 </item>
42 <item>
43 <title>Problems with split-topic-branches.sh</title>
44 <link>http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=index.html#1233706294</link>
45 <guid>http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=index.html#1233706294</guid>
46 <pubDate>Wed, 04 Feb 2009 01:11:34 +0100</pubDate>
47 <description><![CDATA[Problems with split-topic-branches.sh
48 </p><p>
49 So my little script that should help me to split my topic branches does
50 not work properly.
51 </p><p>
52 First some background: the idea was to let <i>git blame</i> do the hard work
53 to find overlapping changes, i.e. changes that would conflict when
54 changing the order (or skipping the first change, on which the next builds).
55 </p><p>
56 The first problem with that approach: when lines are <u>removed</u> by one
57 commit, and the next commit touches the same location, <i>git blame</i> does
58 not find that the first commit is required by the second.
59 </p><p>
60 Therefore I introduced a really slow reverse thing which tries to find
61 those commits whose removals survived until the parent of a particular
62 commit, but not further.
63 </p><p>
64 However, it does not work properly. Basically, only context sizes that
65 span the whole files lead to conflict-free topic branches so far.
66 </p><p>
67 As a consequence, I think I'll add an option --sprout to the revision
68 walker which will fake octopus merges (or a series of two-parent merges)
69 whenever it finds a perl of non-merge commits that are theoretically
70 independent, i.e. whose patches apply cleanly.]]></description>
71 </item>
72 <item>
73 <title>More valgrind fun</title>
74 <link>http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=index.html#1233277286</link>
75 <guid>http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=index.html#1233277286</guid>
76 <pubDate>Fri, 30 Jan 2009 02:01:26 +0100</pubDate>
77 <description><![CDATA[More valgrind fun
78 </p><p>
79 So I spent quite a number of hours on that funny zlib/valgrind issue. The
80 thing is, zlib people claim that even if their code accesses uninitialized
81 memory, it does not produce erroneous data (by cutting out the results of the
82 uninitialized data, which is cheaper than checking for the end of the buffer
83 in an unaligned manner), so zlib will always be special for valgrind.
84 </p><p>
85 However, the bug I was chasing is funny, and different from said issue. zlib
86 deflates an input buffer to an output buffer that is exactly 58 bytes long.
87 But valgrind claims that the 52nd of those bytes is uninitialized, and <u>only</u>
88 that one.
89 </p><p>
90 But it is not. It must be 0x2c, otherwise zlib refuses to inflate the
91 buffer.
92 </p><p>
93 Now, I went into a debugging frenzy, and finally found out that zlib just
94 passes fine (with the default suppressions because of the "cute" way it
95 uses uninitialized memory), <u>except</u> when it is compiled with UNALIGNED_OK
96 defined.
97 </p><p>
98 Which Ubuntu does, of course. Ubuntu, the biggest forker of all.
99 </p><p>
100 The bad part is that it sounds like a bug in valgrind, and I <u>could</u> imagine
101 that it is an issue of an optimized memcpy() that copies int by int, and
102 that valgrind misses out on the fact that a part of that int is actually
103 <u>not</u> uninitialized.
104 </p><p>
105 But my debugging session's results disagree with that.
106 </p><p>
107 With the help of Julian Seward, the original author of valgrind, I instrumented
108 zlib's source code so that valgrind checks earlier if the byte is initialized
109 or not, to find out where the reason of the issue lies.
110 </p><p>
111 The sad part is that when I added the instrumentation to both the <u>end</u> of
112 the while() loop in compress_block() in zlib's trees.c, and just <u>after</u> the
113 while() loop (whose condition is a plain <i>variable < variable</i> comparison,
114 nothing fancy, certainly not changing any memory), only the <u>latter</u> catches
115 a valgrind error.
116 </p><p>
117 And that is truly strange.]]></description>
118 </item>
119 <item>
120 <title>Interactive stash</title>
121 <link>http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=index.html#1233193467</link>
122 <guid>http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=index.html#1233193467</guid>
123 <pubDate>Thu, 29 Jan 2009 02:44:27 +0100</pubDate>
124 <description><![CDATA[Interactive stash
125 </p><p>
126 There is an easy way to split a patch:
127 </p><p>
128 <table
129 border=1 bgcolor=white>
130 <tr><td bgcolor=lightblue colspan=3>
131 <pre> </pre>
132 </td></tr>
133 <tr><td>
134 <table cellspacing=5 border=0
135 style="color:black;">
136 <tr><td>
137 <pre>
138 $ git reset HEAD^
139 $ git add -i
140 $ git commit
141 $ git diff -R HEAD@{1} | git apply --index
142 $ git commit
143 </pre>
144 </td></tr>
145 </table>
146 </td></tr>
147 </table>
148 </p><p>
149 but it misses out on the fact that the first of both commits does not
150 reflect the state of the working directory at any time.
151 </p><p>
152 So I think something like an interactive <i>stash</i> is needed. A method
153 to specify what you want to keep in the working directory, the rest should
154 be stashed. The idea would be something like this:
155 </p><p>
156 <ol>
157 <li>Add the desired changes into a temporary index.
158 <li>Put the rest of the changes in another temporary index.
159 <li>Stash the latter index.
160 <li>Synchronize the working directory with the first index.
161 <li>Clean up temporary indices.
162 </ol>
163 </p><p>
164 Or in code:
165 </p><p>
166 <table
167 border=1 bgcolor=white>
168 <tr><td bgcolor=lightblue colspan=3>
169 <pre> </pre>
170 </td></tr>
171 <tr><td>
172 <table cellspacing=5 border=0
173 style="color:black;">
174 <tr><td>
175 <pre>
176 $ cp .git/index .git/interactive-stash-1
177 $ GIT_INDEX_FILE=.git/interactive-stash-1 git add -i
178 $ cp .git/index .git/interactive-stash-2
179 $ GIT_INDEX_FILE=.git/interactive-stash-1 git diff -R |
180 (GIT_INDEX_FILE=.git/interactive-stash-2 git apply--index)
181 $ tree=$(GIT_INDEX_FILE=.git/index git write-tree)
182 $ commit=$(echo Current index | git commit-tree $tree -p HEAD)
183 $ tree=$(GIT_INDEX_FILE=.git/interactive-stash-2 git write-tree)
184 $ commit=$(echo Edited out | git commit-tree $tree -p HEAD -p $commit)
185 $ git update-ref refs/stash $commit
186 $ GIT_INDEX_FILE=.git/interactive-stash-1 git checkout-index -a -f
187 $ rm .git/interactive-stash-1 .git/interactive-stash-2
188 </pre>
189 </td></tr>
190 </table>
191 </td></tr>
192 </table>
193 </p><p>
194 This should probably go into <i>git-stash.sh</i>, maybe even with a switch
195 to start git-gui to do the interactive adding instead of git-add.]]></description>
196 </item>
197 <item>
198 <title>Splitting topic branches</title>
199 <link>http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=index.html#1233154567</link>
200 <guid>http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=index.html#1233154567</guid>
201 <pubDate>Wed, 28 Jan 2009 15:56:07 +0100</pubDate>
202 <description><![CDATA[Splitting topic branches
203 </p><p>
204 One might be put off easily by the overarching use of buzzwords in the
205 description of how <i>Darcs</i> works. I, for one, do not expect an intelligent
206 author when I read <i>Theory of patches</i> and <i>based on quantum physics</i>.
207 </p><p>
208 The true story, however, is much simpler, and is actually not that dumb:
209 Let's call two commits "conflicting" when they contain at least one
210 overlapping change.
211 </p><p>
212 The idea is now: Given a list of commits (not a set, as the order is important),
213 to sort them into smaller lists such that conflicting commits are in the
214 sublists ("topic branches") and the sublists are minimal, i.e. no two
215 non-conflicting commits are in the same sublist.
216 </p><p>
217 The idea has flaws, of course, as you can have a patch changing the code,
218 and another changing the documentation, but splitting a list of commits
219 in that way is a first step to sort out my <i>my-next</i> mess, where I have
220 a linear perl of not-necessarily-dependent commits.
221 </p><p>
222 And actually, my whole rebase revamp aimed at the clean-up for my own
223 <i>my-next</i> branch, so I am currently writing a script that can be used
224 as a GIT_EDITOR for git-rebase which implements the Darcs algorithm. Kind of:
225 the result is not implicit, but explicit and can be fixed up later.]]></description>
226 </item>
227 <item>
228 <title>Showing off that you're an Alpine user ... priceless!</title>
229 <link>http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=index.html#1233102919</link>
230 <guid>http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=index.html#1233102919</guid>
231 <pubDate>Wed, 28 Jan 2009 01:35:19 +0100</pubDate>
232 <description><![CDATA[Showing off that you're an Alpine user ... priceless!
233 </p><p>
234 So I was in a hurry to send the patches, and sent all the patches as replies
235 to the cover-letter, and therefore typed in <i>rnyn</i> all the time, which is the
236 mantra I need to say to Alpine for <i>Reply</i>, ... include quoted message?
237 <i>No</i>, ... reply to all recipients? <i>Yes</i>, ... use first role?
238 <i>No, use default role</i>.
239 </p><p>
240 That was pretty embarassing, as it shows everybody that I still do not trust
241 <i>send-email</i>, and rather paste every single patch by hand. Which is rather
242 annoying.
243 </p><p>
244 So I started using format-patch today, to output directly to Alpine's
245 <i>postponed-msgs</i> folder, so that I can do some touchups in the mailer
246 before sending the patch series on its way.
247 </p><p>
248 However, when running format-patch with <i>--thread</i>, it generates Message-ID
249 strings that Alpine does not like, and therefore replaces.
250 </p><p>
251 Oh, well, I'll probably just investigate how the Message-IDs are supposed to
252 look, and then use sed to rewrite the generated ones by Alpine-friendly ones
253 during the redirection to <i>postponed-msgs</i>.
254 </p><p>
255 But I alread realized that doing it that way is dramatically faster than the
256 workflow I had before.
257 </p><p>
258 And safer: no more <i>rnyn</i>.]]></description>
259 </item>
260 <item>
261 <title>Progress with the interactive rebase preserving merges</title>
262 <link>http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=index.html#1233101919</link>
263 <guid>http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=index.html#1233101919</guid>
264 <pubDate>Wed, 28 Jan 2009 01:18:39 +0100</pubDate>
265 <description><![CDATA[Progress with the interactive rebase preserving merges
266 </p><p>
267 I thought about the "dropped" commits a bit more, after all, and it is
268 probably a good thing to substitute them by their parent, as Stephen did it.
269 </p><p>
270 Imagine that you have merged a branch with two commits. One is in upstream,
271 and you want to rebase (preserving merges) onto upstream. Then you still
272 want to merge the single commit.
273 </p><p>
274 Even better, if there is no commit left, the <i>$REWRITTEN</i> mechanism will
275 substitute the commit onto which we are rebasing, so a merge will just
276 result in a fast-forward!
277 </p><p>
278 Oh, another thing: merge commits should not have a patch id, as they have
279 <u>multiple</u> patches. However, I borked the code long time ago (9c6efa36)
280 and merges get the patch-id of their diff to the first parent. Which is
281 probably wrong. So I guess I'll have to fix that with my rebase revamp.
282 </p><p>
283 So what about a root commit? If that was dropped, we will just substitute
284 it with the commit onto which we rebase (as a root commit did not really
285 have a parent, but will get the onto-commit as new parent)..
286 </p><p>
287 Now that I finally realized that t3410 is so strange because of a bug <u>I</u>
288 introduced, I can finally go about fixing it.]]></description>
289 </item>
290 <item>
291 <title>Another midnight riddle?</title>
292 <link>http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=index.html#1233099894</link>
293 <guid>http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=index.html#1233099894</guid>
294 <pubDate>Wed, 28 Jan 2009 00:44:54 +0100</pubDate>
295 <description><![CDATA[Another midnight riddle?
296 </p><p>
297 Okay, here's another riddle: what is the next line?
298 </p><p>
299 <pre>
303 1 1 1 2
304 3 1 1 2
305 2 1 1 2 1 3
307 </pre>
308 </p><p>
309 And when does the line get wider than 10 digits?]]></description>
310 </item>
311 <item>
312 <title>Fun with calculus after midnight</title>
313 <link>http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=index.html#1233022809</link>
314 <guid>http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=index.html#1233022809</guid>
315 <pubDate>Tue, 27 Jan 2009 03:20:09 +0100</pubDate>
316 <description><![CDATA[Fun with calculus after midnight
317 </p><p>
318 Problem: what is the shortest way of defining a variable consisting of <i>N</i>
319 spaces? I.e. for <i>N=80</i> the result will look something like
320 </p><p>
321 <table
322 border=1 bgcolor=white>
323 <tr><td bgcolor=lightblue colspan=3>
324 <pre> </pre>
325 </td></tr>
326 <tr><td>
327 <table cellspacing=5 border=0
328 style="color:black;">
329 <tr><td>
330 <pre>
331 s=' '
332 s="$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s"
333 </pre>
334 </td></tr>
335 </table>
336 </td></tr>
337 </table>
338 </p><p>
339 Let's see. Let the minimal number of characters needed be <i>A(N)</i>. For
340 simplicity, let's say that we only use one variable. Then, certainly, <i>A(N)</i>
341 cannot be larger than <i>5+N</i>, as we could define a variable using 1 character
342 for the name, 1 for the equal sign, 2 for the quotes, and one for the semicolon
343 or newline character (whichever).
344 </p><p>
345 Now, let's assume <i>N</i> is a product <i>K*L</i>. Then certainly, <i>A(N)</i> cannot
346 be larger than <i>A(K)+5+2*L</i>, as we could first define a variable that has
347 exactly <i>K</i> spaces and then use that to define the end result (in the example
348 above, <i>K=5</i> and <i>L=20</i>).
349 </p><p>
350 So, for which <i>N=K*L</i> is it better to use two definitions instead of one?
351 </p><p>
352 Simple calculus says that <i>5+K*L>5+K+5+2*L</i> must hold true, or (after some
353 scribbling): <i>L>1+7/(K-2)</i>. Which means that it makes no sense to define
354 a variable with 1 or 2 spaces first, which is kinda obvious (writing '$s'
355 alone would use two characters, so we could write the spaces right away).
356 </p><p>
357 But what for the other values? For <i>K=3</i>, <i>L</i> must be at least 9 to make
358 sense (in other words, <i>N</i> must be at least 27). For <i>K=4</i>, <i>L</i> needs
359 to be greater or equal to 5 (<i>N>=20</i>), the next pairs are <i>(5,4)</i>,
360 <i>(6,3)</i>, <i>(7,3)</i>, <i>(8,3)</i>, <i>(9,3)</i> and starting with <i>K=10</i>, any
361 <i>L>1</i> makes sense.
362 </p><p>
363 The second definition can also contain spaces at the end, however, so for any
364 <i>N=K*L+M</i>, <i>A(N)</i> cannot be larger than <i>A(K)+5+2*L+M</i>.
365 </p><p>
366 Not surprisingly, this leads to exactly the same <i>L>1+7/(K-2)</i> (as we can
367 append the <i>M</i> spaces in the last definition, no matter if we use 1 or
368 2 definitions).
369 </p><p>
370 However, that means that as soon as <i>N>=18</i>, we should use two definitions,
371 prior to that, it makes no sense.
372 </p><p>
373 So for <i>N<18</i>, <i>A(N)=5+N</i>.
374 </p><p>
375 But what <i>K</i> should one choose, i.e. how many spaces in the first definition?
376 In other words, what is <i>A(N)</i> given that we use two definitions?
377 </p><p>
378 That will have to wait for another midnight. Just a teaser: <i>A(80)=36</i>. Oh,
379 and with 80 characters, you can define a string of 9900 spaces...]]></description>
380 </item>
381 <item>
382 <title>Valgrind takes a loooong time</title>
383 <link>http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=index.html#1232997290</link>
384 <guid>http://repo.or.cz/w/git/dscho.git?a=blob_plain;hb=blog;f=index.html#1232997290</guid>
385 <pubDate>Mon, 26 Jan 2009 20:14:50 +0100</pubDate>
386 <description><![CDATA[Valgrind takes a loooong time
387 </p><p>
388 Yesterday, I started a run on a fast machine, and it took roughly 5.5
389 hours by the machine's clock.
390 </p><p>
391 And of course, I redirected stdout only... *sigh*
392 </p><p>
393 Which triggered a Google search how to force redirection of all the output
394 in the test scripts to a file and the terminal at the same time.
395 </p><p>
396 It seems as if that is not easily done. I tried
397 <center><table
398 border=1 bgcolor=white>
399 <tr><td bgcolor=lightblue colspan=3>
400 <pre> </pre>
401 </td></tr>
402 <tr><td>
403 <table cellspacing=5 border=0
404 style="color:black;">
405 <tr><td>
406 <pre>
407 exec >(tee out) 2>&1
408 </pre>
409 </td></tr>
410 </table>
411 </td></tr>
412 </table></center>
413 </p><p>
414 but that did not work: it mumbled something about invalid file handles or some
415 such.
416 </p><p>
417 The only solution I found was:
418 <center><table
419 border=1 bgcolor=white>
420 <tr><td bgcolor=lightblue colspan=3>
421 <pre> </pre>
422 </td></tr>
423 <tr><td>
424 <table cellspacing=5 border=0
425 style="color:black;">
426 <tr><td>
427 <pre>
428 mkpipe pipe
429 tee out < pipe &
430 exec > pipe 2>&1
431 </pre>
432 </td></tr>
433 </table>
434 </td></tr>
435 </table></center>
436 </p><p>
437 That is a problem for parallel execution, though, so I am still looking for a
438 better way to do it.
439 </p><p>
440 Once I have the output, it is relatively easy to analyze it, as I already
441 made a script which disects the output into valgrind output and the test
442 case it came from, then groups by common valgrind output and shows the
443 result to the user.]]></description>
444 </item>
445 </channel>
446 </rss>