1 The following metrics will be used to determine if something is a fix:
2 * If a commit is made on a specific branch after a specific commit.
4 For example, all commits that are made in a maintenance
5 branch are guaranteed to be fixes. The problem here is
6 that usually a maintenance branch is 'branched off' the
7 mainstream branch at the point of a release, as such
8 there has to be a lower limit (say, the 'creation point'
9 of the branch) from where on commits are seen as 'part of
10 the maintenance branch'. This means that we need a
11 mechanism to determine whether a commic belongs on a
12 branch or not. This can be done by determining how
13 'diluted' the commit is at the head of a branch. Here we
14 define 'diluting' as merging, if a commit is part of a
15 branch only because it got merged in, that is a strong
16 dilution. If another branch is merged in that is only a
17 slight dilution (or perhaps none at all).
19 We want to create a test repository with a structure
20 similar to what is depicted below.
22 o-o-o-1-o-o-o-o-..-o-A
28 Main points of interest are the numbered commits.
29 This repository will serve as testing ground for the
30 'belongs to a branch' metric.
31 It is evident that commit 1 belongs to branch A, since
32 there is no other branch head that contains it.
33 Branch 2 however, could be seen as belonging to either
34 branch A, or branch B, since both heads contain it.
35 It should belong 'most' to branch B, since that is the
36 branch it was made on, whereas it was 'only' merged into
38 Commit 3 on the other hand, clearly belongs 'most' to
39 branch B, since it is the only branch it is on, while it
40 was 'only' merged into branch B just as much as commit 2
41 was 'only' merged into branch A. The distinction here is
42 that commit 3 occurs on no other branch than B.
43 Commit 4 is a merge commit, it merges commit 3 into
44 branch B and is only of interest because merge commits
45 might need extra attention.
47 * If the commit message matches a certain regexp.
49 For example, if the commit message contains the line:
50 'this fixes', it is a fix. The regexp can be as simple
51 as matching 'fix ' or 'fixes ', there will probably
52 be little need for advanced regexp machinery.
54 * If the diff contains a specific change, matching a certain regexp.
56 For example, in a test suite there might be an indicator
57 for known breakages, when this indicator is changed from
58 'known breakage' to 'should pass' the commit fixed the
61 * If the commit reverts another commit.
63 Whenever a commit reverts another one, the reverted
64 commit was a mistake, and as such the reverting commit is
65 a fix. By first checking which commits touch the same
66 files the commits that have to be checked can be reduced
69 * If a commit touches nearly all the same lines as another commit
71 Commits that are not actual reverts, but that do touch
72 the same files are very likely to be fixes. But, if the
73 commit is from the same author, made shortly after the
74 first commit, it is likely to be from the same 'patch
75 series' and as such should not be treated as a fix.