1 The following metrics will be used to determine if something is a fix:
2 * If a commit is made on a specific branch after a specific commit.
4 For example, all commits that are made in a maintenance
5 branch are guaranteed to be fixes. The problem here is
6 that usually a maintenance branch is 'branched off' the
7 mainstream branch at the point of a release, as such
8 there has to be a lower limit (say, the 'creation point'
9 of the branch) from where on commits are seen as 'part of
10 the maintenance branch'. This means that we need a
11 mechanism to determine whether a commic belongs on a
12 branch or not. This can be done by determining how
13 'diluted' the commit is at the head of a branch. Here we
14 define 'diluting' as merging, if a commit is part of a
15 branch only because it got merged in, that is a strong
16 dilution. If another branch is merged in that is only a
17 slight dilution (or perhaps none at all).
19 We want to create a test repository with a structure
20 similar to what is depicted below.
22 o-o-o-1-o-o-o-o-o-o-o-o-5-o-o-o-A
24 o-2-o-o-4-o-o-o-o-o-o-7-o-o-B
28 Main points of interest are the numbered commits.
29 This repository will serve as testing ground for the
30 'belongs to a branch' metric.
31 1. It is evident this commit 1 belongs to branch A, since
32 that is the branch it was made on.
33 2. Could be seen as belonging to either branch A, B or C,
34 since all of those branches contain it. However, it
35 should belong 'most' to branch B, since that is the
36 branch it was made on, whereas it was 'only' merged into
38 3. Belongs 'most' to branch B, since the branch it got
39 made on was merged into B and then got deleted. Whereas
40 branch A and C got it through a merge.
41 4. A merge commit, it merges commit 3 into branch B and
42 is only of interest because merge commits might need
44 5. Belongs to branch A since it is the only branch it
45 exists on, but does occur before a merge with branch
46 B, this should not change it's dilution though.
47 6. Contained in all branches, but should belong most to
48 branch C, then to B, and only lastly to branch A.
49 7. Interesting because it is a merged commit, belongs to
50 branch B, although branch A also contains it.
51 8. A single commit on branch C, should obviously only
52 belong to branch C with no dilution.
54 * If the commit message matches a certain regexp.
56 For example, if the commit message contains the line:
57 'this fixes', it is a fix. The regexp can be as simple
58 as matching 'fix ' or 'fixes ', there will probably
59 be little need for advanced regexp machinery.
61 * If the diff contains a specific change, matching a certain regexp.
63 For example, in a test suite there might be an indicator
64 for known breakages, when this indicator is changed from
65 'known breakage' to 'should pass' the commit fixed the
68 * If the commit reverts another commit.
70 Whenever a commit reverts another one, the reverted
71 commit was a mistake, and as such the reverting commit is
72 a fix. By first checking which commits touch the same
73 files the commits that have to be checked can be reduced
76 * If a commit touches nearly all the same lines as another commit
78 Commits that are not actual reverts, but that do touch
79 the same files are very likely to be fixes. But, if the
80 commit is from the same author, made shortly after the
81 first commit, it is likely to be from the same 'patch
82 series' and as such should not be treated as a fix.