doc/metrics.txt

   1 The following metrics will be used to determine if something is a fix:
   2 * If a commit is made on a specific branch after a specific commit.
   3
   4   For example, all commits that are made in a maintenance
   5   branch are guaranteed to be fixes. The problem here is
   6   that usually  a maintenance branch is 'branched off' the
   7   mainstream branch at the point of a release, as such
   8   there has to be a lower limit (say, the 'creation point'
   9   of the branch) from where on commits are seen as 'part of
  10   the maintenance branch'. This means that we need a
  11   mechanism to determine whether a commic belongs on a
  12   branch or not. This can be done by determining how
  13   'diluted' the commit is at the head of a branch. Here we
  14   define 'diluting' as merging, if a commit is part of a
  15   branch only because it got merged in, that is a strong
  16   dilution. If another branch is merged in that is only a
  17   slight dilution (or perhaps none at all).
  18
  19   We want to create a test repository with a structure
  20   similar to what is depicted below.
  21
  22   o-o-o-1-o-o-o-o-o-o-o-o-5-o-o-o-A
  23      \     /         \       /
  24       o-2-o-o-4-o-o-o-o-o-o-7-o-o-B
  25          \   /     \     /
  26           3-o       o-6-o-o-o-8-o-C
  27
  28   Main points of interest are the numbered commits.
  29   This repository will serve as testing ground for the
  30   'belongs to a branch' metric.
  31   1. It is evident this commit 1 belongs to branch A, since
  32   that is the branch it was made on.
  33   2. Could be seen as belonging to either branch A, B or C,
  34   since all of those branches contain it. However, it
  35   should belong 'most' to branch B, since that is the
  36   branch it was made on, whereas it was 'only' merged into
  37   branch A and C.
  38   3. Belongs 'most' to branch B, since the branch it got
  39   made on was merged into B and then got deleted. Whereas
  40   branch A and C got it through a merge.
  41   4. A merge commit, it merges commit 3 into branch B and
  42   is only of interest because merge commits might need
  43   extra attention.
  44   5. Belongs to branch A since it is the only branch it
  45   exists on, but does occur before a merge with branch
  46   B, this should not change it's dilution though.
  47   6. Contained in all branches, but should belong most to
  48   branch C, then to B, and only lastly to branch A.
  49   7. Interesting because it is a merged commit, belongs to
  50   branch B, although branch A also contains it.
  51   8. A single commit on branch C, should obviously only
  52   belong to branch C with no dilution.
  53
  54 * If the commit message matches a certain regexp.
  55
  56   For example, if the commit message contains the line:
  57   'this fixes', it is a fix. The regexp can be as simple
  58   as matching 'fix ' or 'fixes ', there will probably
  59   be little need for advanced regexp machinery.
  60
  61 * If the diff contains a specific change, matching a certain regexp.
  62
  63   For example, in a test suite there might be an indicator
  64   for known breakages, when this indicator is changed from
  65   'known breakage' to 'should pass' the commit fixed the
  66   known breakage.
  67
  68 * If the commit reverts another commit.
  69
  70   Whenever a commit reverts another one, the reverted
  71   commit was a mistake, and as such the reverting commit is
  72   a fix. By first checking which commits touch the same
  73   files the commits that have to be checked can be reduced
  74   to a feasable amount.
  75
  76 * If a commit touches nearly all the same lines as another commit
  77
  78   Commits that are not actual reverts, but that do touch
  79   the same files are very likely to be fixes. But, if the
  80   commit is from the same author, made shortly after the
  81   first commit, it is likely to be from the same 'patch
  82   series' and as such should not be treated as a fix.
  83