add an example to ycb
[cpan-testers-parsereport.git] / bin / ctgetreports
blob279716543f3ea51cec7928fe440027f1a19a9003
1 #!/usr/bin/perl -- -*- mode: cperl -*-
3 =head1 NAME
5 ctgetreports - Quickly fetch cpantesters results with all reports
7 =head1 SYNOPSIS
9 ctgetreports [options] distroname ...
10 ctgetreports [options] --report number ...
11 ctgetreports -h
13 =head1 OPTIONS
15 =over 8
17 =cut
19 my $optpod = <<'=back';
21 =item B<--cachedir=s>
23 Directory to keep mirrored data in. Defaults to C<$HOME/var/cpantesters>.
25 =item B<--ctformat=s>
27 Format of the cpan-testers file that should be downloaded. Available
28 options are C<html> and C<yaml>. Default should be html but is
29 temporarily switched to yaml because the cpantesters site is under
30 reconstruction.
32 =item B<--cturl=s>
34 Base URL of the cpantesters website. Defaults to
35 C<http://www.cpantesters.org/show> but sometimes it is interesting to set
36 it to the old URL, C<>http://cpantesters.perl.org/show> to diagnose bugs
37 or whatever.
39 =item B<--dumpfile=s>
41 If dumpvars are specified, dump them into this file. Defaults to "ctgetreports.out".
43 =item B<--dumpvars=s>
45 Dump all queryable variables matching the regular expression given as
46 argument at the end of the loop for a distro.
48 =item B<--help|h>
50 Prints a brief message and exists.
52 =item B<--interactive|i>
54 After every parsed report asks if you want to see it in a pager.
56 =item B<--local>
58 Do not mirror, use a local *.html file. Dies if the HTML or YAML file
59 is missing, skips missing report files.
61 =item B<--pager=s>
63 Pager (needed when -i is given). Defaults to C<less>.
65 =item B<--q=s@>
67 Query, may be repeated.
69 Example: C<--q mod:Clone --q meta:writer>
71 =item B<--quiet!>
73 Do not output the usual query lines per parsed report. Quiet
74 overrules verbose.
76 =item B<--raw!>
78 Boolean which, if set, causes the full (HTML) report to be
79 concatenated to STDOUT after every status line.
81 =item B<--report=s@>
83 Avert going through a cpan testers index, go straight to the report
84 with this number.
86 Example: C<--report 1238673>
88 =item B<--solve!>
90 Calls the solve function which tries to identify the best contenders
91 for a blame using Statistics::Regression. Currently only limited to
92 single variables and with simple heuristics. Implies C<--dumpvars=.>
93 unless the caller sets dumpvars himself.
95 The function prints at the moment to STDOUT the top 3 (set with
96 C<--solvetop>) candidates according to R^2 with their regression
97 analysis.
99 A few words of advise: do not take the results as a prove ever. Take
100 them just as a hint where you can most probablt prove a causal
101 relationship. And keep in mind that causal relationships can be the
102 other direction as well.
104 If you want to extend on that approach, I recommend you study the
105 ctgetreports.out file where you find all the data you'd need and feed
106 your assumptions to Statistics::Regression.
108 =item B<--solvetop=i>
110 The number of top candidates from the C<--solve> regression analysis
111 to display.
113 =item B<--vdistro=s>
115 Versioned distro. Needed if we do not want the most recent. Makes no
116 sense if there is more than one argument on the command line.
118 Example: C<--vdistro IPC-Run-0.80>
120 =item B<--verbose|v+>
122 Feedback during download.
124 =item B<--ycb=s>
126 Only used during --solve. Provides perl code to be used as a callback
127 from the regression to determine the B<Y> of the regression equation.
128 The callback function gets a record (hashref) as the only argument and
129 must return a value or undefined. If it returns undefined, the record
130 is skipped, otherwise this record is processed with the returned
131 value. The callback is pure perl code without any surrounding sub
132 declaration.
134 The following example analyses diagnostic output from Acme-Study-Perl:
136 ctgetreports --q qr:"#(.*native big math float/int.*)" --solve \
137 --ycb 'my $rec = shift;
138 my $nbfi = $rec->{"qr:#(.*native big math float/int.*)"};
139 return undef unless defined $nbfi;
140 my $VAR1 = eval($nbfi);
141 return $VAR1->{">"}' Acme-Study-Perl
143 =back
145 =head1 DESCRIPTION
147 !!!!Alert: alpha quality software, subject to change without warning!!!!
149 The intent is to get at both the summary at cpantesters and the
150 individual reports and parse the reports and collect the data for
151 further inspection.
153 We always only fetch the reports for the most recent (optionally
154 picked) release. Target root directory is C<$HOME/var/cpantesters>
155 (can be overridden with the --cachedir option).
157 The C<--q> paramater can be repeated. It takes one argument which
158 stands for a query. This query must consist of two parts, a qualifier
159 and the query itself. Qualifiers are one of the following
161 conf parameters from the output of 'perl -V'
162 e.g.: conf:usethreads, conf:cc
163 mod for installed modules, either from prerequisites or from the toolchain
164 e.g.: mod:Test::Simple, mod:Imager
165 env environment variables
166 e.g.: env:TERM
167 meta all other parameters
168 e.g.: meta:perl, meta:from, meta:date, meta:writer
169 qr boolean set if the appended regexp matches the report
170 e.g.: qr:'division by zero'
172 The conf parameters specify a word used by the C<Config> module.
174 The mod parameters consist of a package name.
176 The meta parameters are the following: C<perl> for the perl version,
177 C<from> for the sender of the report, C<date> for the date in the mail
178 header, C<writer> for the module that produced the report,
179 C<output_from> for the line that is reported to have produced the output.
182 =head2 Examples
184 This gets all recent reports for Object-Relation and outputs the
185 version number of the prerequisite Clone:
187 $0 --q mod:Clone Object-Relation
189 Collects reports about Clone and reports the default set of metadata:
191 $0 Clone
193 Collect reports for Devel-Events and report the version number of
194 Moose in thses reports and sort by success/failure. If Moose broke
195 Devel-Events is becomes pretty obvious:
197 $0 --q mod:Moose Devel-Events |sort
199 Which tool was used to write how many reports, sorted by frequency:
201 $0 --q meta:writer Template-Timer | sed -e 's/.*meta:writer//' | sort | uniq -c | sort -n
203 Who was in the From field of the mails whose report writer was not determined:
205 $0 --q meta:writer --q meta:from Template-Timer | grep 'UNDEF'
207 At the time of this writing this collected the results of
208 IPC-Run-0.80_91 which was not really the latest release. In this case
209 manual investigations were necessary to find out that 0.80 was the
210 most recent:
212 $0 IPC-Run
214 Pick the specific release IPC-Run-0.80:
216 $0 --vdistro IPC-Run-0.80 IPC-Run
218 The following displays in its own column if the report contains the
219 regexp C<division by zero>:
221 $0 --q qr:"division by zero" --vdistro 'CPAN-Testers-ParseReport-0.0.7' CPAN-Testers-ParseReport
223 The following is a simple job to refresh all HTML pages we already
224 have and fetch new reports referenced there too:
226 perl -le '
227 for my $dirent (glob "$ENV{HOME}/var/cpantesters/cpantesters-show/*.html"){
228 my($distro) = $dirent =~ m|/([^/]+)\.html$| or next;
229 print $distro;
230 my $system = "ctgetreports --verbose --verbose $distro";
231 0 == system $system or die;
234 =cut
236 use strict;
237 use warnings;
239 use CPAN::Testers::ParseReport;
240 use Getopt::Long;
241 use Pod::Usage qw(pod2usage);
243 our %Opt;
244 my @opt = $optpod =~ /B<--(\S+)>/g;
245 for (@opt) {
246 $_ .= "!" unless /[+!=]/;
249 GetOptions(\%Opt,
250 @opt,
251 ) or pod2usage(2);
253 if ($Opt{help}) {
254 pod2usage(0);
257 if ($Opt{report}) {
258 if (@ARGV) {
259 pod2usage(2);
261 } else {
262 if (! @ARGV) {
263 pod2usage(2);
267 if ($Opt{solve}) {
268 eval { require Statistics::Regression };
269 if ($@) {
270 die "Statistics::Regression required for solved option: $@";
274 if ($Opt{dumpvars}) {
275 eval { require YAML::Syck };
276 if ($@) {
277 die "YAML::Syck required for dumpvars option: $@";
281 $|=1;
282 if (my $reports = delete $Opt{report}) {
283 REPORT: for my $report (@$reports) {
284 CPAN::Testers::ParseReport::parse_single_report({id => $report},+{},%Opt);
285 last DISTRO if $CPAN::Testers::ParseReport::Signal;
287 } else {
288 DISTRO: for my $distro (@ARGV) {
289 CPAN::Testers::ParseReport::parse_distro($distro,%Opt);
290 last DISTRO if $CPAN::Testers::ParseReport::Signal;
294 __END__