swallow everything up to a slash in ARGV[0] for convenient copy&pasting of distronames
[cpan-testers-parsereport.git] / bin / ctgetreports
bloba13541b69b515aa50c6c1028e640dcbb3d72ad82
1 #!/usr/bin/perl -- -*- mode: cperl -*-
3 =head1 NAME
5 ctgetreports - Quickly fetch cpantesters results with all reports
7 =head1 SYNOPSIS
9 ctgetreports [options] distroname
10 ctgetreports [options] --report number ...
11 ctgetreports -h
13 =head1 OPTIONS
15 A distroname is either unversioned as in C<IPC-Run> or versioned as in
16 C<IPC-Run-0.80>.
18 =over 2
20 =cut
22 my $optpod = <<'=back';
24 =item B<--cachedir=s>
26 Directory to keep mirrored data in. Defaults to C<$HOME/var/cpantesters>.
28 =item B<--ctformat=s>
30 Format of the cpan-testers file that should be downloaded. Available
31 options were originally C<html> and C<yaml>. With major construction
32 works going on the HTML on cpantesters is now unsupported for the time
33 being.
35 =item B<--cturl=s>
37 Base URL of the cpantesters website. Defaults to
38 C<http://www.cpantesters.org/show> but sometimes it is interesting to set
39 it to the old URL, C<http://cpantesters.perl.org/show> to diagnose bugs
40 or whatever.
42 =item B<--dumpfile=s>
44 If dumpvars are specified, dump them into this file. Defaults to "ctgetreports.out".
46 =item B<--dumpvars=s>
48 Dump all queryable variables matching the regular expression given as
49 argument at the end of the loop for a distro.
51 =item B<--help|h>
53 Prints a brief message and exists.
55 =item B<--interactive|i>
57 After every parsed report asks if you want to see it in a pager.
59 =item B<--local>
61 Do not mirror, use a local *.html file. Dies if the HTML or YAML file
62 is missing, skips missing report files.
64 =item B<--pager=s>
66 Pager (needed when -i is given). Defaults to C<less>.
68 =item B<--q=s@>
70 Query, may be repeated.
72 Example: C<--q mod:Clone --q meta:writer>
74 =item B<--quiet!>
76 Do not output the usual query lines per parsed report. Quiet
77 overrules verbose.
79 =item B<--raw!>
81 Boolean which, if set, causes the full (HTML) report to be
82 concatenated to STDOUT after every status line.
84 =item B<--report=s@>
86 Avert going through a cpan testers index, go straight to the report
87 with this number.
89 Example: C<--report 1238673>
91 =item B<--solve!>
93 Calls the solve function which tries to identify the best contenders
94 for a blame using Statistics::Regression. Currently only limited to
95 single variables and with simple heuristics. Implies C<--dumpvars=.>
96 unless the caller sets dumpvars himself.
98 The function prints at the moment to STDOUT the top 3 (set with
99 C<--solvetop>) candidates according to R^2 with their regression
100 analysis.
102 A few words of advise: do not take the results as a prove ever. Take
103 them just as a hint where you can most probablt prove a causal
104 relationship. And keep in mind that causal relationships can be the
105 other direction as well.
107 If you want to extend on that approach, I recommend you study the
108 ctgetreports.out file where you find all the data you'd need and feed
109 your assumptions to Statistics::Regression.
111 =item B<--solvetop=i>
113 The number of top candidates from the C<--solve> regression analysis
114 to display.
116 =item B<--transport=s>
118 Specifies transport to get the reports. C<nntp> uses Net::NNTP,
119 C<http> uses LWP::UserAgent. Defaults to nntp.
121 =item B<--vdistro=s>
123 Versioned distro, e.g.
125 IPC-Run-0.80
127 Usually not needed because a versioned distro name can be specified as
128 normal commandline argument.
130 =item B<--verbose|v+>
132 Feedback during download.
134 =item B<--ycb=s>
136 Only used during --solve. Provides perl code to be used as a callback
137 from the regression to determine the B<Y> of the regression equation.
138 The callback function gets a record (hashref) as the only argument and
139 must return a value or undefined. If it returns undefined, the record
140 is skipped, otherwise this record is processed with the returned
141 value. The callback is pure perl code without any surrounding sub
142 declaration.
144 The following example analyses diagnostic output from Acme-Study-Perl:
146 ctgetreports --q qr:"#(.*native big math float/int.*)" --solve \
147 --ycb 'my $rec = shift;
148 my $nbfi = $rec->{"qr:#(.*native big math float/int.*)"};
149 return undef unless defined $nbfi;
150 my $VAR1 = eval($nbfi);
151 return $VAR1->{">"}' Acme-Study-Perl
153 =back
155 =head1 DESCRIPTION
157 !!!!Alert: alpha quality software, subject to change without warning!!!!
159 The intent is to get at both the summary at cpantesters and the
160 individual reports and parse the reports and collect the data for
161 further inspection.
163 We always only fetch the reports for the most recent (optionally
164 picked) release. Target root directory is C<$HOME/var/cpantesters>
165 (can be overridden with the --cachedir option).
167 The C<--q> paramater can be repeated. It takes one argument which
168 stands for a query. This query must consist of two parts, a qualifier
169 and the query itself. Qualifiers are one of the following
171 conf parameters from the output of 'perl -V'
172 e.g.: conf:usethreads, conf:cc
173 mod for installed modules, either from prerequisites or from the toolchain
174 e.g.: mod:Test::Simple, mod:Imager
175 env environment variables
176 e.g.: env:TERM
177 meta all other parameters
178 e.g.: meta:perl, meta:from, meta:date, meta:writer
179 qr boolean set if the appended regexp matches the report
180 e.g.: qr:'division by zero'
182 The conf parameters specify a word used by the C<Config> module.
184 The mod parameters consist of a package name.
186 The meta parameters are the following: C<perl> for the perl version,
187 C<from> for the sender of the report, C<date> for the date in the mail
188 header, C<writer> for the module that produced the report,
189 C<output_from> for the line that is reported to have produced the output.
192 =head2 Examples
194 This gets all recent reports for Object-Relation and outputs the
195 version number of the prerequisite Clone:
197 $0 --q mod:Clone Object-Relation
199 Collects reports about Clone and reports the default set of metadata:
201 $0 Clone
203 Collect reports for Devel-Events and report the version number of
204 Moose in thses reports and sort by success/failure. If Moose broke
205 Devel-Events is becomes pretty obvious:
207 $0 --q mod:Moose Devel-Events |sort
209 Which tool was used to write how many reports, sorted by frequency:
211 $0 --q meta:writer Template-Timer | sed -e 's/.*meta:writer//' | sort | uniq -c | sort -n
213 Who was in the From field of the mails whose report writer was not determined:
215 $0 --q meta:writer --q meta:from Template-Timer | grep 'UNDEF'
217 At the time of this writing this collected the results of
218 IPC-Run-0.80_91 which was not really the latest release. In this case
219 manual investigations were necessary to find out that 0.80 was the
220 most recent:
222 $0 IPC-Run
224 Pick the specific release IPC-Run-0.80:
226 $0 IPC-Run-0.80
228 The following displays in its own column if the report contains the
229 regexp C<division by zero>:
231 $0 --q qr:"division by zero" CPAN-Testers-ParseReport-0.0.7
233 The following is a simple job to refresh all HTML pages we already
234 have and fetch new reports referenced there too:
236 perl -le '
237 for my $dirent (glob "$ENV{HOME}/var/cpantesters/cpantesters-show/*.html"){
238 my($distro) = $dirent =~ m|/([^/]+)\.html$| or next;
239 print $distro;
240 my $system = "ctgetreports --verbose --verbose $distro";
241 0 == system $system or die;
244 =cut
246 use strict;
247 use warnings;
249 use CPAN::Testers::ParseReport;
250 use Getopt::Long;
251 use Pod::Usage qw(pod2usage);
253 our %Opt;
254 my @opt = $optpod =~ /B<--(\S+)>/g;
255 for (@opt) {
256 $_ .= "!" unless /[+!=]/;
259 GetOptions(\%Opt,
260 @opt,
261 ) or pod2usage(2);
263 if ($Opt{help}) {
264 pod2usage(0);
267 if ($Opt{report}) {
268 if (@ARGV) {
269 pod2usage(2);
271 } else {
272 if (1 != @ARGV) {
273 pod2usage(2);
277 if ($Opt{solve}) {
278 eval { require Statistics::Regression };
279 if ($@) {
280 die "Statistics::Regression required for solved option: $@";
284 if ($Opt{dumpvars}) {
285 eval { require YAML::Syck };
286 if ($@) {
287 die "YAML::Syck required for dumpvars option: $@";
291 $|=1;
292 if (my $reports = delete $Opt{report}) {
293 my $dumpvars = {};
294 REPORT: for my $report (@$reports) {
295 CPAN::Testers::ParseReport::parse_single_report({id => $report},$dumpvars,%Opt);
296 last REPORT if $CPAN::Testers::ParseReport::Signal;
298 my $dumpfile = $Opt{dumpfile} || "ctgetreports.out";
299 YAML::Syck::DumpFile($dumpfile,$dumpvars);
300 } else {
301 $ARGV[0] =~ s|.+/||;
302 CPAN::Testers::ParseReport::parse_distro($ARGV[0],%Opt);
305 __END__