Make TradWeight a simple subclass and deprecate
[xapian.git] / xapian-applications / omega / mbox2omega
blobe9e101082dbec0cfe3cdbc83b2eba0f3e86e3b9d
1 #!/usr/bin/perl -w
2 # Copyright (C) 2004,2005,2007 Olly Betts
4 # This program is free software; you can redistribute it and/or
5 # modify it under the terms of the GNU General Public License as
6 # published by the Free Software Foundation; either version 2 of the
7 # License, or (at your option) any later version.
9 # This program is distributed in the hope that it will be useful,
10 # but WITHOUT ANY WARRANTY; without even the implied warranty of
11 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
12 # GNU General Public License for more details.
14 # You should have received a copy of the GNU General Public License
15 # along with this program; if not, write to the Free Software
16 # Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301
17 # USA
19 use strict;
21 if (grep {$_ eq '--help'} @ARGV) {
22 die <<EOT;
23 Syntax: $0 [MBOX...]
25 Run this script with one or more mailbox filenames on the command line (or
26 pipe a mailbox in on stdin). It produces output suitable for feeding to
27 scriptindex using the mbox2omega.script index script. For example:
29 $0 *.mbox | scriptindex /path/to/database mbox2omega.script
31 The index script tells scriptindex how to process the dump file, so you can
32 customise that to change how the indexing is done.
34 Note that this script is mainly intended as a simple example of how you might
35 generate scriptindex dump files from a data source, and its handling of mail
36 messages is quite primitive - e.g. it doesn't handle MIME or character sets.
37 EOT
40 my $hdr = 1;
41 line: while (<>) {
42 if ($hdr) {
43 chomp;
44 while (1) {
45 if (/^$/) {
46 print "body=\n";
47 $hdr = 0;
48 next line;
50 # Handle continuation lines
51 my $line = $_;
52 while (<>) {
53 chomp;
54 last unless /^[ \t]/;
55 $line .= $_;
57 if ($line =~ s/^Message-ID:\s*<?(.*?)>?\s*$/$1/i) {
58 print "id=$line\n" if length $line;
59 } elsif ($line =~ s/^Subject:\s*(.*?)\s*$/$1/i) {
60 print "title=$line\n" if length $line;
65 if (/^From /) {
66 print "\n";
67 $hdr = 1;
68 next;
70 if ($_ !~ /^\s+$/) {
71 print "=$_";