6 use vars qw($VERSION @ISA @EXPORT);
12 @EXPORT = qw(read_log_records write_log_records);
17 Debbugs::Log - an interface to debbugs .log files
21 The Debbugs::Log module provides a convenient way for scripts to read and
22 write the .log files used by debbugs to store the complete textual records
23 of all bug transactions.
25 =head2 The .log File Format
27 .log files consist of a sequence of records, of one of the following four
28 types. ^A, ^B, etc. represent those control characters.
38 C<[mail]> must start with /^Received: \(at \S+\) by \S+;/, and is copied to
43 Auto-forwarded messages are recorded like this:
49 C<[mail]> must contain /^X-Debian-Bugs(-\w+)?: This is an autoforward from
50 \S+/. The first line matching that is removed; all lines in the message body
51 that begin with 'X' will be copied to the output, minus the 'X'.
53 Nothing in debbugs actually generates this record type any more, but it may
54 still be in old .logs at some sites.
59 [recip]^D[recip]^D[...] OR -t
64 Each [recip] is output after "Message sent"; C<-t> represents the same
65 sendmail option, indicating that the recipients are taken from the headers
66 of the message itself.
74 [html] is copied unescaped to the output. The record immediately following
75 this one is considered "boring" and only shown in certain output modes.
77 (This is a design flaw in the log format, since it makes it difficult to
78 change the HTML presentation later, or to present the data in an entirely
83 No other types of records are permitted, and the file must end with a ^C
86 =head2 Perl Record Representation
88 Each record is a hash. The C<type> field is C<incoming-recv>, C<autocheck>,
89 C<recips>, or C<html> as above; C<text> contains text from C<[mail]> or
90 C<[html]> as above; C<recips> is a reference to an array of recipients
91 (strings), or undef for C<-t>.
97 =item read_log_records
99 Takes a .log filehandle as input, and returns an array of all records in
100 that file. Throws exceptions using die(), so you may want to wrap this in an
105 sub read_log_records (*)
109 # This comes from bugreport.cgi, but is much simpler since it doesn't
110 # worry about the details of output.
118 7 => 'incoming-recv',
123 my $normstate = 'kill-init';
127 while (defined (my $line = <$logfh>)) {
130 if (length($line) == 1 and exists $states{ord($line)}) {
132 my $newstate = $states{ord($line)};
134 # disallowed transitions
135 $_ = "$normstate $newstate";
136 unless (/^(go|go-nox|html) kill-end$/ or
137 /^(kill-init|kill-end) (incoming-recv|autocheck|recips|html)$/ or
139 die "transition from $normstate to $newstate at $linenum disallowed";
142 if ($newstate =~ /^(autocheck|recips|html|incoming-recv)$/) {
143 $record->{type} = $newstate;
144 } elsif ($newstate eq 'kill-end') {
145 push @records, $record;
149 $normstate = $newstate;
154 if ($normstate eq 'incoming-recv') {
156 unless (/^Received: \(at \S+\) by \S+;/) {
157 die "bad line '$pl' in state incoming-recv";
160 $record->{text} .= "$_\n";
161 } elsif ($normstate eq 'html') {
162 $record->{text} .= "$_\n";
163 } elsif ($normstate eq 'go') {
165 $record->{text} .= "$_\n";
166 } elsif ($normstate eq 'go-nox') {
167 $record->{text} .= "$_\n";
168 } elsif ($normstate eq 'recips') {
170 undef $record->{recips};
172 # preserve trailing null fields, e.g. #2298
173 $record->{recips} = [split /\04/, $_, -1];
175 $normstate = 'kill-body';
176 } elsif ($normstate eq 'autocheck') {
177 $record->{text} .= "$_\n";
178 next if !/^X-Debian-Bugs(-\w+)?: This is an autoforward from (\S+)/;
179 $normstate = 'autowait';
180 } elsif ($normstate eq 'autowait') {
181 $record->{text} .= "$_\n";
183 $normstate = 'go-nox';
185 die "state $normstate at line $linenum ('$_')";
188 die "state $normstate at end" unless $normstate eq 'kill-end';
193 =item write_log_records
195 Takes a filehandle and a list of records as input, and prints the .log
196 format representation of those records to that filehandle.
200 sub write_log_records (*@)
205 for my $record (@records) {
206 my $type = $record->{type};
207 my $text = $record->{text};
208 die "type '$type' with no text field" unless defined $text;
209 if ($type eq 'autocheck') {
210 print $logfh "\01\n$text\03\n";
211 } elsif ($type eq 'recips') {
212 print $logfh "\02\n";
213 my $recips = $record->{recips};
214 if (defined $recips) {
215 die "recips not undef or array"
216 unless ref($recips) eq 'ARRAY';
217 print $logfh join("\04", @$recips) . "\n";
221 $text =~ s/^([\01-\07\030])/\030$1/gm;
222 print $logfh "\05\n$text\03\n";
223 } elsif ($type eq 'html') {
224 print $logfh "\06\n$text\03\n";
225 } elsif ($type eq 'incoming-recv') {
226 $text =~ s/^([\01-\07\030])/\030$1/gm;
227 print $logfh "\07\n$text\03\n";
229 die "unknown type '$type'";
240 This module does none of the formatting that bugreport.cgi et al do. It's
241 simply a means for extracting and rewriting raw records.