## 2021 xx xx
+ - Added a new option '--code-skipping', requested in git #65, in which code
+ between comment lines '#<<V' and '#>>V' is passed verbatim to the output
+ stream without error checking. It is simmilar to --format skipping
+ but there is no error checking, and is useful for skipping an extended
+ syntax.
+
- Added a new option for closing paren placement, -vtc=3, requested in rt #136417.
- Some nested structures formatted with the -lp indentation option may have
=back
-
=back
=head2 Skipping Selected Sections of Code
Selected lines of code may be passed verbatim to the output without any
-formatting. This feature is enabled by default but can be disabled with
-the B<--noformat-skipping> or B<-nfs> flag. It should be used sparingly to
-avoid littering code with markers, but it might be helpful for working
-around occasional problems. For example it might be useful for keeping
-the indentation of old commented code unchanged, keeping indentation of
-long blocks of aligned comments unchanged, keeping certain list
-formatting unchanged, or working around a glitch in perltidy.
-
-=over 4
+formatting by marking the starting and ending lines with special comments.
+There are two options for doing this. The first option is called
+B<--format-skipping> or B<-fs>, and the second option is called
+B<--code-skipping> or B<-cs>.
-=item B<-fs>, B<--format-skipping>
+In both cases the lines of code will be output without any changes.
+The difference is that in B<--format-skipping>
+perltidy will still parse the marked lines of code and check for errors,
+whereas in B<--code-skipping> perltidy will simply pass the lines to the output without any checking.
-This flag, which is enabled by default, causes any code between
-special beginning and ending comment markers to be passed to the
-output without formatting. The default beginning marker is #<<<
-and the default ending marker is #>>> but they
-may be changed (see next items below). Additional text may appear on
-these special comment lines provided that it is separated from the
-marker by at least one space. For example
+Both of these features are enabled by default and are invoked with special
+comment markers. B<--format-skipping> uses starting and ending markers '#<<<'
+and '#>>>', like this:
- #<<< do not let perltidy touch this
+ #<<< format skipping: do not let perltidy change my nice formatting
my @list = (1,
1, 1,
1, 2, 1,
1, 4, 6, 4, 1,);
#>>>
-Format skipping begins when a format skipping comment is seen and continues
-until either a format-skipping end pattern is found or until the end of file.
+B<--code-skipping> uses starting and ending markers '#<<V' and '#>>V', like
+this:
+
+ #<<V code skipping: perltidy will pass this verbatim without error checking
+
+ token ident_digit {
+ [ [ <?word> | _ | <?digit> ] <?ident_digit>
+ | <''>
+ ]
+ };
+
+ #>>V
+
+Additional text may appear on the special comment lines provided that it
+is separated from the marker by at least one space, as in the above examples.
+
+It is recommended to use B<--code-skipping> only if you need to hide a block of
+an extended syntax which would produce errors if parsed by perltidy, and use
+B<--format-skipping> otherwise. This is because the B<--format-skipping>
+option provides the benefits of error checking, and there are essentially no
+limitations on which lines to which it can be applied. The B<--code-skipping>
+option, on the other hand, does not do error checking and its use is more
+restrictive because the code which remains, after skipping the marked lines,
+must be syntactically correct code with balanced containers.
+
+These features should be used sparingly to avoid littering code with markers,
+but they can be helpful for working around occasional problems.
+
+Note that it may be possible to avoid the use of B<--format-skipping> for the
+specific case of a comma-separated list of values, as in the above example, by
+simply inserting a blank or comment somewhere between the opening and closing
+parens. See the section L<Controlling List Formatting>.
+
+The following sections describe the available controls for these options. They
+should not normally be needed.
-The comment markers may be placed at any location that a block comment may
-appear. If they do not appear to be working, use the -log flag and examine the
-F<.LOG> file. Use B<-nfs> to disable this feature.
+=over 4
+
+=item B<-fs>, B<--format-skipping>
+
+As explained above, this flag, which is enabled by default, causes any code
+between special beginning and ending comment markers to be passed to the output
+without formatting. The code between the comments is still checked for errors
+however. The default beginning marker is #<<< and the default ending marker is
+#>>>.
-This method works for any code. For the specific case of a comma-separated list
-of values, as in this example, another possibility is to insert a blank or
-comment somewhere between the opening and closing parens. See the section
-L<Controlling List Formatting>.
+Format skipping begins when a format skipping beginning comment is seen and
+continues until a format-skipping ending comment is found.
+
+This feature can be disabled with B<-nfs>. This should not normally be necessary.
=item B<-fsb=string>, B<--format-skipping-begin=string>
+This and the next parameter allow the special beginning and ending comments to
+be changed. However, it is recommended that they only be changed if there is a
+conflict between the default values and some other use. If they are used, it
+is recommended that they only be entered in a B<.perltidyrc> file, rather than
+on a command line. This is because properly escaping these parameters on a
+command line can be difficult.
+
+If changed comment markers do not appear to be working, use the B<-log> flag and
+examine the F<.LOG> file to see if and where they are being detected.
+
The B<-fsb=string> parameter may be used to change the beginning marker for
format skipping. The default is equivalent to -fsb='#<<<'. The string that
you enter must begin with a # and should be in quotes as necessary to get past
The beginning and ending strings may be the same, but it is preferable
to make them different for clarity.
+=item B<-cs>, B<--code-skipping>
+
+As explained above, this flag, which is enabled by default, causes any code
+between special beginning and ending comment markers to be directly passed to
+the output without any error checking or formatting. Essentially, perltidy
+treats it as if it were a block of arbitrary text. The default beginning
+marker is #<<V and the default ending marker is #>>V.
+
+This feature can be disabled with B<-ncs>. This should not normally be
+necessary.
+
+=item B<-csb=string>, B<--code-skipping-begin=string>
+
+This may be used to change the beginning comment for a B<--code-skipping> section, and its use is similar to the B<-fsb=string>.
+The default is equivalent to -csb='#<<V'.
+
+=item B<-cse=string>, B<--code-skipping-end=string>
+
+This may be used to change the ending comment for a B<--code-skipping> section, and its use is similar to the B<-fse=string>.
+The default is equivalent to -cse='#>>V'.
+
=back
=head2 Line Break Control
$add_option->( 'closing-side-comment-warnings', 'cscw', '!' );
$add_option->( 'closing-side-comments', 'csc', '!' );
$add_option->( 'closing-side-comments-balanced', 'cscb', '!' );
+ $add_option->( 'code-skipping', 'cs', '!' );
+ $add_option->( 'code-skipping-begin', 'csb', '=s' );
+ $add_option->( 'code-skipping-end', 'cse', '=s' );
$add_option->( 'format-skipping', 'fs', '!' );
$add_option->( 'format-skipping-begin', 'fsb', '=s' );
$add_option->( 'format-skipping-end', 'fse', '=s' );
trim-qw
format=tidy
backup-file-extension=bak
+ code-skipping
format-skipping
default-tabsize=8
length_function => $length_function
);
+ write_logfile_entry("\nStarting tokenization pass...\n");
+
if ( $rOpts->{'entab-leading-whitespace'} ) {
write_logfile_entry(
"Leading whitespace will be entabbed with $rOpts->{'entab-leading-whitespace'} spaces per tab\n"
$line_of_tokens->{_ended_in_blank_token} = undef;
my $line_type = $line_of_tokens_old->{_line_type};
- my $input_line_no = $line_of_tokens_old->{_line_number} - 1;
+ my $input_line_no = $line_of_tokens_old->{_line_number};
my $CODE_type = "";
my $tee_output;
$rblock_type->[$j], $rcontainer_environment->[$j],
$rtype_sequence->[$j], $rlevels->[$j],
$rlevels->[$j], $slevel,
- $rci_levels->[$j], $input_line_no,
+ $rci_levels->[$j], $input_line_no - 1,
);
push @{$rLL}, \@tokary;
} ## end foreach my $j ( 0 .. $jmax )
} ## end if ( $jmax >= 0 )
$CODE_type =
- $self->get_CODE_type( $line_of_tokens, $Kfirst, $Klimit );
+ $self->get_CODE_type( $line_of_tokens, $Kfirst, $Klimit,
+ $input_line_no );
$tee_output ||=
$rOpts_tee_block_comments
}
sub get_CODE_type {
- my ( $self, $line_of_tokens, $Kfirst, $Klast ) = @_;
+ my ( $self, $line_of_tokens, $Kfirst, $Klast, $input_line_no ) = @_;
# We are looking at a line of code and setting a flag to
# describe any special processing that it requires
/$format_skipping_pattern_end/ )
{
$In_format_skipping_section = 0;
- write_logfile_entry("Exiting formatting skip section\n");
+ write_logfile_entry(
+ "Line $input_line_no: Exiting format-skipping section\n");
}
$CODE_type = 'FS';
goto RETURN;
/$format_skipping_pattern_begin/ )
{
$In_format_skipping_section = 1;
- write_logfile_entry("Entering formatting skip section\n");
+ write_logfile_entry(
+ "Line $input_line_no: Entering format-skipping section\n");
$CODE_type = 'FS';
goto RETURN;
}
$self->{_wrote_column_headings} = 1;
my $routput_array = $self->{_output_array};
push @{$routput_array}, <<EOM;
+
+Starting formatting pass...
The nesting depths in the table below are at the start of the lines.
The indicated output line numbers are not always exact.
ci = levels of continuation indentation; bk = 1 if in BLOCK, 0 if not.
%is_package
%is_comma_question_colon
%other_line_endings
+ $code_skipping_pattern_begin
+ $code_skipping_pattern_end
};
+# GLOBAL VARIABLES which are constant after being configured by user-supplied
+# parameters. They remain constant as a file is being processed.
+my (
+
+ $rOpts_code_skipping,
+ $code_skipping_pattern_begin,
+ $code_skipping_pattern_end,
+);
+
# possible values of operator_expected()
use constant TERM => -1;
use constant UNKNOWN => 0;
_in_format_ => $i++,
_in_error_ => $i++,
_in_pod_ => $i++,
+ _in_skipped_ => $i++,
_in_attribute_list_ => $i++,
_in_quote_ => $i++,
_quote_target_ => $i++,
exit 1;
}
+sub Die {
+ my ($msg) = @_;
+ Perl::Tidy::Die($msg);
+ croak "unexpected return from Perl::Tidy::Die";
+}
+
+sub bad_pattern {
+
+ # See if a pattern will compile. We have to use a string eval here,
+ # but it should be safe because the pattern has been constructed
+ # by this program.
+ my ($pattern) = @_;
+ eval "'##'=~/$pattern/";
+ return $@;
+}
+
+sub make_code_skipping_pattern {
+ my ( $rOpts, $opt_name, $default ) = @_;
+ my $param = $rOpts->{$opt_name};
+ unless ($param) { $param = $default }
+ $param =~ s/^\s*//; # allow leading spaces to be like format-skipping
+ if ( $param !~ /^#/ ) {
+ Die("ERROR: the $opt_name parameter '$param' must begin with '#'\n");
+ }
+ my $pattern = '^\s*' . $param . '\b';
+ if ( bad_pattern($pattern) ) {
+ Die(
+"ERROR: the $opt_name parameter '$param' causes the invalid regex '$pattern'\n"
+ );
+ }
+ return $pattern;
+}
+
sub check_options {
# Check Tokenizer parameters
$is_sub{$word} = 1;
}
}
+
+ $rOpts_code_skipping = $rOpts->{'code-skipping'};
+ $code_skipping_pattern_begin =
+ make_code_skipping_pattern( $rOpts, 'code-skipping-begin', '#<<V' );
+ $code_skipping_pattern_end =
+ make_code_skipping_pattern( $rOpts, 'code-skipping-end', '#>>V' );
return;
}
# _line_start_quote_ line where we started looking for a long quote
# _in_here_doc_ flag indicating if we are in a here-doc
# _in_pod_ flag set if we are in pod documentation
+ # _in_skipped_ flag set if we are in a skipped section
# _in_error_ flag set if we saw severe error (binary in script)
# _in_data_ flag set if we are in __DATA__ section
# _in_end_ flag set if we are in __END__ section
$self->[_in_format_] = 0;
$self->[_in_error_] = 0;
$self->[_in_pod_] = 0;
+ $self->[_in_skipped_] = 0;
$self->[_in_attribute_list_] = 0;
$self->[_in_quote_] = 0;
$self->[_quote_target_] = "";
warning("hit EOF while in format description\n");
}
+ if ( $tokenizer_self->[_in_skipped_] ) {
+ warning("hit EOF while in lines skipped with --code-skipping\n");
+ }
+
if ( $tokenizer_self->[_in_pod_] ) {
# Just write log entry if this is after __END__ or __DATA__
my $input_line_number = ++$tokenizer_self->[_last_line_number_];
+ my $write_logfile_entry = sub {
+ my ($msg) = @_;
+ write_logfile_entry("Line $input_line_number: $msg");
+ };
+
# Find and remove what characters terminate this line, including any
# control r
my $input_line_separator = "";
if ( $candidate_target eq $here_doc_target ) {
$tokenizer_self->[_nearly_matched_here_target_at_] = undef;
$line_of_tokens->{_line_type} = 'HERE_END';
- write_logfile_entry("Exiting HERE document $here_doc_target\n");
+ $write_logfile_entry->("Exiting HERE document $here_doc_target\n");
my $rhere_target_list = $tokenizer_self->[_rhere_target_list_];
if ( @{$rhere_target_list} ) { # there can be multiple here targets
$tokenizer_self->[_here_doc_target_] = $here_doc_target;
$tokenizer_self->[_here_quote_character_] =
$here_quote_character;
- write_logfile_entry(
+ $write_logfile_entry->(
"Entering HERE document $here_doc_target\n");
$tokenizer_self->[_nearly_matched_here_target_at_] = undef;
$tokenizer_self->[_started_looking_for_here_target_at_] =
# This is the end when count reaches 0
if ( !$tokenizer_self->[_in_format_] ) {
- write_logfile_entry("Exiting format section\n");
+ $write_logfile_entry->("Exiting format section\n");
$line_of_tokens->{_line_type} = 'FORMAT_END';
}
}
$line_of_tokens->{_line_type} = 'POD';
if ( $input_line =~ /^=cut/ ) {
$line_of_tokens->{_line_type} = 'POD_END';
- write_logfile_entry("Exiting POD section\n");
+ $write_logfile_entry->("Exiting POD section\n");
$tokenizer_self->[_in_pod_] = 0;
}
if ( $input_line =~ /^\#\!.*perl\b/ && !$tokenizer_self->[_in_end_] ) {
return $line_of_tokens;
}
+ # print line unchanged if in skipped section
+ elsif ( $tokenizer_self->[_in_skipped_] ) {
+
+ # NOTE: marked as the existing type 'FORMAT' to keep html working
+ $line_of_tokens->{_line_type} = 'FORMAT';
+ if ( $input_line =~ /$code_skipping_pattern_end/ ) {
+ $write_logfile_entry->("Exiting code-skipping section\n");
+ $tokenizer_self->[_in_skipped_] = 0;
+ }
+ return $line_of_tokens;
+ }
+
# must print line unchanged if we have seen a severe error (i.e., we
# are seeing illegal tokens and cannot continue. Syntax errors do
# not pass this route). Calling routine can decide what to do, but
# end of a pod section
if ( $input_line =~ /^=(\w+)\b/ && $1 ne 'cut' ) {
$line_of_tokens->{_line_type} = 'POD_START';
- write_logfile_entry("Entering POD section\n");
+ $write_logfile_entry->("Entering POD section\n");
$tokenizer_self->[_in_pod_] = 1;
return $line_of_tokens;
}
# end of a pod section
if ( $input_line =~ /^=(\w+)\b/ && $1 ne 'cut' ) {
$line_of_tokens->{_line_type} = 'POD_START';
- write_logfile_entry("Entering POD section\n");
+ $write_logfile_entry->("Entering POD section\n");
$tokenizer_self->[_in_pod_] = 1;
return $line_of_tokens;
}
# now we know that it is ok to tokenize the line...
# the line tokenizer will modify any of these private variables:
- # _rhere_target_list
- # _in_data
- # _in_end
- # _in_format
- # _in_error
- # _in_pod
- # _in_quote
+ # _rhere_target_list_
+ # _in_data_
+ # _in_end_
+ # _in_format_
+ # _in_error_
+ # _in_skipped_
+ # _in_pod_
+ # _in_quote_
my $ending_in_quote_last = $tokenizer_self->[_in_quote_];
tokenize_this_line($line_of_tokens);
warning(
"=cut starts a pod section .. this can fool pod utilities.\n"
);
- write_logfile_entry("Entering POD section\n");
+ $write_logfile_entry->("Entering POD section\n");
}
}
else {
$line_of_tokens->{_line_type} = 'POD_START';
- write_logfile_entry("Entering POD section\n");
+ $write_logfile_entry->("Entering POD section\n");
}
return $line_of_tokens;
}
+ # handle start of skipped section
+ if ( $tokenizer_self->[_in_skipped_] ) {
+
+ # NOTE: marked as the existing type 'FORMAT' to keep html working
+ $line_of_tokens->{_line_type} = 'FORMAT';
+ $write_logfile_entry->("Entering code-skipping section\n");
+ return $line_of_tokens;
+ }
+
# Update indentation levels for log messages.
# Skip blank lines and also block comments, unless a logfile is requested.
# Note that _line_of_text_ is the input line but trimmed from left to right.
$tokenizer_self->[_in_here_doc_] = 1;
$tokenizer_self->[_here_doc_target_] = $here_doc_target;
$tokenizer_self->[_here_quote_character_] = $here_quote_character;
- write_logfile_entry("Entering HERE document $here_doc_target\n");
+ $write_logfile_entry->("Entering HERE document $here_doc_target\n");
$tokenizer_self->[_started_looking_for_here_target_at_] =
$input_line_number;
}
# which are not tokenized (and cannot be read with <DATA> either!).
if ( $tokenizer_self->[_in_data_] ) {
$line_of_tokens->{_line_type} = 'DATA_START';
- write_logfile_entry("Starting __DATA__ section\n");
+ $write_logfile_entry->("Starting __DATA__ section\n");
$tokenizer_self->[_saw_data_] = 1;
# keep parsing after __DATA__ if use SelfLoader was seen
if ( $tokenizer_self->[_saw_selfloader_] ) {
$tokenizer_self->[_in_data_] = 0;
- write_logfile_entry(
+ $write_logfile_entry->(
"SelfLoader seen, continuing; -nlsl deactivates\n");
}
elsif ( $tokenizer_self->[_in_end_] ) {
$line_of_tokens->{_line_type} = 'END_START';
- write_logfile_entry("Starting __END__ section\n");
+ $write_logfile_entry->("Starting __END__ section\n");
$tokenizer_self->[_saw_end_] = 1;
# keep parsing after __END__ if use AutoLoader was seen
if ( $tokenizer_self->[_saw_autoloader_] ) {
$tokenizer_self->[_in_end_] = 0;
- write_logfile_entry(
+ $write_logfile_entry->(
"AutoLoader seen, continuing; -nlal deactivates\n");
}
return $line_of_tokens;
# Note: if keyword 'format' occurs in this line code, it is still CODE
# (keyword 'format' need not start a line)
if ( $tokenizer_self->[_in_format_] ) {
- write_logfile_entry("Entering format section\n");
+ $write_logfile_entry->("Entering format section\n");
}
if ( $tokenizer_self->[_in_quote_]
/^\s*$/ )
{
$tokenizer_self->[_line_start_quote_] = $input_line_number;
- write_logfile_entry(
+ $write_logfile_entry->(
"Start multi-line quote or pattern ending in $quote_target\n");
}
}
&& !$tokenizer_self->[_in_quote_] )
{
$tokenizer_self->[_line_start_quote_] = -1;
- write_logfile_entry("End of multi-line quote or pattern\n");
+ $write_logfile_entry->("End of multi-line quote or pattern\n");
}
# we are returning a line of CODE
# stage 1 is a very simple pre-tokenization
my $max_tokens_wanted = 0; # this signals pre_tokenize to get all tokens
- # a little optimization for a full-line comment
+ # optimize for a full-line comment
if ( !$in_quote && substr( $input_line, 0, 1 ) eq '#' ) {
- $max_tokens_wanted = 1 # no use tokenizing a comment
+ $max_tokens_wanted = 1; # no use tokenizing a comment
+
+ # and check for skipped section
+ if ( $rOpts_code_skipping
+ && $input_line =~ /$code_skipping_pattern_begin/ )
+ {
+ $tokenizer_self->[_in_skipped_] = 1;
+ return;
+ }
}
# start by breaking the line into pre-tokens
=over 4
+=item B<Add --code-skipping option, see git #65>
+
+Added a new option '--code-skipping', requested in git #65, in which code
+between comment lines '#<<V' and '#>>V' is passed verbatim to the output stream
+without error checking. It is simmilar to --format skipping but there is no
+error checking, and is useful for skipping an extended syntax.
+
=item B<Handle nested print format blocks>
Perltidy was producing an error at nested print format blocks,
--- /dev/null
+%Hdr=%U2E=%E2U=%Fallback=();
+$in_charmap=$nerror=$nwarning=0;
+$.=0;
+#<<V code skipping: perltidy will pass this verbatim without error checking
+
+ }}} {{{
+
+#>>V
+my $self=shift;
+my $cloning=shift;
--- /dev/null
+%Hdr = %U2E = %E2U = %Fallback = ();
+$in_charmap = $nerror = $nwarning = 0;
+$. = 0;
+#<<V code skipping: perltidy will pass this verbatim without error checking
+
+ }}} {{{
+
+#>>V
+my $self = shift;
+my $cloning = shift;
../snippets24.t align35.def
../snippets24.t rt136417.def
../snippets24.t rt136417.rt136417
+../snippets24.t numbers.def
../snippets3.t ce_wn1.ce_wn
../snippets3.t ce_wn1.def
../snippets3.t colin.colin
../snippets9.t rt98902.def
../snippets9.t rt98902.rt98902
../snippets9.t rt99961.def
-../snippets24.t numbers.def
+../snippets24.t code_skipping.def
#13 rt136417.def
#14 rt136417.rt136417
#15 numbers.def
+#16 code_skipping.def
# To locate test #13 you can search for its name or the string '#13'
use constant COUNTDOWN => scalar reverse 1, 2, 3, 4, 5;
use constant COUNTUP => reverse 1, 2, 3, 4, 5;
use constant COUNTDOWN => scalar reverse 1, 2, 3, 4, 5;
+----------
+
+ 'code_skipping' => <<'----------',
+%Hdr=%U2E=%E2U=%Fallback=();
+$in_charmap=$nerror=$nwarning=0;
+$.=0;
+#<<V code skipping: perltidy will pass this verbatim without error checking
+
+ }}} {{{
+
+#>>V
+my $self=shift;
+my $cloning=shift;
----------
'fpva' => <<'----------',
);
#15...........
},
+
+ 'code_skipping.def' => {
+ source => "code_skipping",
+ params => "def",
+ expect => <<'#16...........',
+%Hdr = %U2E = %E2U = %Fallback = ();
+$in_charmap = $nerror = $nwarning = 0;
+$. = 0;
+#<<V code skipping: perltidy will pass this verbatim without error checking
+
+ }}} {{{
+
+#>>V
+my $self = shift;
+my $cloning = shift;
+#16...........
+ },
};
my $ntests = 0 + keys %{$rtests};