Issue 5167/6: Changes: show \markup xxx = ... \etc assignments

[lilypond.git] / Documentation / contributor / regressions.itexi
diff --git a/Documentation/contributor/regressions.itexi b/Documentation/contributor/regressions.itexi

index 2c7783c1d7828e578c0ca940e24ecc6eab006264..1963a6bebcfac7ae5a1bd4b90cd849423184fadc 100644 (file)
--- a/Documentation/contributor/regressions.itexi
+++ b/Documentation/contributor/regressions.itexi
@@ -7,10 +7,10 @@
  * Precompiled regression tests::
  * Compiling regression tests::
  * Regtest comparison::
+* Pixel-based regtest comparison::
  * Finding the cause of a regression::
  * Memory and coverage tests::
  * MusicXML tests::
-* Grand Regression Test Checking::
  @end menu
  
  
@@ -99,8 +99,20 @@ verify that the regression tests have, in fact, run.}
  
  The test comparison shows all of the changes that occurred between
  the current release and the prior release.  Each test that has a
-significant difference in output is displayed, with the old
-version on the left and the new version on the right.
+significant (noticeable) difference in output is displayed, with
+the old version on the left and the new version on the right.
+
+Some of the small changes can be ignored (slightly different slur
+shapes, small variations in note spacing), but this is not always
+the case: sometimes even the smallest change means that something
+is wrong.  To help in distinguishing these cases, we use bigger
+staff size when small differences matter.
+
+Staff size 30 generally means "pay extra attention to details".
+Staff size 40 (two times bigger than default size) or more means
+that the regtest @strong{is} about the details.
+
+Staff size smaller than default doesn't mean anything.
  
  Regression tests whose output is the same for both versions are
  not shown in the test comparison.
@@ -123,6 +135,11 @@ something is suspicious!
  @item
  Profile files: give information about
  TODO?  I don't know what they're for.
+Apparently they give some information about CPU usage.  If you got
+tons of changes in cell counts, this probably means that you compiled
+@code{make test-baseline} with a different amount of CPU threads than
+@code{make check}. Try redoing tests from scratch with the same
+number of threads each time -- see @ref{Saving time with the -j option}.
  
  @end itemize
  
@@ -183,7 +200,8 @@ than building the source code, as described in
  @node Regtest comparison
  @section Regtest comparison
  
-Before modified code is committed to master, a regression test
+Before modified code is committed to @code{master} (via @code{staging}),
+a regression test
  comparison must be completed to ensure that the changes have
  not caused problems with previously working code.  The comparison
  is made automatically upon compiling the regression test suite
@@ -196,7 +214,7 @@ Run @code{make} with current git master without any of your changes.
  
  @item
  Before making changes to the code, establish a baseline for the comparison by
-going to the @file{lilypond-git/build/} directory and running:
+going to the @file{$LILYPOND_GIT/build/} directory and running:
  
  @example
  make test-baseline
@@ -280,6 +298,58 @@ and want to do regtests comparison for all of them, you can
  @code{make test-clean}, @code{make} and @code{make check} it without doing
  @code{make test-baseline} again.}
  
+@node Pixel-based regtest comparison
+@section Pixel-based regtest comparison
+
+As an alternative to the @code{make test} method for regtest checking (which
+relies upon @code{.signature} files created by a LilyPond run and which describe
+the placing of grobs) there is a script which compares the output of two
+LilyPond versions pixel-by-pixel.  To use this, start by checking out the
+version of LilyPond you want to use as a baseline, and run @code{make}.  Then,
+do the following:
+
+@example
+cd $LILYPOND_GIT/scripts/auxiliar/
+./make-regtest-pngs.sh -j9 -o
+@end example
+
+The @code{-j9} option tells the script to use 9 CPUs to create the
+images - change this to your own CPU count+1.  @code{-o} means this is the "old"
+version.  This will create images of all the regtests in
+
+@example
+$LILYPOND_BUILD_DIR/out-png-check/old-regtest-results/
+@end example
+
+Now checkout the version you want to compare with the baseline.  Run
+@code{make} again to recreate the LilyPond binary.  Then, do the following:
+
+@example
+cd $LILYPOND_GIT/scripts/auxiliar/
+./make-regtest-pngs.sh -j9 -n
+@end example
+
+The @code{-n} option tells the script to make a "new" version of the
+images.  They are created in
+
+@example
+$LILYPOND_BUILD_DIR/out-png-check/new-regtest-results/
+@end example
+
+Once the new images have been created, the script compares the old images with
+the new ones pixel-by-pixel and prints a list of the different images to the
+terminal, together with a count of how many differences were found.  The
+results of the checks are in
+
+@example
+$LILYPOND_BUILD_DIR/out-png-check/regtest-diffs/
+@end example
+
+To check for differences, browse that directory with an image
+viewer.  Differences are shown in red.  Be aware that some images with complex
+fonts or spacing annotations always display a few minor differences.  These can
+safely be ignored.
+
  
  @node Finding the cause of a regression
  @section Finding the cause of a regression
@@ -435,7 +505,7 @@ code has been exercised by the tests.
  For tracking memory usage as part of this test, you will need
  GUILE CVS; especially the following patch:
  @smallexample
-@uref{http://www.lilypond.org/vc/old/gub.darcs/patches/guile-1.9-gcstats.patch}.
+@uref{http://lilypond.org/vc/old/gub.darcs/patches/guile-1.9-gcstats.patch}.
  @end smallexample
  
  @subheading Code coverage
@@ -471,46 +541,3 @@ available in the LilyPond documentation:
  @uref{http://lilypond.org/doc/latest/input/regression/musicxml/collated-files}
  @end example
  
-
-@node Grand Regression Test Checking
-@section Grand Regression Test Checking
-
-@subheading What is this all about?
-
-Regression tests (usually abbreviated "regtests") is a collection
-of @file{.ly} files used to check whether LilyPond is working correctly.
-Example: before version 2.15.12 breve noteheads had incorrect width,
-which resulted in collisions with other objects.  After the issue was fixed,
-a small @file{.ly} file demonstrating the problem was added to the regression
-tests as a proof that the fix works.  If someone will accidentally break
-breve width again, we will notice this in the output of that regression test.
-
-We are asking you to help us by checking a regtest or two from time to time.
-You don't need programming skills to do this, not even LilyPond skills -
-just basic music notation knowledge; checking one regtest takes less than
-a minute.  Simply go here:
-
-@example
-@uref{http://www.philholmes.net/lilypond/regtests/}
-@end example
-
-@subheading Some tips on checking regtests
-
-@subsubheading Description text
-
-The description should be clear even for a music beginner.
-If there are any special terms used in the description,
-they all should be explained in our @rglosnamed{Top, Music Glossary}
-or @rinternalsnamed{Top, Internals Reference}.
-Vague descriptions (like "behaves well", "looks reasonable") shouldn't be used.
-
-@ignore
-@subsubheading Is regtest straightforward and systematic?
-
-Unfortunately some regtests are written poorly.  A good regtest should be
-straightforward: it should be obvious what it checks and how.  Also, it
-usually shouldn't check everything at once.  For example it's a bad idea to test
-accidental placement by constucting one huge chord with many suspended notes
-and loads of accidentals.  It's better to divide such problem into a series
-of clearly separated cases.
-@end ignore