X-Git-Url: https://git.donarmstrong.com/?a=blobdiff_plain;f=Documentation%2Fcontributor%2Fregressions.itexi;h=365582e87f7a8beef00a6e437f67d7b005fba98a;hb=a775c0535512573cb013cd230a2630dffd933ac0;hp=25c106f2eaa30e38d54ca6a28ee3601521d068d9;hpb=297b650058417845745a706a49bae154cba80fd6;p=lilypond.git

diff --git a/Documentation/contributor/regressions.itexi b/Documentation/contributor/regressions.itexi
index 25c106f2ea..365582e87f 100644
--- a/Documentation/contributor/regressions.itexi
+++ b/Documentation/contributor/regressions.itexi
@@ -4,9 +4,14 @@
 
 @menu
 * Introduction to regression tests::
-* Current regtest output::
-* Comparison regtest output::
+* Precompiled regression tests::
+* Compiling regression tests::
+* Regtest comparison::
+* Pixel-based regtest comparison::
+* Finding the cause of a regression::
+* Memory and coverage tests::
 * MusicXML tests::
+* Grand Regression Test Checking::
 @end menu
 
 
@@ -19,136 +24,567 @@ These regression tests comprise small LilyPond snippets that test
 the functionality of each part of LilyPond.
 
 Regression tests are added when new functionality is added to
-LilyPond.  They are also added when bugs are identified.  The
-snippet that causes the bug becomes a regression test to verify
-that the bug has been fixed.
+LilyPond.
+We do not yet have a policy on when it is appropriate to add or
+modify a regtest when bugs are fixed.  Individual developers
+should use their best judgement until this is clarified during the
+@ref{Grand Organization Project (GOP)}.
 
-The regression tests are automatically compiled using special @code{make}
-targets.  The output of the regression tests is also automatically
-checked to identify changes in LilyPond output.
+The regression tests are compiled using special @code{make}
+targets.  There are three primary uses for the regression
+tests.  First, successful completion of the regression tests means
+that LilyPond has been properly built.  Second, the output of the
+regression tests can be manually checked to ensure that
+the graphical output matches the description of the intended
+output.  Third, the regression test output from two different
+versions of LilyPond can be automatically compared to identify
+any differences.  These differences should then be manually
+checked to ensure that the differences are intended.
 
-The output of the regression tests is available on the website
-for every stable version of LilyPond.  This allows the comparison
-of different versions to see when bugs appeared.
+Regression tests (@qq{regtests}) are available in precompiled form
+as part of the documentation.  Regtests can also be compiled
+on any machine that has a properly configured LilyPond build
+system.
 
 
-@node Current regtest output
-@section Current regtest output
+@node Precompiled regression tests
+@section Precompiled regression tests
 
+@subheading Regression test output
 
-TODO: To be checked and completed -vv
+As part of the release process, the regression tests are run
+for every LilyPond release.  Full regression test output is
+available for every stable version and the most recent development
+version.
 
-Regression tests (@qq{regtests}) are available in two ways: either
-in a compiled form, for instance on the website, or as source code
-that needs to be compiled locally, using the most recent LilyPond
-binary as possible.  The latter is recommended, although more
-technically involved.
+Regression test output is available in HTML and PDF format.  Links
+to the regression test output are available at the developer's
+resources page for the version of interest.
 
+The latest stable version of the regtests is found at:
 
-@subheading Precompiled regtests
+@example
+@uref{http://lilypond.org/doc/stable/input/regression/collated-files.html}
+@end example
 
-The easiest way to see the @q{current} regtest output (meaning,
-the ouput of the latest stable or development version) is
-to look at the online compiled regtest page:
+The latest development version of the regtests is found at:
 
 @example
 @uref{http://lilypond.org/doc/latest/input/regression/collated-files.html}
 @end example
 
-However, depending on how many changes have been made to the code
-since the latest release, this page may not reflect the latest
-features, bugfixes... or new bugs that may have been introduced!
 
-Therefore, if you have an appropriate environment to build LilyPond
-yourself, it is recommended that you compile the software yourself.
+@subheading Regression test comparison
+
+Each time a new version is released, the regtests are
+compiled and the output is automatically compared with the
+output of the previous release.  The result of these
+comparisons is archived online:
+
+@example
+@uref{http://lilypond.org/test/}
+@end example
+
+Checking these pages is a very important task for the LilyPond project.
+You are invited to report anything that looks broken, or any case
+where the output quality is not on par with the previous release,
+as described in @rweb{Bug reports}.
+
+@warning{ The special regression test
+@file{test-output-distance.ly} will always show up as a
+regression.  This test changes each time it is run, and serves to
+verify that the regression tests have, in fact, run.}
+
+
+@subheading What to look for
+
+The test comparison shows all of the changes that occurred between
+the current release and the prior release.  Each test that has a
+significant (noticeable) difference in output is displayed, with
+the old version on the left and the new version on the right.
+
+Some of the small changes can be ignored (slightly different slur
+shapes, small variations in note spacing), but this is not always
+the case: sometimes even the smallest change means that something
+is wrong.  To help in distinguishing these cases, we use bigger
+staff size when small differences matter.
+
+Staff size 30 generally means "pay extra attention to details".
+Staff size 40 (two times bigger than default size) or more means
+that the regtest @strong{is} about the details.
+
+Staff size smaller than default doesn't mean anything.
+
+Regression tests whose output is the same for both versions are
+not shown in the test comparison.
+
+@itemize
+@item
+Images: green blurs in the new version show the approximate
+location of elements in the old version.
+
+There are often minor adjustments in spacing which do not indicate
+any problem.
 
+@item
+Log files: show the difference in command-line output.
 
-@subheading Compiling regtests
+The main thing to examine are any changes in page counts -- if a
+file used to fit on 1 page but now requires 4 or 5 pages,
+something is suspicious!
 
-The first step is to download the latest available source code,
-as explained in @ref{Working with source code}.  Then you will need
-to build the LilyPond binary: see
-@ref{Compiling LilyPond}.
+@item
+Profile files: give information about
+TODO?  I don't know what they're for.
+Apparently they give some information about CPU usage.  If you got
+tons of changes in cell counts, this probably means that you compiled
+@code{make test-baseline} with a different amount of CPU threads than
+@code{make check}. Try redoing tests from scratch with the same
+number of threads each time -- see @ref{Saving time with the -j option}.
 
-@noindent
-(Uninstalling the previous LilyPond version is not necessary, nor is
+@end itemize
+
+@warning{
+The automatic comparison of the regtests checks the LilyPond
+bounding boxes.  This means that Ghostscript changes and changes
+in lyrics or text are not found.
+}
+
+@node Compiling regression tests
+@section Compiling regression tests
+
+Developers may wish to see the output of the complete regression
+test suite for the current version of the source repository
+between releases.  Current source code is available; see
+@ref{Working with source code}.
+
+For regression testing @code{../configure} should be run with the
+@code{--disable-optimising} option.  Then you will need
+to build the LilyPond binary; see @ref{Compiling LilyPond}.
+
+Uninstalling the previous LilyPond version is not necessary, nor is
 running @code{make install}, since the tests will automatically be
 compiled with the LilyPond binary you have just built in your source
-directory.)
+directory.
 
-From this point, compiling the regtests is as simple as running
+From this point, the regtests are compiled with:
 
 @example
 make test
 @end example
 
-However, as there are many snippets to compile, if you have a multi-core
-machine it is highly recommended to use the @option{-j} option, as
-described in @ref{Saving time with the @option{-j} option}.  Another
-useful optimization is to set the @var{CPU_COUNT} variable; for a
-quad-core processor the complete command would look like
+If you have a multi-core machine you may want to use the @option{-j}
+option and @var{CPU_COUNT} variable, as
+described in @ref{Saving time with CPU_COUNT}.
+For a quad-core processor the complete command would be:
 
 @example
-make -j5 CPU_COUNT=4 test
+make -j5 CPU_COUNT=5 test
 @end example
 
-The regtest output will then be available in one of the
-@file{input/regression/out-*} directories, depending on the
-exact command you used.  See @ref{Testing LilyPond} for
-more information.
+The regtest output will then be available in
+@file{input/regression/out-test}.
+@file{input/regression/out-test/collated-examples.html}
+contains a listing of all the regression tests that were run,
+but none of the images are included.  Individual images are
+also available in this directory.
+
+The primary use of @samp{make@tie{}test} is to verify that the
+regression tests all run without error.  The regression test
+page that is part of the documentation is created only when the
+documentation is built, as described in @ref{Generating documentation}.
+Note that building the documentation requires more installed components
+than building the source code, as described in
+@ref{Requirements for building documentation}.
+
+
+@node Regtest comparison
+@section Regtest comparison
+
+Before modified code is committed to @code{master} (via @code{staging}),
+a regression test
+comparison must be completed to ensure that the changes have
+not caused problems with previously working code.  The comparison
+is made automatically upon compiling the regression test suite
+twice.
+
+@enumerate
 
+@item
+Run @code{make} with current git master without any of your changes.
 
-@node Comparison regtest output
-@section Comparison regtest output
+@item
+Before making changes to the code, establish a baseline for the comparison by
+going to the @file{$LILYPOND_GIT/build/} directory and running:
 
+@example
+make test-baseline
+@end example
 
-Regtests are an useful way to compare what has changed between two
-versions of LilyPond, or to verify on a fine-grained level if a
-particular change may have unwanted side-effects, such as introducing
-a bug or breaking existing features.
+@item
+Make your changes, or apply the patch(es) to consider.
 
-For such cases, LilyPond's build system provides an automated way of
-comparing regtests output.
+@item
+Compile the source with @samp{make} as usual.
 
+@item
+Check for unintentional changes to the regtests:
 
-@subheading Comparing regtests for two development releases
+@example
+make check
+@end example
 
-Each time a new development version is released, a set of regtests is
-compiled and compared with the previous release.  The result of these
-comparisons is archived online, and may be seen at the following address:
+After this has finished, a regression test comparison will be
+available (relative to the current @file{build/} directory) at:
 
 @example
-@uref{http://lilypond.org/test/}
+out/test-results/index.html
 @end example
 
-@noindent
-Checking these pages is a very important task for the LilyPond project.  
-You are invited to report anything that looks broken, or any case
-where the output quality is not on par with the previous release,
-either to the Bug Squad, following our guidelines for
-@rweb{Bug reports}, or directly in the bug tracker, as explained in
-@ref{Issues}.
+For each regression test that differs between the baseline and the
+changed code, a regression test entry will be displayed.  Ideally,
+the only changes would be the changes that you were working on.
+If regressions are introduced, they must be fixed before
+committing the code.
+
+@warning{
+The special regression test @file{test-output-distance.ly} will always
+show up as a regression.  This test changes each time it is run, and
+serves to verify that the regression tests have, in fact, run.}
+
+@item
+If you are happy with the results, then stop now.
+
+If you want to continue programming, then make any additional code
+changes, and continue.
+
+@item
+Compile the source with @samp{make} as usual.
+
+@item
+To re-check files that differed between the initial
+@samp{make@tie{}test-baseline} and your post-changes
+@samp{make@tie{}check}, run:
+
+@example
+make test-redo
+@end example
+
+This updates the regression list at @file{out/test-results/index.html}.
+It does @emph{not} redo @file{test-output-distance.ly}.
+
+@item
+When all regressions have been resolved, the output list will be empty.
+
+@item
+Once all regressions have been resolved, a final check should be completed
+by running:
+
+@example
+make test-clean
+make check
+@end example
+
+This cleans the results of the previous @samp{make@tie{}check}, then does the
+automatic regression comparison again.  
+
+@end enumerate
+
+@advanced{
+Once a test baseline has been established, there is no need to run it again
+unless git master changed. In other words, if you work with several branches
+and want to do regtests comparison for all of them, you can
+@code{make test-baseline} with git master, checkout some branch,
+@code{make} and @code{make check} it, then switch to another branch,
+@code{make test-clean}, @code{make} and @code{make check} it without doing
+@code{make test-baseline} again.}
+
+@node Pixel-based regtest comparison
+@section Pixel-based regtest comparison
+
+As an alternative to the @code{make test} method for regtest checking (which
+relies upon @code{.signature} files created by a LilyPond run and which describe
+the placing of grobs) there is a script which compares the output of two
+LilyPond versions pixel-by-pixel.  To use this, start by checking out the
+version of LilyPond you want to use as a baseline, and run @code{make}.  Then,
+do the following:
+
+@example
+cd $LILYPOND_GIT/scripts/auxiliar/
+./make-regtest-pngs.sh -j9 -o
+@end example
+
+The @code{-j9} option tells the script to use 9 CPUs to create the
+images - change this to your own CPU count+1.  @code{-o} means this is the "old"
+version.  This will create images of all the regtests in
+
+@example
+$LILYPOND_BUILD_DIR/out-png-check/old-regtest-results/
+@end example
+
+Now checkout the version you want to compare with the baseline.  Run
+@code{make} again to recreate the LilyPond binary.  Then, do the following:
+
+@example
+cd $LILYPOND_GIT/scripts/auxiliar/
+./make-regtest-pngs.sh -j9 -n
+@end example
+
+The @code{-n} option tells the script to make a "new" version of the
+images.  They are created in
+
+@example
+$LILYPOND_BUILD_DIR/out-png-check/new-regtest-results/
+@end example
+
+Once the new images have been created, the script compares the old images with
+the new ones pixel-by-pixel and prints a list of the different images to the
+terminal, together with a count of how many differences were found.  The
+results of the checks are in
+
+@example
+$LILYPOND_BUILD_DIR/out-png-check/regtest-diffs/
+@end example
+
+To check for differences, browse that directory with an image
+viewer.  Differences are shown in red.  Be aware that some images with complex
+fonts or spacing annotations always display a few minor differences.  These can
+safely be ignored.
+
+
+@node Finding the cause of a regression
+@section Finding the cause of a regression
+
+Git has special functionality to help tracking down the exact
+commit which causes a problem.  See the git manual page for
+@code{git bisect}.  This is a job that non-programmers can do,
+although it requires familiarity with git, ability to compile
+LilyPond, and generally a fair amount of technical knowledge.  A
+brief summary is given below, but you may need to consult other
+documentation for in-depth explanations.
+
+Even if you are not familiar with git or are not able to compile
+LilyPond you can still help to narrow down the cause of a
+regression simply by downloading the binary releases of different
+LilyPond versions and testing them for the regression.  Knowing
+which version of LilyPond first exhibited the regression is
+helpful to a developer as it shortens the @code{git bisect}
+procedure.
 
+Once a problematic commit is identified, the programmers' job is
+much easier.  In fact, for most regression bugs, the majority of
+the time is spent simply finding the problematic commit.
 
-@subheading Comparing regtests when modifying the source code
+More information is in @ref{Regression tests}.
 
-When changing any piece of code, developers are asked to verify that the
-regtests still compile successfuly (i.e., not only without error, but
-with an output quality equivalent or superior).  This may be done as
-described in @ref{Testing LilyPond}.
+@subheading git bisect setup
+
+We need to set up the bisect for each problem we want to
+investigate.
+
+Suppose we have an input file which compiled in version 2.13.32,
+but fails in version 2.13.38 and above.
+
+@enumerate
+@item
+Begin the process:
+
+@example
+git bisect start
+@end example
+
+@item
+Give it the earliest known bad tag:
+
+@example
+git bisect bad release/2.13.38-1
+@end example
+
+(you can see tags with: @code{git tag} )
+
+@item
+Give it the latest known good tag:
+
+@example
+git bisect good release/2.13.32-1
+@end example
+
+You should now see something like:
+@example
+Bisecting: 195 revisions left to test after this (roughly 8 steps)
+[b17e2f3d7a5853a30f7d5a3cdc6b5079e77a3d2a] Web: Announcement
+update for the new @qq{LilyPond Report}.
+@end example
+
+@end enumerate
+
+@subheading git bisect actual
+
+@enumerate
+
+@item
+Compile the source:
+
+@example
+make
+@end example
+
+@item
+Test your input file:
+
+@example
+out/bin/lilypond test.ly
+@end example
+
+@item
+Test results?
+
+@itemize
+@item
+Does it crash, or is the output bad?  If so:
+
+@example
+git bisect bad
+@end example
+
+@item
+Does your input file produce good output?  If so:
+
+@example
+git bisect good
+@end example
+
+@end itemize
+
+@item
+Once the exact problem commit has been identified, git will inform
+you with a message like:
+
+@example
+6d28aebbaaab1be9961a00bf15a1ef93acb91e30 is the first bad commit
+%%% ... blah blah blah ...
+@end example
+
+If there is still a range of commits, then git will automatically
+select a new version for you to test.  Go to step #1.
+
+@end enumerate
+
+@subheading Recommendation: use two terminal windows
+
+@itemize
+@item
+One window is open to the @code{build/} directory, and alternates
+between these commands:
+
+@example
+make
+out/bin/lilypond test.ly
+@end example
+
+@item
+One window is open to the top source directory, and alternates
+between these commands:
+
+@example
+git bisect good
+git bisect bad
+@end example
+
+@end itemize
+
+
+@node Memory and coverage tests
+@section Memory and coverage tests
+
+In addition to the graphical output of the regression tests, it is
+possible to test memory usage and to determine how much of the source
+code has been exercised by the tests.
+
+@subheading Memory usage
+
+For tracking memory usage as part of this test, you will need
+GUILE CVS; especially the following patch:
+@smallexample
+@uref{http://www.lilypond.org/vc/old/gub.darcs/patches/guile-1.9-gcstats.patch}.
+@end smallexample
+
+@subheading Code coverage
+
+For checking the coverage of the test suite, do the following
+
+@example
+./scripts/auxiliar/build-coverage.sh
+@emph{# uncovered files, least covered first}
+./scripts/auxiliar/coverage.py  --summary out-cov/*.cc
+@emph{# consecutive uncovered lines, longest first}
+./scripts/auxiliar/coverage.py  --uncovered out-cov/*.cc
+@end example
 
 
 @node MusicXML tests
 @section MusicXML tests
 
 
-LilyPond comes with a fairly complete set of regtests for the
-@uref{http://www.musicxml.org/,MusicXML} language.  These tests may
-be seen online at the following address:
+LilyPond comes with a complete set of regtests for the
+@uref{http://www.musicxml.org/,MusicXML} language.  Originally
+developed to test @samp{musicxml2ly}, these regression tests
+can be used to test any MusicXML implementation.
+
+The MusicXML regression tests are found at
+@file{input/regression/musicxml/}.
+
+The output resulting from running these tests
+through @samp{musicxml2ly} followed by @samp{lilypond} is
+available in the LilyPond documentation:
 
 @example
 @uref{http://lilypond.org/doc/latest/input/regression/musicxml/collated-files}
 @end example
 
-TBC
 
+@node Grand Regression Test Checking
+@section Grand Regression Test Checking
+
+@subheading What is this all about?
+
+Regression tests (usually abbreviated "regtests") is a collection
+of @file{.ly} files used to check whether LilyPond is working correctly.
+Example: before version 2.15.12 breve noteheads had incorrect width,
+which resulted in collisions with other objects.  After the issue was fixed,
+a small @file{.ly} file demonstrating the problem was added to the regression
+tests as a proof that the fix works.  If someone will accidentally break
+breve width again, we will notice this in the output of that regression test.
+
+@subheading How can I help?
+
+We ask you to help us by checking one or two regtests from time to time.
+You don't need programming skills to do this, not even LilyPond skills -
+just basic music notation knowledge; checking one regtest takes less than
+a minute.  Simply go here:
+
+@example
+@uref{http://www.philholmes.net/lilypond/regtests/}
+@end example
+
+@subheading Some tips on checking regtests
+
+@subsubheading Description text
+
+The description should be clear even for a music beginner.
+If there are any special terms used in the description,
+they all should be explained in our @rglosnamed{Top, Music Glossary}
+or @rinternalsnamed{Top, Internals Reference}.
+Vague descriptions (like "behaves well", "looks reasonable") shouldn't be used.
+
+@ignore
+this may be useful for advanced regtest checking
+@subsubheading Is regtest straightforward and systematic?
+
+Unfortunately some regtests are written poorly.  A good regtest should be
+straightforward: it should be obvious what it checks and how.  Also, it
+usually shouldn't check everything at once.  For example it's a bad idea to test
+accidental placement by constucting one huge chord with many suspended notes
+and loads of accidentals.  It's better to divide such problem into a series
+of clearly separated cases.
+@end ignore