X-Git-Url: https://git.donarmstrong.com/?a=blobdiff_plain;f=Documentation%2Fcontributor%2Fregressions.itexi;h=365582e87f7a8beef00a6e437f67d7b005fba98a;hb=a775c0535512573cb013cd230a2630dffd933ac0;hp=b23e6b1abb99c7e2c7f2ec355c498c938aff9576;hpb=d199c2786d16e1fc00bd17fd9b1a54a8312e2079;p=lilypond.git diff --git a/Documentation/contributor/regressions.itexi b/Documentation/contributor/regressions.itexi index b23e6b1abb..365582e87f 100644 --- a/Documentation/contributor/regressions.itexi +++ b/Documentation/contributor/regressions.itexi @@ -7,9 +7,11 @@ * Precompiled regression tests:: * Compiling regression tests:: * Regtest comparison:: +* Pixel-based regtest comparison:: * Finding the cause of a regression:: * Memory and coverage tests:: * MusicXML tests:: +* Grand Regression Test Checking:: @end menu @@ -98,8 +100,20 @@ verify that the regression tests have, in fact, run.} The test comparison shows all of the changes that occurred between the current release and the prior release. Each test that has a -significant difference in output is displayed, with the old -version on the left and the new version on the right. +significant (noticeable) difference in output is displayed, with +the old version on the left and the new version on the right. + +Some of the small changes can be ignored (slightly different slur +shapes, small variations in note spacing), but this is not always +the case: sometimes even the smallest change means that something +is wrong. To help in distinguishing these cases, we use bigger +staff size when small differences matter. + +Staff size 30 generally means "pay extra attention to details". +Staff size 40 (two times bigger than default size) or more means +that the regtest @strong{is} about the details. + +Staff size smaller than default doesn't mean anything. Regression tests whose output is the same for both versions are not shown in the test comparison. @@ -122,6 +136,11 @@ something is suspicious! @item Profile files: give information about TODO? I don't know what they're for. +Apparently they give some information about CPU usage. If you got +tons of changes in cell counts, this probably means that you compiled +@code{make test-baseline} with a different amount of CPU threads than +@code{make check}. Try redoing tests from scratch with the same +number of threads each time -- see @ref{Saving time with the -j option}. @end itemize @@ -182,7 +201,8 @@ than building the source code, as described in @node Regtest comparison @section Regtest comparison -Before modified code is committed to master, a regression test +Before modified code is committed to @code{master} (via @code{staging}), +a regression test comparison must be completed to ensure that the changes have not caused problems with previously working code. The comparison is made automatically upon compiling the regression test suite @@ -195,7 +215,7 @@ Run @code{make} with current git master without any of your changes. @item Before making changes to the code, establish a baseline for the comparison by -going to the @file{lilypond-git/build/} directory and running: +going to the @file{$LILYPOND_GIT/build/} directory and running: @example make test-baseline @@ -279,6 +299,58 @@ and want to do regtests comparison for all of them, you can @code{make test-clean}, @code{make} and @code{make check} it without doing @code{make test-baseline} again.} +@node Pixel-based regtest comparison +@section Pixel-based regtest comparison + +As an alternative to the @code{make test} method for regtest checking (which +relies upon @code{.signature} files created by a LilyPond run and which describe +the placing of grobs) there is a script which compares the output of two +LilyPond versions pixel-by-pixel. To use this, start by checking out the +version of LilyPond you want to use as a baseline, and run @code{make}. Then, +do the following: + +@example +cd $LILYPOND_GIT/scripts/auxiliar/ +./make-regtest-pngs.sh -j9 -o +@end example + +The @code{-j9} option tells the script to use 9 CPUs to create the +images - change this to your own CPU count+1. @code{-o} means this is the "old" +version. This will create images of all the regtests in + +@example +$LILYPOND_BUILD_DIR/out-png-check/old-regtest-results/ +@end example + +Now checkout the version you want to compare with the baseline. Run +@code{make} again to recreate the LilyPond binary. Then, do the following: + +@example +cd $LILYPOND_GIT/scripts/auxiliar/ +./make-regtest-pngs.sh -j9 -n +@end example + +The @code{-n} option tells the script to make a "new" version of the +images. They are created in + +@example +$LILYPOND_BUILD_DIR/out-png-check/new-regtest-results/ +@end example + +Once the new images have been created, the script compares the old images with +the new ones pixel-by-pixel and prints a list of the different images to the +terminal, together with a count of how many differences were found. The +results of the checks are in + +@example +$LILYPOND_BUILD_DIR/out-png-check/regtest-diffs/ +@end example + +To check for differences, browse that directory with an image +viewer. Differences are shown in red. Be aware that some images with complex +fonts or spacing annotations always display a few minor differences. These can +safely be ignored. + @node Finding the cause of a regression @section Finding the cause of a regression @@ -470,3 +542,49 @@ available in the LilyPond documentation: @uref{http://lilypond.org/doc/latest/input/regression/musicxml/collated-files} @end example + +@node Grand Regression Test Checking +@section Grand Regression Test Checking + +@subheading What is this all about? + +Regression tests (usually abbreviated "regtests") is a collection +of @file{.ly} files used to check whether LilyPond is working correctly. +Example: before version 2.15.12 breve noteheads had incorrect width, +which resulted in collisions with other objects. After the issue was fixed, +a small @file{.ly} file demonstrating the problem was added to the regression +tests as a proof that the fix works. If someone will accidentally break +breve width again, we will notice this in the output of that regression test. + +@subheading How can I help? + +We ask you to help us by checking one or two regtests from time to time. +You don't need programming skills to do this, not even LilyPond skills - +just basic music notation knowledge; checking one regtest takes less than +a minute. Simply go here: + +@example +@uref{http://www.philholmes.net/lilypond/regtests/} +@end example + +@subheading Some tips on checking regtests + +@subsubheading Description text + +The description should be clear even for a music beginner. +If there are any special terms used in the description, +they all should be explained in our @rglosnamed{Top, Music Glossary} +or @rinternalsnamed{Top, Internals Reference}. +Vague descriptions (like "behaves well", "looks reasonable") shouldn't be used. + +@ignore +this may be useful for advanced regtest checking +@subsubheading Is regtest straightforward and systematic? + +Unfortunately some regtests are written poorly. A good regtest should be +straightforward: it should be obvious what it checks and how. Also, it +usually shouldn't check everything at once. For example it's a bad idea to test +accidental placement by constucting one huge chord with many suspended notes +and loads of accidentals. It's better to divide such problem into a series +of clearly separated cases. +@end ignore