X-Git-Url: https://git.donarmstrong.com/?a=blobdiff_plain;f=Documentation%2Fcontributor%2Fregressions.itexi;h=365582e87f7a8beef00a6e437f67d7b005fba98a;hb=ac569ff38c26fb3aaf1bc16d0e213589bdc4fa8f;hp=aa5aeaa4e6a583b7365f95a0a156d67864ea0c83;hpb=9d1520b21710bd22872010ae9aa4c4899014e9d4;p=lilypond.git diff --git a/Documentation/contributor/regressions.itexi b/Documentation/contributor/regressions.itexi index aa5aeaa4e6..365582e87f 100644 --- a/Documentation/contributor/regressions.itexi +++ b/Documentation/contributor/regressions.itexi @@ -7,6 +7,7 @@ * Precompiled regression tests:: * Compiling regression tests:: * Regtest comparison:: +* Pixel-based regtest comparison:: * Finding the cause of a regression:: * Memory and coverage tests:: * MusicXML tests:: @@ -99,8 +100,20 @@ verify that the regression tests have, in fact, run.} The test comparison shows all of the changes that occurred between the current release and the prior release. Each test that has a -significant difference in output is displayed, with the old -version on the left and the new version on the right. +significant (noticeable) difference in output is displayed, with +the old version on the left and the new version on the right. + +Some of the small changes can be ignored (slightly different slur +shapes, small variations in note spacing), but this is not always +the case: sometimes even the smallest change means that something +is wrong. To help in distinguishing these cases, we use bigger +staff size when small differences matter. + +Staff size 30 generally means "pay extra attention to details". +Staff size 40 (two times bigger than default size) or more means +that the regtest @strong{is} about the details. + +Staff size smaller than default doesn't mean anything. Regression tests whose output is the same for both versions are not shown in the test comparison. @@ -123,6 +136,11 @@ something is suspicious! @item Profile files: give information about TODO? I don't know what they're for. +Apparently they give some information about CPU usage. If you got +tons of changes in cell counts, this probably means that you compiled +@code{make test-baseline} with a different amount of CPU threads than +@code{make check}. Try redoing tests from scratch with the same +number of threads each time -- see @ref{Saving time with the -j option}. @end itemize @@ -183,7 +201,8 @@ than building the source code, as described in @node Regtest comparison @section Regtest comparison -Before modified code is committed to master, a regression test +Before modified code is committed to @code{master} (via @code{staging}), +a regression test comparison must be completed to ensure that the changes have not caused problems with previously working code. The comparison is made automatically upon compiling the regression test suite @@ -196,7 +215,7 @@ Run @code{make} with current git master without any of your changes. @item Before making changes to the code, establish a baseline for the comparison by -going to the @file{lilypond-git/build/} directory and running: +going to the @file{$LILYPOND_GIT/build/} directory and running: @example make test-baseline @@ -280,6 +299,58 @@ and want to do regtests comparison for all of them, you can @code{make test-clean}, @code{make} and @code{make check} it without doing @code{make test-baseline} again.} +@node Pixel-based regtest comparison +@section Pixel-based regtest comparison + +As an alternative to the @code{make test} method for regtest checking (which +relies upon @code{.signature} files created by a LilyPond run and which describe +the placing of grobs) there is a script which compares the output of two +LilyPond versions pixel-by-pixel. To use this, start by checking out the +version of LilyPond you want to use as a baseline, and run @code{make}. Then, +do the following: + +@example +cd $LILYPOND_GIT/scripts/auxiliar/ +./make-regtest-pngs.sh -j9 -o +@end example + +The @code{-j9} option tells the script to use 9 CPUs to create the +images - change this to your own CPU count+1. @code{-o} means this is the "old" +version. This will create images of all the regtests in + +@example +$LILYPOND_BUILD_DIR/out-png-check/old-regtest-results/ +@end example + +Now checkout the version you want to compare with the baseline. Run +@code{make} again to recreate the LilyPond binary. Then, do the following: + +@example +cd $LILYPOND_GIT/scripts/auxiliar/ +./make-regtest-pngs.sh -j9 -n +@end example + +The @code{-n} option tells the script to make a "new" version of the +images. They are created in + +@example +$LILYPOND_BUILD_DIR/out-png-check/new-regtest-results/ +@end example + +Once the new images have been created, the script compares the old images with +the new ones pixel-by-pixel and prints a list of the different images to the +terminal, together with a count of how many differences were found. The +results of the checks are in + +@example +$LILYPOND_BUILD_DIR/out-png-check/regtest-diffs/ +@end example + +To check for differences, browse that directory with an image +viewer. Differences are shown in red. Be aware that some images with complex +fonts or spacing annotations always display a few minor differences. These can +safely be ignored. + @node Finding the cause of a regression @section Finding the cause of a regression @@ -485,13 +556,15 @@ a small @file{.ly} file demonstrating the problem was added to the regression tests as a proof that the fix works. If someone will accidentally break breve width again, we will notice this in the output of that regression test. -We are asking you to help us by checking a regtest or two from time to time. +@subheading How can I help? + +We ask you to help us by checking one or two regtests from time to time. You don't need programming skills to do this, not even LilyPond skills - just basic music notation knowledge; checking one regtest takes less than a minute. Simply go here: @example -@uref{http://www.holmessoft.co.uk/homepage/private/regtests/} +@uref{http://www.philholmes.net/lilypond/regtests/} @end example @subheading Some tips on checking regtests @@ -505,6 +578,7 @@ or @rinternalsnamed{Top, Internals Reference}. Vague descriptions (like "behaves well", "looks reasonable") shouldn't be used. @ignore +this may be useful for advanced regtest checking @subsubheading Is regtest straightforward and systematic? Unfortunately some regtests are written poorly. A good regtest should be