Documentation/contributor/regressions.itexi

   1 @c -*- coding: utf-8; mode: texinfo; -*-
   2 @node Regression tests
   3 @chapter Regression tests
   4
   5 @menu
   6 * Introduction to regression tests::
   7 * Precompiled regression tests::
   8 * Compiling regression tests::
   9 * Regtest comparison::
  10 * Finding the cause of a regression::
  11 * Memory and coverage tests::
  12 * MusicXML tests::
  13 @end menu
  14
  15
  16 @node Introduction to regression tests
  17 @section Introduction to regression tests
  18
  19 LilyPond has a complete suite of regression tests that are used
  20 to ensure that changes to the code do not break existing behavior.
  21 These regression tests comprise small LilyPond snippets that test
  22 the functionality of each part of LilyPond.
  23
  24 Regression tests are added when new functionality is added to
  25 LilyPond.
  26 We do not yet have a policy on when it is appropriate to add or
  27 modify a regtest when bugs are fixed.  Individual developers
  28 should use their best judgement until this is clarified during the
  29 @ref{Grand Organization Project (GOP)}.
  30
  31 The regression tests are compiled using special @code{make}
  32 targets.  There are three primary uses for the regression
  33 tests.  First, successful completion of the regression tests means
  34 that LilyPond has been properly built.  Second, the output of the
  35 regression tests can be manually checked to ensure that
  36 the graphical output matches the description of the intended
  37 output.  Third, the regression test output from two different
  38 versions of LilyPond can be automatically compared to identify
  39 any differences.  These differences should then be manually
  40 checked to ensure that the differences are intended.
  41
  42 Regression tests (@qq{regtests}) are available in precompiled form
  43 as part of the documentation.  Regtests can also be compiled
  44 on any machine that has a properly configured LilyPond build
  45 system.
  46
  47
  48 @node Precompiled regression tests
  49 @section Precompiled regression tests
  50
  51 @subheading Regression test output
  52
  53 As part of the release process, the regression tests are run
  54 for every LilyPond release.  Full regression test output is
  55 available for every stable version and the most recent development
  56 version.
  57
  58 Regression test output is available in HTML and PDF format.  Links
  59 to the regression test output are available at the developer's
  60 resources page for the version of interest.
  61
  62 The latest stable version of the regtests is found at:
  63
  64 @example
  65 @uref{http://lilypond.org/doc/stable/input/regression/collated-files.html}
  66 @end example
  67
  68 The latest development version of the regtests is found at:
  69
  70 @example
  71 @uref{http://lilypond.org/doc/latest/input/regression/collated-files.html}
  72 @end example
  73
  74
  75 @subheading Regression test comparison
  76
  77 Each time a new version is released, the regtests are
  78 compiled and the output is automatically compared with the
  79 output of the previous release.  The result of these
  80 comparisons is archived online:
  81
  82 @example
  83 @uref{http://lilypond.org/test/}
  84 @end example
  85
  86 Checking these pages is a very important task for the LilyPond project.
  87 You are invited to report anything that looks broken, or any case
  88 where the output quality is not on par with the previous release,
  89 as described in @rweb{Bug reports}.
  90
  91 @warning{ The special regression test
  92 @file{test-output-distance.ly} will always show up as a
  93 regression.  This test changes each time it is run, and serves to
  94 verify that the regression tests have, in fact, run.}
  95
  96
  97 @subheading What to look for
  98
  99 The test comparison shows all of the changes that occurred between
 100 the current release and the prior release.  Each test that has a
 101 significant difference in output is displayed, with the old
 102 version on the left and the new version on the right.
 103
 104 Regression tests whose output is the same for both versions are
 105 not shown in the test comparison.
 106
 107 @itemize
 108 @item
 109 Images: green blurs in the new version show the approximate
 110 location of elements in the old version.
 111
 112 There are often minor adjustments in spacing which do not indicate
 113 any problem.
 114
 115 @item
 116 Log files: show the difference in command-line output.
 117
 118 The main thing to examine are any changes in page counts -- if a
 119 file used to fit on 1 page but now requires 4 or 5 pages,
 120 something is suspicious!
 121
 122 @item
 123 Profile files: give information about
 124 TODO?  I don't know what they're for.
 125
 126 @end itemize
 127
 128 @warning{
 129 The automatic comparison of the regtests checks the LilyPond
 130 bounding boxes.  This means that Ghostscript changes and changes
 131 in lyrics or text are not found.
 132 }
 133
 134 @node Compiling regression tests
 135 @section Compiling regression tests
 136
 137 Developers may wish to see the output of the complete regression
 138 test suite for the current version of the source repository
 139 between releases.  Current source code is available; see
 140 @ref{Working with source code}.  Then you will need
 141 to build the LilyPond binary; see @ref{Compiling LilyPond}.
 142
 143 Uninstalling the previous LilyPond version is not necessary, nor is
 144 running @code{make install}, since the tests will automatically be
 145 compiled with the LilyPond binary you have just built in your source
 146 directory.
 147
 148 From this point, the regtests are compiled with:
 149
 150 @example
 151 make test
 152 @end example
 153
 154 If you have a multi-core machine you may want to use the @option{-j}
 155 option and @var{CPU_COUT} variable, as
 156 described in @ref{Saving time with CPU_COUNT}.
 157 For a quad-core processor the complete command would be:
 158
 159 @example
 160 make -j5 CPU_COUNT=5 test
 161 @end example
 162
 163 The regtest output will then be available in
 164 @file{input/regression/out-test}.
 165 @file{input/regression/out-test/collated-examples.html}
 166 contains a listing of all the regression tests that were run,
 167 but none of the images are included.  Individual images are
 168 also available in this directory.
 169
 170 The primary use of @samp{make@tie{}test} is to verify that the
 171 regression tests all run without error.  The regression test
 172 page that is part of the documentation is created only when the
 173 documentation is built, as described in @ref{Generating documentation}.
 174 Note that building the documentation requires more installed components
 175 than building the source code, as described in
 176 @ref{Requirements for building documentation}.
 177
 178
 179 @node Regtest comparison
 180 @section Regtest comparison
 181
 182 Before modified code is committed to master, a regression test
 183 comparison must be completed to ensure that the changes have
 184 not caused problems with previously working code.  The comparison
 185 is made automatically upon compiling the regression test suite
 186 twice.
 187
 188 @enumerate
 189
 190 @item
 191 Before making changes, a baseline should be established by
 192 running:
 193
 194 @example
 195 make test-baseline
 196 @end example
 197
 198 @item
 199 Make your changes, or apply the patch(es) to consider.
 200
 201 @item
 202 Compile the source with @samp{make} as usual.
 203
 204 @item
 205 Check for unintentional changes to the regtests:
 206
 207 @example
 208 make check
 209 @end example
 210
 211 After this has finished, a regression test comparison will be
 212 available at:
 213
 214 @example
 215 out/test-results/index.html
 216 @end example
 217
 218 For each regression test that differs between the baseline and the
 219 changed code, a regression test entry will displayed.  Ideally,
 220 the only changes would be the changes that you were working on.
 221 If regressions are introduced, they must be fixed before
 222 committing the code.
 223
 224 @warning{
 225 The special regression test @file{test-output-distance.ly} will always
 226 show up as a regression.  This test changes each time it is run, and
 227 serves to verify that the regression tests have, in fact, run.}
 228
 229 @item
 230 If you are happy with the results, then stop now.
 231
 232 If you want to continue programming, then make any additional code
 233 changes, and continue.
 234
 235 @item
 236 Compile the source with @samp{make} as usual.
 237
 238 @item
 239 To re-check files that differed between the initial
 240 @samp{make@tie{}test-baseline} and your post-changes
 241 @samp{make@tie{}check}, run:
 242
 243 @example
 244 make test-redo
 245 @end example
 246
 247 This updates the regression list at @file{out/test-results/index.html}.
 248 It does @emph{not} redo @file{test-output-distance.ly}.
 249
 250 @item
 251 When all regressions have been resolved, the output list will be empty.
 252
 253 @item
 254 Once all regressions have been resolved, a final check should be completed
 255 by running:
 256
 257 @example
 258 make test-clean
 259 make check
 260 @end example
 261
 262 This cleans the results of the previous @samp{make@tie{}check}, then does the
 263 automatic regression comparison again.
 264
 265 @end enumerate
 266
 267
 268 @node Finding the cause of a regression
 269 @section Finding the cause of a regression
 270
 271 Git has special functionality to help tracking down the exact
 272 commit which causes a problem.  See the git manual page for
 273 @code{git bisect}.  This is a job that non-programmers can do,
 274 although it requires familiarity with git, ability to compile
 275 LilyPond, and generally a fair amount of technical knowledge.  A
 276 brief summary is given below, but you may need to consult other
 277 documentation for in-depth explanations.
 278
 279 Even if you are not familiar with git or are not able to compile
 280 LilyPond you can still help to narrow down the cause of a
 281 regression simply by downloading the binary releases of different
 282 LilyPond versions and testing them for the regression.  Knowing
 283 which version of LilyPond first exhibited the regression is
 284 helpful to a developer as it shortens the @code{git bisect}
 285 procedure.
 286
 287 Once a problematic commit is identified, the programmers' job is
 288 much easier.  In fact, for most regression bugs, the majority of
 289 the time is spent simply finding the problematic commit.
 290
 291 More information is in @ref{Regression tests}.
 292
 293 @subheading git bisect setup
 294
 295 We need to set up the bisect for each problem we want to
 296 investigate.
 297
 298 Suppose we have an input file which compiled in version 2.13.32,
 299 but fails in version 2.13.38 and above.
 300
 301 @enumerate
 302 @item
 303 Begin the process:
 304
 305 @example
 306 git bisect start
 307 @end example
 308
 309 @item
 310 Give it the earliest known bad tag:
 311
 312 @example
 313 git bisect bad release/2.13.38-1
 314 @end example
 315
 316 (you can see tags with: @code{git tag} )
 317
 318 @item
 319 Give it the latest known good tag:
 320
 321 @example
 322 git bisect good release/2.13.32-1
 323 @end example
 324
 325 You should now see something like:
 326 @example
 327 Bisecting: 195 revisions left to test after this (roughly 8 steps)
 328 [b17e2f3d7a5853a30f7d5a3cdc6b5079e77a3d2a] Web: Announcement
 329 update for the new @qq{LilyPond Report}.
 330 @end example
 331
 332 @end enumerate
 333
 334 @subheading git bisect actual
 335
 336 @enumerate
 337
 338 @item
 339 Compile the source:
 340
 341 @example
 342 make
 343 @end example
 344
 345 @item
 346 Test your input file:
 347
 348 @example
 349 out/bin/lilypond test.ly
 350 @end example
 351
 352 @item
 353 Test results?
 354
 355 @itemize
 356 @item
 357 Does it crash, or is the output bad?  If so:
 358
 359 @example
 360 git bisect bad
 361 @end example
 362
 363 @item
 364 Does your input file produce good output?  If so:
 365
 366 @example
 367 git bisect good
 368 @end example
 369
 370 @end itemize
 371
 372 @item
 373 Once the exact problem commit has been identified, git will inform
 374 you with a message like:
 375
 376 @example
 377 6d28aebbaaab1be9961a00bf15a1ef93acb91e30 is the first bad commit
 378 %%% ... blah blah blah ...
 379 @end example
 380
 381 If there is still a range of commits, then git will automatically
 382 select a new version for you to test.  Go to step #1.
 383
 384 @end enumerate
 385
 386 @subheading Recommendation: use two terminal windows
 387
 388 @itemize
 389 @item
 390 One window is open to the @code{build/} directory, and alternates
 391 between these commands:
 392
 393 @example
 394 make
 395 out/bin/lilypond test.ly
 396 @end example
 397
 398 @item
 399 One window is open to the top source directory, and alternates
 400 between these commands:
 401
 402 @example
 403 git bisect good
 404 git bisect bad
 405 @end example
 406
 407 @end itemize
 408
 409
 410 @node Memory and coverage tests
 411 @section Memory and coverage tests
 412
 413 In addition to the graphical output of the regression tests, it is
 414 possible to test memory usage and to determine how much of the source
 415 code has been exercised by the tests.
 416
 417 @subheading Memory usage
 418
 419 For tracking memory usage as part of this test, you will need
 420 GUILE CVS; especially the following patch:
 421 @uref{http://www.lilypond.org/vc/old/gub.darcs/patches/guile-1.9-gcstats.patch}.
 422
 423 @subheading Code coverage
 424
 425 For checking the coverage of the test suite, do the following
 426
 427 @example
 428 ./scripts/auxiliar/build-coverage.sh
 429 @emph{# uncovered files, least covered first}
 430 ./scripts/auxiliar/coverage.py  --summary out-cov/*.cc
 431 @emph{# consecutive uncovered lines, longest first}
 432 ./scripts/auxiliar/coverage.py  --uncovered out-cov/*.cc
 433 @end example
 434
 435
 436 @node MusicXML tests
 437 @section MusicXML tests
 438
 439
 440 LilyPond comes with a complete set of regtests for the
 441 @uref{http://www.musicxml.org/,MusicXML} language.  Originally
 442 developed to test @samp{musicxml2ly}, these regression tests
 443 can be used to test any MusicXML implementation.
 444
 445 The MusicXML regression tests are found at
 446 @file{input/regression/musicxml/}.
 447
 448 The output resulting from running these tests
 449 through @samp{muscxml2ly} followed by @samp{lilypond} is
 450 available in the LilyPond documentation:
 451
 452 @example
 453 @uref{http://lilypond.org/doc/latest/input/regression/musicxml/collated-files}
 454 @end example
 455