Documentation/contributor/regressions.itexi

   1 @c -*- coding: utf-8; mode: texinfo; -*-
   2 @node Regression tests
   3 @chapter Regression tests
   4
   5 @menu
   6 * Introduction to regression tests::
   7 * Precompiled regression tests::
   8 * Compiling regression tests::
   9 * Regtest comparison::
  10 * Finding the cause of a regression::
  11 * Memory and coverage tests::
  12 * MusicXML tests::
  13 @end menu
  14
  15
  16 @node Introduction to regression tests
  17 @section Introduction to regression tests
  18
  19 LilyPond has a complete suite of regression tests that are used
  20 to ensure that changes to the code do not break existing behavior.
  21 These regression tests comprise small LilyPond snippets that test
  22 the functionality of each part of LilyPond.
  23
  24 Regression tests are added when new functionality is added to
  25 LilyPond.
  26 We do not yet have a policy on when it is appropriate to add or
  27 modify a regtest when bugs are fixed.  Individual developers
  28 should use their best judgement until this is clarified during the
  29 @ref{Grand Organization Project (GOP)}.
  30
  31 The regression tests are compiled using special @code{make}
  32 targets.  There are three primary uses for the regression
  33 tests.  First, successful completion of the regression tests means
  34 that LilyPond has been properly built.  Second, the output of the
  35 regression tests can be manually checked to ensure that
  36 the graphical output matches the description of the intended
  37 output.  Third, the regression test output from two different
  38 versions of LilyPond can be automatically compared to identify
  39 any differences.  These differences should then be manually
  40 checked to ensure that the differences are intended.
  41
  42 Regression tests (@qq{regtests}) are available in precompiled form
  43 as part of the documentation.  Regtests can also be compiled
  44 on any machine that has a properly configured LilyPond build
  45 system.
  46
  47
  48 @node Precompiled regression tests
  49 @section Precompiled regression tests
  50
  51 @subheading Regression test output
  52
  53 As part of the release process, the regression tests are run
  54 for every LilyPond release.  Full regression test output is
  55 available for every stable version and the most recent development
  56 version.
  57
  58 Regression test output is available in HTML and PDF format.  Links
  59 to the regression test output are available at the developer's
  60 resources page for the version of interest.
  61
  62 The latest stable version of the regtests is found at:
  63
  64 @example
  65 @uref{http://lilypond.org/doc/stable/input/regression/collated-files.html}
  66 @end example
  67
  68 The latest development version of the regtests is found at:
  69
  70 @example
  71 @uref{http://lilypond.org/doc/latest/input/regression/collated-files.html}
  72 @end example
  73
  74
  75 @subheading Regression test comparison
  76
  77 Each time a new version is released, the regtests are
  78 compiled and the output is automatically compared with the
  79 output of the previous release.  The result of these
  80 comparisons is archived online:
  81
  82 @example
  83 @uref{http://lilypond.org/test/}
  84 @end example
  85
  86 Checking these pages is a very important task for the LilyPond project.
  87 You are invited to report anything that looks broken, or any case
  88 where the output quality is not on par with the previous release,
  89 as described in @rweb{Bug reports}.
  90
  91 @warning{ The special regression test
  92 @file{test-output-distance.ly} will always show up as a
  93 regression.  This test changes each time it is run, and serves to
  94 verify that the regression tests have, in fact, run.}
  95
  96
  97 @subheading What to look for
  98
  99 The test comparison shows all of the changes that occurred between
 100 the current release and the prior release.  Each test that has a
 101 significant difference in output is displayed, with the old
 102 version on the left and the new version on the right.
 103
 104 Regression tests whose output is the same for both versions are
 105 not shown in the test comparison.
 106
 107 @itemize
 108 @item
 109 Images: green blurs in the new version show the approximate
 110 location of elements in the old version.
 111
 112 There are often minor adjustments in spacing which do not indicate
 113 any problem.
 114
 115 @item
 116 Log files: show the difference in command-line output.
 117
 118 The main thing to examine are any changes in page counts -- if a
 119 file used to fit on 1 page but now requires 4 or 5 pages,
 120 something is suspicious!
 121
 122 @item
 123 Profile files: give information about
 124 TODO?  I don't know what they're for.
 125
 126 @end itemize
 127
 128 @warning{
 129 The automatic comparison of the regtests checks the LilyPond
 130 bounding boxes.  This means that Ghostscript changes and changes
 131 in lyrics or text are not found.
 132 }
 133
 134 @node Compiling regression tests
 135 @section Compiling regression tests
 136
 137 Developers may wish to see the output of the complete regression
 138 test suite for the current version of the source repository
 139 between releases.  Current source code is available; see
 140 @ref{Working with source code}.
 141
 142 For regression testing @code{../configure} should be run with the
 143 @code{--disable-optimising} option.  Then you will need
 144 to build the LilyPond binary; see @ref{Compiling LilyPond}.
 145
 146 Uninstalling the previous LilyPond version is not necessary, nor is
 147 running @code{make install}, since the tests will automatically be
 148 compiled with the LilyPond binary you have just built in your source
 149 directory.
 150
 151 From this point, the regtests are compiled with:
 152
 153 @example
 154 make test
 155 @end example
 156
 157 If you have a multi-core machine you may want to use the @option{-j}
 158 option and @var{CPU_COUNT} variable, as
 159 described in @ref{Saving time with CPU_COUNT}.
 160 For a quad-core processor the complete command would be:
 161
 162 @example
 163 make -j5 CPU_COUNT=5 test
 164 @end example
 165
 166 The regtest output will then be available in
 167 @file{input/regression/out-test}.
 168 @file{input/regression/out-test/collated-examples.html}
 169 contains a listing of all the regression tests that were run,
 170 but none of the images are included.  Individual images are
 171 also available in this directory.
 172
 173 The primary use of @samp{make@tie{}test} is to verify that the
 174 regression tests all run without error.  The regression test
 175 page that is part of the documentation is created only when the
 176 documentation is built, as described in @ref{Generating documentation}.
 177 Note that building the documentation requires more installed components
 178 than building the source code, as described in
 179 @ref{Requirements for building documentation}.
 180
 181
 182 @node Regtest comparison
 183 @section Regtest comparison
 184
 185 Before modified code is committed to master, a regression test
 186 comparison must be completed to ensure that the changes have
 187 not caused problems with previously working code.  The comparison
 188 is made automatically upon compiling the regression test suite
 189 twice.
 190
 191 @enumerate
 192
 193 @item
 194 Run @code{make} with current git master without any of your changes.
 195
 196 @item
 197 Before making changes to the code, establish a baseline for the comparison by
 198 going to the @file{lilypond-git/build/} directory and running:
 199
 200 @example
 201 make test-baseline
 202 @end example
 203
 204 @item
 205 Make your changes, or apply the patch(es) to consider.
 206
 207 @item
 208 Compile the source with @samp{make} as usual.
 209
 210 @item
 211 Check for unintentional changes to the regtests:
 212
 213 @example
 214 make check
 215 @end example
 216
 217 After this has finished, a regression test comparison will be
 218 available (relative to the current @file{build/} directory) at:
 219
 220 @example
 221 out/test-results/index.html
 222 @end example
 223
 224 For each regression test that differs between the baseline and the
 225 changed code, a regression test entry will be displayed.  Ideally,
 226 the only changes would be the changes that you were working on.
 227 If regressions are introduced, they must be fixed before
 228 committing the code.
 229
 230 @warning{
 231 The special regression test @file{test-output-distance.ly} will always
 232 show up as a regression.  This test changes each time it is run, and
 233 serves to verify that the regression tests have, in fact, run.}
 234
 235 @item
 236 If you are happy with the results, then stop now.
 237
 238 If you want to continue programming, then make any additional code
 239 changes, and continue.
 240
 241 @item
 242 Compile the source with @samp{make} as usual.
 243
 244 @item
 245 To re-check files that differed between the initial
 246 @samp{make@tie{}test-baseline} and your post-changes
 247 @samp{make@tie{}check}, run:
 248
 249 @example
 250 make test-redo
 251 @end example
 252
 253 This updates the regression list at @file{out/test-results/index.html}.
 254 It does @emph{not} redo @file{test-output-distance.ly}.
 255
 256 @item
 257 When all regressions have been resolved, the output list will be empty.
 258
 259 @item
 260 Once all regressions have been resolved, a final check should be completed
 261 by running:
 262
 263 @example
 264 make test-clean
 265 make check
 266 @end example
 267
 268 This cleans the results of the previous @samp{make@tie{}check}, then does the
 269 automatic regression comparison again.
 270
 271 @end enumerate
 272
 273 @advanced{
 274 Once a test baseline has been established, there is no need to run it again
 275 unless git master changed. In other words, if you work with several branches
 276 and want to do regtests comparison for all of them, you can
 277 @code{make test-baseline} with git master, checkout some branch,
 278 @code{make} and @code{make check} it, then switch to another branch,
 279 @code{make test-clean}, @code{make} and @code{make check} it without doing
 280 @code{make test-baseline} again.}
 281
 282
 283 @node Finding the cause of a regression
 284 @section Finding the cause of a regression
 285
 286 Git has special functionality to help tracking down the exact
 287 commit which causes a problem.  See the git manual page for
 288 @code{git bisect}.  This is a job that non-programmers can do,
 289 although it requires familiarity with git, ability to compile
 290 LilyPond, and generally a fair amount of technical knowledge.  A
 291 brief summary is given below, but you may need to consult other
 292 documentation for in-depth explanations.
 293
 294 Even if you are not familiar with git or are not able to compile
 295 LilyPond you can still help to narrow down the cause of a
 296 regression simply by downloading the binary releases of different
 297 LilyPond versions and testing them for the regression.  Knowing
 298 which version of LilyPond first exhibited the regression is
 299 helpful to a developer as it shortens the @code{git bisect}
 300 procedure.
 301
 302 Once a problematic commit is identified, the programmers' job is
 303 much easier.  In fact, for most regression bugs, the majority of
 304 the time is spent simply finding the problematic commit.
 305
 306 More information is in @ref{Regression tests}.
 307
 308 @subheading git bisect setup
 309
 310 We need to set up the bisect for each problem we want to
 311 investigate.
 312
 313 Suppose we have an input file which compiled in version 2.13.32,
 314 but fails in version 2.13.38 and above.
 315
 316 @enumerate
 317 @item
 318 Begin the process:
 319
 320 @example
 321 git bisect start
 322 @end example
 323
 324 @item
 325 Give it the earliest known bad tag:
 326
 327 @example
 328 git bisect bad release/2.13.38-1
 329 @end example
 330
 331 (you can see tags with: @code{git tag} )
 332
 333 @item
 334 Give it the latest known good tag:
 335
 336 @example
 337 git bisect good release/2.13.32-1
 338 @end example
 339
 340 You should now see something like:
 341 @example
 342 Bisecting: 195 revisions left to test after this (roughly 8 steps)
 343 [b17e2f3d7a5853a30f7d5a3cdc6b5079e77a3d2a] Web: Announcement
 344 update for the new @qq{LilyPond Report}.
 345 @end example
 346
 347 @end enumerate
 348
 349 @subheading git bisect actual
 350
 351 @enumerate
 352
 353 @item
 354 Compile the source:
 355
 356 @example
 357 make
 358 @end example
 359
 360 @item
 361 Test your input file:
 362
 363 @example
 364 out/bin/lilypond test.ly
 365 @end example
 366
 367 @item
 368 Test results?
 369
 370 @itemize
 371 @item
 372 Does it crash, or is the output bad?  If so:
 373
 374 @example
 375 git bisect bad
 376 @end example
 377
 378 @item
 379 Does your input file produce good output?  If so:
 380
 381 @example
 382 git bisect good
 383 @end example
 384
 385 @end itemize
 386
 387 @item
 388 Once the exact problem commit has been identified, git will inform
 389 you with a message like:
 390
 391 @example
 392 6d28aebbaaab1be9961a00bf15a1ef93acb91e30 is the first bad commit
 393 %%% ... blah blah blah ...
 394 @end example
 395
 396 If there is still a range of commits, then git will automatically
 397 select a new version for you to test.  Go to step #1.
 398
 399 @end enumerate
 400
 401 @subheading Recommendation: use two terminal windows
 402
 403 @itemize
 404 @item
 405 One window is open to the @code{build/} directory, and alternates
 406 between these commands:
 407
 408 @example
 409 make
 410 out/bin/lilypond test.ly
 411 @end example
 412
 413 @item
 414 One window is open to the top source directory, and alternates
 415 between these commands:
 416
 417 @example
 418 git bisect good
 419 git bisect bad
 420 @end example
 421
 422 @end itemize
 423
 424
 425 @node Memory and coverage tests
 426 @section Memory and coverage tests
 427
 428 In addition to the graphical output of the regression tests, it is
 429 possible to test memory usage and to determine how much of the source
 430 code has been exercised by the tests.
 431
 432 @subheading Memory usage
 433
 434 For tracking memory usage as part of this test, you will need
 435 GUILE CVS; especially the following patch:
 436 @smallexample
 437 @uref{http://www.lilypond.org/vc/old/gub.darcs/patches/guile-1.9-gcstats.patch}.
 438 @end smallexample
 439
 440 @subheading Code coverage
 441
 442 For checking the coverage of the test suite, do the following
 443
 444 @example
 445 ./scripts/auxiliar/build-coverage.sh
 446 @emph{# uncovered files, least covered first}
 447 ./scripts/auxiliar/coverage.py  --summary out-cov/*.cc
 448 @emph{# consecutive uncovered lines, longest first}
 449 ./scripts/auxiliar/coverage.py  --uncovered out-cov/*.cc
 450 @end example
 451
 452
 453 @node MusicXML tests
 454 @section MusicXML tests
 455
 456
 457 LilyPond comes with a complete set of regtests for the
 458 @uref{http://www.musicxml.org/,MusicXML} language.  Originally
 459 developed to test @samp{musicxml2ly}, these regression tests
 460 can be used to test any MusicXML implementation.
 461
 462 The MusicXML regression tests are found at
 463 @file{input/regression/musicxml/}.
 464
 465 The output resulting from running these tests
 466 through @samp{musicxml2ly} followed by @samp{lilypond} is
 467 available in the LilyPond documentation:
 468
 469 @example
 470 @uref{http://lilypond.org/doc/latest/input/regression/musicxml/collated-files}
 471 @end example
 472