Documentation/contributor/regressions.itexi

   1 @c -*- coding: utf-8; mode: texinfo; -*-
   2 @node Regression tests
   3 @chapter Regression tests
   4
   5 @menu
   6 * Introduction to regression tests::
   7 * Precompiled regression tests::
   8 * Compiling regression tests::
   9 * Identifying code regressions::
  10 * Finding the cause of a regression::
  11 * Memory and coverage tests::
  12 * MusicXML tests::
  13 @end menu
  14
  15
  16 @node Introduction to regression tests
  17 @section Introduction to regression tests
  18
  19 LilyPond has a complete suite of regression tests that are used
  20 to ensure that changes to the code do not break existing behavior.
  21 These regression tests comprise small LilyPond snippets that test
  22 the functionality of each part of LilyPond.
  23
  24 Regression tests are added when new functionality is added to
  25 LilyPond.  They are also added when bugs are identified.  The
  26 snippet that causes the bug becomes a regression test to verify
  27 that the bug has been fixed.
  28
  29 The regression tests are compiled using special @code{make}
  30 targets.  There are three primary uses for the regression
  31 tests.  First, successful completion of the regression tests means
  32 that LilyPond has been properly built.  Second, the output of the
  33 regression tests can be manually checked to ensure that
  34 the graphical output matches the description of the intended
  35 output.  Third, the regression test output from two different
  36 versions of LilyPond can be automatically compared to identify
  37 any differences.  These differences should then be manually
  38 checked to ensure that the differences are intended.
  39
  40 Regression tests (@qq{regtests}) are available in precompiled form
  41 as part of the documentation.  Regtests can also be compiled
  42 on any machine that has a properly configured LilyPond build
  43 system.
  44
  45
  46 @node Precompiled regression tests
  47 @section Precompiled regression tests
  48
  49 @subheading Regression test output
  50
  51 As part of the release process, the regression tests are run
  52 for every LilyPond release.  Full regression test output is
  53 available for every stable version and the most recent development
  54 version.
  55
  56 Regression test output is available in HTML and PDF format.  Links
  57 to the regression test output are available at the developer's
  58 resources page for the version of interest.
  59
  60 The latest stable version of the regtests is found at:
  61
  62 @example
  63 @uref{http://lilypond.org/doc/stable/input/regression/collated-files.html}
  64 @end example
  65
  66 The latest development version of the regtests is found at:
  67
  68 @example
  69 @uref{http://lilypond.org/doc/latest/input/regression/collated-files.html}
  70 @end example
  71
  72
  73 @subheading Regression test comparison
  74
  75 Each time a new version is released, the regtests are
  76 compiled and the output is automatically compared with the
  77 output of the previous release.  The result of these
  78 comparisons is archived online:
  79
  80 @example
  81 @uref{http://lilypond.org/test/}
  82 @end example
  83
  84 Checking these pages is a very important task for the LilyPond project.
  85 You are invited to report anything that looks broken, or any case
  86 where the output quality is not on par with the previous release,
  87 as described in @rweb{Bug reports}.
  88
  89 @warning{ The special regression test
  90 @file{test-output-distance.ly} will always show up as a
  91 regression.  This test changes each time it is run, and serves to
  92 verify that the regression tests have, in fact, run.}
  93
  94
  95 @subheading What to look for
  96
  97 The test comparison shows all of the changes that occurred between
  98 the current release and the prior release.  Each test that has a
  99 significant difference in output is displayed, with the old
 100 version on the left and the new version on the right.
 101
 102 Regression tests whose output is the same for both versions are
 103 not shown in the test comparison.
 104
 105 @itemize
 106 @item
 107 Images: green blurs in the new version show the approximate
 108 location of elements in the old version.
 109
 110 There are often minor adjustments in spacing which do not indicate
 111 any problem.
 112
 113 @item
 114 Log files: show the difference in command-line output.
 115
 116 The main thing to examine are any changes in page counts -- if a
 117 file used to fit on 1 page but now requires 4 or 5 pages,
 118 something is suspicious!
 119
 120 @item
 121 Profile files: give information about
 122 TODO?  I don't know what they're for.
 123
 124 @end itemize
 125
 126 @warning{
 127 The automatic comparison of the regtests checks the LilyPond
 128 bounding boxes.  This means that Ghostscript changes and changes
 129 in lyrics or text are not found.
 130 }
 131
 132 @node Compiling regression tests
 133 @section Compiling regression tests
 134
 135 Developers may wish to see the output of the complete regression
 136 test suite for the current version of the source repository
 137 between releases.  Current source code is available; see
 138 @ref{Working with source code}.  Then you will need
 139 to build the LilyPond binary; see @ref{Compiling LilyPond}.
 140
 141 Uninstalling the previous LilyPond version is not necessary, nor is
 142 running @code{make install}, since the tests will automatically be
 143 compiled with the LilyPond binary you have just built in your source
 144 directory.
 145
 146 From this point, the regtests are compiled with:
 147
 148 @example
 149 make test
 150 @end example
 151
 152 If you have a multi-core machine you may want to use the @option{-j}
 153 option and @var{CPU_COUT} variable, as
 154 described in @ref{Saving time with CPU_COUNT}.
 155 For a quad-core processor the complete command would be:
 156
 157 @example
 158 make -j5 CPU_COUNT=5 test
 159 @end example
 160
 161 The regtest output will then be available in
 162 @file{input/regression/out-test}.
 163 @file{input/regression/out-test/collated-examples.html}
 164 contains a listing of all the regression tests that were run,
 165 but none of the images are included.  Individual images are
 166 also available in this directory.
 167
 168 The primary use of @samp{make@tie{}test} is to verify that the
 169 regression tests all run without error.  The regression test
 170 page that is part of the documentation is created only when the
 171 documentation is built, as described in @ref{Generating documentation}.
 172 Note that building the documentation requires more installed components
 173 than building the source code, as described in
 174 @ref{Requirements for building documentation}.
 175
 176
 177 @node Identifying code regressions
 178 @section Identifying code regressions
 179
 180 Before modified code is committed to master, a regression test
 181 comparison must be completed to ensure that the changes have
 182 not caused problems with previously working code.  The comparison
 183 is made automatically upon compiling the regression test suite
 184 twice.
 185
 186 @enumerate
 187
 188 @item
 189 Before making changes, a baseline should be established by
 190 running:
 191
 192 @example
 193 make test-baseline
 194 @end example
 195
 196 @item
 197 Make your changes, or apply the patch(es) to consider.
 198
 199 @item
 200 Compile the source with @samp{make} as usual.
 201
 202 @item
 203 Check for unintentional changes to the regtests:
 204
 205 @example
 206 make check
 207 @end example
 208
 209 After this has finished, a regression test comparison will be
 210 available at:
 211
 212 @example
 213 out/test-results/index.html
 214 @end example
 215
 216 For each regression test that differs between the baseline and the
 217 changed code, a regression test entry will displayed.  Ideally,
 218 the only changes would be the changes that you were working on.
 219 If regressions are introduced, they must be fixed before
 220 committing the code.
 221
 222 @warning{
 223 The special regression test @file{test-output-distance.ly} will always
 224 show up as a regression.  This test changes each time it is run, and
 225 serves to verify that the regression tests have, in fact, run.}
 226
 227 @item
 228 If you are happy with the results, then stop now.
 229
 230 If you want to continue programming, then make any additional code
 231 changes, and continue.
 232
 233 @item
 234 Compile the source with @samp{make} as usual.
 235
 236 @item
 237 To re-check files that differed between the initial
 238 @samp{make@tie{}test-baseline} and your post-changes
 239 @samp{make@tie{}check}, run:
 240
 241 @example
 242 make test-redo
 243 @end example
 244
 245 This updates the regression list at @file{out/test-results/index.html}.
 246 It does @emph{not} redo @file{test-output-distance.ly}.
 247
 248 @item
 249 When all regressions have been resolved, the output list will be empty.
 250
 251 @item
 252 Once all regressions have been resolved, a final check should be completed
 253 by running:
 254
 255 @example
 256 make test-clean
 257 make check
 258 @end example
 259
 260 This cleans the results of the previous @samp{make@tie{}check}, then does the
 261 automatic regression comparison again.
 262
 263 @end enumerate
 264
 265
 266 @node Finding the cause of a regression
 267 @section Finding the cause of a regression
 268
 269 @warning{This is not a @qq{simple} task; it requires a fair amount
 270 of technical knowledge.}
 271
 272 Git has special functionality to help tracking down the exact
 273 commit which causes a problem.  See the git manual page for
 274 @code{git bisect}.  This is a job that non-programmers can do,
 275 although it requires familiarity with git, ability to compile
 276 LilyPond, and generally a fair amount of technical knowledge.  A
 277 brief summary is given below, but you may need to consult other
 278 documentation for in-depth explanations.
 279
 280 Even if you are not familiar with git or are not able to compile
 281 LilyPond you can still help to narrow down the cause of a
 282 regression simply by downloading the binary releases of different
 283 LilyPond versions and testing them for the regression.  Knowing
 284 which version of LilyPond first exhibited the regression is
 285 helpful to a developer as it shortens the @code{git bisect}
 286 procedure.
 287
 288 Once a problematic commit is identified, the programmers' job is
 289 much easier.  In fact, for most regression bugs, the majority of
 290 the time is spent simply finding the problematic commit.
 291
 292 More information is in @ref{Regression tests}.
 293
 294 @subheading git bisect setup
 295
 296 We need to set up the bisect for each problem we want to
 297 investigate.
 298
 299 Suppose we have an input file which compiled in version 2.13.32,
 300 but fails in version 2.13.38 and above.
 301
 302 @enumerate
 303 @item
 304 Begin the process:
 305
 306 @example
 307 git bisect start
 308 @end example
 309
 310 @item
 311 Give it the earliest known bad tag:
 312
 313 @example
 314 git bisect bad release/2.13.38-1
 315 @end example
 316
 317 (you can see tags with: @code{git tag} )
 318
 319 @item
 320 Give it the latest known good tag:
 321
 322 @example
 323 git bisect good release/2.13.32-1
 324 @end example
 325
 326 You should now see something like:
 327 @example
 328 Bisecting: 195 revisions left to test after this (roughly 8 steps)
 329 [b17e2f3d7a5853a30f7d5a3cdc6b5079e77a3d2a] Web: Announcement
 330 update for the new @qq{LilyPond Report}.
 331 @end example
 332
 333 @end enumerate
 334
 335 @subheading git bisect actual
 336
 337 @enumerate
 338
 339 @item
 340 Compile the source:
 341
 342 @example
 343 make
 344 @end example
 345
 346 @item
 347 Test your input file:
 348
 349 @example
 350 out/bin/lilypond test.ly
 351 @end example
 352
 353 @item
 354 Test results?
 355
 356 @itemize
 357 @item
 358 Does it crash, or is the output bad?  If so:
 359
 360 @example
 361 git bisect bad
 362 @end example
 363
 364 @item
 365 Does your input file produce good output?  If so:
 366
 367 @example
 368 git bisect good
 369 @end example
 370
 371 @end itemize
 372
 373 @item
 374 Once the exact problem commit has been identified, git will inform
 375 you with a message like:
 376
 377 @example
 378 6d28aebbaaab1be9961a00bf15a1ef93acb91e30 is the first bad commit
 379 %%% ... blah blah blah ...
 380 @end example
 381
 382 If there is still a range of commits, then git will automatically
 383 select a new version for you to test.  Go to step #1.
 384
 385 @end enumerate
 386
 387 @subheading Recommendation: use two terminal windows
 388
 389 @itemize
 390 @item
 391 One window is open to the @code{build/} directory, and alternates
 392 between these commands:
 393
 394 @example
 395 make
 396 out/bin/lilypond test.ly
 397 @end example
 398
 399 @item
 400 One window is open to the top source directory, and alternates
 401 between these commands:
 402
 403 @example
 404 git bisect good
 405 git bisect bad
 406 @end example
 407
 408 @end itemize
 409
 410
 411 @node Memory and coverage tests
 412 @section Memory and coverage tests
 413
 414 In addition to the graphical output of the regression tests, it is
 415 possible to test memory usage and to determine how much of the source
 416 code has been exercised by the tests.
 417
 418 @subheading Memory usage
 419
 420 For tracking memory usage as part of this test, you will need
 421 GUILE CVS; especially the following patch:
 422 @uref{http://www.lilypond.org/vc/old/gub.darcs/patches/guile-1.9-gcstats.patch}.
 423
 424 @subheading Code coverage
 425
 426 For checking the coverage of the test suite, do the following
 427
 428 @example
 429 ./scripts/auxiliar/build-coverage.sh
 430 @emph{# uncovered files, least covered first}
 431 ./scripts/auxiliar/coverage.py  --summary out-cov/*.cc
 432 @emph{# consecutive uncovered lines, longest first}
 433 ./scripts/auxiliar/coverage.py  --uncovered out-cov/*.cc
 434 @end example
 435
 436
 437 @node MusicXML tests
 438 @section MusicXML tests
 439
 440
 441 LilyPond comes with a complete set of regtests for the
 442 @uref{http://www.musicxml.org/,MusicXML} language.  Originally
 443 developed to test @samp{musicxml2ly}, these regression tests
 444 can be used to test any MusicXML implementation.
 445
 446 The MusicXML regression tests are found at
 447 @file{input/regression/musicxml/}.
 448
 449 The output resulting from running these tests
 450 through @samp{muscxml2ly} followed by @samp{lilypond} is
 451 available in the LilyPond documentation:
 452
 453 @example
 454 @uref{http://lilypond.org/doc/latest/input/regression/musicxml/collated-files}
 455 @end example
 456