Documentation/tex/lilypond-overview.doc

   1 %-*-LaTeX-*-
   2
   3 \documentclass{article}
   4 \usepackage{a4}
   5 \def\postMudelaExample{\setlength{\parindent}{1em}}
   6 \title{LilyPond, a Music Typesetter}
   7 \author{HWN}
   8 \usepackage{musicnotes}
   9 \usepackage{graphics}
  10
  11
  12 \begin{document}
  13 \maketitle
  14
  15 % -*-LaTeX-*-
  16 \section{Introduction}
  17
  18 The Internet has become a popular medium for collaborative work on
  19 information.  Its success is partly due to its use of simple, text-based
  20 formats.  Examples of these formats are HTML and \LaTeX.  Anyone can
  21 produce or modify such files using nothing but a text editor, they are
  22 easily processed with run-of-the-mill text tools, and they can be
  23 integrated into other text-based formats.
  24
  25 Software for processing this information and presenting these formats
  26 in an elegant form is available freely (Netscape, \LaTeX, etc.).
  27 Ubiquitousness of the software and simplicity of the formats have
  28 revolutionised the way people publish text-based information
  29 nowadays.
  30
  31 In the field of performed music, where the presentation takes the form
  32 of sheet music, such a revolution has not started yet.  Let us review
  33 some alternatives that have been available for transmitting sheet
  34 music until now:
  35 \begin{itemize}
  36 \item MIDI\cite{midi}.  This format was designed for interchanging performances
  37   of music; one should think of it as a glorified tape recorder
  38   format.  It needs dedicated editors, since it is binary.  It does
  39   not provide enough information for producing musical scores: some of
  40   the abstract musical content of what is performed is thrown away.
  41
  42 \item PostScript\cite{Postscript}. This format is a printer control
  43   language.  Printed musical scores can be transmitted in PostScript,
  44   but once a score is converted to PostScript, it is virtually
  45   impossible to modify the score in a meaningful way.
  46
  47 \item Formats for various notation programs.  Notation programs either
  48   work with binary formats (e.g., NIFF\cite{niff-web}), need specific
  49   platforms (e.g., Sibelius\cite{sibelius}, Score\cite{score}), are
  50   proprietary or non-portable tools themselves (idem), produce
  51   inadequate output (e.g., MUP\cite{mup}), are based on graphical
  52   content (e.g., MusixTeX\cite{musixtex1}), or limit themselves to
  53   specific subdomains (e.g., ABC\cite{abc2mtex}).
  54
  55 \item SMDL\cite{smdl-web}.  This is a very rich ASCII format, that is
  56   designed for storing many types of music.  Unfortunately, there is
  57   no implementation of a program to print music from SMDL available.
  58   Moreover, SMDL is so verbose, that it is not suitable for human
  59   production.
  60
  61 \item TAB\cite{tablature-web}.  Tab (short for tablature) is a popular
  62   format, for interchanging music over e-mail, but it can only be used
  63   for guitar music.
  64 \end{itemize}
  65
  66 In summary, sheet music is not published and edited on a wide scale
  67 across the internet  because no format for music
  68 interchange exists that is:
  69 \begin{itemize}
  70 \item open, i.e., with publically available specifications.
  71 \item based on ASCII, and therefore suitable for human consumption and
  72   production.
  73 \item rich enough for producing publication quality sheet music from
  74   it.
  75 \item based on musical content (unlike, for example, PostScript), and
  76   therefore suitable for making modifications.
  77 \item accompanied by tools for processing it that are freely available
  78   across multiple platforms.
  79 \end{itemize}
  80
  81
  82 With the creation of LilyPond, we have tried to create both a
  83 convenient format for storing sheet music, and a portable,
  84 high-quality implementation of a compiler, that compiles the input
  85 into a printable score.  You can find a small example of LilyPond
  86 input along with output in Figure~\ref{fig:intro-fig}.
  87 %
  88 \begin{figure}[htbp]
  89   \begin{center}
  90 \begin{mudela}[verbatim]
  91       \score {
  92         \notes
  93           \type GrandStaff <
  94              \transpose c'' { c4 c4 g4 g4 a4 a4 g2 }
  95              { \clef "bass"; c4 c'4
  96                \type Staff <e'2 {\stemdown c'4 c'4}> f'4 c'4 e'4 c'4 }
  97            >
  98            \paper {
  99              linewidth = -1.0\cm ;
 100            }
 101         }
 102 \end{mudela}
 103     \caption{A small example of LilyPond input}
 104     \label{fig:intro-fig}
 105   \end{center}
 106 \end{figure}
 107 %
 108
 109
 110 The input language encodes musical events (such as notes and rests) on
 111 the basis of their time-ordering.  For example, the grammar includes
 112 constructs that specify that notes start simultaneous and that notes
 113 are to be played in sequence.  In this encoding some context that is
 114 present in sheet music is lost.
 115
 116 The compiler reconstructs the notation from the encoded music.  Its
 117 operation comprises four different steps (see
 118 Figure~\ref{fig:intro-steps}).
 119
 120 \begin{description}
 121 \item[Parsing] During parsing, the input is converted in a syntax tree.
 122
 123 \item[Interpreting] In the \emph{interpreting} step, it is determined
 124   which symbols have to be printed.  Objects that correspond to
 125   notation (\emph{Graphical objects}) are created from the syntax tree
 126   in this phase. Generally speaking, for every symbol printed there is
 127   one graphical object.  These objects are incomplete: their position
 128   and their final shape is unknown.
 129
 130   The context that was lost by encoding the input in a language is
 131   reconstructed during this conversion.
 132 \item[Formatting] The next step is determing where symbols are to be
 133   placed, this is called \emph{formatting}.
 134 \item[Outputting]
 135   Finally, all Graphical objects are outputted as PostScript or \TeX\ code.
 136 \end{description}
 137
 138 \def\staffsym{\vbox to 16pt{
 139     \hbox{\vrule width 1cm depth .2pt height .2pt}\nointerlineskip
 140     \vfil
 141     \hbox{\vrule width 1cm depth .2pt height .2pt}\nointerlineskip
 142     \vfil
 143     \hbox{\vrule width 1cm depth .2pt height .2pt}\nointerlineskip
 144     \vfil
 145     \hbox{\vrule width 1cm depth .2pt height .2pt}\nointerlineskip
 146     \vfil
 147     \hbox{\vrule width 1cm depth .2pt height .2pt}\nointerlineskip
 148 }}
 149
 150 \def\vspacer{\vbox to 20pt{\vss}}
 151 \begin{figure}[h]
 152 \def\spacedhbox#1{\hbox{\ #1\ }}
 153 \begin{eqnarray*}
 154   {\spacedhbox{Input}\atop \hbox{\texttt{\{c8 c8\}}}} {\spacedhbox{Parsing}\atop\longrightarrow}
 155   {\spacedhbox{Syntax tree}\atop\spacedhbox{\textsf{Sequential(Note,Note)}}}
 156   {\spacedhbox{Interpreting}\atop\longrightarrow}\\
 157   \vspacer\\
 158   {\spacedhbox{Graphic objects}\atop\spacedhbox{\texttrebleclef \textquarterhead\texteighthflag\textquarterhead\texteighthflag \staffsym }}
 159   {\spacedhbox{Formatting}\atop\longrightarrow}
 160   {\spacedhbox{Formatted objects}\atop\hbox{
 161     \mudela{c''8 c''8}
 162     }}\\
 163 \vspacer\\
 164   {\spacedhbox{Outputting}\atop\longrightarrow}
 165   {\spacedhbox{PostScript code}\atop\hbox{\texttt{\%!PS-Adobe}\ldots}}
 166 \end{eqnarray*}
 167   \caption{Parsing, Interpreting, Formatting and Outputting}
 168     \label{fig:intro-steps}
 169 \end{figure}
 170
 171
 172 The second step, the interpretation phase of the compiler, can be
 173 manipulated as a separate entity: the interpretation process is
 174 composed of many separate modules, and the behaviour of the modules is
 175 parameterised.  By recombining these interpretation modules,
 176 and changing parameter settings, the same piece of music can be
 177 printed differently, as is shown in Figure~\ref{fig:intro-interpret}.
 178
 179 This makes it easy to extend the program. Moreover, this enables the
 180 same music to be printed in different versions, e.g., in a conductors
 181 score and in extracted parts.
 182
 183
 184 \begin{figure}[h]
 185   \begin{center}
 186     \begin{mudela}
 187       \score {
 188         \notes
 189           \type GrandStaff <
 190              \transpose c'' { c4 c4 g4 g4 a4 a4 g2 }
 191              { \clef "bass"; c4 c'4
 192                \type Staff <e'2 {\stemdown c'4 c'4}> f'4 c'4 e'4 c'4 }
 193            >
 194            \paper {
 195              linewidth = -1.0\cm ;
 196              \translator {
 197                 \VoiceContext
 198                 \remove "Stem_engraver";
 199              }
 200            \translator {
 201              \StaffContext
 202                numberOfStaffLines = 3;
 203            }
 204           }
 205         }
 206     \end{mudela}
 207     \caption{The interpretation phase can be manipulated: the same
 208       music as in Figure~\ref{fig:intro-fig} is interpreted
 209       differently: three staff lines and no stems.}
 210     \label{fig:intro-interpret}
 211   \end{center}
 212 \end{figure}
 213
 214
 215
 216 \section{Preliminaries}
 217
 218 To understand the rest of the article, it is necessary to know
 219 something about music notation and music typography.  Since both
 220 communicate music, we will explain some characteristics of instruments
 221 and western music that motivate some notational constructs.
 222
 223 \subsection{Music}
 224
 225 Music notation is meant to be read by human performers.  They sing or
 226 play instruments that can produce sounds of different pitches.  These
 227 sounds are called \emph{notes}. Additionally, the sounds can be
 228 articulated in differents ways, e.g., staccato (short and separated)
 229 or legato (fluently bound together).  The loudness of the notes can
 230 also be varied.  Changes in loudness are called \emph{dynamics}.
 231
 232 Silence is also an element of music.  The musical terminology for
 233 silence within music is \emph{rest}.
 234
 235 The basic unit of pitch is the \emph{octave}.  The octave corresponds
 236 to a frequency ratio of 1:2. For example the pitch denoted by a'
 237 (frequency: 440 hertz) is one octave lower than a'' (frequency: 880
 238 hertz).  Various instruments have a limited \emph{pitch range}, for
 239 example, a trumpet has a range of about 2.5 octaves.  Not all
 240 instruments have ranges in the same register: a tuba also has a range
 241 of 2.5 octaves, but the range of the tuba is much lower.
 242
 243 Musicology has a confusing mix of relative and absolute measures for
 244 pitches: the term `octave' refers to both a difference between two
 245 pitches (the frequency ratio of 1:2), and to a range of pitches.  For
 246 example, the term `[eengestreept] octave' refers to the pitch range
 247 between 261.6 Hz and 523.3 Hz.
 248
 249
 250 The octave is divided into smaller pitch steps.  In modern western
 251 music, every octave is divided into twelve approximately equidistant
 252 pitch steps, and each step is called a \emph{semitone}.  Usually, the
 253 pitches in a musical piece come from a smaller subset of these twelve
 254 possible pitches.  This smaller subset along with the musical
 255 functions fo the pitches is called the
 256 \emph{tonality}\footnote{Tonality also refers to the relations between
 257   and functions of certain pitches.  Since these do not have any
 258   impact on notation, we ignore this} of the piece.
 259
 260
 261 The standard tonality that forms the basis of music notation
 262 (the key of C major) is a set of seven pitches within every octave.
 263 Each of these seven is denoted by a name. In English, these are names
 264 are (in rising pitch) denoted by c, d, e, f, g, a and b.  Pitches that
 265 are a semitone higher or lower than one of these seven can be
 266 represented by suffixing the name with `sharp' or `flat'
 267 respectively (this is called an \emph{chromatic alteration}).
 268
 269 A pitch therefore can be fully specified by a combination of the
 270 octave number, the note name and a chromatic alteration.
 271 Figure~\ref{fig:intro-pitches} shows the relation between names and
 272 frequencies.
 273
 274
 275
 276
 277 \begin{figure}[h]
 278   \begin{center}
 279     [te doen]
 280   \end{center}
 281   \caption{Pitches in western music.  The octave number is denoted
 282     by a superscript.}
 283   \label{fig:intro-pitches}
 284 \end{figure}
 285
 286
 287 Many instruments can produce more than one note at the same time, e.g.
 288 pianos and guitars.  When more notes are played simultaneously, they
 289 form a so-called \emph{chord}.
 290
 291 The unit of duration is the \emph{beat}. When playing, the tempo is
 292 determined by setting the number of beats per minute.  In western
 293 music, beats are often stressed in a regular pattern: for example
 294 Waltzes have a stress pattern that is strong-weak-weak, i.e. every
 295 note that starts on a `strong' beat is louder and has more pronounced
 296 articulation.  This stress pattern is called \emph{meter}.
 297
 298 \subsection{Music notation}
 299
 300 Music notation is a system that tries to represent musical ideas
 301 through printed symbols.  Music notation has no precise definition,
 302 but most conventions have described in reference manuals on music
 303 notation\cite{read-notation}.
 304
 305 In music notation, sounds and silences are represented by symbols that
 306 are called note and rest respectively.\footnote{These names serve a
 307   double purpose: the same terms are used to denote the musical
 308   concepts.}  The shape of notes and rests indicates their duration
 309 (See figure~\ref{noteshapes}) relative to the whole note.
 310
 311 \begin{figure}[h]
 312   \begin{center}
 313 \begin{mudela}
 314   \score {
 315     \notes \transpose c''{ c\longa*1/4 c\breve*1/2 c1 c2 c4 c8 c16 c32 c64 }
 316     \paper {
 317      \translator {
 318        \StaffContext
 319        \remove "Staff_symbol_engraver";
 320         \remove "Time_signature_engraver";
 321         \remove "Bar_engraver";
 322         \remove "Clef_engraver";
 323  }
 324 linewidth = -1.;
 325     }
 326 }
 327 \end{mudela}
 328 \begin{mudela}
 329   \score {
 330     \notes \transpose c''{ r\longa*1/4 r\breve*1/2 r1 r2 r4 r8 r16 r32 r64 }
 331     \paper {
 332       \translator {
 333         \StaffContext
 334         \remove "Staff_symbol_engraver";
 335         \remove "Time_signature_engraver";
 336         \remove "Bar_engraver";
 337         \remove "Clef_engraver";
 338         }
 339       linewidth = -1.;
 340     }
 341 }
 342 \end{mudela}
 343     \caption{Note and rest shapes encode the length.  At the top notes
 344       are shown, at the bottom rests.  From left to right a quadruple
 345       note (\emph{longa}), double (\emph{breve}), whole, half,
 346       quarter, eigth, sixteenth, thirtysecond and sixtyfourth. Each
 347       note has half of the duration of its predecessor.}
 348     \label{fig:noteshapes}
 349 \end{center}
 350 \end{figure}
 351
 352 Notes are printed in a grid of horizontal lines called \emph{staff} to
 353 denote their pitch: each line represents the pitch of from the
 354 standard scale (c, d, e, f, g, a, b).  The reference point is the
 355 \emph{clef}, eg., the treble clef marks the location of the $g^1$
 356 pitch.  The notes are printed in their time order, from left to right.
 357
 358
 359 \begin{figure}[h]
 360   \begin{center}
 361     \begin{mudela}
 362       \score { \notes {
 363       a4 b c d e f g a \clef bass;
 364       a4 b c d e f g a \clef alto;
 365       a4 b c d e f g a \clef treble;
 366       }
 367       \paper { linewidth = 15.\cm; }
 368       }
 369     \end{mudela}
 370     \caption{Pitches ranging from $a, b, c',\ldots a'$, in different
 371       clefs.  From left right the bass, alto and treble clef are
 372       featured.}
 373     \label{fig:pitches}
 374   \end{center}
 375 \end{figure}
 376
 377 The chromatic alterations are indicated by printing a flat sign or a
 378 sharp sign in front of the note head.  If these chromatic alterations
 379 occur systematically (if they are part of the tonality of the piece),
 380 then this indicated with a \emph{key signature}.  This is a list of
 381 sharp/flat signs which is printed next to the clef.
 382
 383 Articulation is notated by marking the note shapes wedges, hats and
 384 dots all indicate specific articulations.  If the notes are to be
 385 bound fluently (legato), the note shapes are encompassed by a smooth
 386 curve called \emph{slur},
 387
 388 \begin{figure}[h]
 389   \begin{center}
 390     \begin{mudela}
 391       c'4-> c'4-. g'4 ( b'4  ) g''4
 392     \end{mudela}
 393     \caption{Some articulations.  From left to right: extra stress
 394       (\emph{marcato}), short (staccato), slurred notes (legato).}
 395     \label{fig:articulation}
 396   \end{center}
 397 \end{figure}
 398
 399
 400
 401 Dynamics are notated in two ways: absolute dynamics are indicated by
 402 letters: \textbf{f} (from Italian ``forte'') stands for loud,
 403 \textbf{p} (from Italian ``piano'') means soft.  Gradual changes in
 404 loudness are notated by (de)crescendos.  These are hairpin like shapes
 405 below the staff.
 406
 407 \begin{figure}[h]
 408   \begin{center}
 409     \begin{mudela}
 410       g'4\pp \<  g'4 \! g'4 \ff \> g'4 g' \! g'\ppp
 411     \end{mudela}
 412     \caption{Dynamics: start very soft (pp), grow to loud (ff) and
 413       decrease to extremely soft (ppp)}
 414     \label{fig:dynamics}
 415   \end{center}
 416 \end{figure}
 417
 418
 419 The meter is indicated by barlines: every start of the stress pattern
 420 is preceded by a vertical line, the \emph{bar line}.  The space
 421 between two bar lines is called measure.  It is therefore the unit of
 422 the rhythmic pattern.
 423
 424 The time signature also indicates what kind of rhythmic pattern is
 425 desired.  The time signature takes the form of two numbers stacked
 426 vertically. The top number is the number of beats in one measure, the
 427 bottom number is the duration (relative to the whole note) of the note
 428 that takes  one beat.  Example: 2/4  time signature means ``two beats
 429 per measure, and a quarter note takes one beat''
 430
 431 Chords are written by attaching multiple note heads to one stem.  When
 432 the composer wants to emphasize the horizontal relationships between
 433 notes, the simultaneous notes can be written as voices (where every
 434 note head has its own stem).  A small example is given in
 435 Figure~\ref{fig:simultaneous}.
 436
 437 \begin{figure}[h]
 438   \begin{center}
 439     \begin{mudela}
 440       \relative c'' {\time 2/4;  <c4 e> <d f>
 441                 \type Staff < \type Voice = VA{
 442                   \stemdown
 443                   c4 d
 444                   b16 b b b b b b b }
 445                 \type Voice = VB {
 446                   \stemup e4 f g8 g4 g8 } >
 447           }
 448     \end{mudela}
 449     \caption{Notes sounding together.  Chord notation (left, before
 450       the bar line) emphasizes vertical relations, voice notation
 451       emphasizes horizontal relations. Separate voices needn't have
 452       synchronous rhythms (third measure).
 453       }
 454     \label{fig:simultaneous}
 455   \end{center}
 456 \end{figure}
 457
 458 Separate voices do not have to share one rhythmic pattern---this is
 459 also demonstrated in Figure~\ref{fig:simultaneous}--- they are in a sense%vaag
 460 independent.  A different way to express this in notation, is by
 461 printing each voice on a different staff.  This is customary when
 462 writing for piano (both left and right hand have a staff of their own)
 463 and for ensemble (every instrument has a staff of its own).
 464
 465
 466
 467 \subsection{Music typography}
 468
 469 Music typography is the art of placing symbols in esthetically
 470 pleasing configuration.  Little is explicitly known about music
 471 typography.  There are only a few reference works
 472 available\cite{ross,wanske}.  Most of the knowledge of this art has
 473 been transmitted verbally, and was subsequently lost.
 474
 475 The motivation behind choices in typography is to represent the idea
 476 as clearly as possible. Among others, this results in the following
 477 guidelines:
 478 \begin{itemize}
 479 \item The printed score should use visual hints to accentuate the
 480   musical content
 481 \item The printed score should not contain distracting elements, such
 482   as large empty regions or blotted regions.
 483 \end{itemize}
 484
 485 An example of the first guideline in action is the horizontal spacing.
 486 The amount of space following a note should reflect the duration of
 487 that note: short notes get a small amount of space, long notes larger
 488 amounts.  Such spacing constraints can be quite subtle, for the
 489 ``amount of space'' is only the impression that should be conveyed; there
 490 has to be some correction for optical illusions.  See
 491 Figure~\ref{fig:spacing}.
 492
 493 \begin{figure}[h]
 494   \begin{center}
 495     \begin{mudela}
 496       \relative c'' { \time 3/4; c16 c c c c8 c8 | f4 f, f'  }
 497     \end{mudela}
 498     \caption{Spacing conveys information about duration. Sixteenth
 499       notes at the left get less space than quarter notes in the
 500       middle. Spacing is ``visual'', there should be more space
 501       after  the first note of the last measure, and  less after second. }
 502     \label{fig:spacing}
 503   \end{center}
 504 \end{figure}
 505
 506 Another clearly visible example of music typography is visible in
 507 collisions.  When chords or separate voices are printed, the notes
 508 that start at the same time should be printed aligned (ie., with the
 509 same $x$ position).  If the pitches are close to each other, the note
 510 heads would collide. To prevent this, some notes (or note heads) have
 511 to be shifted horizontally.  An example of this is given in Figure~\ref{fig:collision}.
 512 \begin{figure}[h]
 513   \begin{center}
 514     \begin{mudela}
 515       c4
 516     \end{mudela}
 517     \caption{Collisions}
 518     \label{fig:collision}
 519   \end{center}
 520 \end{figure}
 521
 522 \bibliographystyle{hw-plain}
 523 \bibliography{engraving,boeken,colorado,computer-notation,other-packages}
 524
 525
 526
 527 \end{document}
 528
 529 The complexity of  music notation was tackled by adopting a modular
 530 design: both the formatting system (which encodes the esthetic rules of
 531 notation), and the interpretation system (which encodes the semantic
 532 rules) are highly modular.
 533
 534
 535 The difficulty in creating a format for music notation is rooted in
 536 the fact that music is multi dimensional: each sound has its own
 537 duration, pitch, loudness and articulation. Additionally, multiple
 538 sounds may be played simultaneously.  Because of this, there is no
 539 obvious way to ``flatten'' music into a context-free language.
 540
 541 The difficulty in creating a printing engine is rooted in the fact
 542 that music notation complicated: it is very large graphical
 543 ``language'' with many arbitrary esthetic and semantic conventions.
 544 Building a system that formats full fledged musical notation is a
 545 challenge in itself, regardless of whether it is part of a compiler or
 546 an editor.
 547
 548 The fact that music and its notation are of a different nature,
 549 implies that the conversion between input notation is non-trivial.
 550
 551 In LilyPond we solved the above problem in the following way:
 552