-## This will eventually do to wiki.debian.org/RegressionTestFramework
+## This will eventually do to wiki.debian.org/DebTestFramework
* '''Created''': <<Date(2010-10-07)>>
- * '''Contributors''': MichaelHanke
+ * '''Contributors''': MichaelHanke, YaroslavHalchenko
* '''Packages affected''':
* '''See also''':
== Summary ==
-This specification describes conventions and tools that allow Debian to
-distribute and run regression test batteries developed by upstream or
-Debian developers in a uniform fashion.
+This specification describes DebTest -- a framework with conventions and tools
+that allow Debian to distribute test batteries developed by upstream or Debian
+developers. DebTest aims to enable developers and users to perform extensive
+testing of a deployed Debian system or a particular software of interest in a
+uniform fashion.
== Rationale ==
Ideally software packaged for Debian comes with an exhaustive test suite that
-can be used to determine whether this software works as expected on the Debian
-platform. However, especially for complex software, these test suites are often
-resource hungry (CPU time, memory, diskspace, network bandwidth) and cannot be
-ran at package build time by buildds. Consequently, test suites are only
-utilized by the packager on a particular machine, before uploading a new version
-to the archive.
-
-However, Debian is an integrated system and packaged software is typically made
-to rely on functionality provided by other Debian packages (e.g. shared
-libraries) instead of shipping duplicates with different versions in every
-package -- for many good reasons. Unfortunately, there is also a downside to
-this: Debian packages often use 3rd-party tools with different versions than
-those tested by upstream, and moreover, the actual versions might change
-frequently between to subsequent uploads of a package. Currently a change in a
-dependency that introduces an incompatibility cannot be detected reliably
-(before users have filed a bug report) -- even if upstream provides a testsuite
-that would have caught the breakage. Although there are archive-wide QA efforts
-(e.g. constantly rebuilding all packages) these tests can only detect API/ABI
-breakage or functionality tested during build-time checks -- they are not
-exhaustive for the aforementioned reasons.
+can be used to determine whether this particular software works as expected on
+the Debian platform. However, especially for complex software, these test
+suites are often resource hungry (CPU time, memory, disk space, network
+bandwidth) and cannot be ran at package build time by buildds. Consequently,
+test suites are typically utilized manually and only by the respective packager
+on a particular machine, before uploading a new version to the archive.
+
+However, Debian is an integrated system and packaged software typically relies
+on functionality provided by other Debian packages (e.g. shared libraries)
+instead of shipping duplicates with different versions in every package -- for
+many good reasons. Unfortunately, there is also a downside to this: Debian
+packages often use versions of 3rd-party tools that are different from those
+tested by upstream, and moreover, the actual versions of dependencies might
+change frequently between subsequent uploads of a dependent package. Currently
+a change in a dependency that introduces an incompatibility cannot be detected
+reliably even if upstream provides a test suite that would have caught the
+breakage. Therefore integration testing heavily relies on users to detect
+incorrect functioning and file bug reports. Although there are archive-wide QA
+efforts (e.g. constantly rebuilding all packages) these tests can only detect
+API/ABI breakage or functionality tested during build-time checks -- they are
+not exhaustive for the aforementioned reasons.
This is a proposal to, first of all, package upstream test suites in a way that
they can be used to run expensive archive-wide QA tests. However, this is also
-a proposal to establish means to test interactions of software from multiple
-Debian packages and test proper, continued, integration into the Debian system.
+a proposal to establish means to test interactions between software from
+multiple Debian packages to provide more thorough continued integration and
+regression testing for the Debian systems.
== Use Cases ==
- * Bob is the maintainer for the boot process for Debian. In the etch cycle,
- he would like to work on getting the boot time down to two seconds from
- boot manager to GDM screen. He creates an entry for the specification in the
- wiki, discusses it at debconf, and starts writing out a braindump of it.
-
- * Arnaud is a student participating in Google's Summer of code and wants to
- make sure that his project fits the short description that has been given
- on the ideas page. He writes a detailed spec in the wiki. His mentor can
- then confirm that he's on good track. The specification is published on a
- mailing list and people's comments help improve it even further.
+ * Moritz is a member of the security team. Whenever he applies a patch to fix
+ a security issue he wants to make sure that the generic behavior of the software
+ remains unchanged. However, in general he only has access to test cases that
+ are included in the source package (if any). In the absence of proper tests
+ he can only either assume that is would work (bad by design), or rely on the
+ respective package maintainer to run the appropriate tests (introduces
+ delays). A packaged exhaustive regression test suite would allow Moritz to
+ perform comprehensive testing on his own and release the fixed package as
+ soon as the tests pass.
+
+ * Michael is a Debian package maintainer that takes care of three
+ packages each providing a data format conversion utility. While
+ all three tools have their merits there is also lots of
+ overlap. For example, given a particular data file they should all
+ generate identical output. With a DebTest framework, Michael can
+ write and package cross-package test suites to ensure that this
+ promise is fulfilled at any time. Moreover, Michael can also
+ develop/package "pipeline" tests that ensure proper functioning of
+ multi-stage/package processing pipelines (from raw data format
+ conversion to visualization), where some stages could be
+ (re)processed using alternative tools from different software
+ packages promising to provide the same functionality. By testing
+ a whole processing stream while changing the alternative
+ implementations, breakage of the compatibility compliance could be
+ detected.
+
+ * Yarik is a Debian maintainer of a package where upstream provides
+ a complete analysis pipeline which was used for an article
+ publication. Such analysis requires relatively large array of
+ data and a range of tools from other packages to acquire
+ publication-ready summary of the results. Therefor such analysis
+ cannot be carried out at package build time. Upstream aims to
+ assure the reproducibility of the published results and encourages
+ Yarik to promise correct functioning of the research product on
+ Debian systems. Within the DebTest framework, Yarik can package
+ upstream analysis pipeline along with the target results to assure
+ reproducibility of the scientific findings.
+
+ * Albert is a scientist using Debian for his research activities. The
+ developers of his favorite software tell him to rather use the GreenPants
+ distribution, because they cannot guarantee that their software works
+ properly on Debian. They reason that Debian has a different
+ version of a numerical library that hasn't been "tested" by the authors.
+ With packaged regression test suites Albert can install and run, at any given point,
+ a complete test of his Debian system to ensure that everything is working
+ properly given the exact set of base libraries installed at this very moment.
+ This includes the test suite of the authors of his favorite software, but
+ also all distribution test suites provided by Debian developers (see above).
+
+ * Sylvestre maintains a core computational library in Debian.
+ A new version (or other modification) of this library promises performance
+ advantages. Using DebTest he could not only verify the absence of
+ regressions but also to obtain direct performance comparison
+ against the previous version across a range of applications.
+
+ * Joerg maintains a repository of backports of Debian packages to be
+ installed in a stable environment. He wants to assure that
+ backporting of the packages has not caused a deviation in their
+ intended functioning. By using existing DebTest tests suites he
+ could verify that backported versions of the packages do not break
+ the stability and function as promised within the stable
+ environment.
+
+ * Mark wants to create a Debian-derived distribution and needs to
+ modify a number of essential packages in order to achieve the desired
+ improvements. He hopes that these changes do not break other Debian
+ packages, but he is not really sure. A comprehensive test battery for the
+ whole Debian system would offer him a way to verify proper functioning
+ of his modified snapshot of Debian -- without having to manually replicate
+ the testing efforts done by thousands of Debian contributors.
+
+ * Linus is an upstream developer. He just loves the fact that he can tell any
+ of his Debian-based users to just 'apt-get install' something and send him
+ the output of a debtest command, whenever they claim that his software
+ doesn't work properly. It pleases him to see his carefully developed test
+ suite to be conveniently accessible for users.
+
+ * Finally, Lucas has access to a powerful computing facility and
+ likes to run all kinds of tests on all packages in the Debian archive.
+ A Debian-wide regression test framework would allow Lucas to execute
+ complex test collections (suites for individual packages,
+ interoperability tests, or comparative) in an automated fashion,
+ and file bug reports against the respective packages whenever a
+ malfunction is detected. Some of Lucas friends are not brave enough to file
+ bugs, but still want to contribute. They simply run (selected) tests
+ on their local machines that in turn report results/logs to a Debian
+ dashboard server, where interested parties can get a weather report of
+ Debian's status.
== Scope ==
-This specification covers feature specifications for Debian. It is not meant as
-a more general specification format.
+This specification is applicable to all Debian packages, and Debian as a whole.
== Design ==
A specification should be built with the following considerations:
- * The person implementing it may not be the person writing it. It should be
+ * The person implementing it may not be the person writing it. Specification should be
* clear enough for someone to be able to read it and have a clear path
- * towards implementing it. If it doesn't, it needs more detail.
+ * towards implementing it. If it is not straightforward, it needs more detail.
- * That the use cases covered in the specification should be practical
+ * Use cases covered in the specification should be practical
* situations, not contrived issues.
* Limitations and issues discovered during the creation of a specification
Specific issues related to particular sections are described further below.
-=== Summary ===
-The summary should not attempt to say '''why''' the spec is being defined, just
-'''what''' is being specified.
+=== Core components ===
+
+ * Organization of the framework
+ - packages might register ways to run basic tests against installed
+ versions
+ register:
+ - executable?
+
+
+==== Packaged tests ====
+
+ * Metainformation:
+ * duration: ....
+ * resources:
+ * suites:
+
+ * Debug symbols: ....
+ * do not strip symbols from test binary
+
+ * Packages that register tests might provide a virtual package
+ 'test-<packagename>' to allow easy test discovery and retrival via
+ debtest tools.
-=== Rationale ===
-This should be the description of '''why''' this spec is being defined.
+==== debtest tools ====
-=== Scope and Use Cases ===
+ * Invocation::
+ * single package tests
+ * all (with -f to force even if resources are not sufficient)
+ * tests of dependent packages (discovered via rdepends,
+ "rrecommends" and "rsuggests")
+ * given specific resources demands, just run
+ the ones matching those
+ * Customization/Output::
+ plugins::
+ * job resources requirement adjustments
+ . manual customization
+ . request from dashboard for the system (or alike)
+ * executioners
+ . local execution (monitor resources)
+ . submit to cluster/cloud
+ * output/reports
+ . some structured output
+ . interfaces to dashboards
-While not always required, but in many cases they bring much better clarity to
-the scope and scale of the specification than could be obtained by talking in
-abstract terms.
+
+==== Maintainer helpers ====
+
+ Helpers:
+ - assess resources/performance:
+
+
+=== Supplementary infrastructure ===
+
+==== Dashboard server ====
=== Implementation Plan ===
The implementation is very dependent on the type of feature to be implemented.
Refer to the team leader for further suggestions and guidance on this topic.
+ * Implementation language:
+ - Python unless someone takes the burden to develop
+ and maintain for upcoming years.
+
== Outstanding Issues ==
The specification process requires experienced people to drive it. More