templates, tune up of --help for configurerepo tool, etc

[neurodebian.git] / sandbox / proposal_regressiontestframwork.moin
diff --git a/sandbox/proposal_regressiontestframwork.moin b/sandbox/proposal_regressiontestframwork.moin

index f93569b9cc1b61dfeb8d1a674aeb0adf166d31d0..983a624c5d91550e3f7c12deaec304a560a8f373 100644 (file)
--- a/sandbox/proposal_regressiontestframwork.moin
+++ b/sandbox/proposal_regressiontestframwork.moin
@@ -1,72 +1,153 @@
-## This will eventually do to wiki.debian.org/RegressionTestFramework
+## This will eventually do to wiki.debian.org/DebTestFramework
  
   * '''Created''': <<Date(2010-10-07)>>
- * '''Contributors''': MichaelHanke
+ * '''Contributors''': MichaelHanke, YaroslavHalchenko
   * '''Packages affected''': 
   * '''See also''': 
  
  == Summary ==
  
-This specification describes conventions and tools that allow Debian to
-distribute and run regression test batteries developed by upstream or
-Debian developers in a uniform fashion.
+This specification describes DebTest -- a framework with conventions and tools
+that allow Debian to distribute test batteries developed by upstream or Debian
+developers.  DebTest aims to enable developers and users to perform extensive
+testing of a deployed Debian system or a particular software of interest in a
+uniform fashion.
  
  == Rationale ==
  
  Ideally software packaged for Debian comes with an exhaustive test suite that
-can be used to determine whether this software works as expected on the Debian
-platform. However, especially for complex software, these test suites are often
-resource hungry (CPU time, memory, diskspace, network bandwidth) and cannot be
-ran at package build time by buildds. Consequently, test suites are only
-utilized by the packager on a particular machine, before uploading a new version
-to the archive.
-
-However, Debian is an integrated system and packaged software is typically made
-to rely on functionality provided by other Debian packages (e.g. shared
-libraries) instead of shipping duplicates with different versions in every
-package -- for many good reasons. Unfortunately, there is also a downside to
-this: Debian packages often use 3rd-party tools with different versions than
-those tested by upstream, and moreover, the actual versions might change
-frequently between to subsequent uploads of a package.  Currently a change in a
-dependency that introduces an incompatibility cannot be detected reliably
-(before users have filed a bug report) -- even if upstream provides a testsuite
-that would have caught the breakage. Although there are archive-wide QA efforts
-(e.g. constantly rebuilding all packages) these tests can only detect API/ABI
-breakage or functionality tested during build-time checks -- they are not
-exhaustive for the aforementioned reasons.
+can be used to determine whether this particular software works as expected on
+the Debian platform. However, especially for complex software, these test
+suites are often resource hungry (CPU time, memory, disk space, network
+bandwidth) and cannot be ran at package build time by buildds. Consequently,
+test suites are typically utilized manually and only by the respective packager
+on a particular machine, before uploading a new version to the archive.
+
+However, Debian is an integrated system and packaged software typically relies
+on functionality provided by other Debian packages (e.g. shared libraries)
+instead of shipping duplicates with different versions in every package -- for
+many good reasons. Unfortunately, there is also a downside to this: Debian
+packages often use versions of 3rd-party tools that are different from those
+tested by upstream, and moreover, the actual versions of dependencies might
+change frequently between subsequent uploads of a dependent package.  Currently
+a change in a dependency that introduces an incompatibility cannot be detected
+reliably even if upstream provides a test suite that would have caught the
+breakage.  Therefore integration testing heavily relies on users to detect
+incorrect functioning and file bug reports. Although there are archive-wide QA
+efforts (e.g. constantly rebuilding all packages) these tests can only detect
+API/ABI breakage or functionality tested during build-time checks -- they are
+not exhaustive for the aforementioned reasons.
  
  This is a proposal to, first of all, package upstream test suites in a way that
  they can be used to run expensive archive-wide QA tests. However, this is also
-a proposal to establish means to test interactions of software from multiple
-Debian packages and test proper, continued, integration into the Debian system.
+a proposal to establish means to test interactions between software from
+multiple Debian packages to provide more thorough continued integration and
+regression testing for the Debian systems.
  
  == Use Cases ==
  
-  * Bob is the maintainer for the boot process for Debian. In the etch cycle,
-    he would like to work on getting the boot time down to two seconds from
-    boot manager to GDM screen. He creates an entry for the specification in the
-    wiki, discusses it at debconf, and starts writing out a braindump of it.
-
-  * Arnaud is a student participating in Google's Summer of code and wants to
-    make sure that his project fits the short description that has been given
-    on the ideas page. He writes a detailed spec in the wiki. His mentor can
-    then confirm that he's on good track. The specification is published on a
-    mailing list and people's comments help improve it even further.
+  * Moritz is a member of the security team. Whenever he applies a patch to fix
+    a security issue he wants to make sure that the generic behavior of the software
+    remains unchanged. However, in general he only has access to test cases that
+    are included in the source package (if any). In the absence of proper tests
+    he can only either assume that is would work (bad by design), or rely on the
+    respective package maintainer to run the appropriate tests (introduces
+    delays). A packaged exhaustive regression test suite would allow Moritz to
+    perform comprehensive testing on his own and release the fixed package as
+    soon as the tests pass.
+
+  * Michael is a Debian package maintainer that takes care of three
+    packages each providing a data format conversion utility. While
+    all three tools have their merits there is also lots of
+    overlap. For example, given a particular data file they should all
+    generate identical output. With a DebTest framework, Michael can
+    write and package cross-package test suites to ensure that this
+    promise is fulfilled at any time.  Moreover, Michael can also
+    develop/package "pipeline" tests that ensure proper functioning of
+    multi-stage/package processing pipelines (from raw data format
+    conversion to visualization), where some stages could be
+    (re)processed using alternative tools from different software
+    packages promising to provide the same functionality.  By testing
+    a whole processing stream while changing the alternative
+    implementations, breakage of the compatibility compliance could be
+    detected.
+
+  * Yarik is a Debian maintainer of a package where upstream provides
+    a complete analysis pipeline which was used for an article
+    publication.  Such analysis requires relatively large array of
+    data and a range of tools from other packages to acquire
+    publication-ready summary of the results. Therefor such analysis
+    cannot be carried out at package build time.  Upstream aims to
+    assure the reproducibility of the published results and encourages
+    Yarik to promise correct functioning of the research product on
+    Debian systems.  Within the DebTest framework, Yarik can package
+    upstream analysis pipeline along with the target results to assure
+    reproducibility of the scientific findings.
+
+  * Albert is a scientist using Debian for his research activities. The
+    developers of his favorite software tell him to rather use the GreenPants
+    distribution, because they cannot guarantee that their software works
+    properly on Debian. They reason that Debian has a different
+    version of a numerical library that hasn't been "tested" by the authors.
+    With packaged regression test suites Albert can install and run, at any given point,
+    a complete test of his Debian system to ensure that everything is working
+    properly given the exact set of base libraries installed at this very moment.
+    This includes the test suite of the authors of his favorite software, but
+    also all distribution test suites provided by Debian developers (see above).
+
+  * Sylvestre maintains a core computational library in Debian.
+    A new version (or other modification) of this library promises performance
+    advantages.  Using DebTest he could not only verify the absence of
+    regressions but also to obtain direct performance comparison
+    against the previous version across a range of applications.
+
+  * Joerg maintains a repository of backports of Debian packages to be
+    installed in a stable environment.  He wants to assure that
+    backporting of the packages has not caused a deviation in their
+    intended functioning.  By using existing DebTest tests suites he
+    could verify that backported versions of the packages do not break
+    the stability and function as promised within the stable
+    environment.
+
+  * Mark wants to create a Debian-derived distribution and needs to
+    modify a number of essential packages in order to achieve the desired
+    improvements. He hopes that these changes do not break other Debian
+    packages, but he is not really sure. A comprehensive test battery for the
+    whole Debian system would offer him a way to verify proper functioning
+    of his modified snapshot of Debian -- without having to manually replicate
+    the testing efforts done by thousands of Debian contributors.
+
+  * Linus is an upstream developer. He just loves the fact that he can tell any
+    of his Debian-based users to just 'apt-get install' something and send him
+    the output of a debtest command, whenever they claim that his software
+    doesn't work properly. It pleases him to see his carefully developed test
+    suite to be conveniently accessible for users.
+
+  * Finally, Lucas has access to a powerful computing facility and
+    likes to run all kinds of tests on all packages in the Debian archive.
+    A Debian-wide regression test framework would allow Lucas to execute
+    complex test collections (suites for individual packages,
+    interoperability tests, or comparative) in an automated fashion,
+    and file bug reports against the respective packages whenever a
+    malfunction is detected. Some of Lucas friends are not brave enough to file
+    bugs, but still want to contribute. They simply run (selected) tests
+    on their local machines that in turn report results/logs to a Debian
+    dashboard server, where interested parties can get a weather report of
+    Debian's status.
  
  == Scope ==
  
-This specification covers feature specifications for Debian. It is not meant as
-a more general specification format.
+This specification is applicable to all Debian packages, and Debian as a whole.
  
  == Design ==
  
  A specification should be built with the following considerations:
  
-  * The person implementing it may not be the person writing it. It should be
+  * The person implementing it may not be the person writing it. Specification should be
    * clear enough for someone to be able to read it and have a clear path
-  * towards implementing it. If it doesn't, it needs more detail.
+  * towards implementing it. If it is not straightforward, it needs more detail.
  
-  * That the use cases covered in the specification should be practical
+  * Use cases covered in the specification should be practical
    * situations, not contrived issues.
  
    * Limitations and issues discovered during the creation of a specification
@@ -78,20 +159,62 @@ A specification should be built with the following considerations:
  
  Specific issues related to particular sections are described further below.
  
-=== Summary ===
  
-The summary should not attempt to say '''why''' the spec is being defined, just
-'''what''' is being specified.
+=== Core components ===
+
+ * Organization of the framework
+   - packages might register ways to run basic tests against installed
+     versions
+   register:
+    - executable?
+
+
+==== Packaged tests ====
+
+ * Metainformation:
+   * duration: ....
+   * resources:
+   * suites:
+
+ * Debug symbols: ....
+   * do not strip symbols from test binary
+
+ * Packages that register tests might provide a virtual package
+   'test-<packagename>' to allow easy test discovery and retrival via
+   debtest tools.
  
-=== Rationale ===
  
-This should be the description of '''why''' this spec is being defined.
+==== debtest tools ====
  
-=== Scope and Use Cases ===
+ * Invocation::
+   * single package tests
+   * all (with -f to force even if resources are not sufficient)
+   * tests of dependent packages (discovered via rdepends,
+     "rrecommends" and "rsuggests")
+   * given specific resources demands, just run
+     the ones matching those
+ * Customization/Output::
+   plugins::
+     * job resources requirement adjustments
+          . manual customization
+       . request from dashboard for the system (or alike)
+        * executioners
+       . local execution (monitor resources)
+       . submit to cluster/cloud
+     * output/reports
+          . some structured output
+          . interfaces to dashboards
  
-While not always required, but in many cases they bring much better clarity to
-the scope and scale of the specification than could be obtained by talking in
-abstract terms.
+
+==== Maintainer helpers ====
+
+   Helpers:
+   - assess resources/performance:
+
+
+=== Supplementary infrastructure ===
+
+==== Dashboard server ====
  
  === Implementation Plan ===
  
@@ -111,6 +234,10 @@ as a reference.
  The implementation is very dependent on the type of feature to be implemented.
  Refer to the team leader for further suggestions and guidance on this topic.
  
+ * Implementation language:
+   - Python unless someone takes the burden to develop
+     and maintain for upcoming years.
+
  == Outstanding Issues ==
  
  The specification process requires experienced people to drive it. More