Steps towards regular automated regression testing
Automated testing with TestTool at the moment has several sets of test cases available which are used for testing on Master Workspaces (MWSs) or Child Workspaces (CWSs). The tests on the MWS are used as a reference when testing a CWS derived from this MWS, so that it can be determined whether a bug found by means of an automated test was introduced on the CWS or on the MWS. Our current Master Workspaces, be it SRC680 or OOF680, do by no means succeed with all automated tests. Instead there are always some tests which fail. For such failed tests bugs are created within Issue Tracker. These are treated as normal bugs so that sooner or later they are fixed and the test works again.
Current pitfalls for regular regression testing
1. As some tests also fail on the Master Workspace the tester has to check whether a failed test was newly introduced by the CWS or whether it already existed on the ReleaseMilestone his CWS is based on. If automated testing was done for this milestone or if at least an installation set for this milestone is still available then this check can be done with acceptable effort. If neither is available it is rather difficult to find out where the bug was introduced.
2. Too many different test environments sometimes report on some functional tests different results. This means that if a test fails, which did not fail on the corresponding MWS, it does not always mean that a new bug was introduced. Instead it might be that a new test run finishes without this error.
3. If all existing test cases are run, the whole test cycle takes a few days - far too much to do this on every CWS. But reduced sets exist that run overnight.
What does this mean for the quality of OpenOffice.org?
The effect of the above pitfalls is that it is quite hard for a developer to do extended regression testing for his CWSs. If he feels that the changes he made on his CWS should be verified by automated testing, he will usually try to get hold of one of the automated testers - a limited resource, especially when we are in the middle of an OOo release. Even the QA representative for the CWS will in many cases not be able to run more than a few basic automated tests as the effort to run a larger set is too high for him.
So we have to rely on the SmokeTest, the normal testing done by the developer and testers, and those automated tests the few automated testers are able to do, to ensure the quality of the CWSs. But there is room for more. If we were able to do more automated tests then we could find some more bugs early, before they get into the codeline and the product and thus create OOo releases in an even better quality...
A possible way out
To enable automated tests on every CWS we would have to eliminate the above two problems. So a tester encountering a failed test during automated testing should not be required to check whether the bug was really introduced on the CWS and he should be certain that a failed test means that a bug exists and not have to retry the test again to be sure. Additionally the tests should run in an acceptable time i.e. a few hours at the most.
The CWSs gh13 and gh14 are supposed to improve the speed of the TestTool and also improve its deterministic behaviour. With a small set of tests being used, which always run safely and deterministic and with the tests being finished within a few hours (we might add more tests later), that leaves us with the problem of having to check whether all those failed tests were introduced with the CWS or whether they existed on the MWS.
These checks - at least at the moment - take the most time for the automated testers. They could be eliminated if the MWS were always free of bugs that cause automated tests to fail. Then a failed test would automatically mean that the corresponding bug was introduced with the new CWS - no need to check the MWS. A failed test then always means that a bug was introduced with the CWS.
But how to ensure that the MWS stays free of bugs that cause the automated tests to fail? It looks like we would have to make the automated tests, at least a certain set, mandatory for every CWS. Otherwise, sooner or later, via some CWSs bugs would again be introduced that cause the automated tests to fail. So for every CWS the owner or the QA representative would have to run the automated tests. Ideally the test run would send the information whether the tests all succeeded directly to the EIS tool. Approval of a CWS would then require the tests to be successfully run. That way we should be able to ensure that the automated tests don't get "corrupted" on the MWS. If - for whatever reason - we would indeed later detect a failed test on the MWS, then this would cause a P1 bug that would have to be fixed with the milestone before making it ready for usage, or on the next milestone if a fix is not possible in acceptable time. A milestone containing such a bug should not be used for CWS creation and CWSs already having been created on it should be resynced to the next milestone.
Automatic testing on the MWS for each milestone might postpone the time when the milestone is ready for usage compared to now. This ranges from a few hours for pure testing to whole days in case of real regressions which have to get fixed. Is this acceptable?
Running automated tests that take several hours to run requires hardware to run the tests on. Do we need separate hardware to run the tests, or should this be done on the machines where the developers do the builds? Is the additional required machine time acceptable at all?
Is it possible to integrate such testing into tinderbox builds?
What platforms are required for the automated tests? Does one platform suffice? Should the developer have freedom to determine the necessary platforms and e.g. normally run the tests on Windows & Linux but be able to say that one platform is sufficient if he can safely say that his changes should not affect other platforms?
Changes in the UI might make it necessary to adapt a corresponding test case. This requires some work from the QA engineers responsible for creating automated tests, who then might become the bottleneck. This might be mitigated by informing the automated tester early during CWS development. Do we accept this potential increase in development time?
How does this all relate to unit test? From time to time there are efforts to push usage of unit tests or even make them mandatory, for the same reasons as given here.
A vision of a future with regular regression testing
The following is a "vision" of an idealized future where we do more regression testing via automated tests:
In the past few months the quality of OOo has significantly improved. A set of automated tests is now run on each and every CWS before it gets integrated, so that the quality of CWSs has improved further and thus the Master Workspace where OOo is released from.
Now, once the owner sets his CWS to the state "ready for QA", the QA representative (or the owner himself) triggers the run of the automated tests. This is quite easy for him as it is a simple command in his command shell. The tests then run while the tester does the normal manual testing on the CWS and is usually finished even before all normal tests are done, so that it costs little or no additional time. With the finished test, the test script also reports its state to the EIS application, so that it is easy for the QA representative to see whether the tests were successful. Without successful automated tests, nearly (*) no CWS nowadays gets integrated in the MWS. (There are exceptions for special cases like pure build fixes, cws tooling changes, and the like.)
The automated testers now have much less effort with checking failed tests, as - at least for the mandatory set of tests for every CWS - the MWS is usually free of bugs in automated tests, so that there is no such effort at least for this part of the existing tests. This leaves them the necessary time to adapt tests to modified UI in CWSs and also a bit more time to develop new test cases. Now and then they also add a test case to the set of mandatory tests, so that this set of tests has slightly increased over the past weeks.
--Jj 13:19, 10 May 2007 (CEST)