[Title Page] [TOC] [Prev] [Next] [End]

11 Maintenance


Maintenance is the most expensive phase of the software lifecycle, with typical estimates ranging from 60% to 80% of total cost [PRESSMAN]. Consequently, maintenance methodology has a major impact on software cost. "Bad fixes," in which errors are introduced while fixing reported problems, are a significant source of error [JONES]. Complexity analysis can guide maintenance activity to preserve (or improve) system quality, and specialized testing techniques help guard against the introduction of errors while avoiding redundant testing.

11.1 Effects of changes on complexity

Complexity tends to increase during maintenance, for the simple reason that both error correction and functional enhancement are much more frequently accomplished by adding code than by deleting it. Not only does overall system complexity increase, but the complexity of individual modules increases as well, because it is usually easier to "patch" the logic in an existing module rather than introducing a new module into the system design.

11.1.1 Effect of changes on cyclomatic complexity

Cyclomatic complexity usually increases gradually during maintenance, since the increase in complexity is proportional to the complexity of the new code. For example, adding four decisions to a module increases its complexity by exactly four. Thus, although complexity can become excessive if not controlled, the effects of any particular modification on complexity are predictable.

11.1.2 Effect of changes on essential complexity

Essential complexity can increase suddenly during maintenance, since adding a single statement can raise essential complexity from 1 to the cyclomatic complexity, making a perfectly structured module completely unstructured. Figure 11-1 illustrates this phenomenon. The first flow graph is perfectly structured, with an essential complexity of 1. The second flow graph, derived from the first by replacing a functional statement with a single "goto" statement, is completely unstructured, with an essential complexity of 12. The impact on essential complexity may not be obvious from inspection of the source code, or even to the developer making the change. It is therefore very important to measure essential complexity before accepting each modification, to guard against such catastrophic structural degradation.

Figure 11-1. Catastrophic structural degradation.

11.1.3 Incremental reengineering

An incremental reengineering strategy [WATSON1] provides greater benefits than merely monitoring the effects of individual modifications on complexity. A major problem with software is that it gets out of control. Generally, the level of reengineering effort increases from simple maintenance patches through targeted reverse engineering to complete redevelopment as software size increases and quality decreases, but only up to a point. There is a boundary beyond which quality is too poor for effective reverse engineering, size is too large for effective redevelopment, and so the only approach left is to make the system worse by performing localized maintenance patches. Once that boundary is crossed, the system is out of control and becomes an ever-increasing liability. The incremental reengineering technique helps keep systems away from the boundary by improving software quality in the vicinity of routine maintenance modifications. The strategy is to improve the quality of poor software that interacts with software that must be modified during maintenance. The result is that software quality improves during maintenance rather than deteriorating.

11.2 Retesting at the path level

Although most well-organized testing is repeatable as discussed in section 11.4, it is sometimes expensive to perform complete regression testing. When a change to a module is localized, it may be possible to avoid testing the changed module from scratch. Any path that had been tested in the previous version and does not execute any of the changed software may be considered tested in the new version. After testing information for those preserved paths has been carried forward as if it had been executed through the new system, the standard structured testing techniques can be used to complete the basis. This technique is most effective when the change is to a rarely executed area of the module, so that most of the tested paths through the previous version can be preserved. The technique is not applicable when the changed software is always executed, for example the module's initialization code, since in that case all paths must be retested.

11.3 Data complexity

The specified data complexity, sdv, of a module and a set of data elements is defined as the cyclomatic complexity of the reduced graph after applying the module design complexity reduction rules from section 7.4, except that the "black dot" nodes correspond to references to data elements in the specified set rather than module calls. As a special case, the sdv of a module with no references to data in the specified set is defined to be 0. Specified data complexity is really an infinite class of metrics rather than a single metric, since any set of data elements may be specified. Examples include a single element, all elements of a particular type, or all global elements [WATSON3].

The data-reduced graph contains all control structures that interact with references to specified data, and changes to that data may be tested by executing a basis set of paths through the reduced graph. Specified data complexity can therefore be used to predict the impact of changes to the specified data.

One particularly interesting data set consists of all references to dates [MCCABE6]. The "Year 2000 Problem" refers to the fact that a vast amount of software only stores the last two digits of the year field of date data. When this field changes from 99 to 00 in the year 2000, computations involving date comparison will fail. Correcting this problem is already (in 1996) becoming a major focus of software maintenance activity. Calculating the "date complexity" (specified data complexity with respect to date references) helps determine the scope of the problem, and the corresponding "date-reduced" flow graphs can be used to generate test cases when the corrections are implemented.

11.4 Reuse of testing information

Although the technique described in section 11.2 can occasionally be used to reduce the regression testing effort, it is best to rerun all of the old tests after making modifications to software. The outputs should be examined to make sure that correct functionality was preserved and that modified functionality conforms to the intended behavior. Then, the basis path coverage of the new system induced by those old tests should be examined and augmented as necessary. Although this may seem like a tremendous effort, it is mostly a matter of good organization and proper tool selection.

Graphical regression testing tools and embedded system simulators facilitate test execution and functional comparison in the most difficult environments. Most non-interactive applications software can be tested effectively at the integration level with simple regression suites and a text comparison utility for outputs, tied together with system command scripts. At the module level, stub and driver code should certainly not be discarded after one use -- minimal extra effort is required to store it for automated regression purposes, although it must be kept current as the software itself is modified.

An automated coverage tool can determine the level of basis path coverage at each regression run, and indicate what further tests need to be added to reflect the changes to the software. The resulting new tests should of course be added to the automated regression suite rather than executed once and discarded. Some discipline is required to maintain regression suites in the face of limited schedules and budgets, but the payoff is well worth the effort.

When full regression testing is really impractical, there are various shortcuts that still give a reasonable level of testing. First is to keep "minimized" regression suites, in which only key functional tests and tests that increased basis path coverage on the original system are executed. This technique preserves most but not all of the error detection ability of the complete set of original tests, as discussed in Appendix B, and may result in a regression suite of manageable size. A possible drawback is that some of the original tests that were eliminated due to not increasing coverage of the original system might have increased coverage of the new system, so extra test cases may be needed. Another technique is to save the old coverage information for modules that have not changed, and fully test only those modules that have changed. At the integration level, calls to changed modules from unchanged modules can be tested using the incremental integration method described in section 7-6.

One final point about regression testing is that it is only as effective as the underlying behavior verification oracle. Too many otherwise comprehensive regression suites use unexamined (and therefore probably incorrect) output from the time when the suite was constructed as the standard for comparison. Although it may not be feasible to verify correctness for every test case in a large regression suite, it is often appropriate to have an ongoing project that gradually increases the percentage of verified regression outputs.



[Title Page] [TOC] [Prev] [Next] [End]