Harman Patil (Editor)

Software aging

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit

In software engineering, software aging refers to all software's tendency to fail, or cause a system failure after running continuously for a certain time. As the software gets older it becomes less immune and will eventually stop functioning as it should, therefore rebooting or reinstalling the software can be seen as a short term fix. A proactive fault management method to deal with the software aging incident is software rejuvenation. This method can be classified as an environment diversity technique that usually is implemented through software rejuvenation agents (SRA).

Contents

From both an academic an d industrial point of view, the software aging phenomenon has increased. The main focus has been to understand its effects from a verifiable observation and theoretical understanding.

"Programs, like people, get old. We can't prevent aging, but we can understand its causes, take steps to limit its effects, temporarily reverse some of the damage it has caused, and prepare for the day when the software is no longer viable."

Memory bloating and leaking, along with data corruption and unreleased file-locks are particular causes of software aging.

Software aging

Software failures are a more likely cause of unplanned systems outages compared to hardware failures. This is because software exhibits over time an increasing failure rate due to data corruption, numerical error accumulation and unlimited resource consumption. In widely used and specialized software, a common action to clear a problem is rebooting because aging occurs due to the complexity of software which is never free of errors. It is almost impossible to fully verify that a piece of software is bug-free. Even high-profile software such as Windows and Mac OSX must receive continual updates to improve performance and fix bugs. Software development tends to be driven by the need to meet release deadlines rather than to ensure long-term reliability. Designing software that can be immune to aging is difficult. Not all software will age at the same rate as some users use the system more intensively than others.

Software rejuvenation

To prevent crashes or degradation a software rejuvenation can be employed proactively as inevitably aging leads to failures in software systems. This happens by removing the accumulated error condition and freeing up system resources, some examples to clean the internal state of the software are to flush operating system kernel tables, garbage collection, reinitialize internal data structures and a well known example of rejuvenation is a system reboot.

Multinational telecommunications corporation, AT&T have implemented software rejuvenation in the real time system collecting billing data in the United States for most telephone exchanges.

Different systems have also implemented the software rejuvenation method which are:

  1. Transaction processing systems
  2. Web servers
  3. Spacecraft systems

Memory leaks

In most systems, programs can request temporary memory from the system when they need it. However, any system has a limited total amount of memory and if one application is using a large amount of it, then other applications won't be able to. In low memory conditions, the system usually functions slower, applications become unresponsive and those that request large amounts of memory unexpectedly may crash. The applications should "free" dynamically requested memory (return it to the system's pool) when they have finished using it, so it can be used by another application when needed.

A memory leak happens when the application is allocated memory but does not free after it has finished using it. This eventually causes the system to run out of memory. In Microsoft Windows, for example, the memory use of Windows Explorer plug-ins and long-lived processes such as services can impact the reliability of the system to the point of making it unusable. A reboot might be needed to make the system work again.

Software rejuvenation helps with memory leaks as it forces all the memory used by an application to be released. The application can be restarted but starts with a clean slate.

Implementation

Two methods for implementing rejuvenation are:

  1. Time based rejuvenation
  2. Prediction based rejuvenation

Memory bloating

Garbage collection is a form of automatic memory management whereby the system automatically recovers unused memory. For example, the .NET Framework manages the allocation and release of memory for software running under it. But automatically tracking these objects takes time and is not perfect.

.NET based web services manage several logical types of memory such as stack, unmanaged and managed heap (free space). As the physical memory gets full, the OS writes rarely-used parts of it to disk, so that it can reallocate it to another application, a process known as paging or swapping. But if the memory does need to be used, it must be reloaded from disk. If several applications are all making large demands, the OS can spend much of its time merely moving data between main memory and disk, a process known as disk thrashing. Since the garbage collector has to examine all of the allocations to decide which are in use, it may exacerbate this thrashing. As a result, extensive swapping can lead to garbage collection cycles extended from milliseconds to tens of seconds. This results in usability problems.

References

Software aging Wikipedia