Legacy code: a treasure to cherish, pain to maintain

By Koos Huijssen

This blog is part of a series that is also available as a whitepaper.

In our practice as scientific software engineers at VORtech, we work for organizations that have computational software as an essential part of their intellectual capital. Countless man-hours have been invested to get the software to its present maturity state. The wisdom, know-how and research of the experts in the organization’s application domain are being captured in it and deployed through it. And as such, this so called legacy code features as an indispensable tool in the design or operational stages of the organization’s activities, or it is a key part of its product portfolio.

Legacy code

If successful, a computational software package can exist for many years and even decades. Over time, it evolves. New features and applications are being added, prediction results are being validated and improved, problems are being fixed. The experience with the software grows, as well as the trust in its usability and reliability. The software package easily outlives the hardware it is running on. And often it also outlives the developers and users within the organization. Thus, the package becomes a legacy treasure of great value.

At the same time however, the code base grows and becomes more and more difficult to manage. The efforts required to maintain and extend the software become extensive. This could go up even to the point that, despite the enormous investments, a from-scratch replacement seems the only way forward.

In this series of blogs we will discuss how to deal with legacy code. How to keep it in a manageable state, and how to revive it if necessary. It is based on our daily practice with legacy code for more than 25 years, as well as the new insights and developments in software science that we have found useful.

Our view on legacy code

Our view on legacy code is characterized by three aspects:

Respect for the value. The intellectual value, the invested efforts and costs, the practical value for design and operation, the proven validity of the model predictions over the years, the user experience and trust: together they add up to the treasure of legacy code.
Understanding the history and the present. Feeling at home with the paradigms, languages and computing platforms of the past and the present is essential. This is required to understand how the legacy code has come to be as well as to have a clear view of where it is (and should be) going to. Just to name a few examples of the old and the new:
- Paradigms: from procedural programming to object-oriented and functional programming. Each paradigm has its advantages and drawbacks, and various concepts can be ‘borrowed’ for improving the code.
- Languages: from Fortran, C, Delphi and Perl to C++, Java, Haskell and Python. In our opinion, an ‘old’ language does not necessarily mean that it is obsolete. However, the threshold is higher for new generations of developers that are trained in the newer languages and modern concepts of software science. Knowing both ‘old’ and ‘new’ lays the foundation for modernizing the legacy.
- Platforms:
  - From single-core CPUs to multi/many-core platforms with vectorization and pipelining optimizations, including GPUs and other accelerators.
  - From limited amounts of single-level memory that requires economic and explicit data management to hierarchical cached memory. Or alternatively, to automatic memory management featuring garbage collection.
  - From cluster platforms requiring parallel computations featuring explicit data partitioning and data sharing protocols to cloud platforms running highly scalable and resilient stateless microservices.
  - Especially when the software is heavy on the computational load and when it is optimized for a specific platform, the migration to other platforms requires in-depth knowledge of both the ‘old’ and the ‘new’.
A clear vision of the shortcomings of aged software. Over time, the complexity of the software grows, and sub-optimal structures and patterns occur. The aging of software can make the code increasingly hard to understand, extend and maintain. Especially for new developers the learning curve becomes steeper and ridden with pitfalls. In the next section we list typical issues that are encountered with legacy code.

Further blogs in this series

In our second blog, we discuss

typical undesirable features of legacy code that we have come across, and
how these features have slipped into the code over time.

The last of this blog series will address

what approaches can be taken to improve the code sustainability, and
our general strategy to make the legacy treasure shine and prosper again.

If you want to be notified when new blogs in this series appear or if you want to get the full text as one, drop us an email at info@vortech.nl.

Can we help you?

If you are dealing with a situation around legacy code, feel free to contact us. We can assist you with consultancy, project management and hands-on work. Contact us through info@vortech.nl or +31(0)15 285 01 25 for an appointment with one of our experts.

Developing Computational Software