The Cost of Redundancy
Expensive resources spent on computer grids are often wasted. An example from the financial industry teaches how repeated and redundant calculations can eat up all the computing power. Theta Proxy, a simple and yet generic solution, has the potential to save millions in server costs.
Nowadays much of the computation time is invested in recurring or similar tasks. May it be the reload button that you press twice just to make sure everything is loaded correctly. Or, may it be your reboot cycle that repeats day by day with exactly the same result. Whatever it is that your computer keeps you wait for, the task has probably been solved before. Why is it that computers redo things rather than remembering and updating previous results?
Imagine for a moment that you came home from your shopping tour and realized that you forgot to take one bag of milk. Would you go again and just take the milk, or would throw away all your shoppings and start again? If you happen to have a preference for the first solution, then you are probably not a programmer. Programmers, it seems, have a strong preference for the second choice.
In the early days of computer programming serious efforts have been spent on making computations fast and efficient. Standard algorithms were developed to perform common tasks from linear algebra, data analysis, the Fourier transformation and routings in various graph structures. These efforts have peaked in the 70′s. Even today many of the most efficient programs are written in Fortran 77, a language that was designed in 1977. Although new computer languages are permanently in the hype, the desire for fast algorithms is clearly in decay.
What does it take to make programs efficient?
First of all, the price for making algorithms fast is immense. While the task might be simple, efficiency is hard to achieve. Efficient algorithms have to keep track of intermediate results, predict whether they will be needed later and restore them when possible. Apparently, today’s programmers are too lazy to care. Just recompute everything and they are done.
The biggest efforts in the fight against redundant computation is currently concentrated on compiler optimization. Today’s compilers can detect algorithmic redundancies and provide a fix where coders were sloppy. In fact, they do an impressive job when it comes to the simplification of terms like f(a,b) * f(a,b). As long as the subterms f(a,b) occur within the same piece of code compilers can detect the duplicate requests and evaluate only once. However, as soon as this subterm repeats on a larger scope compilers would lack the creativity to store the result in a save place and retrieve it as needed.
There was one place where the costs of redundant computation could no longer be ignored: the internet. Since browsing the web became common place web servers are busy delivering more or less the same few pages to millions of visitors. To bring this problem down to a bearable level web proxies were put into place. Their task is simple: keeping track of what users want, predicting what they will reuse and balance the cost of storage against the probability of reuse.
Aren’t there more places where proxies can cut the redundancy in computation?
The answer is yes. The Thetaris company pulls the boundary one step further. Rather than just storing the results of individual requests, the Theta Proxy software utilizes machine learning techniques to predict the results of similar requests. What sounds scary at first has a very practical application that pays off in its most literal sense.
Financial institutions have an insatiable appetite for huge computation clusters. All this computation power is spent on reevaluating the institution’s assets under certain hypothetical market conditions. Again, most of that power is wasted in redundancy.
As it turns out financial valuation formulas are one of the easiest things to learn. Why? The answer lies in the mathematical foundation of the financial formulas. They are based on the solution of so called drift-diffusion equations. That means that their results are either trivial to compute or smooth. Hence, their solutions depend do not jump unexpectedly when input parameters have small approximation errors. This is an ideal playing field for the machine learning algorithm underlying the Theta Proxy.
A typical application for banks is written in the requirements of the Basel II directive. This directive demands banks to evaluate around 300000 potential scenarios for one typical investment type. With a single evaluation taking up to several seconds the potential for improvement is enormous.
Theta Proxy XL is a simple Excel plugin that accelerates the computational cost of Excel functions by a factor of up to 1000. What comes even more at a surprise using the plugin is essentially trivial: select the cell that computes slowly and press the training button. After training is finished the sheet recalculates in a fraction of its original time.
“The efficiency of this simple plugin is astonishing”, says Mauricio I. González Evans, a global solution provider for the financial industry, “With only a few clicks my desktop turned into a high performance wonder machine.”
What is coming next?
It is evident, that shorter innovation cycles and increased specialization in the software business has led to less efficient code. There simply is not enough time to develop efficient algorithms and when time is not the limiting factor the error rate is. Efficient code is always harder to debug and harder to verify.
Therefore we are only left with one choice: to come up with generic and widely applicable solutions that cut redundancies in a semi-automatic fashion. With the success stories from Thetaris speak for themselvels.. Certainly there are more industrial applications that can be found for Theta Proxy, the machine learning redundancy cutter.