4 July 2012 By Marc Henrard
In quantitative finance most of the modelling and development time is spent on models and algorithms to compute present values from a set of inputs. The art is to obtain efficiently the value of the financial instruments from a (relatively large) data set (often between 10 and 200 data points).
On the other side, when it comes to CPU time, most of it is spent on computing the derivatives of the value with respect to the inputs: the so-called greeks. A traditional way to compute them is to use a finite difference, or "bump and recompute" approach: each input is bumped by a small amount, and the value is recomputed. The derivative ratios are computed from the bumped values and the initial value.
This is a very simple approach: you only need the initial algorithm and a loop to bump all the inputs. Unfortunately, it is usually not an efficient algorithm. For each input you need to recompute the value. For a 50 inputs problem, the time required to compute the first order derivatives is 50+1 value time. Moreover, some numerical stability issues can lead to poor quality results.
In this blog post, I'll quickly describe the theory of Algorithmic Differentiation and how we implemented it at OpenGamma. In the second part, I'll describe the efficiency obtained in practice and then go one step further than the standard implementation in finance and describe how to cope with model calibration.
The theory of algorithmic differentiation is well known and has been used in computer science for a long time1. It is also called the art of differentiating programs. The idea is that the steps to compute the value are recorded and 'replayed' with the required modifications to obtain the derivatives. From a mathematical point of view, the technique uses the chain rule on a repeated basis.
Algorithmic differentiation comes in two flavours: the standard or forward mode, and the adjoint or reverse mode. The two differ in the way the differentiation algorithm works: from the inputs to the outputs for the forward mode, and from the outputs to the inputs for the reverse mode.
In the sequel I measure the efficiency of derivative computations methods by computing the ratio of the time required to compute the value and all its derivatives to the time required to compute the value only. If the time ratio is 1, it means that computing the derivatives is free (only a dream, not an expectation); for the finite difference, the ratio scales linearly with the number of inputs.
The time ratio of the standard mode is less than twice the number of inputs (see Griewank and Walther (2008) for the details):
The time ratio of the adjoint algorithm is less than four times the number of outputs:
In the problem we are facing in finance, the input is usually of dimension between 10 and 200 and the output is of dimension 1 (the present value). The choice between the two flavours and the benefits are clear: using the adjoint implementation, the computation time can be divided by ten or more.
One of the first systematic implementations of the technique in quantitative finance, which sparked more research and development, was Giles & Glasserman (2006). Algorithmic Differentiation was used to compute the greeks in the Libor Market Model with Monte Carlo. Monte Carlo simulations are usually relatively slow and the Libor Market Model implementation requires the computation of path dependent drift terms that make them even slower. Obtaining a more efficient implementation of greeks computation in that context is certainly a big advantage.
If the technique is so efficient, why is it not systematically implemented for the most time-consuming tasks?
One of the reasons is certainly that you cannot start from the top and decide to apply it to your more complex task. The technique requires the computation of the derivative of each line of code. It means that each method used should have its derivative version; one has to start from the very bottom. Each method used in the complex task, down to the simplest interpolations, should have a derivative version before even starting to look at the top task. If the library has not been designed with that aspect in mind, it is a long and time-consuming task to, a posteriori, redevelop the missing parts. On the other side, computing the derivative in an explicit way (by opposition to a finite difference way) is relatively natural for functions whose derivatives are used on a regular basis and important building blocks may already be present, even if not under the name of Algorithmic Differentiation.
The implementation of algorithmic differentiation can be done in several ways:
Our implementation is based on manually written code. There are several reasons for that. It gives us the flexibility to use financial domain specific knowledge to simplify the computations when possible. Understanding the intuition behind some variables can lead to significant efficiency gain. As I will show below, this can lead to results significantly faster than the theoretical upper bound and a direct implementation.
Our libraries are written in Java. This imposes some constraints on the implementation and in particular on overloading operators.The overloading approach was not a route we could take easily.
When we started to explicitly use the term Algorithmic Differentiation and put it in place for almost all of the pricing functions, we looked through the existing code. Most of the building blocks already had a derivative version. This was, in particular, the case for anything related to curve construction. The curves construction in a multi-curves environment with enough flexibility to use any liquid instrument naturally comes down to root finding exercises with a large number of unknowns (easily 50 to 70). To solve those, an efficient gradient implementation is required. Without the efficient gradient implementation, in one form or another, the results would be quite poor.
From there it is a matter of continuing adding the derivative versions for each new model. A priori this may seem like a lot of work, roughly doubling the amount of code. In practice, when the price code is cleanly written and fully tested, adding the differentiation version is relatively straightforward and quick, certainly a lot quicker than writing the original pricing code.
My next post will deal with the results obtained in practice and some extension to the standard implementation in finance.
For more information on algorithmic differentiation:
Adjoint Algorithmic Differentiation: Calibration and Implicit Function Theorem (PDF)
Webinar: Algorithmic Differentiation in Finance - Fast Greeks and Beyond
Giles, M. and Glasserman, P. (2006). Smoking adjoints: fast Monte Carlo Greeks. Risk, 2006, 19, 88-92
Griewank, A. and Walther, A. (2008). Evaluating derivatives: principles and techniques of algorithmic differentiation. SIAM, second edition.
1As a reference on the subject, I recommend: Griewank and Walther (2008)