What was said last
My last post was about the theory of algorithmic differentiation and how we implemented it at OpenGamma.
This post is about the efficiency we obtained in practice for algorithmic differentiation computation of greeks. It also explains how we went one step further with respect to the standard implementation by analysing the model calibration question.
Efficiency in practice
The efficiency described previously is the theoretical efficiency; in practice the code may not be perfect, or some easier-to-maintain choices may be selected in some places. What is the impact of the practice on the theoretical figures mentioned above? Maybe surprisingly, the results are not only very good, but in a lot of cases well below the theoretical upper bound.
In this section we take three examples, in increasing level of sophistication. The numbers are obtained with the OpenGamma OG-Analytics library performing valuation of a large number of instruments.
One function used a lot in finance is the Black option price formula. The formula and its derivatives version are implemented in the class BlackPriceFunction. For that function the computation time for the price only and for the price and its three derivatives (with respect to forward, strike and volatility) are (time for 1,000,000 prices):
|Price and 3 derivatives||195 ms|
This is a ratio of 1.15 to 1 (only 15% more for 4 numbers instead of 1). When analysing how such a result is possible, one has to remember that algorithm-specific information can be used. In this case the exercise boundary (often called d1) is an optimal boundary (in the sense it gives the highest value to the price) and the derivative with respect to a variable at its optimum is 0. That information can be used to speed up the computation (there is no need to algorithmically compute the number we know is 0). With a similar implementation but without the algorithm-specific information, the time is 245 ms (ratio 1.45 to 1). The gain due to the optimisation is roughly 30% of the pricing time.
The second example is a interest rate derivative one. The price of a swaption (with physical settlement) is computed in the Hull-White model. The methods are implemented in the classSwaptionPhysicalFixedIborHullWhiteMethod. The derivatives are with respect to all the rates in the yield curves used to compute the cash-flow (2 curves, quarterly payments for 5 years = 40 sensitivities). The computation times are (time for 10,000 swaptions):
|Present value||90 ms|
|Present value and curve sensitivity (40 derivatives)||320 ms|
Here, again, the ratio is below the theoretical upper bound: 3.5 to 1.
The last example relates to CMS spread options. The options are priced with a binormal model on swap rates with strike dependent volatility and calibrated to CMS prices obtained by replication with SABR smile. The methods are implemented in the classCapFloorCMSSpreadSABRBinormalMethod. In the example we use a 10Y-2Y spread. The times are (time for 100 spread options):
|Present value||40 ms|
|Curve sensitivity (80 derivatives)||235 ms|
|SABR risks (4 derivatives)||185 ms|
In this case the ratio is not as good (but still 10 times faster than finite difference for the curve sensitivity). One of the reasons is that the sensitivities are computed by recomputing the numerical integral used in the replication.
One step further: calibration and implicit functions
One step that can be quite time consuming when pricing an exotic instrument is the model calibration. The process is as follows:
- the price of an exotic instrument is related to a specific basket of vanilla instruments;
- the price of those vanilla instruments is computed in a given base model;
- the complex model parameters are calibrated to fit the the vanilla option prices from the base model. This step is usually done through a generic numerical equation solver; and
- the exotic instrument is priced with the calibrated complex model.
In the algorithmic differentiation process, we suppose that the pricing algorithms and their derivatives are implemented. We want to differentiate the exotic price with respect to the parameters of the base model. In the bump and recompute approach, this corresponds to computing the risks with model recalibration. As the calibration can represent an important fraction of the total computation time of the procedure described above, an alternative method is highly desirable. In general, the calibration process is not explicit, but done through a numerical equation solving. We have only one set of parameters of the calibrated model; the set that matches the prices from a specific set of curve and base model parameters. There is no explicit algorithm that provides the calibrated model parameters as function of the base model, and even less its adjoint algorithmic differentiation. In a recent paper (Henrard (2011)) I described a way to combine efficiently the calibration procedure with the algorithmic differentiation through the implicit function theorem. With that apporach the results are extremely good.
The more speaking results of the paper are reproduced here. The prices of amortised swaptions are computed in a Libor Market Model calibrated to a set of SABR models for vanilla swaptions of different tenors. The sensitivities computed are the sensitivities of the price with respect to the curves and to the SABR models parameters. The amortisation scheme for the swaption is annual. So for each year there is an underlying SABR model for the vanilla swaptions with its own set of parameters, and we would like to compute the sensitivity of the final price to those initial inputs, with the understanding that the sensitivity is the total sensitivity, including the calibration mechanism.
The table below is for a 10 years swaption with annual amortisation and semi-annual floating leg in a multi-curves setting. There are around 40 interest rate sensitivities (2 curves for a 10 years instrument with 2 payments a year) and 30 SABR parameters (10 calibration instruments with 3 SABR parameters for each). The results are (time in seconds for 250 swaptions; the "Risks" time is the additional time on top of the value time required to obtain the risk type):
|SABR||AAD and implicit function||0.425||0.075||0.500|
|Curve||AAD and implicit function||0.425||0.315||0.740|
|Curve and SABR||Finite difference||0.425||72x0.425||31.025|
|Curve and SABR||AAD and implicit function||0.425||0.320||0.745|
The same results can be computed for swaptions with different tenors. In Figure 1, we graphically represent the results for tenors between 2 and 30 years. In our implementation, the time ratio for all rate risks and parameter risks is below 2.5 in all cases (and often below 2) while in the finite difference case, the ratio is linearly growing with the tenor, ranging from 14 (2 years with 7 risk per year) to 210 (30 years with 7 risk per year). At the extreme the computation time is divided by almost 100.
What about higher order?
The higher order derivatives can also be useful; unfortunately algorithmic differentiation does not extend easily to them. One can take the derivative of the derivative. But for a n-input and 1 output function at a cost of 1, the first order will be n input, n output for a cost of 4 (using ADD) and the second order a n input and n x n output for a cost of 4xn x 4, which is not faster than applying a finite difference mechanism to the first order derivatives at a cost of n x 4 (one sided and 2xn x 4 for symmetrical differential ratio).
We have generally not implemented second order derivative mechanisms at OpenGamma. The exceptions are the Black and the SABR formulas. In the SABR extrapolation, to obtain a probability distribution without point mass, we extrapolate the price with a continuous second order derivative; for this we need an efficient second order implementation. When we compute the derivatives of the extrapolated price with respect to the input, we somehow need a third order derivative with respect to some parameters (two orders for the smooth extrapolation and one order for the derivative). In that case we used explicit second order SABR derivatives and finite difference for the third order.
We have systematically implemented algorithmic differentiation (generally through the adjoint mode) throughout the OG-Analytics library. We already had a good code base, including some derivative building blocks, when we decided to formally use adjoint algorithmic differentiation. We decided to implement it through manually written code. The results we obtained are very good. We have also implemented some implied function based methods to cope with model calibration. In that case the results are even better.
More information on Algorithmic Differentiation
Giles, M. and Glasserman, P. (2006). Smoking adjoints: fast Monte Carlo Greeks. Risk, 2006, 19, 88-92
Griewank, A. and Walther, A. (2008). Evaluating derivatives: principles and techniques of algorithmic differentiation. SIAM, second edition.
Henrard, M. (2011). Adjoint Algorithmic Differentiation: Calibration and Implicit Function Theorem. Journal of Computational Finance, to appear. Preprint version at http://ssrn.com/abstract=1896329.