# Credit Risk: Default and Loss Given Default from PRMIA

Great event from PRMIA on Tuesday evening of last week, entitled Credit Risk: The link between Loss Given Default and Default. The event was kicked off by Melissa Sexton of PRMIA, who introduced Jon Frye of the Federal Reserve Bank of Chicago. Jon seems to an acknowledged expert in the field of Loss Given Default (LGD) and credit risk modelling. I am sure that the slides will be up on the PRMIA event page above soon, but much of Jon’s presentation seems to be around the following working paper. So take a look at the paper (which is good in my view) but I will stick to an overview and in particular any anecdotal comments made by Jon and other panelists.

Jon is an excellent speaker, relaxed in manner, very knowledgeable about his subject, humourous but also sensibly reserved in coming up with immediate answers to audience questions. He started by saying that his talk was not going to be long on philosophy, but very pragmatic in nature. Before going into detail, he outlined that the area of credit risk can and will be improved, but that this improvement becomes easier as more data is collected, and inevitably that this data collection process may need to run for many years and decades yet before the data becomes statistically significant.

**Which Formula is Simpler?** Jon showed two formulas for estimating LGD, one a relatively complex looking formula (the Vasicek distribution mentioned his working paper) and the other a simple linear model of the a + b.x. Jon said that looking at the two formulas, then many would hope that the second formula might work best given its simplicity, but he wanted to convince us that the first formula was infact simpler than the second. He said that the second formula would need to be regressed on all loans to estimate its parameters, whereas the first formula depended on two parameters that most banks should have a fairly good handle on. The two parameters were Default Rate (DR) and Expected Loss (EL). The fact that these parameters were relatively well understood seemed to be the basis for saying the first formula was simpler, despite its relative mathematical complexity. This prompted an audience question on what is the difference between Probability of Default (PD) and Default Rate (DR). Apparently it turns out PD is the expected probability of default before default happens (so ex-ante) and DR is the the realised rate of default (so ex-post).

**Default and LGD over Time.** Jon showed a graph (by an academic called Altman) of DR and LGD over time. When the DR was high (lots of companies failing, in a likely economic downtown) the LGD was also perhaps understandably high (so high number of companies failing, in an economic background that is both part of the causes of the failures but also not helping the loss recovery process). When DR is low, then there is a disconnect between LGD and DR. Put another way, when the number of companies failing is low, the losses incurred by those companies that do default can be high or low, there is no discernable pattern. I guess I am not sure in part whether this disconnect is due to the smaller number of companies failing meaning the sample space is much smaller and hence the outcomes are more volatile (no averaging effect), or more likely that in healthy economic times the loss given a default is much more of random variable, dependent on the defaulting company specifics rather than on general economic background.

**Conclusions Beware: Data is Sparse.** Jon emphasised from the graph that the Altman data went back 28 years, of which 23 years were periods of low default, with 5 years of high default levels but only across 3 separate recessions. Therefore from a statistical point of view this is very little data, so makes drawing any firm statistical conclusions about default and levels of loss given default very difficult and error-prone.

**The Inherent Risk of LGD.** Jon here seemed to be focussed not on the probability of default, but rather on the conditional risk that once a default has occurred then how does LGD behave and what is the risk inherent from the different losses faced. He described how LGD affects i) Economic Capital – if LGD is more variable, then you need stronger capital reserves, ii) Risk and Reward – if a loan has more LGD risk, then the lender wants more reward, and iii) Pricing/Valuation – even if the expected LGD of two loans is equal, then different loans can still default under different conditions having different LGD levels.

**Models of LGD**.

Jon showed a chart will LGC plotted against DR for 6 models (two of which I think he was involved in). All six models were dependent on three parameters, PD, EL and correlation, plus all six models seemed to produce almost identical results when plotted on the chart. Jon mentioned that one of his models had been validated (successfully I think, but with a lot of noise in the data) against Moody’s loan data taken over the past 14 years. He added that he was surprised that all six models produced almost the same results, implying that either all models were converging around the correct solution or in total contrast that all six models were potentially subject to “group think” and were systematically all wrong in the ways the problem should be looked at.

Jon took one of his LGD models and compared it against the simple linear model, using simulated data. He showed a graph of some data points for what he called a “lucky bank” with the two models superimposed over the top. The lucky bit came in since this bank’s data points for DR against LGD showed lower DR than expected for a given LGD, and lower LGD for a given DR. On this specific case, Jon said that the simple linear model fits better than his non-linear one, but when done over many data sets his LGD model fitted better overall since it seemed to be less affected by random data.

There were then a few audience questions as Jon closed his talk, one leading Jon to remind everyone of the scarcity of data in LGD modelling. In another Jon seemed to imply that he would favor using his model (maybe understandably) in the Dodd-Frank Annual Stress Tests for banks, emphasising that models should be kept simple unless a more complex model can be justified statistically.

**Steve Bennet and the Data Scarcity Issue**

Following Jon’s talk, Steve Bennet of PECDC picked on Jon’s issue of scare data within LGD modelling. Steve is based in the US, working for his organisation PECDC which is a cross border initiative to collect LGD and EAD (exposure at default) data. The basic premise seems to be that in dealing with the scarce data problem, we do not have 100 years of data yet, so in the mean time lets pool data across member banks and hence build up a more statistically significant data set – put another way: let’s increase the width of the dataset if we can’t control the depth.

PECDC is a consortia of around 50 organisations that pool data relating to credit events. Steve said that capture data fields per default at four “snapshot” times: orgination, 1 year prior to default, at default and at resolution. He said that every bank that had joined the organisation had managed to improve its datasets. Following an audience question, he clarified that PECDC does not predict LGD with any of its own models, but rather provides the pooled data to enable the banks to model LGD better.

Steve said that LGD turns out to be very different for different sectors of the market, particularly between SMEs and large corporations (levels of LGD for large corporations being more stable globally and less subject to regional variations). But also there is great LGD variation across specialist sectors such as aircraft finance, shipping and project finance.

Steve ended by saying that PECDC was orginally formed in Europe, and was now attempting to get more US banks involved, with 3 US banks already involved and 7 waiting to join. There was an audience question relating to whether regulators allowed pooled data to be used under Basel IRB – apparently Nordic regulators allow this due to needing more data in a smaller market, European banks use the pooled data to validate their own data in IRB but in the US banks much use their own data at the moment.

**Til Schuermann**

Following Steve, Til Schuermann added his thoughts on LGD. He said that LGD has a time variation and is not random, being worse in recession when DR is high. His stylized argument to support this was that in recession there are lots of defaults, leading to lots of distressed assets and that following the laws of supply and demand, then assets used in recovery would be subject to lower prices. Til mentioned that there was a large effect in the timing of recovery, with recovery following default between 1 and 10 quarters later. He offered words of warning that not all defaults and not all collateral are created equal, emphasising that debt structures and industry stress matter.

**Summary**

The evening closed with a few audience questions and a general summation by the panelists of the main issues of their talks, primarily around models and modelling, the scarcity of data and how to be pragmatic in the application of this kind of credit analysis.