Saturday, February 4, 2012

A Valuable Lesson I Learned doing a Credit Default Model

 By Kirk Harrington

As some of you know, I have worked on Marketing and Credit Risk Models.  This past week, I was finishing up work on a credit default model (to come up with predictions for Allowance for Loan Loss provision) and learned a valuable lesson.

The lesson was this...I had prepared a report, with average probabilities by Risk group, < a given % and more than a given percent (basically high and low risk).  I also split these into delinquency groups (one group was <30 delinquency and the other 30 or 60--I was modeling to find out who reached >=90 in a 12 month time window).

One thing that was odd about the average probabilities was that the 30 or 60 delinquency group's probability seemed very low.  In fact, when I ran out the actual losses by account and dollar amount, something seemed off.  I was predicting a very low number of accounts for a group that had much higher actuals.

This led me to create a decision tree (I was using CHAID for this excercise) on the two delinquency groups (one for each group).  What I found was that 5 of the 8 predictors were being used in the < 30 delinquency group and 0/8 predictors in the 30 and 60 group.  This was a clue that these two populations were VERY different.  Further, it was obvious that the average probabilities across nodes were sharply different (the probabilities of going bad were MUCH higher for the 30-60 delinquency groups).

What was happening (and I verified this with other numbers) is that these special delinquency groups' probabilities were being driven by the 'majority' node by node.  In most nodes, the population proportions were about 99 to 1 (or 99% < 30, and 1% in 30 to 60).

This was clearly an example of a population 'within' the majority that was fouling up the probabilities in my end report.  To rectify this, I simply created a 'forced-split' tree, with this delinquency split at the top.  The probabilities came out much more stable and all my model diagnosis statistics improved as well (i.e. KS, Gini).




No comments:

Post a Comment