By Kirk Harrington
The dependent variable in a model is extremely important, however its accuracy can sometimes be overlooked (particularly if it has to do with timing and if it is to explain behavior). What is predicted is a key part of the model and it drives the results
that ultimately affect any predictions that come out. An incorrectly made dependent variable can
also affect the accuracy of the independent variables that are estimated off of
it. Consider a dependent variable for a default model (logistic time series):
If your object is to explain account behavior leading up to the end of the behavior cycle point, the end of the behavior cycle point would be the best timing for the dependent variable (in this example it could be the last episode of an account going into 90 days before it starts the default cycle of >90 days). This would vary by account, however the independent variables leading up to this end of behavior cycle point would be most reflective of the account behavior at that 'time'.
If you take the dependent variable farther out, say to when an account falls off the books, this is neither in the account's control or in the organization's control (the organization doing the charge-off) and any appended independent variables meant to predict the dependent variable would have trouble coalescing around positive and negative correlations with the dependent variable.
After the point of 'no control' (in this case when an account falls off the books during the default cycle), many things can happen--there can be a 'blitz' of charge-offs at the end of a given quarter, month, or year. It can take forever for a charge-off to finally happen because of legal matters and processing. The organization may not have a set policy for charge-offs, therefore the timing of when it happens can vary widely. 'Normalizing' the dependent variable around the end of behavior cycle point vs. when it 'fall off the books' will improve your model's ability to capture account behavior before it reaches the point of 'no control'.
~~~
Effective Analyst Thought:
If you are in the position to consult with a client as an analyst, don't just ask them how they want something measured and not question it (especially if the person you may be dealing with does not have a modeling background). Ask more questions, learn their data a bit more, and work together with the client to determine the best measurement that is statistically sound. It is better to hash this out in the beginning vs. later after the
model is built and already being used.

No comments:
Post a Comment