How could I visualise the importance of different inputs to the forecast for a black-box non-linear model?

General Tech Learning Aids/Tools 2 years ago

0 1 0 0 0 tuteeHUB earn credit +10 pts

5 Star Rating 1 Rating

Posted on 16 Aug 2022, this text provides information on Learning Aids/Tools related to General Tech. Please note that while accuracy is prioritized, the data presented might not be entirely correct or up-to-date. This information is offered for general knowledge and informational purposes only, and should not be considered as a substitute for professional advice.

Take Quiz To Earn Credits!

Turn Your Knowledge into Earnings.

tuteehub_quiz

Answers (1)

Post Answer
profilepic.png
manpreet Tuteehub forum best answer Best Answer 2 years ago

 

I am building an interactive forecast tool (in python) as an aid to forecasting that is done in my organisation. To date the forecast process has been largely human driven, with forecasters assimilating the data in their natural neural networks and using their learned gut feel to make predictions. From a long term forecast verification and predictive modelling study I've done I've found what you might expect; different forecasters exhibit different biases, the effects of some predictors seem to be overstated and other important ones seem to be ignored and in general the forecast performance is mediocre compared with relatively simple empirical models.

The forecasts will continue to be manual, but I am trying to build a useful tool to provide the forecasters with a better quantification of the relative effects of predictors. There are also important effects such as seasonal influences that are often overlooked that I would like the tool to highlight to the user. I am expecting a degree of backlash and scepticism about the modelling process from some of the more 'experienced' forecasters (many of whom have little formal knowledge of statistics), so the communication is at least as important and the model performance itself in terms of achieving a measurable improvement in forecast accuracy.

The models I'm developing have a strong auto-regressive component that is at times modified significantly by events which show up as measured values in some predictors that are, during non-event times, close to zero. This accords with the mental model that forecasters use. The key part is being able to demonstrate which of the 'event' measurements are most influential in driving the prediction away from the auto-regressive value for any given forecast. I imaging the process in this way; the forecaster divines their best guess value, the model suggests a different one and the forecaster asks why. The model replies something like "see here, this value of this predictor increases the forecast value in Summer. If it was Winter, it would move the other way. I know there are these other measurements, but they have much less effect than this one".

Now, imagine the model was a simple linear regression. One could imagine displaying the relative 'effect' of event based predictors by multiplying the value by the model co-efficient and displaying as a simple bar chart. All the bars from the different predictors add up to the total deviation from the AR value, and this succinctly and clearly shows the ones that are, in this instance, having a strong influence.

The problem is that the process being forecast displays a high degree of non-linearity in the predictors, or at least, I have had much more success with black-box non-linear machine learning algorithms (random forest and GBM) than with GLMs for this data-set. Ideally I would like to be able to seamlessly change the model working 'under the hood' without the user experience changing, so I need some generic way of demonstrating in a simple fashion the importance of the different measurements without using some algorithm specific approach. My current approach will be to quasi-linearise the effects by setting all values to zero except for one predictor, record the predicted deviation and then repeat for all predictors, displaying the results in the bar chart mentioned above. In the presence of strong non-linearity, this may not work so well. Are there any known approaches for achieving in a clear way what I am trying to do here?

No matter what stage you're at in your education or career, TuteeHub will help you reach the next level that you're aiming for. Simply,Choose a subject/topic and get started in self-paced practice sessions to improve your knowledge and scores.