Kindly log in to use this feature. We’ll take you to the login page automatically.
LoginGeneral Tech Learning Aids/Tools 3 years ago
User submissions are the sole responsibility of contributors, with TuteeHUB disclaiming liability for accuracy, copyrights, or consequences of use; content is for informational purposes only and not professional advice.
No matter what stage you're at in your education or career, TuteeHUB will help you reach the next level that you're aiming for. Simply,Choose a subject/topic and get started in self-paced practice sessions to improve your knowledge and scores.
Kindly log in to use this feature. We’ll take you to the login page automatically.
LoginReady to take your education and career to the next level? Register today and join our growing community of learners and professionals.
Your experience on this site will be improved by allowing cookies. Read Cookie Policy
Your experience on this site will be improved by allowing cookies. Read Cookie Policy
manpreet
Best Answer
3 years ago
Problem:
We have a fairly big database that is built up by our own users. The way this data is entered is by asking the users 30ish questions that all have around 12 answers (x, a, A, B, C, ..., H). The letters stand for values that we can later interpret.
I have already tried and implemented some very basic predictors, like random forest, a small NN, a simple decision tree etc.
But all these models use the full dataset to do one final prediction. (fairly well already).
What I want to create is a system that will eliminate 7 to 10 of the possible answers a user can give at any question. This will reduce the amount of data we need to collect, store, or use to re-train future models.
I have already found several methods to decide what are the most discriminative variables in the full dataset. Except, when a user starts filling the questions I start to get lost on what to do. None of the models I have calculate the next question given some previous information.
It feels like I should use a Naive Bayes Classifier, but I'm not sure. Other approaches include recalculating the Gini or entropy value at every step. But as far as my knowledge goes, we can't take into account the answers given before the recalculating.