How to approach feature elimination?

General Tech Learning Aids/Tools 2 years ago

0 2 0 0 0 tuteeHUB earn credit +10 pts

5 Star Rating 1 Rating

Posted on 16 Aug 2022, this text provides information on Learning Aids/Tools related to General Tech. Please note that while accuracy is prioritized, the data presented might not be entirely correct or up-to-date. This information is offered for general knowledge and informational purposes only, and should not be considered as a substitute for professional advice.

Take Quiz To Earn Credits!

Turn Your Knowledge into Earnings.

tuteehub_quiz

Answers (2)

Post Answer
profilepic.png
manpreet Tuteehub forum best answer Best Answer 2 years ago

I have been working on a couple of dataset to build predictive models based on them. However I am left a bit bewildered when its coming to elimination of features.

The first one is the Boston Housing dataset and the second is Bigmart Sales dataset. I will focus my question around these two however I would also appreciate relatively generalized answers too.

Boston Housing : I have constructed a correlation coefficient matrix and eliminated the features which has an absolute correlation coefficient of less than 0.50 with respect to the target variable medv. That is leaving me with three features. However, I also do understand that a correlation matrix can be highly deceptive and does not capture non-linear relationships and as a matter of fact features such as crim, indus etc does have non-linear relationship with medv and intuitively it simply does not feel correct to discard them right away.

Bigmart Sales : There are around 30+ features that is created after OneHotEncoding in Python. I have given a go to backward elimination method while I was constructing a linear regression model but I am not exactly sure how to apply backward elimination when I was working on a Decision Tree model for this dataset (not sure if it can actually be applied to Decision Tree at all).

It would be of great help if I can get some idea on how to approach to feature elimination for the above two cases. Let me know if you need more info, I will gladly provide.

profilepic.png
manpreet 2 years ago

It's extremely general question. I don't think that it possible to answer to your question in StackOverFlow format.

For every ML / Statistical model you need different Feature Elimination / Feature Engineering approach:

  • Linear / Logistic / GLM models require removal of correlated features

  • For Neural Nets / Boosted trees removal of features will heart performance of the model

Even for one type of models there's no single best way of doing Feature Elimination

If you can add more specific information to your question it'll be possible to discuss it in details.


0 views   0 shares

No matter what stage you're at in your education or career, TuteeHub will help you reach the next level that you're aiming for. Simply,Choose a subject/topic and get started in self-paced practice sessions to improve your knowledge and scores.