Recursive Feature Elimination (RFE) in Python

This is the age of Artificial Intelligence and machine learning. Although we haven’t reached the point where we have sentient human-like computers (yet) so often featured in popular science fiction films and television programs, we have made significant strides in intelligent machines over the past few decades.

However, nothing happens in a vacuum. People often say that computers are smart, but computers are only as intelligent as they are programmed to be. It takes a lot of effort and different elements to create an intelligent machine, and we are about to explore one particularly important element.

Today, we are covering the process called Recursive Feature Elimination, or RFE for short. RFE deals with Machine Learning models and plays a vital role in improving the machine’s performance. This article hopes to demystify RFE and show its importance.

But first, we need to backtrack and go over some Machine Learning concepts to make a better case for RFE.

A Machine Learning Refresher

Industry leader IBM defines machine learning as “a branch of artificial intelligence (AI) and computer science which focuses on the use of data and algorithms to imitate the way that humans learn, gradually improving its accuracy.”

Organizations employ machine learning in cases such as online customer service chatbots, speech recognition (e.g., Alexa, Siri), computer vision (self-driving cars, social media photo tagging), and recommendation engines (making purchasing suggestions to customers based on their buying history).

There are three primary situations where machine learning comes in handy:

In a situation involving repeated decisions or evaluations that you want automated and receive consistent results.
In a situation where it’s either difficult or impossible to describe a detailed solution or criteria used to make a decision.
In a situation where you have existing examples or labeled data that can best describe the case, then map it to the correct result.

Machine learning is only as good as its machine learning model, which leads us to our following definition.

What Is a Machine Learning Model?

Microsoft.com defines a Machine Learning mode as “…a file that has been trained to recognize certain types of patterns.” Data scientists use data sets to train a model, giving it an algorithm to learn from the data provided.

Once you train the model, it can reason over data that it has never seen before and make predictions based on the information. For instance, if you wanted to design a facial recognition application, you could train the model by offering it a set of facial images, each one tagged with a particular emotion. You can then use the model to recognize anyone’s feelings or sentiments.

Machine Learning models consist of features, and each feature represents a piece of data that is employed in analysis. Features are input variables, a measurable property that helps achieve better pattern recognition. Using our example of facial recognition software, the salient features might include eye color, eyebrow position, ear shape, mouth shape, visible teeth, skin blemishes, forehead wrinkles, etc.

Unfortunately, in the world of Machine Learning, there’s such a thing as too much information. If the data scientist has too many features to work with, the surplus could adversely affect the model’s performance. Thus, the data scientist needs to eliminate the less relevant features. This issue leads us neatly to our next section!

Become a AI & Machine Learning Professional

$267 billionExpected global AI market value by 2027
37.3%Projected CAGR of the global AI market from 2023-2030
$15.7 trillionExpected total contribution of AI to the global economy by 2030

Post Graduate Program in AI and Machine Learning
- Program completion certificate from Purdue University and Simplilearn
- Gain exposure to ChatGPT, OpenAI, Dall-E, Midjourney & other prominent tools
11 months
View Program
Artificial Intelligence Engineer
- Add the IBM Advantage to your Learning
- Generative AI Edge
11 months
View Program

prevNext

Here's what learners are saying regarding our programs:

Akili Yang
Personal Financial Consultant, OCBC Bank
The live sessions were quite good; you could ask questions and clear doubts. Also, the self-paced videos can be played conveniently, and any course part can be revisited. The hands-on projects were also perfect for practice; we could use the knowledge we acquired while doing the projects and apply it in real life.
Indrakala Nigam Beniwal
Technical Consultant, Land Transport Authority (LTA) Singapore
I completed a Master's Program in Artificial Intelligence Engineer with flying colors from Simplilearn. Thanks to the course teachers and others associated with designing such a wonderful learning experience.

prevNext

Not sure what you’re looking for?View all Related Programs

What Is Recursive Feature Elimination?

Recursive Feature Elimination, or RFE Feature Selection, is a feature selection process that reduces a model’s complexity by choosing significant features and removing the weaker ones. The selection process eliminates these less relevant features one by one until it has achieved the optimum number needed to assure peak performance.

RFE ranks features by the model’s “coef” or “feature importances” attributes. It then recursively eliminates a minor number of features per loop, removing any existing dependencies and collinearities present in the model.

Recursive Feature Elimination narrows down the number of features, resulting in a corresponding increase in model efficiency.

Let’s apply this to a real-world decision-making scenario. You and your five friends are trying to decide whether to go out to eat or not. As everyone discusses the point at great length, certain factors come up for consideration, including:

Who is hungry enough to eat a full meal
How people’s available funds are holding up
How late people can stay up
What kind of food people do want
The location and types of local eateries
How late do people want to stay out
Who has a car

Now, consider the above items as “features” in the decision-making process. After spending way too much time debating these points, someone finally suggests that the group base their decision only on who is hungry and the locations and types of local eateries. Congratulations! You’ve recursively eliminated many features and have drastically reduced the amount of time needed to decide!

Machine learning data sets for regression or classification consist of rows and columns, resembling an Excel spreadsheet. Rows are often called “samples,” and columns are known as “features.” Feature selection in the machine learning context refers to techniques that pick a subset of the data set's most appropriate features (e.g., columns).

Fewer features take up less space and aren’t as complex, which helps Machine Learning algorithms run more efficiently and effectively. Conversely, irrelevant input features can slow down specific machine learning algorithms and produce an inferior predictive performance.

All About RFE With scikit-learn

Data scientists can implement RFE manually, but the process can be challenging for beginners. It’s also time-consuming, although the time used for RFE should be considered an investment that pays off in the long run.

Nevertheless, the free scikit-learn RFE Python machine learning library offers an exemplary implementation of Recursive Feature Elimination, available in the later versions of the library. Incidentally, scikit-learn is also called sklearn, so if you see the two terms, they mean the same thing.

RFE can be used to handle problems presented by the two models listed below:

Classification: Classification predicts the class of selected data points. Classes are also known as targets, labels, or categories. Classification predictive modeling involves approximating a mapping function (f) from input variables (X) to discrete output variables (y).
Regression: Regression models supply a function describing the relationship between one (or more) independent variables and a response, dependent, or target variable.

Let’s Talk About RFE Hyperparameters

Here are some hyperparameters you should consider for fine-tuning the chosen RFE method for feature selection and how they affect model performance.

Explore the Number of Features: One of the essential hyperparameters is the number of features to select. That's why it's important to test different features and see which yields the best results. Watch for where the RFE peaks concerning the number of features configured.
Automatically Select Number of Features: You can choose to select the feature numbers that RFE will automatically decide. You can accomplish this by performing a cross-validation evaluation of different features as shown in the previous hyperparameter and automatically choosing the number of features that produced the best mean score. Use the RFECV class to carry this out. Use the RFECV class to carry this out.
Which Features Were Selected? If you’re curious about which features were chosen and which were discarded, you can review the fit RFE object (or fit RFECV object) attributes. The “support_” attribute uses “true/false” to show which features were included, in order of column index. The “ranking_” attribute displays the relative features ranking in the same order.
Explore the Base Algorithm: The core RFE can potentially use a vast number of algorithms. Additionally, different algorithms can produce different results. Thus, you should experiment by changing the base algorithm and see the results. Choose from the decision tree, random forest, linear, or pipeline, to name a few.

Why Not Choose a Career in Machine Learning?

Artificial Intelligence and Machine Learning are fast-growing fields in today’s digital world. So, if you’re curious about a new career (or making a change from an old one!) and you want something exciting, challenging, and with great rewards and job security, consider Machine Learning.

Simplilearn offers a Caltech Post Graduate Program in AI and Machine Learning which will help you hone the right skills and make you job-ready.

Glassdoor reports that Machine Learning Engineers in the United States earn a yearly average of USD 131,001. Payscale.com shows that Machine Learning Engineers in India make an annual average of ₹ 732,099.

The Future of Jobs Report 2020 reported that the artificial intelligence field will create 12 million new jobs across 26 countries by 2025. However, this figure represents a net gain, since the report predicts that 85 million jobs will be displaced while 97 new AI/ML-related jobs will be created.

This outlook is your opportunity to not only explore new career options but also protect yourself from possible AI-related job displacement. Let Simplilearn help prepare you for the brave new world of Artificial Intelligence and Machine Learning. Check out our courses today!

Become a AI & Machine Learning Professional

$267 billionExpected global AI market value by 2027
37.3%Projected CAGR of the global AI market from 2023-2030
$15.7 trillionExpected total contribution of AI to the global economy by 2030

Post Graduate Program in AI and Machine Learning
- Program completion certificate from Purdue University and Simplilearn
- Gain exposure to ChatGPT, OpenAI, Dall-E, Midjourney & other prominent tools
11 months
View Program
Artificial Intelligence Engineer
- Add the IBM Advantage to your Learning
- Generative AI Edge
11 months
View Program

prevNext

Here's what learners are saying regarding our programs:

Akili Yang
Personal Financial Consultant, OCBC Bank
The live sessions were quite good; you could ask questions and clear doubts. Also, the self-paced videos can be played conveniently, and any course part can be revisited. The hands-on projects were also perfect for practice; we could use the knowledge we acquired while doing the projects and apply it in real life.
Indrakala Nigam Beniwal
Technical Consultant, Land Transport Authority (LTA) Singapore
I completed a Master's Program in Artificial Intelligence Engineer with flying colors from Simplilearn. Thanks to the course teachers and others associated with designing such a wonderful learning experience.

prevNext

Not sure what you’re looking for?View all Related Programs

Program Name	Duration	Fees
AI & Machine Learning Bootcamp Cohort Starts: 6 May, 2024	6 Months	$ 10,000
Generative AI for Business Transformation Cohort Starts: 15 May, 2024	4 Months	$ 3,350
Applied Generative AI Specialization Cohort Starts: 31 May, 2024	4 Months	$ 4,000
Post Graduate Program in AI and Machine Learning Cohort Starts: 3 Jun, 2024	11 Months	$ 4,800
AI and Machine Learning Bootcamp - UT Dallas	6 Months	$ 8,000
Artificial Intelligence Engineer	11 Months	$ 1,449

Table of Contents

A Machine Learning Refresher

What Is a Machine Learning Model?

What Is Recursive Feature Elimination?

All About RFE With scikit-learn

Let’s Talk About RFE Hyperparameters

Why Not Choose a Career in Machine Learning?

Recursive Feature Elimination: What It Is and Why It Matters

Table of Contents

A Machine Learning Refresher

What Is a Machine Learning Model?

What Is Recursive Feature Elimination?

All About RFE With scikit-learn

Let’s Talk About RFE Hyperparameters

Why Not Choose a Career in Machine Learning?

A Machine Learning Refresher

What Is a Machine Learning Model?

Become a AI & Machine Learning Professional

Post Graduate Program in AI and Machine Learning

Artificial Intelligence Engineer

Here's what learners are saying regarding our programs:

Akili Yang

Personal Financial Consultant, OCBC Bank

Indrakala Nigam Beniwal

Technical Consultant, Land Transport Authority (LTA) Singapore

What Is Recursive Feature Elimination?

All About RFE With scikit-learn

Let’s Talk About RFE Hyperparameters

Why Not Choose a Career in Machine Learning?

Become a AI & Machine Learning Professional

Post Graduate Program in AI and Machine Learning

Artificial Intelligence Engineer

Here's what learners are saying regarding our programs:

Akili Yang

Personal Financial Consultant, OCBC Bank

Indrakala Nigam Beniwal

Technical Consultant, Land Transport Authority (LTA) Singapore

Our AI & Machine Learning Courses Duration And Fees

Learn from Industry Experts with free Masterclasses

Project Management

Project Management

AI & Machine Learning

Recommended Reads

Learn from Industry Experts with free Masterclasses

Project Management

Project Management

AI & Machine Learning

Get Affiliated Certifications with Live Class programs

Post Graduate Program in AI and Machine Learning

Artificial Intelligence Engineer