From the course: Artificial Intelligence Foundations: Machine Learning

Unlock the full course today

Join today to access over 23,200 courses taught by industry experts.

Demo: Performing feature engineering

Demo: Performing feature engineering

- [Instructor] Let's look at handling missing data. In the Jupyter Notebook, we're using machine learning to impute or determine the missing data. KNN or K-nearest neighbors is a machine learning algorithm that is the quick and effective method to impute missing values when your data set is small. KNN identifies a sample with one or more missing values, then it identifies the K most similar samples in the training data that are complete, ie, have no missing values in some columns, and then replaces the missing value. Let's use KNN to calculate or predict the missing values. I've navigated to the Jupyter Notebook and the first thing I'm going to do, I'm going to run all of the cells. I do that by clicking on run and selecting run all cells. Now let's go down to the section where we are imputing the missing data. So let's scroll down. Here we are. The first thing we'll need to do is import the KNN imputer. Next, we'll create…

Contents