
Lastly, we can drop the original ‘team’ variable from the DataFrame since we no longer need it: #drop 'team' columnįinal_df. Step 3: Drop the Original Categorical Variable Note: You can find the complete documentation for the OneHotEncoder() function here. Notice that three new columns were added to the DataFrame since the original ‘team’ column contained three unique values. #merge one-hot encoded columns back with original DataFrame #perform one-hot encoding on 'team' columnĮncoder_df = pd. preprocessing import OneHotEncoderĮncoder = OneHotEncoder(handle_unknown=' ignore') Next, let’s import the OneHotEncoder() function from the sklearn library and use it to perform one-hot encoding on the ‘team’ variable in the pandas DataFrame: from sklearn. Step 1: Create the Dataįirst, let’s create the following pandas DataFrame: import pandas as pdĭf = pd. The following step-by-step example shows how to perform one-hot encoding for this exact dataset in Python. The basic idea of one-hot encoding is to create new variables that take on values 0 and 1 to represent the original categorical values.įor example, the following image shows how we would perform one-hot encoding to convert a categorical variable that contains team names into new variables that contain only 0 and 1 values: One-hot encoding is used to convert categorical variables into a format that can be readily used by machine learning algorithms.
