Data Preprocessing Interview
š
Data Preprocessing Interview
Question: How do you handle missing values in a dataset during preprocessing?
Answer: Missing values can be handled using various techniques depending on the context:
- Remove rows or columns with excessive missing data.
- Impute missing values using statistical methods such as mean, median, or mode.
- Use predictive modeling to estimate missing values.
- For time series data, use forward-fill or backward-fill methods.
1
import pandas as pd
2
df = pd.read_csv('data.csv')
3
# Impute missing values with mean
4
df.fillna(df.mean(), inplace = True)