When we run regression, we hope to be able to generalize the sample model to the entire population. To do so we have to meet several assumptions of the multiple linear regression model. If we are violating these assumptions it stops our generalizing conclusions to our target population because the results might be biased or misleading, so what are the assumptions ? how do we check them ?
Data Science
Handling Categorical Features with SciKitLearn
After dealing with missing data in your dataset. You will most likely face Categorical Features in numerous datasets. In the majority of cases, these features tend to be non-numerical and thus need to be converted to be processed in machine learning algorithms.
Handling missing data with SciKit SimpleImputer
When working on data science projects, it’s very likely that you’ll be encountering missing data in your columns. It’s not ideal to disregard or take out all the rows containing missing data for any project. Other columns for the same row where the data is missing can be critical for the data preparation state, so it’ll be wiser to infer or find a way to fill in the missing values in our dataset for a better outcome.