What is underfitting and overfitting of the ML model, and how can it be prevented?

4 min readFeb 28, 2023

“The essence of machine learning is to generalise from examples.” — Tom M. Mitchell, Machine Learning

Some of the techniques used to prevent underfitting and overfitting

The goal of a machine learning algorithm is to minimize both bias and variance.

Bias refers to the error introduced by a model’s simplifying assumptions about the underlying relationships between the features and the target variable. A high bias model has high training error and low test error. It means that the model is not complex enough to capture the underlying patterns in the data.

Variance, on the other hand, refers to the error introduced by a model’s sensitivity to fluctuations in the training data. A high variance model has low training error and high test error.

Underfitting and overfitting are two common problems that can occur when training machine learning models.

Underfitting occurs when a model is too simple to capture the underlying patterns in the data. In other words, the model is not complex enough to fit the data well, resulting in poor performance on both the training and test data. Underfitting can occur when the model is not trained for long enough or when the model architecture is too simple.

On the other hand, overfitting occurs when a model is too complex and captures noise or irrelevant patterns in the data, rather than the underlying patterns. This results in excellent performance on the training data but poor performance on the test data. Overfitting can occur when a model is trained for too long or when the model architecture is too complex.

Some of the techniques used to prevent underfitting and overfitting.

Increasing model complexity: One way to prevent underfitting is to use a more complex model architecture. For example, if a linear regression model is underfitting the data, we can try a polynomial regression model with higher degree terms. Similarly, in deep learning, we can add more layers or increase the number of neurons in each layer to make the model more complex.
Increasing the number of training epochs: If the model is not trained for long enough, it may not have enough time to learn the underlying patterns in the data. Increasing the number of training epochs can help the model learn more complex patterns and prevent underfitting.
Adding more training data: If the model is underfitting the data, we can try adding more training data to provide the model with more examples of the underlying patterns.
Reducing regularization: Regularization is a technique used to prevent overfitting by adding a penalty term to the loss function. However, if the model is underfitting the data, we can try reducing the regularization to allow the model to fit the data more closely.
Using a simpler model architecture: If the model is overfitting the data, we can try using a simpler model architecture to prevent the model from capturing noise or irrelevant patterns in the data.
Reducing the number of training epochs: If the model is overfitting the data, we can try reducing the number of training epochs to prevent the model from fitting the noise in the data.
Adding regularization: Regularization can help prevent overfitting by adding a penalty term to the loss function. Common regularization techniques include L1 regularization, L2 regularization, and dropout.
Using early stopping: Early stopping is a technique used to prevent overfitting by stopping the training process when the performance on the validation set stops improving.
Using cross-validation: Cross-validation is a technique used to estimate the performance of the model on unseen data by splitting the data into training and validation sets and testing the model on multiple splits.
Using data augmentation: Data augmentation is a technique used to increase the size of the training dataset by applying transformations to the data, such as rotation, scaling, and flipping. This can help prevent overfitting by providing the model with more examples of the underlying patterns.

Many thanks for reading this post!🙏.

If you found this content helpful😊, please LIKE 👍, SHARE, and FOLLOW to stay updated on our future posts.

If you have a moment, I encourage you to see my other kernels below:

Handy Python Pandas for Data Filtering

Data Cleaning & Data Preparation Series — df.query(), df.loc(row label, column lebel), df.iloc(integer row index…

learner-cares.medium.com

Deploying Breast Cancer Prediction Model Using Flask APIs and Heroku

ML Model to Predict Whether the Cancer Is Benign or Malignant on Breast Cancer Wisconsin Data Set

medium.com

Handwritten Digit Recognition using Convolutional Neural Network (CNN) with Tensorflow

A Deep Learning Analysis with Real World Data

learner-cares.medium.com

What is gradient descent, and how is it used in machine learning?

Gradient descent is an optimization algorithm used to minimize the cost or loss function of a machine learning model…

learner-cares.medium.com

What is underfitting and overfitting of the ML model, and how can it be prevented?

Handy Python Pandas for Data Filtering

Data Cleaning & Data Preparation Series — df.query(), df.loc(row label, column lebel), df.iloc(integer row index…

Deploying Breast Cancer Prediction Model Using Flask APIs and Heroku

ML Model to Predict Whether the Cancer Is Benign or Malignant on Breast Cancer Wisconsin Data Set

Handwritten Digit Recognition using Convolutional Neural Network (CNN) with Tensorflow

A Deep Learning Analysis with Real World Data

What is gradient descent, and how is it used in machine learning?

Gradient descent is an optimization algorithm used to minimize the cost or loss function of a machine learning model…

Written by Learner CARES

Responses (2)