In the machine learning technology whenever a model is made, it needs to be validated. What does validation mean here? Validation of any model means to check whether the developed model is valid or not. This means to check whether the designed model is giving the desired output or not. In the machine learning technology, it is always checked whether the machine has properly learned the data or not.
WHAT IS CROSS-VALIDATION?
Cross-validation is a technique which is used for examining the statistical models. It examines how a statistical model can be formulated or represented as a normal data set. The cross-validation technique is majorly used in predictive modeling where predictions are made. In predictive modeling, it is examined how accurately a model can predict.
The aim of the cross-validation technique is to design a set of data which can be used for testing the models. The set of data which is designed for testing of the models is also known as the validation data set. Not only in machine learning and data science technology, but this phase of testing is also conducted in every project. Here, in machine learning as said above, a data set is prepared and this data set is given as an input to the designed model. There is a special team in a company which gets paid only for finding errors in the models.
TRAINING DATA SET AND VALIDATION DATA SET
There is a small difference between the training data set and validation data set. To many people, these two data sets seem the same. The training data set is that data from which the machine learns. We know that in machine learning technology, different data sets are given to the machine from which it learns. The learning can be supervised or unsupervised, that does not matter. As said above, the validation data set is that set which is used for testing the model.
WHY WE USE CROSS-VALIDATION TECHNIQUE?
Here are some reasons which show why we should use the cross-validation technique in our model-->
WHAT IS UNDERFITTING?
Under fitting is a term which is used in the cross-validation technique. Under fitting refers to that condition when a model does not have an ability to recognize a greater number of patterns from the data. It does not perform well on both the sets, training set as well as validation data set.
WHAT IS OVERFITTING?
Over fitting refers to the condition when a model performs well on the training set, but not on the validation data set.
Those who are interested in gaining more knowledge about the cross-validation technique or anything about the data science training in Pune can visit the mentioned link.