Skip to main content
All CollectionsFAQModeling FAQGeneral Modeling FAQ
What dataset training size should I use?
What dataset training size should I use?

This article discusses training size when training a machine learning model.

Updated over 6 months ago

Training size refers to the portion of your data you will use to train your model as opposed to the portion of the dataset that will be set aside and used to validate / test how good your model is. By default the G2M platform will split your dataset into two equal portions, and use 50% for training and 50% for testing.

If your dataset is small, you may want to increase the share devoted to training as long as you keep at least a few hundred rows for testing. Conversely, if your dataset is large, you may want to reduce the training size to decrease your training time.


Did this answer your question?