How can Tensorflow be used to split the flower dataset into training and validation?
Last Updated :
28 Apr, 2025
The Tensorflow flower dataset is a large dataset that consists of flower images. In this article, we are going to see how we can split the flower dataset into training and validation sets. For the purposes of this article, we will use tensorflow_datasets to load the dataset. It is a library of public datasets ready to use with TensorFlow in Python.
To import the flower dataset, we are going to use the tfds.load() method. It is used to load the named dataset, which is provided using the name argument, into a tf.data.Dataset. The name for the flower dataset is tf_flowers. In the method, we also split the dataset using the split argument with training_set taking 70% of the dataset and the rest going to test_set.
Loading the Flower Dataset with Initial Splitting
By using tensorflow_datasets we can load some of the standard datasets for training and testing the model's architecture. and performance. It has a load() function which contains multiple attributes which come in handy.
Syntax:
tensorflow_datasets.load(name, split, batch_size, shuffle_files, with_info)
where,
- name - Here you need to provide the name of the dataset you would like to load.
- Split - Optional parameter which you can define if any initial splitting of the dataset is required.
- batch_size - This can form batches of images of the desired size.
- shuffle_files - Default is false but you can pass true if you want to shuffle the data.
- with_info - Default is false but can return configuration of the dataset if setted to True.
Python3
import tensorflow_datasets as tfds
(training_set, test_set), info = tfds.load(
'tf_flowers',
split=['train[:70%]', 'train[70%:]'],
with_info=True,
as_supervised=True,
)
Output:
Downloading and preparing dataset 218.21 MiB
(download: 218.21 MiB, generated: 221.83 MiB, total: 440.05 MiB)
to ~/tensorflow_datasets/tf_flowers/3.0.1...
Dl Completed...: 100%
5/5 [00:02<00:00, 2.43 file/s]
Dataset tf_flowers downloaded and prepared to
~/tensorflow_datasets/tf_flowers/3.0.1. Subsequent calls will reuse this data.
If we print the information provided for the dataset by Tensorflow using the print command, we will get the following output.
Python3
Output:
Now, let's print the sizes of the training and test sets. The following piece of codes so:
Python3
print("Training Set Size: %d" % training_set.cardinality().numpy())
print("Test Set Size: %d" % test_set.cardinality().numpy())
Output:
Training Set Size: 2569
Test Set Size: 1101
Now, let's split the dataset into a validation set as well. I will be partitioning the dataset into 70:15:15 fashion with 70% going to the training set and the rest equally divided among the validation set and test set. We have already split the dataset at the time of loading. At the moment training set has 70% of the dataset and the rest hence is with the test set. So we just need to split the test 50:50 into the validation set and the test set.
Using the Take and Skip method for further Splitting
We will be using take-and-skip methods to split the dataset. The tf.data.Dataset.take() method includes the first n number of images or data entries and the tf.data.Dataset.skip() method includes all others after the first n entries.
Python3
validation_size = int(0.5 * test_set.cardinality().numpy())
validation_set = test_set.take(validation_size)
test_set = test_set.skip(validation_size)
Now, let's print the new sizes.
Python3
print("Training Set Size: %d" % training_set.cardinality().numpy())
print("Validation Set Size: %d" % validation_set.cardinality().numpy())
print("Test Set Size: %d" % test_set.cardinality().numpy())
Output:
Training Set Size: 2569
Validation Set Size: 550
Test Set Size: 551
Similar Reads
How can Tensorflow be used to load the flower dataset and work with it? Tensorflow flower dataset is a large dataset of images of flowers. In this article, we are going to see, how we can use Tensorflow to load the flower dataset and work with it. Let us start by importing the necessary libraries. Here we are going to use the tensorflow_dataset library to load the datas
4 min read
How can Tensorflow be used to pre-process the flower training dataset? The flower dataset present in TensorFlow large catalog of datasets is an extensive collection of images of flowers. There are five classes of flowers present in the dataset namely: DaisyDandelionRosesSunflowersTulips Here in this article, we will study how to preprocess the dataset. After the comple
3 min read
How can Tensorflow be used with the flower dataset to compile and fit the model? In this article, we will learn how can we compile a model and fit the flower dataset to it. TO fit a dataset on a model we need to first create a data pipeline, create the model's architecture using TensorFlow high-level API, and then before fitting the model on the data using data pipelines we need
6 min read
How can Tensorflow be used with Estimators to split the iris dataset? TensorFlow is an open-source machine-learning framework that has become incredibly popular in the past few years. It is widely used for building and training deep neural networks, as well as for implementing other machine learning algorithms. Estimators, on the other hand, are high-level TensorFlow
5 min read
How can Tensorflow be used to download and explore the Iliad dataset using Python? Tensorflow is a free open-source machine learning and artificial intelligence library widely popular for training and deploying neural networks. It is developed by Google Brain Team and supports a wide range of platforms. In this tutorial, we will learn to download, load and explore the famous Iliad
2 min read