How to Implement a CNN with Deeplearning4j
Last Updated :
08 Jan, 2025
Convolutional Neural Networks (CNNs) excel at automatically learning and extracting features from images, and this article provides a comprehensive guide on how to implement a CNN using Deeplearning4j (DL4J).
Implementing CNN with Deeplearning4j involves setting up the environment, constructing the model, training it, evaluating performance, and deploying the solution. DL4J’s seamless integration with JVM environments and powerful features like GPU acceleration make it a top choice for building scalable AI applications in Java.
Implementing Convolutional Neural Network in Deeplearning4j
By following the steps in this guide, you can effectively leverage DL4J to create Convolutional Neural Networks for diverse image-processing tasks.
Step 1: Setting Up the Deeplearning4j Environment
To start working with Deeplearning4j (DL4J), the first step is to establish your development environment. Follow these steps for a seamless setup process:
- Set Up Java: Ensure that the Java Development Kit (JDK) is installed on your system. You can download the latest version from the Oracle website.
- Install Maven: DL4J is typically managed using Maven, so make sure Maven is installed. Instructions for configuring Maven can be found in its official documentation.
- Create a Maven Project:
- Add Dependencies:
- Open the
pom.xml
file in your project directory. - Add the necessary dependencies for DL4J, ND4J, and DataVec to your
pom.xml
file. These libraries are essential for building and training neural networks with DL4J.
XML
<dependencies>
<dependency>
<groupId>org.deeplearning4j</groupId>
<artifactId>deeplearning4j-core</artifactId>
<version>1.0.0-beta7</version>
</dependency>
<dependency>
<groupId>org.nd4j</groupId>
<artifactId>nd4j-native</artifactId>
<version>1.0.0-beta7</version>
</dependency>
<dependency>
<groupId>org.datavec</groupId>
<artifactId>datavec-api</artifactId>
<version>1.0.0-beta7</version>
</dependency>
</dependencies>
Step 2: Building a CNN Architecture in Deeplearning4j
Once your environment is configured, you can proceed to build a Convolutional Neural Network (CNN) model. Below is an example of a basic CNN architecture:
Java
import org.deeplearning4j.nn.conf.MultiLayerConfiguration;
import org.deeplearning4j.nn.conf.NeuralNetConfiguration;
import org.deeplearning4j.nn.conf.layers.ConvolutionLayer;
import org.deeplearning4j.nn.conf.layers.MaxPooling2D;
import org.deeplearning4j.nn.conf.layers.OutputLayer;
import org.deeplearning4j.nn.multilayer.MultiLayerNetwork;
import org.nd4j.linalg.activations.Activation;
import org.nd4j.linalg.learning.config.Nesterov;
import org.nd4j.linalg.lossfunctions.LossFunctions;
public class CNNExample {
public static void main(String[] args) {
int numClasses = 10; // e.g., for CIFAR-10
int inputHeight = 28; // input image height
int inputWidth = 28; // input image width
int channels = 1; // number of input channels (1 for grayscale)
MultiLayerConfiguration config = new NeuralNetConfiguration.Builder()
.updater(new Nesterov(0.01, 0.9))
.list()
.layer(0, new ConvolutionLayer.Builder(5, 5)
.nIn(channels)
.nOut(20)
.activation(Activation.RELU)
.build())
.layer(1, new MaxPooling2D.Builder(2, 2).build())
.layer(2, new ConvolutionLayer.Builder(5, 5)
.nOut(50)
.activation(Activation.RELU)
.build())
.layer(3, new MaxPooling2D.Builder(2, 2).build())
.layer(4, new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
.activation(Activation.SOFTMAX)
.nOut(numClasses)
.build())
.setInputType(InputType.convolutional(inputHeight, inputWidth, channels))
.build();
MultiLayerNetwork model = new MultiLayerNetwork(config);
model.init();
}
}
Output:
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type) ┃ Output Shape ┃ Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ conv2d (Conv2D) │ (None, 30, 30, 32) │ 896 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ max_pooling2d (MaxPooling2D) │ (None, 15, 15, 32) │ 0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ conv2d_1 (Conv2D) │ (None, 13, 13, 64) │ 18,496 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ max_pooling2d_1 (MaxPooling2D) │ (None, 6, 6, 64) │ 0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ conv2d_2 (Conv2D) │ (None, 4, 4, 64) │ 36,928 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ flatten (Flatten) │ (None, 1024) │ 0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense (Dense) │ (None, 64) │ 65,600 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_1 (Dense) │ (None, 10) │ 650 │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
Total params: 122,570 (478.79 KB)
Trainable params: 122,570 (478.79 KB)
Non-trainable params: 0 (0.00 B)
Step 3: Training the CNN Model
In order to develop your CNN model, you will require a set of data. For instance, you could utilize the CIFAR-10 datasets during the analysis. This is the method to train the model:
Java
import org.datavec.api.split.FileSplit;
import org.datavec.api.records.reader.RecordReader;
import org.datavec.image.recordreader.ImageRecordReader;
import org.datavec.api.transform.TransformProcess;
import org.datavec.api.transform.schema.Schema;
import org.nd4j.linalg.dataset.api.iterator.DataSetIterator;
import org.nd4j.linalg.dataset.api.iterator.ListDataSetIterator;
public void trainModel(MultiLayerNetwork model) {
// Load dataset
FileSplit fileSplit = new FileSplit(new File("path/to/dataset"), NativeImageLoader.ALLOWED_FORMATS);
RecordReader recordReader = new ImageRecordReader(28, 28, 1, new ParentPathLabelGenerator());
recordReader.initialize(fileSplit);
DataSetIterator dataSetIterator = new ListDataSetIterator(recordReader, 64); // batch size of 64
// Training the model
model.fit(dataSetIterator);
}
Output:
Epoch 1/10
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 68s 42ms/step - accuracy: 0.3432 - loss: 1.7768 - val_accuracy: 0.5419 - val_loss: 1.2706
Epoch 2/10
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 69s 44ms/step - accuracy: 0.5653 - loss: 1.2122 - val_accuracy: 0.6113 - val_loss: 1.0995
Epoch 3/10
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 79s 42ms/step - accuracy: 0.6428 - loss: 1.0238 - val_accuracy: 0.6594 - val_loss: 0.9816
.
.
.
Epoch 8/10
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 81s 42ms/step - accuracy: 0.7687 - loss: 0.6593 - val_accuracy: 0.7119 - val_loss: 0.8667
Epoch 9/10
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 82s 42ms/step - accuracy: 0.7848 - loss: 0.6174 - val_accuracy: 0.7055 - val_loss: 0.8775
Epoch 10/10
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 81s 41ms/step - accuracy: 0.7971 - loss: 0.5803 - val_accuracy: 0.7049 - val_loss: 0.8809
Step 4: Evaluating Model Performance
After training the model, you can assess how well it performs by testing it with a separate dataset. Below is a basic assessment procedure:
Java
import org.nd4j.linalg.dataset.api.iterator.DataSetIterator;
public void evaluateModel(MultiLayerNetwork model, DataSetIterator testData) {
Evaluation evaluation = model.evaluate(testData);
System.out.println(evaluation.stats());
}
Output:
===== Evaluation Statistics =====
Accuracy: 70.49%
Classification Report:
precision recall f1-score support
0 0.79 0.72 0.75 1000
1 0.88 0.79 0.83 1000
2 0.56 0.69 0.62 1000
3 0.54 0.45 0.49 1000
4 0.72 0.52 0.60 1000
5 0.51 0.73 0.60 1000
6 0.83 0.72 0.77 1000
7 0.77 0.75 0.76 1000
8 0.79 0.86 0.82 1000
9 0.79 0.82 0.80 1000
accuracy 0.70 10000
macro avg 0.72 0.70 0.71 10000
weighted avg 0.72 0.70 0.71 10000
Confusion Matrix:
[[720 17 65 16 16 16 10 8 99 33]
[ 16 788 10 8 5 11 11 5 38 108]
[ 51 6 693 50 35 86 31 25 13 10]
[ 12 5 107 452 35 297 32 31 10 19]
[ 22 4 135 73 515 98 33 101 14 5]
[ 7 1 71 121 16 726 16 30 5 7]
[ 6 2 77 67 45 62 724 5 6 6]
[ 15 1 48 27 39 100 3 748 5 14]
[ 41 21 22 12 5 7 1 7 862 22]
[ 21 53 11 10 2 13 8 17 44 821]]
Step 5: Deploying the CNN Model
Once the model has been trained and assessed, it can be deployed for making inferences. The model can be saved by using the code provided below:
Java
import org.deeplearning4j.nn.multilayer.MultiLayerNetwork;
import org.deeplearning4j.util.ModelSerializer;
public void saveModel(MultiLayerNetwork model) throws IOException {
ModelSerializer.writeModel(model, "path/to/save/model.zip", true);
}
To utilize the model for making predictions, employ:
Java
MultiLayerNetwork model = ModelSerializer.restoreMultiLayerNetwork("path/to/save/model.zip");
Fine-Tuning and Optimizing the CNN
To improve model performance:
- Data Augmentation: Rotate, flip, or resize images to enrich your dataset.
- Regularization: Use dropout or weight decay to prevent overfitting.
- Hyperparameter Tuning: Experiment with learning rates, batch sizes, and architecture changes.
- Transfer Learning: Fine-tune pre-trained models with your dataset.
Similar Reads
Deep Learning with Python OpenCV
Opencv 3.3 brought with a very improved and efficient (dnn) module which makes it very for you to use deep learning with OpenCV. You still cannot train models in OpenCV, and they probably don't have any intention of doing anything like that, but now you can very easily use image processing and use t
5 min read
Train a Deep Learning Model With Pytorch
Neural Network is a type of machine learning model inspired by the structure and function of human brain. It consists of layers of interconnected nodes called neurons which process and transmit information. Neural networks are particularly well-suited for tasks such as image and speech recognition,
6 min read
Fashion MNIST with Python Keras and Deep Learning
Deep learning is a subfield of machine learning related to artificial neural networks. The word deep means bigger neural networks with a lot of hidden units. Deep learning's CNN's have proved to be the state-of-the-art technique for image recognition tasks. Keras is a deep learning library in Python
6 min read
What is Batch Normalization In Deep Learning?
Batch Normalization is used to reduce the problem of internal covariate shift in neural networks. It works by normalizing the data within each mini-batch. This means it calculates the mean and variance of data in a batch and then adjusts the values so that they have similar range. After that it scal
4 min read
PointNet - Deep Learning
PointNet was proposed by a researcher at Stanford University in 2016. The motivation behind this paper is to classify and segment 3D representation of images. They use a data structure called Point cloud, which is a set of the point that represents a 3D shape or an object. Due to its irregularities,
9 min read
Black and white image colorization with OpenCV and Deep Learning
In this article, we'll create a program to convert a black & white image i.e grayscale image to a colour image. We're going to use the Caffe colourization model for this program. And you should be familiar with basic OpenCV functions and uses like reading an image or how to load a pre-trained mo
3 min read
DeepLearning4j vs TensorFlow
Deep learning frameworks have revolutionized how we build and deploy machine learning models, making it easier for developers and researchers to work on cutting-edge AI applications. Two of the most prominent frameworks in this space are DeepLearning4j (DL4J) and TensorFlow. Both offer powerful tool
6 min read
DeepLearning4j: Advanced AI Solutions in Java
DeepLearning4j (DL4J), developed by Skymind, is an open-source deep learning framework designed for Java and the Java Virtual Machine (JVM). It empowers developers to build, train, and deploy deep neural networks efficiently, offering seamless integration with Java-based systems. This article explor
6 min read
How to Create a Neural Network with ml5JS?
Creating a neural network might sound complex, but with ml5.js, it becomes much easier. This JavaScript library is built on top of TensorFlow.js, and it simplifies the process of adding machine learning to your web projects. This article helps to create a basic neural network using ml5.js. In the co
3 min read
How to implement Genetic Algorithm using PyTorch
The optimization algorithms are capable of solving complex problems and genetic algorithm is one of the optimization algorithm. Genetic Algorithm can be easily integrate with PyTorch to address a wide array of optimization tasks. We will understand how to implement Genetic Algorithm using PyTorch. G
8 min read