How to Implement a CNN with Deeplearning4j

Last Updated : 23 Jul, 2025

Convolutional Neural Networks (CNNs) excel at automatically learning and extracting features from images, and this article provides a comprehensive guide on how to implement a CNN using Deeplearning4j (DL4J).

Implementing CNN with Deeplearning4j involves setting up the environment, constructing the model, training it, evaluating performance, and deploying the solution. DL4J’s seamless integration with JVM environments and powerful features like GPU acceleration make it a top choice for building scalable AI applications in Java.

Implementing Convolutional Neural Network in Deeplearning4j

By following the steps in this guide, you can effectively leverage DL4J to create Convolutional Neural Networks for diverse image-processing tasks.

Step 1: Setting Up the Deeplearning4j Environment

To start working with Deeplearning4j (DL4J), the first step is to establish your development environment. Follow these steps for a seamless setup process:

Set Up Java: Ensure that the Java Development Kit (JDK) is installed on your system. You can download the latest version from the Oracle website.
Install Maven: DL4J is typically managed using Maven, so make sure Maven is installed. Instructions for configuring Maven can be found in its official documentation.

Create a Maven Project:

Set up a new folder for your project.

Open a terminal and execute the following command to generate a Maven project:

mvn archetype:generate -DgroupId=com.example.cnn -DartifactId=CNNExample -DarchetypeArtifactId=maven-archetype-quickstart -DinteractiveMode=false

Add Dependencies:
- Open the pom.xml file in your project directory.
- Add the necessary dependencies for DL4J, ND4J, and DataVec to your pom.xml file. These libraries are essential for building and training neural networks with DL4J.

XML

<dependencies>
    <dependency>
        <groupId>org.deeplearning4j</groupId>
        <artifactId>deeplearning4j-core</artifactId>
        <version>1.0.0-beta7</version>
    </dependency>
    <dependency>
        <groupId>org.nd4j</groupId>
        <artifactId>nd4j-native</artifactId>
        <version>1.0.0-beta7</version>
    </dependency>
    <dependency>
        <groupId>org.datavec</groupId>
        <artifactId>datavec-api</artifactId>
        <version>1.0.0-beta7</version>
    </dependency>
</dependencies>

Step 2: Building a CNN Architecture in Deeplearning4j

Once your environment is configured, you can proceed to build a Convolutional Neural Network (CNN) model. Below is an example of a basic CNN architecture:

Java

import org.deeplearning4j.nn.conf.MultiLayerConfiguration;
import org.deeplearning4j.nn.conf.NeuralNetConfiguration;
import org.deeplearning4j.nn.conf.layers.ConvolutionLayer;
import org.deeplearning4j.nn.conf.layers.MaxPooling2D;
import org.deeplearning4j.nn.conf.layers.OutputLayer;
import org.deeplearning4j.nn.multilayer.MultiLayerNetwork;
import org.nd4j.linalg.activations.Activation;
import org.nd4j.linalg.learning.config.Nesterov;
import org.nd4j.linalg.lossfunctions.LossFunctions;

public class CNNExample {
    public static void main(String[] args) {
        int numClasses = 10; // e.g., for CIFAR-10
        int inputHeight = 28; // input image height
        int inputWidth = 28;  // input image width
        int channels = 1;      // number of input channels (1 for grayscale)

        MultiLayerConfiguration config = new NeuralNetConfiguration.Builder()
                .updater(new Nesterov(0.01, 0.9))
                .list()
                .layer(0, new ConvolutionLayer.Builder(5, 5)
                        .nIn(channels)
                        .nOut(20)
                        .activation(Activation.RELU)
                        .build())
                .layer(1, new MaxPooling2D.Builder(2, 2).build())
                .layer(2, new ConvolutionLayer.Builder(5, 5)
                        .nOut(50)
                        .activation(Activation.RELU)
                        .build())
                .layer(3, new MaxPooling2D.Builder(2, 2).build())
                .layer(4, new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
                        .activation(Activation.SOFTMAX)
                        .nOut(numClasses)
                        .build())
                .setInputType(InputType.convolutional(inputHeight, inputWidth, channels))
                .build();

        MultiLayerNetwork model = new MultiLayerNetwork(config);
        model.init();
    }
}

Output:


Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type)                         ┃ Output Shape                ┃         Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ conv2d (Conv2D)                      │ (None, 30, 30, 32)          │             896 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ max_pooling2d (MaxPooling2D)         │ (None, 15, 15, 32)          │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ conv2d_1 (Conv2D)                    │ (None, 13, 13, 64)          │          18,496 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ max_pooling2d_1 (MaxPooling2D)       │ (None, 6, 6, 64)            │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ conv2d_2 (Conv2D)                    │ (None, 4, 4, 64)            │          36,928 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ flatten (Flatten)                    │ (None, 1024)                │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense (Dense)                        │ (None, 64)                  │          65,600 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_1 (Dense)                      │ (None, 10)                  │             650 │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
 Total params: 122,570 (478.79 KB)
 Trainable params: 122,570 (478.79 KB)
 Non-trainable params: 0 (0.00 B)

Step 3: Training the CNN Model

In order to develop your CNN model, you will require a set of data. For instance, you could utilize the CIFAR-10 datasets during the analysis. This is the method to train the model:

Java

import org.datavec.api.split.FileSplit;
import org.datavec.api.records.reader.RecordReader;
import org.datavec.image.recordreader.ImageRecordReader;
import org.datavec.api.transform.TransformProcess;
import org.datavec.api.transform.schema.Schema;
import org.nd4j.linalg.dataset.api.iterator.DataSetIterator;
import org.nd4j.linalg.dataset.api.iterator.ListDataSetIterator;

public void trainModel(MultiLayerNetwork model) {
    // Load dataset
    FileSplit fileSplit = new FileSplit(new File("path/to/dataset"), NativeImageLoader.ALLOWED_FORMATS);
    RecordReader recordReader = new ImageRecordReader(28, 28, 1, new ParentPathLabelGenerator());
    recordReader.initialize(fileSplit);

    DataSetIterator dataSetIterator = new ListDataSetIterator(recordReader, 64); // batch size of 64

    // Training the model
    model.fit(dataSetIterator);
}

Output:

Epoch 1/10
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 68s 42ms/step - accuracy: 0.3432 - loss: 1.7768 - val_accuracy: 0.5419 - val_loss: 1.2706
Epoch 2/10
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 69s 44ms/step - accuracy: 0.5653 - loss: 1.2122 - val_accuracy: 0.6113 - val_loss: 1.0995
Epoch 3/10
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 79s 42ms/step - accuracy: 0.6428 - loss: 1.0238 - val_accuracy: 0.6594 - val_loss: 0.9816
. 
. 
.
Epoch 8/10
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 81s 42ms/step - accuracy: 0.7687 - loss: 0.6593 - val_accuracy: 0.7119 - val_loss: 0.8667
Epoch 9/10
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 82s 42ms/step - accuracy: 0.7848 - loss: 0.6174 - val_accuracy: 0.7055 - val_loss: 0.8775
Epoch 10/10
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 81s 41ms/step - accuracy: 0.7971 - loss: 0.5803 - val_accuracy: 0.7049 - val_loss: 0.8809

Step 4: Evaluating Model Performance

After training the model, you can assess how well it performs by testing it with a separate dataset. Below is a basic assessment procedure:

Java

import org.nd4j.linalg.dataset.api.iterator.DataSetIterator;

public void evaluateModel(MultiLayerNetwork model, DataSetIterator testData) {
    Evaluation evaluation = model.evaluate(testData);
    System.out.println(evaluation.stats());
}

Output:

===== Evaluation Statistics =====
Accuracy: 70.49%

Classification Report:

              precision    recall  f1-score   support

           0       0.79      0.72      0.75      1000
           1       0.88      0.79      0.83      1000
           2       0.56      0.69      0.62      1000
           3       0.54      0.45      0.49      1000
           4       0.72      0.52      0.60      1000
           5       0.51      0.73      0.60      1000
           6       0.83      0.72      0.77      1000
           7       0.77      0.75      0.76      1000
           8       0.79      0.86      0.82      1000
           9       0.79      0.82      0.80      1000

    accuracy                           0.70     10000
   macro avg       0.72      0.70      0.71     10000
weighted avg       0.72      0.70      0.71     10000

Confusion Matrix:

[[720  17  65  16  16  16  10   8  99  33]
 [ 16 788  10   8   5  11  11   5  38 108]
 [ 51   6 693  50  35  86  31  25  13  10]
 [ 12   5 107 452  35 297  32  31  10  19]
 [ 22   4 135  73 515  98  33 101  14   5]
 [  7   1  71 121  16 726  16  30   5   7]
 [  6   2  77  67  45  62 724   5   6   6]
 [ 15   1  48  27  39 100   3 748   5  14]
 [ 41  21  22  12   5   7   1   7 862  22]
 [ 21  53  11  10   2  13   8  17  44 821]]

Step 5: Deploying the CNN Model

Once the model has been trained and assessed, it can be deployed for making inferences. The model can be saved by using the code provided below:

Java

import org.deeplearning4j.nn.multilayer.MultiLayerNetwork;
import org.deeplearning4j.util.ModelSerializer;

public void saveModel(MultiLayerNetwork model) throws IOException {
    ModelSerializer.writeModel(model, "path/to/save/model.zip", true);
}

To utilize the model for making predictions, employ:

Java

MultiLayerNetwork model = ModelSerializer.restoreMultiLayerNetwork("path/to/save/model.zip");

Fine-Tuning and Optimizing the CNN

To improve model performance:

Data Augmentation: Rotate, flip, or resize images to enrich your dataset.
Regularization: Use dropout or weight decay to prevent overfitting.
Hyperparameter Tuning: Experiment with learning rates, batch sizes, and architecture changes.
Transfer Learning: Fine-tune pre-trained models with your dataset.

sirvinaysy60t

Improve

Article Tags :

How to Implement a CNN with Deeplearning4j

Implementing Convolutional Neural Network in Deeplearning4j

Step 1: Setting Up the Deeplearning4j Environment

Step 2: Building a CNN Architecture in Deeplearning4j

Step 3: Training the CNN Model

Step 4: Evaluating Model Performance

Step 5: Deploying the CNN Model

Fine-Tuning and Optimizing the CNN

Explore

Deep Learning Basics

Neural Networks Basics

Deep Learning Models

Deep Learning Frameworks

Model Evaluation

Deep Learning Projects

Thank You!

What kind of Experience do you want to share?