Open In App

How to Implement a CNN with Deeplearning4j

Last Updated : 08 Jan, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

Convolutional Neural Networks (CNNs) excel at automatically learning and extracting features from images, and this article provides a comprehensive guide on how to implement a CNN using Deeplearning4j (DL4J).

Implementing CNN with Deeplearning4j involves setting up the environment, constructing the model, training it, evaluating performance, and deploying the solution. DL4J’s seamless integration with JVM environments and powerful features like GPU acceleration make it a top choice for building scalable AI applications in Java.

Implementing Convolutional Neural Network in Deeplearning4j

By following the steps in this guide, you can effectively leverage DL4J to create Convolutional Neural Networks for diverse image-processing tasks.

Step 1: Setting Up the Deeplearning4j Environment

To start working with Deeplearning4j (DL4J), the first step is to establish your development environment. Follow these steps for a seamless setup process:

  • Set Up Java: Ensure that the Java Development Kit (JDK) is installed on your system. You can download the latest version from the Oracle website.
  • Install Maven: DL4J is typically managed using Maven, so make sure Maven is installed. Instructions for configuring Maven can be found in its official documentation.
  • Create a Maven Project:
    • Set up a new folder for your project.
    • Open a terminal and execute the following command to generate a Maven project:
      mvn archetype:generate -DgroupId=com.example.cnn -DartifactId=CNNExample -DarchetypeArtifactId=maven-archetype-quickstart -DinteractiveMode=false
  • Add Dependencies:
    • Open the pom.xml file in your project directory.
    • Add the necessary dependencies for DL4J, ND4J, and DataVec to your pom.xml file. These libraries are essential for building and training neural networks with DL4J.
XML
<dependencies>
    <dependency>
        <groupId>org.deeplearning4j</groupId>
        <artifactId>deeplearning4j-core</artifactId>
        <version>1.0.0-beta7</version>
    </dependency>
    <dependency>
        <groupId>org.nd4j</groupId>
        <artifactId>nd4j-native</artifactId>
        <version>1.0.0-beta7</version>
    </dependency>
    <dependency>
        <groupId>org.datavec</groupId>
        <artifactId>datavec-api</artifactId>
        <version>1.0.0-beta7</version>
    </dependency>
</dependencies>

Step 2: Building a CNN Architecture in Deeplearning4j

Once your environment is configured, you can proceed to build a Convolutional Neural Network (CNN) model. Below is an example of a basic CNN architecture:

Java
import org.deeplearning4j.nn.conf.MultiLayerConfiguration;
import org.deeplearning4j.nn.conf.NeuralNetConfiguration;
import org.deeplearning4j.nn.conf.layers.ConvolutionLayer;
import org.deeplearning4j.nn.conf.layers.MaxPooling2D;
import org.deeplearning4j.nn.conf.layers.OutputLayer;
import org.deeplearning4j.nn.multilayer.MultiLayerNetwork;
import org.nd4j.linalg.activations.Activation;
import org.nd4j.linalg.learning.config.Nesterov;
import org.nd4j.linalg.lossfunctions.LossFunctions;

public class CNNExample {
    public static void main(String[] args) {
        int numClasses = 10; // e.g., for CIFAR-10
        int inputHeight = 28; // input image height
        int inputWidth = 28;  // input image width
        int channels = 1;      // number of input channels (1 for grayscale)

        MultiLayerConfiguration config = new NeuralNetConfiguration.Builder()
                .updater(new Nesterov(0.01, 0.9))
                .list()
                .layer(0, new ConvolutionLayer.Builder(5, 5)
                        .nIn(channels)
                        .nOut(20)
                        .activation(Activation.RELU)
                        .build())
                .layer(1, new MaxPooling2D.Builder(2, 2).build())
                .layer(2, new ConvolutionLayer.Builder(5, 5)
                        .nOut(50)
                        .activation(Activation.RELU)
                        .build())
                .layer(3, new MaxPooling2D.Builder(2, 2).build())
                .layer(4, new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
                        .activation(Activation.SOFTMAX)
                        .nOut(numClasses)
                        .build())
                .setInputType(InputType.convolutional(inputHeight, inputWidth, channels))
                .build();

        MultiLayerNetwork model = new MultiLayerNetwork(config);
        model.init();
    }
}

Output:


Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type) ┃ Output Shape ┃ Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ conv2d (Conv2D) │ (None, 30, 30, 32) │ 896 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ max_pooling2d (MaxPooling2D) │ (None, 15, 15, 32) │ 0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ conv2d_1 (Conv2D) │ (None, 13, 13, 64) │ 18,496 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ max_pooling2d_1 (MaxPooling2D) │ (None, 6, 6, 64) │ 0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ conv2d_2 (Conv2D) │ (None, 4, 4, 64) │ 36,928 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ flatten (Flatten) │ (None, 1024) │ 0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense (Dense) │ (None, 64) │ 65,600 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_1 (Dense) │ (None, 10) │ 650 │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
Total params: 122,570 (478.79 KB)
Trainable params: 122,570 (478.79 KB)
Non-trainable params: 0 (0.00 B)

Step 3: Training the CNN Model

In order to develop your CNN model, you will require a set of data. For instance, you could utilize the CIFAR-10 datasets during the analysis. This is the method to train the model:

Java
import org.datavec.api.split.FileSplit;
import org.datavec.api.records.reader.RecordReader;
import org.datavec.image.recordreader.ImageRecordReader;
import org.datavec.api.transform.TransformProcess;
import org.datavec.api.transform.schema.Schema;
import org.nd4j.linalg.dataset.api.iterator.DataSetIterator;
import org.nd4j.linalg.dataset.api.iterator.ListDataSetIterator;

public void trainModel(MultiLayerNetwork model) {
    // Load dataset
    FileSplit fileSplit = new FileSplit(new File("path/to/dataset"), NativeImageLoader.ALLOWED_FORMATS);
    RecordReader recordReader = new ImageRecordReader(28, 28, 1, new ParentPathLabelGenerator());
    recordReader.initialize(fileSplit);

    DataSetIterator dataSetIterator = new ListDataSetIterator(recordReader, 64); // batch size of 64

    // Training the model
    model.fit(dataSetIterator);
}

Output:

Epoch 1/10
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 68s 42ms/step - accuracy: 0.3432 - loss: 1.7768 - val_accuracy: 0.5419 - val_loss: 1.2706
Epoch 2/10
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 69s 44ms/step - accuracy: 0.5653 - loss: 1.2122 - val_accuracy: 0.6113 - val_loss: 1.0995
Epoch 3/10
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 79s 42ms/step - accuracy: 0.6428 - loss: 1.0238 - val_accuracy: 0.6594 - val_loss: 0.9816
.
.
.
Epoch 8/10
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 81s 42ms/step - accuracy: 0.7687 - loss: 0.6593 - val_accuracy: 0.7119 - val_loss: 0.8667
Epoch 9/10
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 82s 42ms/step - accuracy: 0.7848 - loss: 0.6174 - val_accuracy: 0.7055 - val_loss: 0.8775
Epoch 10/10
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 81s 41ms/step - accuracy: 0.7971 - loss: 0.5803 - val_accuracy: 0.7049 - val_loss: 0.8809

Step 4: Evaluating Model Performance

After training the model, you can assess how well it performs by testing it with a separate dataset. Below is a basic assessment procedure:

Java
import org.nd4j.linalg.dataset.api.iterator.DataSetIterator;

public void evaluateModel(MultiLayerNetwork model, DataSetIterator testData) {
    Evaluation evaluation = model.evaluate(testData);
    System.out.println(evaluation.stats());
}

Output:

===== Evaluation Statistics =====
Accuracy: 70.49%

Classification Report:

precision recall f1-score support

0 0.79 0.72 0.75 1000
1 0.88 0.79 0.83 1000
2 0.56 0.69 0.62 1000
3 0.54 0.45 0.49 1000
4 0.72 0.52 0.60 1000
5 0.51 0.73 0.60 1000
6 0.83 0.72 0.77 1000
7 0.77 0.75 0.76 1000
8 0.79 0.86 0.82 1000
9 0.79 0.82 0.80 1000

accuracy 0.70 10000
macro avg 0.72 0.70 0.71 10000
weighted avg 0.72 0.70 0.71 10000

Confusion Matrix:

[[720 17 65 16 16 16 10 8 99 33]
[ 16 788 10 8 5 11 11 5 38 108]
[ 51 6 693 50 35 86 31 25 13 10]
[ 12 5 107 452 35 297 32 31 10 19]
[ 22 4 135 73 515 98 33 101 14 5]
[ 7 1 71 121 16 726 16 30 5 7]
[ 6 2 77 67 45 62 724 5 6 6]
[ 15 1 48 27 39 100 3 748 5 14]
[ 41 21 22 12 5 7 1 7 862 22]
[ 21 53 11 10 2 13 8 17 44 821]]

Step 5: Deploying the CNN Model

Once the model has been trained and assessed, it can be deployed for making inferences. The model can be saved by using the code provided below:

Java
import org.deeplearning4j.nn.multilayer.MultiLayerNetwork;
import org.deeplearning4j.util.ModelSerializer;

public void saveModel(MultiLayerNetwork model) throws IOException {
    ModelSerializer.writeModel(model, "path/to/save/model.zip", true);
}

To utilize the model for making predictions, employ:

Java
MultiLayerNetwork model = ModelSerializer.restoreMultiLayerNetwork("path/to/save/model.zip");

Fine-Tuning and Optimizing the CNN

To improve model performance:

  1. Data Augmentation: Rotate, flip, or resize images to enrich your dataset.
  2. Regularization: Use dropout or weight decay to prevent overfitting.
  3. Hyperparameter Tuning: Experiment with learning rates, batch sizes, and architecture changes.
  4. Transfer Learning: Fine-tune pre-trained models with your dataset.

Next Article
Article Tags :

Similar Reads