A Deep Learning Framework For Smart Agriculture Real-Time Weed Classification Using Convolutional Neural Network
A Deep Learning Framework For Smart Agriculture Real-Time Weed Classification Using Convolutional Neural Network
, 2025, 1, 25205
Department of Electronics and Telecommunication, Sinhgad Institute of Technology, Lonavala, 410401, Maharashtra, India
*Email: [email protected] (S. S. Salve)
Abstract
Agricultural sector being the foundation of food supply and raw material production contributes significantly to the GDP
growth and value chain. Thus, the effective elimination of weed in modern day agriculture is essential as the current world
scenario demands for efficient and resourceful ways for crop cultivation and harvesting. The urgent need for elimination of
weeds arises due to their tendency to extricate all the essential minerals and moisture that the crops require for their
appropriate growth. The main objective of this study is to successfully acquire live video feed as input, classification into
categories of crop, weed and none. Finally, upon detection of weed the spraying mechanism releases a pre-determined
amount of herbicide upon the weed. A total of 5471 image samples were captured to train the CNN model. The prototype
mentioned in this paper uses Convolutional Neural Network (CNN) technique for feature extraction, Fully Connected Layers
or Dense Layers (FCLs) for classification using SoftMax as the activation function respectively. The activation function also
here is being used to remove all negative (less significant) values. Also, a comprehensive comparison was made between the
CNN and YOLOv4 technique and performance parameters of both were evaluated. The CNN technique achieved an accuracy
of 95.50% whereas YOLOv4 achieved 91.00%. Finally, the F1 Score was evaluated to be 96.25% and 91.96% respectively.
Compared to existing models, our prototype demonstrated higher accuracy and real-time adaptability in field conditions,
proving suitable for autonomous weed management systems. Unlike earlier systems that depended mostly on stored images
or fixed datasets, our approach stands out by using a live video feed to identify weeds in real-time. It’s built on a mobile
platform that can automatically spray herbicides, making precision farming possible without the need for constant human
supervision.
Keywords: Computer vision; Convolutional Neural Network; Deep learning; SoftMax; Max pooling; Weed detection; Image
pre-processing.
Received: 05 April 2025; Revised: 15 May 2025; Accepted: 27 May 2025; Published Online: 30 May 2025.
DOI: https://doi.org/10.64189/ssc.25205
© The Author(s) 2025
This article is licensed under Creative Commons Attribution Non Commercial 4.0 International (CC-BY-NC 4.0) J. Smart Sens. Comput., 2025, 1, 25205 | 1
Research article
conventional farming practices. Modern technology has are all components of the cultural approach of weed
made it possible for autonomous machines to carry out management. This approach has significant labour costs and
agricultural tasks effectively. High-quality crops can be drawbacks. Applying herbicides is thought to be a significant
produced with little human labor when robotics and alternative to hand weeding. However, excessive herbicide
intelligent machinery are integrated into agriculture.[3,4] use can result in harvest losses, harm to the environment,
A weed detection system uses machine learning high production costs, and the development of herbicide
algorithms to identify unwanted plants in an agricultural resistance. Without getting to the weeds, some of these
field. Farmers can reduce their use of weed and herbicides, pesticides even wind up on the soil and food crops. Since
which can be harmful to the environment and public health. spraying food crops is viewed as a risk to the safety of the
Plans for targeted weed control can be created by utilizing food being consumed, a thorough weed control method is
the information on the types of the weeds that the detecting necessary.
system can supply.[5] A new technology that has the potential On the other hand, as specified by P. Kavitha Reddy et
to completely transform agriculture is machine learning- al.,[5] deep learning techniques particularly those that use
based weed detection. The system's purpose is to locate and neural networks have become increasingly popular in recent
identify weeds in a field so that farmers can take specific years. These methods use big datasets of tagged images to
action to get rid of them, gather live videos and photos of a train and intricate neural network models. The neural
field, apply machine learning techniques to the same, and network automatically collects pertinent information and
then determine the weeds. Numerous methods, such as object classifies the input photos using iterative learning
detection, feature extraction, segmentation, and procedures. The YOLO algorithm is a well-known
Classification, can be used to complete this process. We implementation of the convolutional neural network (CNN),
decided to use a live feed CNN technique to address this which is the foundation of deep learning techniques in
problem it's more like analyzing the input dataset to find the computer vision (CV).
weeds.[5,6] In this paper, a low cost and robust weed detection, live
The weeds within rows might not be accurately removed video-based and elimination system with automated spraying
by conventional machinery. Sunil G C et al. emphasized using Convolutional Neural Network (CNN) as the main
while introducing their study on the thought that the computing algorithm, SoftMax and ReLU as activation
herbicide which is sprayed uniformly across the field, functions and classification of the same using Fully
treating weeds and crops alike, at a set pace when compared Connected Layers (FCLs) is given along with a detailed
to site-specific herbicide applications prove less feasible as comparison of YOLOv4 with the proposed method.
blanket herbicide applications may have a more negative The major reason for why CNN was selected is due to its
impact on the ecosystem. As a result, applying an herbicide ability to focus on fine-grained feature learning, especially
selectively to areas of concern may improve precision while useful in identifying small or overlapping weed patterns.
lowering input costs and environmental problems. YOLOv4 was chosen for comparison due to its real-time
Umamaheswari S et al.[7] mentioned about the field of detection speed. Other models like Faster R-CNN or ViT
robotic farming and precision agriculture that needs were not used due to higher computational demands
to advance in response to current problems with the lack of unsuitable for edge deployment on Raspberry Pi 4. The two
agricultural labour and resources, the emergence of new crop activation functions SoftMax and ReLU were selected for
diseases, and weeds. The issues of climate change and their simplicity, speed, and established use in CNN
sustainable agriculture are intimately tied to the challenge of architectures. Alternatives like Swish or Leaky ReLU can
effective weed classification and detection. According to improve performance but require higher computational cost
various resources and findings the study suggests, existing and tuning.
species may be exposed to new and hybrid weeds as a result
of climate change. Because weeds can hinder the growth of 2. Materials and methods
farm crops, it is crucial to create new technologies that aid in This particular section describes the materials and design
identifying them. Identifying weeds can also help remove required for the successful development of the particular
them, which lowers the need for pesticides and offers proposed system. Here a detailed overview of the
effective substitutes when the crops are harvested. components, methodology utilized and many other
O. M. Olaniyi et al.[8] mentioned about the various ways specifications are mentioned. The system prototype well
of weed eliminations as people have become more civic and integrates the combination of Internet of Things (IOT) with
knowledgeable about weeds, experts have been looking for image processing, feature extraction, deep learning
ways to eradicate the infamous pest with the least amount of algorithm and identification along with precision spraying
harm to the plant. The three main strategies for controlling unit.
weeds are cultural, chemical, and automated approaches.
Bush fallowing, mulching, fire clearance, early flooding, 2.1 System overview
hand weeding, shifting crops, and maintaining a clean reaper The proposed system is implemented using Convolutional
Neural Network to develop and cultivate a robust, multi- field. Both the L293D module and the DC motors are
scalable and versatile weed detection system that produces powered using a 12V DC power supply. The L293D is also
accurate results in real time using live video feed via a interfaced with the Raspberry Pi 4 model B as the master
webcam. The input dataset then goes through various control unit.
processes and at the end determines the result based on three A Bluetooth module i.e., HC-04 is also interfaced to the
particular parameters i.e., i) weed, ii) crop, iii) none. The microcontroller for controlling the directions provided by the
various processes particularly include, Image Acquisition, motor driver module. This Bluetooth module supports V2.0+
feature extraction, classification and training of the model. EDR (Enhanced Data Rate) up to 3 Mbps modulation along
A generalized block diagram is represented as Fig. 1 that with 2.4 GHz radio transceiver and baseband. A python
gives an idea regarding the actual flow of the components program is being compiled and executed by the
within the proposed system, and their particular task microcontroller that enables the user to connect with the
involved in the accurate execution. The proposed prototype Bluetooth module using the application “Serial Bluetooth
contains various components mounted on a robust wooden Terminal”, where user can give commands in the form of
platform which are powered by a 12V DC adapter. numbers for specifying movements in specific direction (i.e.,
The main microcontroller unit i.e., Raspberry Pi 4 Model 1 = forward, 2 = reverse, 3 = left, 4 = right, 5 = terminate).
B is powered by a 5V USB-C type charger. The output can A single-channel relay module is also being used and
be observed on a desktop monitor via connection with an interfaced with the microcontroller in order to control the
HDMI cable. Fig. 2 Carefully shows stage by stage fluid pump inside the sprayer prototype. It is rated for
deployment and implementation of a particular CNN based switching up to 10A at 250V AC or 24V DC. This is also
weed detection system using Max Pooling, ReLU, Dropout, powered using the 12V DC power supply which was used to
Fully Connected Layers (FCLs) and SoftMax for multiple power the motors and driver module. Now in this relay when
stages of detection and processing of the input dataset.[9,10] the input (IN) is driven LOW, at that time the relay coil
energizes and it switches the normally closed (NC) contact
2.2 Working principle point to normally open (NO) contact. This action effectively
2.2.1 Hardware contributes in turning a connected device (i.e., the fluid
The “A Deep Learning Framework for Smart Agriculture: pump) on or off at specified intervals upon weed detection.
Real-Time Weed Classification Using CNN” uses a robust The camera module used in the particularly developed
and sturdy navigable prototype that enables the system to be system is the Xiaomi Mi HD USB 2.0 Web-Cam. It can
mounted of a hard bound wooden base with a four-wheel capture live video feeds up to a resolution of 1280×720p HD
chassis. The two forward wheels are attached with two 12V and has a frame rate of 30 FPS. With up to a 90 wide angle
DC geared motors of 300 r.p.m each and the two rear wheels field of view it has no driver requirements, thus compatible
are attached as dummy wheels for support. As they support with the Raspberry Pi 4 microcontroller. The OpenCV library
heavier load i.e., in this case a wooden platform 12V DC efficiently helps the prototype to capture and process the live
geared motors are used. These motors are then connected to feed of input dataset for image pre-processing.
an L293D module. This L293D module is a motor driver The Raspberry Pi 4 model B microcontroller acts as the
module which is widely used in embedded systems to control heart and brains of the system. This is basically a card sized
the direction of DC motors and stepper motors. This module mini-computer that operates using its own software,
is capable of driving two DC motors independently in both performing tasks that an actual desktop can perform
forward and reverse direction. This adds precision and independently including browsing, media playback and
control to the whole system and grants mobility across the major IOT development. It has a 4 GB LPDDR4-3200
SDRAM and has a microSD card slot that comprises of the both the algorithms. The image gives a clear idea about all
actual controller software. It consists of four USB ports two the particular components and their whereabouts in the
with USB 3.0 capabilities and the other two with USB 2.0 particular model.
support. Two micro-HDMI slots are provided for interfacing
with external display peripherals supporting resolutions up to 2.3 Software
4K 60 FPS. The power supply is provided via a 5V DC USB- In regard to the proposed model in this study, Debian
C type connector, and has an ambient operating temperature GNU/Linux 10 (buster) has been installed onto the Raspberry
within the range of 0C to 50C. Pi 4 Model B as its operating system which uses Python
Fig. 3 gives a glimpse of the proposed system that is being IDLE as compiler to script and execute the python code for
cultivated and developed for the comprehensive study of the implementation and training of CNN and YOLOv4
function has proven its efficiency and reliability even after give higher accuracy while training the model with data set
considerations of its drawbacks. but will produce low accuracy in the testing procedures, thus
Max Pooling and Dropout are also being used as Max leaving a large gap between the training and testing accuracy.
Pooling reduces the image spatial dimensions while The Dropout technique effective in such cases as during the
preserving the essential features and the Dropout reduces the training process it randomly removes a small fraction of
overfitting by randomly setting 25% of the neurons to zero neurons in the network, in our case 25%, 50% and 80% for
during training procedure. A window of 2×2 size moves all different layers, so the dropout rates were set at 0.25, 0.5, and
over the feature map, thus keeping only maximum value 0.8 respectively.
from each window. This particularly contributes in reducing In mathematical terms,[17] a mask is being applied to a set
computational complexity. of neurons according to the percentage of dropout applied
Max Pooling in CNN is basically a down sampling during the training period. At each step a mask matrix is
technique which proves extremely beneficial in reducing generated where each entry is in form of a binary variable
spatial features and dimensions of an input volume dataset. i.e., 0 or 1 indicating which neuron to be dropped or not.
It is non-linear in nature that serves for better efficiency and
𝑦 = 𝑊. (𝑀⨀𝑥) (2)
reduced computational power. It operates independently on
each and every depth slice of the input image and resizes it where,
spatially. It involves sliding a window called kernel of size 𝑥 =input to a layer
2×2 across the input data and performing matrix 𝑊 =weight matrix for particular layer
multiplication taking only the maximum values from each 𝑀 =mask matrix
frame. Fig. 7 shows accurately the same using a set of sample ⨀ =element - wise product
values. With dropout, the mask matrix particularly applied, where
each element of 𝑀 is ‘0’ having probability of p and ‘1’ with
probability 1 – p. During testing the dropout is called off but
the weights are scaled by 1 – p to take account of the neurons
that were dropped off during training process.
The layers are being employed where each layer detects
more and more complex patterns in the input images. These
higher-level features include shapes, edges, textures. Pooling
of such layers helps the model to recognize the objects
regardless of their position in an image thus making the
model translation-invariant. The First Dropout layer
Fig. 7: Max pooling in CNN. introduces early regularization in the dataset, preventing co-
adaptation of the neurons and encouraging increased robust
These particular maximum values then constitute a single feature learning. The Flatten layer now converts these multi-
pixel in the newly pooled output. The 2×2 window that dimensional feature maps into 1-D vector for better transition
moves all over the input image follows a particular stride of into the dense layers (FCLs). The Second Dropout layer
a certain number of pixels. This particular process when again randomly drops units from the flattened layer before its
repeated until the final output produces an output image of transition into the dense layers giving more regularization
size almost half the original and effectively reduction in which were prone to overfitting due to their large number of
pixels by 75%.[15] parameters.
Now while training a neural network, it might not only
learn the general pattern but also the noise and specific 2.6 Classification
ungeneralised details of unseen data. This overfitting might After the successful extraction of features from the input
images, the model network now flattens the image dataset network. Out of the three major components of the YOLOv4
into a 1D vector and feeds to the Fully Convolutional Layer network (i.e., backbone, neck, head), the head network is
(FCL) as it accepts only one-dimensional input. responsible for classification and final detection. It basically
E.g. applies anchor boxes on the feature maps and generates the
MaxPooling2D output = (7, 7, 64) output with particular probabilities of the classes.[20,21]
Equivalent 1D vector output = (7×7×64) = (3136) The process initiates with an input image of size
The dense layer comprises of 1024 neurons that acts as a 416× 416,where multiple detection heads of different scales
hidden layer processing extracted features from previous are being used. The feature maps are of sizes 13× 13, 26 ×
CNN layers. In the output layer of the FCL, three neurons are 26 and 52× 52. If 𝑆 = Grid Size, 𝐵 =Number of anchor
taken that denote three possible classes. Here the SoftMax boxes per grid cell and 𝐶 =Number of classes, then the
activation function is used that converts the output into tensor output for each scale shape would be:
probabilities whose sum results to 1. It is basically a
𝑆 × 𝑆 × 𝐵 × (5 + 𝐶) (4)
mathematical function which is majorly used in cases
involving multiple classes, where vector of real numbers where,
(logits) is converted into probability distribution, where the 5 = 4 bounding box coordinates (𝑡𝑥 , 𝑡𝑦 , 𝑡𝑤 , 𝑡ℎ ) + 1
values are in the range of 0 and 1. In [18] Brahim Jabir et al. C = Class probabilities
accurately depicted and visualized how the hidden layers in In Equation (4), the output tensor of YOLOv4 has been
a fully connected dense layer interact with one another and calculated. Now the bounding box offsets relative to the
work accordingly. The CNN consisted of 3 convolutional anchor boxes are being predicted. If 𝑡𝑥 , 𝑡𝑦 are predicted
layers with filter sizes (3×3), (3×3), and (5×5) respectively, offsets for box center and 𝑡𝑤 , 𝑡ℎ are the predicted offsets for
followed by ReLU activations and MaxPooling. width and height. So, to calculate the actual box predictions,
Mathematically, Eq. 3. Accurately shows the working of the equations would look like:
SoftMax activation function for precise model prediction and
𝑏𝑥 = 𝜎(𝑡𝑥 ) + 𝑐𝑥 (5)
detection.[19]
𝑒 𝑍𝑖 𝑏𝑦 = 𝜎(𝑡𝑦 ) + 𝑐𝑦 (6)
𝑃𝑖 = 𝑍𝑗 (3)
∑𝑛
𝑗=1 𝑒 𝑏𝑥 = 𝑝𝑤∙ 𝑒 𝑡𝑤 (7)
where, 𝑏ℎ = 𝑝ℎ∙ 𝑒 𝑡ℎ
(8)
𝑒 𝑍𝑖 =Exponential of input 𝑍𝑖 (Raw Score)
∑𝑛𝑗=1 𝑒 𝑍𝑗 =Sum of exponentials of all inputs where,
𝜎 = Sigmoid function
Here, 𝑃0indicates probability of crop, 𝑃1 indicates
(𝑐𝑥 , 𝑐𝑦 ) = Top-left coordinate of grid cell
probability of weed and 𝑃2 indicates probability of none. The
class with highest probability is the model’s prediction. (𝑝𝑤 , 𝑝ℎ ) = Width and height of the anchor box
Whereas in YOLOv4, the classification techniques are Now, when we actually step into the probability distribution
directly including into the object detection process. analysis over all the classes, we use the SoftMax activation
Originally this method is ideal for real-time object detection function here as well for independent multi-labelled
but in this paper, we have proposed a different approach to classification. Equation (3) shows the SoftMax
utilize CNN for real-time object detection and training. implementation of CNN as well as YOLOv4. But when we
YOLOv4 algorithm performs localization by detecting the go with sigmoid for binary per class classification, the
position of an object and classification by object type equation looks like:
identification in a single forward pass using the neural 𝑃(𝑐𝑖 |𝑜𝑏𝑗𝑒𝑐𝑡) = 𝜎(𝑡𝑐𝑖 ) (9)
instances where the model might be having trouble.[24] Equation (12) shows how precision of a model is being
The model is tested offline as well as in real time in the calculated on the basis of True Positives and False Positives.
field using Raspberry Pi 4 Model B. In this stage, the model As here we seek to determine the actual correctness of a
is fed live video input, and the accuracy of the weed detection model, hence this only considers positive classification
and localization is monitored. To make that the system scenario’s where the prediction is always right. But it also
functions well in real-world situations, its response speed, shows a major drawback by not the negatives at all, that
effectiveness, and dependability are tracked. The model is might cause the model to miss certain correct predictions
connected to an automated weed-removal sprayer, that (i.e., low recall).[26]
performs reliably and accurately. Additionally, field testing
offers insightful input for retraining or additional model 3.1.3 Recall
refinement to increase resilience. Within this performance evaluation parameter, we check in
actual how many cases did the model actually classified prototype was being implemented using CNN as well as
positively out of all the positive ones. It ranges from 0 to 1. YOLOv4 deep learning algorithms and after successful
This basically measures the model’s real ability to capture all testing phase, the results have been concluded and compiled
the relevant instances of the positive class. according to the above defined performance evaluation
𝑇𝑃 metrics.
𝑅𝑒𝑐𝑎𝑙𝑙 = (13) The results of each technique have been thoroughly
𝑇𝑃+𝐹𝑁
Equation (13) answers to our question, “Out of all the actual evaluated ensuring untampered standards and accurate real-
weeds, how many did our model find?” If a model has higher world simulation. Table 1 summarizes the result metrics of
recall, then we can safely say that the model is classifying YOLOv4 technique that was implemented on the very same
most of the positive classes, hence maximum weeds in the setup for a through comparison.
field of crops are being successfully detected. Using Equation (11) we can calculate the value of accuracy
But if the recall alone is too high, that would mean that as follows:
103 + 79
the model is classifying every object as weed, thus making 𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = = 91.0%
the recall of the model 100% but reducing precision in its 103 + 79 + 6 + 12
Similarly, using Equation (12) and (13) the precision and
classification which accounts to be a failure in the model’s
recall are calculated:
classification.[27]
103
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = = 94.49%
3.1.4 F1 score 103 + 6
This parameter is solely based on the values of precision and 103
recall of the particular model as it is a harmonic mean of the 𝑅𝑒𝑐𝑎𝑙𝑙 = = 89.56%
103 + 12
precision and recall of the model. It ranges in between 0
(worst) and 1 (best). This metric is the one that gives us a Now, Equation (14) is being used to calculate the F1 Score
trade-off between the precision and recall of a particular for the particular technique:
model. As the harmonic mean is observed to punish the 2 × 0.9449 × 0.8956
extreme high resulting values more, thus this is preferred 𝐹1𝑆𝑐𝑜𝑟𝑒 = = 91.96%
0.9449 + 0.8956
over arithmetic mean process. As a result, both precision and
Table 1: Results during field testing using YOLOv4.
recall values have to be above the mark in order to achieve a
Field True Cases False Cases %Error %Success
reasonably higher F1 score.
Trial TN TP FN FP
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛∙𝑅𝑒𝑐𝑎𝑙𝑙
𝐹1𝑠𝑐𝑜𝑟𝑒 = 2 × 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛+𝑅𝑒𝑐𝑎𝑙𝑙 (14) 1 8 9 3 0 15 85
2 6 11 3 0 15 85
Equation (14) shows how mathematically F1 score is being 3 3 16 0 1 5 95
calculated using the precision and recall metric values. It is 4 10 10 0 0 0 100
especially used in cases where a particular model has an 5 7 11 2 0 10 90
imbalanced dataset or cases where the model needs to have a 6 11 9 0 0 0 100
proper balance between precision and recall.[28] 7 8 11 1 0 5 95
8 6 9 3 2 25 75
3.2 Experimental analysis 9 14 6 0 0 0 100
3.2.1 Metric values 10 6 11 0 3 15 85
The proposed prototype in this paper is trained and Total 79 103 12 6 - -
developed using a standard self-developed dataset. The Average - - - - 9.0 91.0
Table 2 summarizes the result metrics of CNN technique recall are calculated:
that was implemented on the very same setup for a through 116
comparison. 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = = 96.66%
116 + 4
Table 2: Results during field testing using CNN. 116
Field True Cases False Cases %Error %Success 𝑅𝑒𝑐𝑎𝑙𝑙 = = 95.86%
116 + 5
Trial
Now, Equation (14) is being used to calculate the F1 Score
TN TP FN FP
for the particular technique:
1 6 11 2 1 15 85
2 6 12 2 0 10 90 2 × 0.9666 × 0.9586
𝐹1𝑆𝑐𝑜𝑟𝑒 = = 96.25%
3 6 12 0 2 10 90 0.9666 + 0.9586
4 3 16 0 1 5 95
5 7 12 1 0 5 95 3.2.2 Confusion matrix
6 11 9 0 0 0 100 From the shown confusion matrix, it can be clearly observed
7 8 12 0 0 0 100 that the CNN technique has completely outperformed the
8 4 16 0 0 0 100 YOLOv4 algorithm and proven its proficiency in accurate
9 14 6 0 0 0 100 object detection and recognition. The YOLOv4 in field
10 10 10 0 0 0 100 testing lacks true positive cases (TP = 103), whereas its
Total 75 116 5 4 - - greater true negative (TN = 79) and false negative (FN = 12)
Average - - - - 4.5 95.5 values result in lower precision as compared to CNN.[29]
3.3 Discussion
116 + 75
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = = 95.50% The prototype in this paper is being developed and
116 + 75 + 4 + 5 implemented using CNN classification and YOLOv4
Similarly, using Equation (12) and (13) the precision and supervised algorithms for a comparison-based study and
proved to be more precise and accurate in doing so with an The authors confirm that there was no use of artificial
accuracy of 95.50%, precision and recall of 96.66% and intelligence (AI)-assisted technology for assisting in the
95.86% respectively. This surpasses the scores of the writing or editing of the manuscript and no images were
YOLOv4 technique for weed detection, although it cannot manipulated using AI.
beat the speed and agility of YOLOv4 but when it comes to
accurate classification and comprehensive detection CNN References
takes up the stakes and proves it worth by securing an F1 [1] C. S. G. Sunil, Y. Zhang, C. Koparan, M. R. Ahmed, K.
Score of 96.25%. The CNN classifier model is suitable for Howatt, X. Sun, Weed and crop species classification using
general field assessment, such as identifying whether weeds computer vision and deep learning technologies in
are present in an image. However, for the practical greenhouse conditions, Journal of Agriculture and Food
application of targeted and precision spraying, the YOLOv4 Research, 2022, 9, 100325, doi: 10.1016/j.jafr.2022.100325.
object detector is essential due to its ability to localize weeds
[2] B. Turan, I. Kadioglu, A. Basturk, B. Sin, A. Sadeghpour,
within the image. YOLOv4 achieved an average inference Deep learning for image-based detection of weeds from
speed of 30 FPS (frames per second), making it suitable for emergence to maturity in wheat fields, Smart Agricultural
real-time applications, whereas the custom CNN model Technology, 2024, 09, 100552, doi:
averaged around 5 FPS, making it more suitable for offline 10.1016/j.atech.2024.100552.
analysis. Although future research scope for this particular[3] A. Upadhyay, G. C. Sunil, Y. Zhang, C. Koparan, X. Sun,
field of study is broad and insightful, yet this paper Development and evaluation of a machine vision and deep
successfully highlights certain areas of aspect that can learning-based smart sprayer system for site-specific weed
significantly improve the performance of a large scale weed management in row crops: An edge computing approach,
detection and elimination system. Despite of certain Computers and Electronics in Agriculture, 2024, 216,
limitations being encountered during the implementation of 108495, doi: 10.1016/j.jafr.2024.101331.
the study such as artificial lighting conditions, shadow [4] S. Zahoor, S. A. Sof, Weed identification in crop field
overlaying etc., the authors have achieved to prove the using CNN, Journal of University of Shanghai for Science
proficiency of the particularly suggested method of and Technology, 2021, 23, 15-21, doi:
implementation for future implementations to come. 10.3390/smartcities3030039.
[5] P. K. Reddy, R. A. Reddy, M. A. Reddy, K. Sai Teja, K.
Conflict of Interest Rohith, K. Rahul, Detection of weeds by using machine
There is no conflict of interest. learning, Proceedings of the International Conference on
Emerging Trends in Engineering and Technology, 2023, 882-
Supporting Information 892.
Not applicable [6] W. -H. Su, Advanced machine learning in point
spectroscopy, RGB- and Hyperspectral-imaging for
Use of artificial intelligence (AI)-assisted technology for automatic discriminations of crops and weeds: a review,
manuscript preparation Sensors, 2021, 21, 4707, doi: 0.3390/smartcities3030039.
[7] U. S. Umanaheswari, A. R. Arjun, M. D. Meganathan, Optimal speed and accuracy of object detection, arXiv
Weed detection in farm crops using parallel image preprint arXiv:2004.10934, 2020.
processing, 2018 Conference on Information and [21] J. Redmon, S. Divvala, R. Girshick, A. Farhadi, You
Communication Technology (CICT), Jabalpur, India, 2018, only look once: unified, real-time object detection,
1-4, doi: 10.1109/INFOCOMTECH.2018.8722369. Proceedings CVPR'16, 2016, 779-788, doi:
[8] O. M. Olaniyi, E. Daniya, J. G. Kolo, J. A. Bala, A. E. 10.48550/arXiv.1506.02640.
Olanrewaju, A computer vision-based weed control system [22] J. Redmon, A. Farhadi, YOLOv3: An incremental
for low-land rice precision farming, International Journal of improvement, arXiv preprint arXiv:1804.02767, 2018.
Advances in Applied Sciences, 2020, 9, 51-61, doi: [23] O. L. Garcia-Navarrete, A. Correa-Guimaraes,
10.11591/ijaas.v9.i1.pp51-61. Application of convolutional neural networks in weed
[9] M. D. Bah, A. Hafiane, R. Canals, Deep learning with detection and identification: A systematic review, Computers
unsupervised data labeling for weed detection in line crops and Electronics in Agriculture, 2024, 216, 108520.
in UAV images, Remote Sensing, 2018, 10, 1690, doi: [24] M. Ofori, O. El-Gayar, An approach for weed detection
10.3390/rs10111690. using CNNs and transfer learning, Proceedings of the 54th
[10] V. Partel, S. C. Kakaria, Y. Ampatzidis, Development Hawaii International Conference on System Sciences, 2021,
and evaluation of a low-cost and smart technology for 888-895.
precision weed management utilizing artificial intelligence, [25] R. Sapkota, J. Stenger, M. Ostlie, P. Flores, Towards
Computers and Electronics in Agriculture, 2019, 157, 339- reducing chemical usage for weed control in agriculture
350, doi: 10.1016/j.compag.2018.12.048. using UAS imagery analysis and computer vision
[11] Y. Wang, H. Liu, D. Wang, D. Liu, Image processing in techniques, Scientific Reports, 2020, 13, 6548, doi:
fault identification for power equipment based on improved 10.1038/s41598-023-33042-0.
super green algorithm, Computers & Electrical Engineering, [26] B. B. Sapkota, C. Hu, M. V. Bagavathiannan, Evaluating
2020, 87, 106753, doi: 10.1016/j.compeleceng.2020.106753. cross-applicability of weed detection models across different
[12] J. Zhang, Weed recognition method based on hybrid crops in similar production environments, Frontiers in Plant
CNN-transformer model, Frontiers in Computing and Science, 2022, 13, doi: 10.3389/fpls.2022.837726
Intelligent Systems, 2023, 4, 72-77, doi: [27] O. E. Apolo-Apolo, M. Fontanelli, C. Frasconi, M.
10.54097/fcis.v4i2.10209. Raffaelli, A. Peruzzi, M. P. Ruiz, Evaluation of YOLO object
[13] L. Moldvai, P. Ákos Mesterházi, G. Teschner, A. Nyéki, detectors for weed detection in different turfgrass scenarios,
Weed detection and classification with computer vision using Applies Sciences, 2023, 13, 8502, doi:10.3390/app13148502.
a limited image dataset, Computers and Electronics in [28] M. A. Saqib, M. Aqib, M. N. Tahir, Y. Hafeez, Towards
Agriculture, 2024, 214, 108301, doi: 10.3390/app14114839. deep learning based smart farming for intelligent weeds
[14] H. Jiang, C. Zhang, Y. Qiao, Z. Zhang, W. Zhang, C. management in crops, Frontiers in Plant Science, 2023, 14,
Song, CNN feature-based graph convolutional network for doi: 10.3389/fpls.2023.1211235.
weed and crop recognition in smart farming, Computers and [29] V. S. Babu, N. Venkatram, Weed detection and
Electronics in Agriculture, 2020, 174, 105450, doi: localization in soybean crops using YOLOv4 deep learning
10.1016/j.compag.2020.105450. model, Traitement du Signal, 2023, 41, 1019-1025, doi:
[15] M. A. Haq, CNN based automated weed detection 10.18280/ts.410242.
system using UAV imagery, Computer Systems Science and
Engineering, 2022, 42, 837-849, doi: Publisher Note: The views, statements, and data in all
[16] P. K. Reddy, R. A. Reddy, M. A. Reddy, K. S. Teja, publications solely belong to the authors and contributors.
K. Rohith, K. Rahul, Detection of weeds by using machine GR Scholastic is not responsible for any injury resulting from
learning, Proceedings of International Conference on the ideas, methods, or products mentioned. GR Scholastic
Emerging Trends in Engineering, B. Raj et al., Eds., Springer, remains neutral regarding jurisdictional claims in published
2023, 882–892, doi: 10.2991/978-94-6463-252-1_89. maps and institutional affiliations.
[17] L. Wan, M. Zeiler, S. Zhang, Y. L. Cun, R. Fergus,
Regularization of neural networks using DropConnect, Open Access
Proceedings of the 30th International Conference on This article is licensed under a Creative Commons
Machine Learning, 2013, 28, 1058-1066. Attribution-NonCommercial 4.0 International License,
[18] B. Jabir, L. Rabhi, N. Falih, RNN- and CNN-based weed which permits the non-commercial use, sharing, adaptation,
detection for crop improvement: An overview, Foods and distribution and reproduction in any medium or format, as
Raw Materials, 2021, 9, 387–396, doi: 10.21603/2308-4057- long as appropriate credit to the original author(s) and the
2021-2-387-396.h. source is given by providing a link to the Creative Commons
[19] Y. Tang, Deep learning using linear support vector licence and changes need to be indicated if there are any. The
machines, arXiv preprint arXiv:1306.0239, 2015. images or other third-party material in this article are
[20] A. Bochkovskiy, C. -Y. Wang, H. -Y. M. Liao, YOLOv4: included in the article's Creative Commons licence, unless