0% found this document useful (0 votes)
40 views11 pages

Sat - 40.Pdf - Agricultural Product Price and Crop Cultivation Prediction Based On SMLT

The document discusses using machine learning algorithms to predict crop yields using agricultural data. Specifically, it analyzes using supervised machine learning techniques like k-nearest neighbor (kNN), Naive Bayes (NB), decision tree (DT), support vector machine (SVM), and random forest (RF) on agricultural datasets containing variables like soil texture, nutrients, temperature, and rainfall to select important features that help predict suitable crops for different land regions. The goal is to improve crop yield prediction models by selecting key data points from the agricultural datasets.

Uploaded by

Vj Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views11 pages

Sat - 40.Pdf - Agricultural Product Price and Crop Cultivation Prediction Based On SMLT

The document discusses using machine learning algorithms to predict crop yields using agricultural data. Specifically, it analyzes using supervised machine learning techniques like k-nearest neighbor (kNN), Naive Bayes (NB), decision tree (DT), support vector machine (SVM), and random forest (RF) on agricultural datasets containing variables like soil texture, nutrients, temperature, and rainfall to select important features that help predict suitable crops for different land regions. The goal is to improve crop yield prediction models by selecting key data points from the agricultural datasets.

Uploaded by

Vj Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

ABSTRACT

Agriculture is responsible for the majority of the nation's economic contribution


across the world. However, due to a lack of ecosystem control methods, the
majority of agricultural lands are still underdeveloped. Crop output is not improving
as a result of these issues, which has an impact on the farm economy. As a result,
a rise in agricultural output is aided by the forecast of plant yields. To avoid this
issue, agricultural industries must use machine learning algorithms to estimate the
harvest from a given dataset. The supervised machine learning method (SMLT)
analyses a to capture a dataset several pieces of information, such as variable
identification, Missing value treatments, univariate analysis, bivariate and
multivariate analysis, and so on. A comparison of machine learning algorithms was
performed to see which one was more accurate in forecasting the best crop. When
entropy calculation, and are evaluated, precision, recall, F1 Score, sensitivity,
specificity, & entropy come out ahead. the data show that the recommended
machine learning algorithm technique has the highest accuracy.
TABLE OF CONTENTS

CHAPTER NO TITLE PAGE NO


Abstract i
List of Tables ii
List of Figures iii
1 Introduction 1
2 Literature Survey 2
3 Methodology 4
3.1 Machine Learning
3.2 Preparing the Data set
3.3 List of Modules

4 Result and Discussion 22


5 Conclusion and Future Work 23

References 24
Appendices 25
A. Source Code 25-34
B. Screen Shots 34-35
C. Publication with 35
Plagiarism Report
List of Tables

TABLE NO TABLE NAME PAGE NO


1 Data set Attributes 9
List of Figures

FIGURE NO FIGURE NAME PAGE NO


1 Process of Machine 5
Learning
2 Block Diagram 7

3 Data set1 9
4 Data set2 10
5 Yield of Cost Production 11
6 Cost Production of Crop 11
7 Decision Tree 17
8 Random Forest 18
9 Naive Bayes 19
10 Support Vector 20
CHAPTER 1
INTRODUCTION

Agriculture research has boosted the world economy and offers enormous benefits
to society as a whole. Crop evaluation in agriculture remains challenging, despite
recent improvements that involve the use of a wide range the availability of
technical resources techniques,as well as techniques Precision farming and agri-
technology sometimes referred to as virtual farming, are current research
areasthat boosts the usage of data-intensive approaches agricultural productivity
while lowering environmental impact.Accurate crop detection focuses on
ecological and soil Agricultural production is dependent on a number of factors,
one of which has been identified. the subject of decades of research.. The majority
of existing algorithms for crop yield estimate involve machine learning (ML), but
very little was done to forecast territory crops solely on soil and climate data as
well as the environment Crop cultivation is influenced by a variety of factors, Soil
texture, nutrients (N2, P, and K), micronutrients (Fe, B, and Mn), temp, and rainfall
are all factors to consider.l. Because the characteristics vary by zone, resulting in
a large order to ensure the sustainability data set, it is necessary to pick key
aspects that aid in the diagnosis of acceptable crops for certain land regions.
Feature selection (FS) approaches are used to carry out the procedure.

Prediction relies heavily on machine learning methods. FS approaches [1]–[6] are


utilized to reduce fitting problems and identify important characteristics for the
prediction procedure, from the data set resulting in improved ML performance.
Filters [7], wrap [8], and embed [9] are the three types of the FS approach. Filter
techniques are unaffected by the classifier's performance, Wrapper approaches,
on the other hand, pick features depending on the success of the classifier.
Because it combines both filter and wrapper approaches, the embedded approach
is analogous to the latter. This investigation focuses on wrapper FS approaches.
To anticipate an appropriate crop and measure the efficacy of the FS process, the
chosen attributes are fed into the k-nearest neighbor (kNN), Naïve Bayes ( NB) ,
decision tree (DT), support vector machine (SVM), and random forest (RF). The
purpose of this research is just to improve the crops underlying model by selecting
key data points.
CHAPTER 2

LITERATURE SURVEY

Title : Forecasting crop yield using remotely sensed vegetation indices and

crop phenology metrics

Author : Douglas K. Bolton∗, Mark A. Friedl

Year : 2013

We used data from NASA's Moderate Resolution Imaging Spectro radiometer


(MODIS) to develop empirical models predicting maize and soybean yield in the
Central United States. Results show that MODIS provides a better basis for
predicting maize yields than the widely used NDVI.

Title : Crop Yield Assessment from Remote Sensing

Author : Paul C. Doraiswamy, Sophie Moulin, Paul W. Cook, and Alan Stern

Satellite remotely sensed data provide a real-time assessment of the magnitude


and variation of crop condition parameters. This study investigates the use of
these parameters as an input to a crop growth model. A radiative transfer model,
SAIL, provided the link between the satellite data and crop model.

Title : A generalized regression-based model for forecasting winter wheat yields in


Kansas and Ukraine using MODIS data

Author : Becker-Reshef a, ⁎ , E. Vermote a , M. Lindeman b , C. Justice

Year : 2010

Wheat is one of the key cereal crops grown worldwide. The forecasts of production
in Kansas closely matched the USDA/NASS reported numbers with a 7% error.
The same regression model forecast winter wheat production in Ukraine within
10% of the official reported numbers six weeks prior to harvest.

Title : Plant Yield Prediction Model Using Firefly based Feature Selection with
Modified Fuzzy Cognitive Maps
Author : 1 D. Sabareeswaran and 2 R. Gunasundari

In this paper, a development of agricultural productivity is enhanced based on the


plant yield prediction. Predicted outcome is transmitted to the farmer's through
smart phones which helps for identifying the growth of plant and improving the
harvesting. The experimental results show that the effectiveness of the proposed
technique can be compared with the other prediction techniques.

Title : Crop yield forecasting on the Canadian Prairies using MODIS NDVI data

Author : M.S. Mkhabelaa,∗, P. Bullocka, S. Raj b, S. Wangc, Y. Yang

Year : 2010

Moderate resolution imaging spectroradiometer (MODIS) could be used for crop


yield prediction on the Canadian Prairies. MODIS 10-day composite NDVI data for
the years 2000–2006 were obtained from the Canada Centre for Remote Sensing.
New regression models were developed to forecast crops grain yield. Results
showed that MODIS-NDVI data can be used effectively to predict crop yield on the
Canadian Prairies. Power function models developed for each crop accounted for
48 to 90% of the grain yield variability for barley, canola, field peas and spring
wheat.
CHAPTER 3

METHODOLOGY

3.1 MACHINE LEARNING

Machine learning is to predict the future from past data. Machine learning (ML) is a
type of artificial intelligence (AI) that provides computers with the ability to learn
without being explicitly programmed. Machine learning focuses on the
development of Computer Programs that can change when exposed to new data
and the basics of Machine Learning, implementation of a simple machine learning
algorithm using python. Process of training and prediction involves use of
specialized algorithms. It feed the training data to an algorithm, and the algorithm
uses this training data to give predictions on a new test data. Machine learning can
be roughly separated in to three categories. There are supervised learning,
unsupervised learning and reinforcement learning. Supervised learning program is
both given the input data and the corresponding labeling to learn data has to be
labeled by a human being beforehand. Unsupervised learning is no labels. It
provided to the learning algorithm. This algorithm has to figure out the clustering of
the input data. Finally, Reinforcement learning dynamically interacts with its
environment and it receives positive or negative feedback to improve its
performance.

Data scientists use many different kinds of machine learning


algorithms to discover patterns in python that lead to actionable insights. At a high
level, these different algorithms can be classified into two groups based on the
way they “learn” about data to make predictions: supervised and unsupervised
learning. Classification is the process of predicting the class of given data points.
Classes are sometimes called as targets/ labels or categories. Classification
predictive modeling is the task of approximating a mapping function from input
variables(X) to discrete output variables(y). In machine learning and statistics,
classification is a supervised learning approach in which the computer program
learns from the data input given to it and then uses this learning to classify new
observation. This data set may simply be bi-class (like identifying whether the
person is male or female or that the mail is spam or non-spam) or it may be multi-
class too. Some examples of classification problems are: speech recognition,
handwriting recognition, bio metric identification, document classification etc.

Supervised Machine Learning is the majority of practical machine learning uses


supervised learning. Supervised learning is where have input variables (X) and an
output variable (y) and use an algorithm to learn the mapping function from the
input to the output is y = f(X). The goal is to approximate the mapping function so
well that when you have new input data (X) that you can predict the output
variables (y) for that data. Techniques of Supervised Machine Learning algorithms
include logistic regression, multi-class classification, Decision Trees and support
vector machines etc. Supervised learning requires that the data used to train the
algorithm is already labeled with correct answers. Supervised learning problems
can be further grouped into Classification problems. This problem has as goal the
construction of a succinct model that can predict the value of the dependent
attribute from the attribute variables. The difference between the two tasks is the
fact that the dependent attribute is numerical for categorical for classification. A
classification model attempts to draw some conclusion from observed values.
Given one or more inputs a classification model will try to predict the value of one
or more outcomes. A classification problem is when the output variable is a
category, such as “red” or “blue”.

3.2 PREPARING THE DATASET

The demo dataset is now supplied to machine learning model on the basis of this
data set the model is trained. Every new detail filled at the time of application form
acts as a test data set. After the operation of testing, model prediction based upon
the inference it concludes on the basis of the training data sets. Satellite Imagery
(Remote Sensing Data), has been widely used for predicting crop yield. This
dataset is collected using the sensors mounted on satellites or planes, which
detect the energy (electromagnetic waves), reflected or diffracted from surface of
the earth. Remote sensing data has a lot of energy bands to offer, but mainly only
few of them have been used for crop yield prediction. Yet, there are some people
who have tried generating relevant features using the bands which are typically
ignored, and they have been successful with improving results with that. In case of
this dataset, most people rarely explore the high-order moments of the features.
Based on these datasets people have used algorithms like Regression models,
Random Forest and Nearest Neighbor etc.

Table shows details of the datasets:

Variable Description
Crop Crop name
State Name Indian state name
District Name District name list of each state
Cost of Cultivation (`/Hectare) C2 Cultivation amount for C2 Scheme
Cost of Production (`/Quintal) C2 Production amount for A2+FL Scheme
Yield (Quintal/ Hectare) Yield of crop
Crop year Crop year list
District Name District name for each state
Area Total area of each place
Rainfall Water availability of each crop
Average humidity directly influences the water
relations of plant and indirectly
affects leaf growth
Mean Temperature Climate of each crop
Cost Production of per yield crop Cost of crop yield
Dataset Attributes

3.3 List of Modules:

 Data Pre-processing
 Data Analysis of Visualization
 Comparing Algorithm with prediction in the form of best accuracy result
 Deployment Using GUI
Architecture Diagram

Block Diagram

3.3.1 Data Pre-processing

Validation techniques in machine learning are used to get the error rate of
the Machine Learning (ML) model, which can be considered as close to the true
error rate of the dataset. If the data volume is large enough to be representative of
the population, you may not need the validation techniques. However, in real-world
scenarios, to work with samples of data that may not be a true representative of
the population of given dataset. To finding the missing value, duplicate value and
description of data type whether it is float variable or integer. The sample of data
used to provide an unbiased evaluation of a model fit on the training dataset while
tuning model hyper parameters.

The evaluation becomes more biased as skill on the validation dataset is


incorporated into the model configuration. The validation set is used to evaluate a
given model, but this is for frequent evaluation. It as machine learning engineers

You might also like