0% found this document useful (0 votes)

174 views9 pages

Code Challenge Integrated Project P1 Student Version

Uploaded by

Wakanda Citizen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

174 views9 pages

Code Challenge Integrated Project P1 Student Version

Uploaded by

Wakanda Citizen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Integrated Project: Understanding Maji Ndogo's agriculture

In this coding challenge, we will apply all of the skills we learned in Pandas.

⚠ Note that this code challenge is graded and will contribute to your overall marks for this module.
Submit this notebook for grading. Note that the names of the functions are different in this notebook.
Transfer the code in your notebook to this submission notebook

Instructions
• Do not add or remove cells in this notebook. Do not edit or remove the ### START FUNCTION or
### END FUNCTION comments. Do not add any code outside of the functions you are required to
edit. Doing any of this will lead to a mark of 0%!

• Answer the questions according to the specifications provided.

• Use the given cell in each question to see if your function matches the expected outputs.

• Do not hard-code answers to the questions.

• The use of StackOverflow, Google, and other online tools is permitted. However, copying a fellow
student's code is not permissible and is considered a breach of the Honour code. Doing this will
result in a mark of 0%.

Introduction
Hey there, I'm glad you're on board for the Maji Ndogo project AGAIN! Let me walk you through what
we're up against and how we'll tackle it.

As you know, we're in an ambitious project aimed at automating farming in Maji Ndogo, a place with
diverse and challenging agricultural landscapes. Before we dive into the 'how' of farming, we need to
figure out the 'where' and 'what'. It's not just about deploying technology; it's about making informed
decisions on where to plant specific crops, considering factors like rainfall, soil type, climate, and many
others.

Our analysis is the groundwork for this entire automation project. We have an array of variables like soil
fertility, climate conditions, and geographical data. By understanding these elements, we can recommend
the best locations for different crops. It's a bit like solving a complex puzzle – each piece of data is crucia
to seeing the bigger picture.
We'll start by importing our dataset into a DataFrame. It is currently in an SQLite database, and split into
tables. Unlike Power BI or SQL, data analysis in Python happens in a single table. So we will have to brush
off those dusty SQL skills to get the data imported. Expect a bit of a mess in the data – it's part of the
challenge. We need to clean it up and maybe reshape it to make sense of it. It's like sorting out the tools
and materials we need and getting rid of what we don't.

Here's where the real fun begins. We'll dive deep into the data, looking for patterns, and correlations. Eac
clue in the data leads us closer to understanding the best farming practices for Maji Ndogo. I'll be relying
on your skills and insights. We'll be working through these steps together, discussing our findings and
strategies.

Let's gear up and get ready to make a real difference in Maji Ndogo. Ready to get started? Let's dive into
our data and see what stories it has to tell us.

Sanaa.

Data dictionary
1. Geographic features

• Field_ID: A unique identifier for each field (BigInt).

• Elevation: The elevation of the field above sea level in metres (Float).

• Latitude: Geographical latitude of the field in degrees (Float).

• Longitude: Geographical longitude of the field in degrees (Float).

• Location: Province the field is in (Text).

• Slope: The slope of the land in the field (Float).

2. Weather features

• Field_ID: Corresponding field identifier (BigInt).

• Rainfall: Amount of rainfall in the area in mm (Float).

• Min_temperature_C: Average minimum temperature recorded in Celsius (Float).

• Max_temperature_C: Average maximum temperature recorded in Celsius (Float).

• Ave_temps: Average temperature in Celcius (Float).

3. Soil and crop features

• Field_ID: Corresponding field identifier (BigInt).

• Soil_fertility: A measure of soil fertility where 0 is infertile soil, and 1 is very fertile soil (Float).

• Soil_type: Type of soil present in the field (Text).

• pH: pH level of the soil, which is a measure of how acidic/basic the soil is (Float).
4. Farm management features

• Field_ID: Corresponding field identifier (BigInt).

• Pollution_level: Level of pollution in the area where 0 is unpolluted and 1 is very polluted (Float).

• Plot_size: Size of the plot in the field (Ha) (Float).

• Chosen_crop: Type of crop chosen for cultivation (Text).

• Annual_yield: Annual yield from the field (Float). This is the total output of the field. The field size and
type of crop will affect the Annual Yield

• Standard_yield: Standardised yield expected from the field, normalised per crop (Float). This is
independent of field size, or crop type. Multiplying this number by the field size, and average crop
yield will give the Annual_Yield.

Average yield (tons/Ha) per crop type:

• Coffee: 1.5

• Wheat: 3

• Rice: 4.5

• Maize: 5.5

• Tea: 1.2

• Potato: 20

• Banana: 30

• Cassava: 13

Alright, let's walk through the process of importing our SQL data from multiple tables into a single
DataFrame. This is a crucial step as it sets the foundation for all our subsequent analyses.

We're dealing with an SQLite database, Maji_Ndogo_farm_survey.db , which contains multiple tables.
We'll need to join these tables on a common key to create a comprehensive dataset for our analysis. The
common key in our case is Field_ID .

Here’s how we can do it:

In [ ]: import pandas as pd # importing the Pandas package with an alias, pd

from sqlalchemy import create_engine, text # Importing the SQL interface. If this fails, r

# Create an engine for the database

engine = create_engine('sqlite:///Maji_Ndogo_farm_survey_small.db') #Make sure to have the

Next up, we test if the connection works by printing out all of the table names in the database.

In [ ]:
with engine.connect() as connection:
result = connection.execute(text("SELECT name FROM sqlite_master WHERE type='table';")
for row in result:
print(row)

Expected output:

('geographic_features',)

('weather_features',)

('soil_and_crop_features',)

('farm_management_features',)

At this point, we have two choices:

1. Either we import each table into a DataFrame, for example, df_geographic , then merge them
together.

2. Use one SQL query and read it into a single DataFrame.

While both are equally viable, let's try to use a single SQL query to keep things simple.

Next, we'll write an SQL query to join our tables. Combine all of the tables into a single query, using
Field_ID .

In [ ]: sql_query = """
# Insert your query here
"""

With our engine and query ready, we'll use Pandas to execute the query. The pd.read_sql_query
function fetches the data and loads it into a DataFrame – essentially transferring our data from the
database into a familiar Pandas structure. If you use one query, you will import it all into
MD_agric_df .

In [ ]: # Create a connection object

with engine.connect() as connection:

# Use Pandas to execute the query and store the result in a DataFrame
MD_agric_df = pd.read_sql_query(text(sql_query), connection)

Check the DataFrame to see if it loaded correctly.

In [ ]: MD_agric_df

Note that there are a couple of Field_ID columns in our DataFrame that we need to remove since
we're not interested in particular farms for now.

In [ ]: # Now, drop all columns named 'Field_ID'.

MD_agric_df.drop(columns = 'Field_ID', inplace = True)

Data cleanup
I noticed some errors in the data. Here's what I picked up:

1. There are some swapped column names. Please ensure to use the correct name.

2. Some of the crop types contain spelling errors.

3. The Elevation column contains some negative values, which are not plausible, so change these to
positive values.

Use your Pandas skills to clean up the data.

In [ ]: # Insert your code here

Final data checkup

Compare your answers to the expected output to make sure your data is corrected.

In [ ]: len(MD_agric_df['Crop_type'].unique())

Expected output: 8

In [ ]: MD_agric_df['Elevation'].min()

Expected output: 35.910797

In [ ]: MD_agric_df['Annual_yield'].dtype

Expected outcome: dtype('float64')

Analysis
Challenge 1: Uncovering crop preferences
Now that we have our data ready, let's delve into understanding where different crops are grown in Maji
Ndogo. Our initial step is to focus on tea, a key crop in Maji Ndogo. We need to determine the optimal
conditions for its growth. By analyzing data related to elevation, rainfall, and soil type specifically for tea
plantations, we'll start to paint a picture of where our farming systems could thrive.

Task: Create a function that includes only tea fields and returns a tuple with the mean Rainfall and th
mean Elevation . The function should input the full DataFrame, a string value for the crop type to filter
by, and output a tuple with rainfall and elevation.

In [ ]: ### START FUNCTION

def explore_crop_distribution(df,crop_filter):
# Insert your code here
### END FUNCTION

Input:

In [ ]: explore_crop_distribution(MD_agric_df, "tea")

Expected output: (1534.5079956188388, 775.208667535597)

In [ ]: explore_crop_distribution(MD_agric_df, "wheat")

Expected output: (1010.2859910581222, 595.8384148002981)

Repeat this for a couple of crops to get a feeling for where crops are planted in Majio Ndogo.

Challenge 2: Finding fertile grounds

With insights into tea cultivation, let's broaden our horizons. Fertile soil is the bedrock of successful
farming. By grouping our data by location and soil type, we'll pinpoint where the most fertile soils in Maj
Ndogo are. These fertile zones could be prime candidates for diverse crop cultivation, maximising our
yield.

We’ll group our data by soil type to see where the most fertile grounds are. This information will be vital
for deciding where to deploy our farming technology.

Task: Create a function that groups the data by Soil_type , and returns the Soil_fertility .

In [ ]: ### START FUNCTION

def analyse_soil_fertility(df):
# Insert your code here
### END FUNCTION

Input:

In [ ]: analyse_soil_fertility(MD_agric_df)

Expected output:

Loamy 0.585868
Peaty 0.604882
Rocky 0.582368
Sandy 0.595669
Silt 0.652654
Volcanic 0.648894
Name: Soil_Fertility, dtype: float64

Try digging into the data a bit more by aggregating various data to identify some more patterns.

Challenge 3: Climate and geography analysis

Now, let's delve into how climate and geography influence farming. By understanding the relationship
between factors like elevation, temperature, and rainfall with crop yields, we can identify the most
suitable areas for different crops. This analysis is key to ensuring our automated systems are deployed in
locations that will maximise their effectiveness.

Task: Create a function that takes in a DataFrame and the column name, and groups the data by that
column, and aggregates the data by the means of Elevation , Min_temperature_C ,
Max_temperature_C , and Rainfall , and outputs a DataFrame. Please ensure that the order of the
columns matches the output.

In [ ]:
### START FUNCTION
def climate_geography_influence(df,column):
# Insert your code here
### END FUNCTION

Input:

In [ ]: climate_geography_influence(MD_agric_df, 'Crop_type')

Expected output:

Crop_type Elevation Min_temperature_C Max_temperature_C

Rainfall
banana 487.973572 -5.354344
31.988152 1659.905687
cassava 682.903008 -3.992113
30.902381 1210.543006
coffee 647.047734 -4.028007
30.855189 1527.265074
maize 680.596982 -4.497995
30.576692 681.010276
potato 696.313917 -4.375334
30.300608 660.289064
rice 352.858053 -6.610566 32.727170
1632.382642
tea 775.208668 -2.862651 29.950383
1534.507996
wheat 595.838415 -4.968107
30.973845 1010.285991

Challenge 4: Advanced sorting techniques

Quite often it is better to improve the things you're good at than improving the things you're bad at. So
the question is, which crop is the top performer in Maji Ndogo, and under what conditions does it
perform well?

To answer this, we need to:

1. Filter all the fields with an above-average Standard_yield (do you have flashbacks to SQL
subqueries right now?).
2. Then group by <?> crop type, using count() .
3. Sort the values to get the top crop type on top.
4. Retrieve the name of the top index. See the hint below on how to do this.

Task: Create a function that takes a DataFrame as input, filters, groups and sorts, and outputs a string
value of a crop type.

Hint: When you have grouped by a column, we can access the labels of that "index column" using
.index . For example:

In [ ]: grouped_df = MD_agric_df.groupby("Soil_type").mean(numeric_only = True).sort_values(by="El

print(grouped_df.index[0])
grouped_df
In [ ]: ### START FUNCTION
def find_ideal_fields(df):
# Insert your code here
### END FUNCTION

Input:

In [ ]: type(find_ideal_fields(MD_agric_df))

Expected output: str

Challenge 5: Advanced filtering techniques

Now we know that <?> is our most successful crop, we can look at what makes it successful.

Create a function that takes a DataFrame as input, and the type of crop, and filters the DataFrame using
the following conditions:

1. Filter by crop type.

2. Select only rows that have above average Standard_yield .

3. Select only rows that have Ave_temps >= 12 but =< 15.

4. Have a Pollution_level lower than 0.0001.

In [ ]: ### START FUNCTION

def find_good_conditions(df, crop_type):
# Insert your code here
### END FUNCTION

Input:

In [ ]: find_good_conditions(MD_agric_df, "tea").shape

Expected output: (14, 17)

Extra Pandas "nuggets"

We have not even scratched the surface of Pandas or our dataset. If you remember back to your days
with Chidi, it took a while before we could unlock the secrets the survey data had. So, scratch around a
bit.

On the Pandas front, it's the same. Pandas is a very powerful data analysis tool that takes a while to
master. Many of the more advanced methods like window functions, dynamically retrieving or changing
data, vectorisation, or processing big data with Pandas are all more advanced topics we encounter in the
workplace.

But here are two tiny 'nuggets' to upskill in Pandas.

df.query()
Oh, you're going to love this one... df.query() was designed to filter data, but using SQL-like syntax.
For example:

In [ ]: MD_agric_df.query('Standard_yield > 0.5 and Soil_type == "Loamy"')

Isn't that much easier to read and understand than the one below?

In [ ]: MD_agric_df[(MD_agric_df['Standard_yield'] > 0.5) & (MD_agric_df['Soil_type'] == 'Loamy')]

The nice thing is, we can use in [] , not in [] to filter with, and also pass in variables using
@variable_name .

In [ ]: soil_types = ['Loamy', 'Sandy', 'Silt']

MD_agric_df.query('Soil_type in @soil_types')

Plotting data with Pandas

Sometimes we quickly want to see a basic visualisation of our data. we can use df.plot(kind='bar')
to make a bar plot, df.plot(kind='hist', bins = 10) to see the distribution of a data column, or
df.plot(kind='scatter', x='Column_on_x', y='Column_on_y') to understand the relationship
between variables.

In [ ]: MD_agric_df.groupby('Crop_type')['Standard_yield'].mean().plot(kind='bar')

In [ ]: MD_agric_df['Standard_yield'].plot(kind='hist', bins =20)

In [ ]: MD_agric_df.plot(kind='scatter', x = "Pollution_level", y = "Standard_yield")

We can use these plots to get a quick feel for the data, but we can't really customise these much. For tha
we need some better tools.

Akshata Report
No ratings yet
Akshata Report
21 pages
Crop Yield Prediction
No ratings yet
Crop Yield Prediction
11 pages
Smart Farm 24
No ratings yet
Smart Farm 24
29 pages
Cmpe272 Project Report Team17
No ratings yet
Cmpe272 Project Report Team17
2 pages
Agricultural Data Analysis
No ratings yet
Agricultural Data Analysis
9 pages
Different Query Optimization Techniques (QOT) Using Data Mining Technology
No ratings yet
Different Query Optimization Techniques (QOT) Using Data Mining Technology
9 pages
Farm PDF
No ratings yet
Farm PDF
50 pages
Smart Farm 24
No ratings yet
Smart Farm 24
28 pages
Agriculture 13 01015
No ratings yet
Agriculture 13 01015
19 pages
Practical Manual For Python Programming
No ratings yet
Practical Manual For Python Programming
17 pages
Ai Project
No ratings yet
Ai Project
24 pages
Descriptive Statistics Project
No ratings yet
Descriptive Statistics Project
11 pages
Machine Learning in Agriculture
No ratings yet
Machine Learning in Agriculture
29 pages
Write-Up - Custom Scripting Env. Module
No ratings yet
Write-Up - Custom Scripting Env. Module
4 pages
AI Project - Ishaan Saha Class 10
No ratings yet
AI Project - Ishaan Saha Class 10
15 pages
HECASESTUDY
No ratings yet
HECASESTUDY
8 pages
Ggis
No ratings yet
Ggis
14 pages
Class 05-Case Study
No ratings yet
Class 05-Case Study
6 pages
Project Crops Production Analysis Python Xii Ip
No ratings yet
Project Crops Production Analysis Python Xii Ip
21 pages
Project Crops Production Analysis Python Xii Ip
No ratings yet
Project Crops Production Analysis Python Xii Ip
21 pages
Vagizov 2021 IOP Conf. Ser. Earth Environ. Sci. 876 012078
No ratings yet
Vagizov 2021 IOP Conf. Ser. Earth Environ. Sci. 876 012078
9 pages
MEng Thesis
No ratings yet
MEng Thesis
48 pages
Development Phase
No ratings yet
Development Phase
8 pages
Test Gis 2025
No ratings yet
Test Gis 2025
1 page
Crop Recommendation Using Machine Learning
No ratings yet
Crop Recommendation Using Machine Learning
7 pages
Urban Heat Island & Environmental Impact Analysis: Overview
No ratings yet
Urban Heat Island & Environmental Impact Analysis: Overview
28 pages
Farmarjdjndndmdnennenenen Redjdjndmdmdmdm
No ratings yet
Farmarjdjndndmdnennenenen Redjdjndmdmdmdm
13 pages
ArcGIS Research Lecture 2
No ratings yet
ArcGIS Research Lecture 2
13 pages
Harvetify
100% (1)
Harvetify
8 pages
Smart Agricultural Crop Prediction Using Machine Learning
No ratings yet
Smart Agricultural Crop Prediction Using Machine Learning
9 pages
Sristy Documentation Pno
No ratings yet
Sristy Documentation Pno
58 pages
CropYield Prediction Checkpoint - Ipynb
No ratings yet
CropYield Prediction Checkpoint - Ipynb
74 pages
From Sensors To Insights - Leveraging Intelligent Data Mining For Precision Farming and Decision Support
No ratings yet
From Sensors To Insights - Leveraging Intelligent Data Mining For Precision Farming and Decision Support
7 pages
Crop - Recom - Jupyter Notebook
No ratings yet
Crop - Recom - Jupyter Notebook
6 pages
Crops Grown
No ratings yet
Crops Grown
17 pages
ITC OEL - Watermark
No ratings yet
ITC OEL - Watermark
12 pages
Shreyas DBMS
No ratings yet
Shreyas DBMS
13 pages
Total Code
No ratings yet
Total Code
3 pages
ArcGIS Research
No ratings yet
ArcGIS Research
6 pages
56235ebb 1657787644522
No ratings yet
56235ebb 1657787644522
40 pages
Orange Data Mining - Introuduction
No ratings yet
Orange Data Mining - Introuduction
7 pages
Data Mining of Agricultural Yield Data - A Comparison of Regression Models
No ratings yet
Data Mining of Agricultural Yield Data - A Comparison of Regression Models
15 pages
Paper 4
No ratings yet
Paper 4
7 pages
Sample Poster
No ratings yet
Sample Poster
1 page
Major Proposal
No ratings yet
Major Proposal
41 pages
Cs Balaji Main Project
No ratings yet
Cs Balaji Main Project
28 pages
Web Based Recommendation System For Farmers
100% (1)
Web Based Recommendation System For Farmers
5 pages
Integrated CA2 Wheat Production Report
No ratings yet
Integrated CA2 Wheat Production Report
5 pages
Dashboard Project
No ratings yet
Dashboard Project
15 pages
HHHHHH
No ratings yet
HHHHHH
18 pages
Agriculture Data Analysis Using Parallel K-Nearest Neighbour Classification Algorithm
No ratings yet
Agriculture Data Analysis Using Parallel K-Nearest Neighbour Classification Algorithm
9 pages
Viraj Project Documentation
No ratings yet
Viraj Project Documentation
65 pages
Python in Hidrology Book
100% (1)
Python in Hidrology Book
153 pages
Agri Crop
No ratings yet
Agri Crop
13 pages
Capstone Project
No ratings yet
Capstone Project
8 pages
Polytechnic Farmer Info Project Report Detailed
No ratings yet
Polytechnic Farmer Info Project Report Detailed
18 pages
Base Paper
No ratings yet
Base Paper
7 pages
Documentation Major
No ratings yet
Documentation Major
51 pages
Areaplot - Ipynb - Colab
No ratings yet
Areaplot - Ipynb - Colab
4 pages
Areaplot
No ratings yet
Areaplot
2 pages
Integrated Project - Access To Drinking Water (Transforming The Data) (Re-Brand)
100% (1)
Integrated Project - Access To Drinking Water (Transforming The Data) (Re-Brand)
35 pages
Getting Set Up For Preparing Data (Slides)
No ratings yet
Getting Set Up For Preparing Data (Slides)
10 pages
Cleaning Techniques (Slides)
No ratings yet
Cleaning Techniques (Slides)
20 pages
Pix Fruits
No ratings yet
Pix Fruits
6 pages
Useful Formulas and Where To Use Them (Slides)
No ratings yet
Useful Formulas and Where To Use Them (Slides)
25 pages
ALX - An Introduction To Data Visualisations - Lesson Overview 3707
No ratings yet
ALX - An Introduction To Data Visualisations - Lesson Overview 3707
1 page
ALX - Descriptive Statistics - Reference Card 3585
No ratings yet
ALX - Descriptive Statistics - Reference Card 3585
1 page
Summarizing Data (Slides)
No ratings yet
Summarizing Data (Slides)
13 pages
Assessment and Project Plan - Preparing Data 3809
No ratings yet
Assessment and Project Plan - Preparing Data 3809
2 pages
Generic Structure of The Final Report and Table of Contents For Final Project Report
No ratings yet
Generic Structure of The Final Report and Table of Contents For Final Project Report
8 pages
Week 9 Legal Social Ethical Issues
No ratings yet
Week 9 Legal Social Ethical Issues
45 pages
Pandas Basics For Data Science
No ratings yet
Pandas Basics For Data Science
2 pages
How To Grow A YouTube Channel From 0 Subs in 2024 (Full Course)
No ratings yet
How To Grow A YouTube Channel From 0 Subs in 2024 (Full Course)
6 pages
Concept - Creating Your Personal Mission Statement 2 - Accra Intranet
No ratings yet
Concept - Creating Your Personal Mission Statement 2 - Accra Intranet
2 pages
Concept - More Suggestions On How To Deal With Imposter Syndrome - Accra Intranet
No ratings yet
Concept - More Suggestions On How To Deal With Imposter Syndrome - Accra Intranet
3 pages
Requirementsdevelopment
No ratings yet
Requirementsdevelopment
7 pages
Threat Landscape Report 2h 2023
No ratings yet
Threat Landscape Report 2h 2023
43 pages
Impactof Data Securityand Privacyin Cloud Computing
No ratings yet
Impactof Data Securityand Privacyin Cloud Computing
16 pages
Developing Web Content Management Systems - From T
No ratings yet
Developing Web Content Management Systems - From T
8 pages
Exercise - Advanced If Statements (Slides) (Re-Brand)
No ratings yet
Exercise - Advanced If Statements (Slides) (Re-Brand)
25 pages
Lab1 TestingVM
No ratings yet
Lab1 TestingVM
3 pages
Assignment For Module 3: 1. Basic CREATE TABLE Statement Requirements
100% (1)
Assignment For Module 3: 1. Basic CREATE TABLE Statement Requirements
22 pages
Microsoft Office PowerPoint (Working With Graphics)
No ratings yet
Microsoft Office PowerPoint (Working With Graphics)
19 pages
Cit208 Calculus Educational Consult 2020 - 1
No ratings yet
Cit208 Calculus Educational Consult 2020 - 1
34 pages
Panduan CRUD Data Siswa
No ratings yet
Panduan CRUD Data Siswa
6 pages
Business Analytics
100% (2)
Business Analytics
142 pages
SQL DBMS Language Overview
No ratings yet
SQL DBMS Language Overview
151 pages
DBMS Lab Manual for SE IT 2019
No ratings yet
DBMS Lab Manual for SE IT 2019
64 pages
Bahr 312 Sim
No ratings yet
Bahr 312 Sim
115 pages
13 Cursors Exception
No ratings yet
13 Cursors Exception
16 pages
SQR
No ratings yet
SQR
42 pages
SQL PDF
No ratings yet
SQL PDF
58 pages
Section A Objective Questions (50 Marks) Instruction:: Confidential
No ratings yet
Section A Objective Questions (50 Marks) Instruction:: Confidential
19 pages
SQL and PL/SQL Database Tasks
No ratings yet
SQL and PL/SQL Database Tasks
26 pages
Delete Constraints
No ratings yet
Delete Constraints
18 pages
SAP HANA Admin Notes
No ratings yet
SAP HANA Admin Notes
99 pages
Steps Involved in Oracle Apps Forms Personalization
No ratings yet
Steps Involved in Oracle Apps Forms Personalization
74 pages
Database Concepts - Mysql
No ratings yet
Database Concepts - Mysql
61 pages
6058 Food Order Processing Management
100% (1)
6058 Food Order Processing Management
12 pages
DDL DML DRL TCL DCL Create Table Syntax: Create Table Student (No Number (2), Name Varchar (10), Marks Number (3) )
No ratings yet
DDL DML DRL TCL DCL Create Table Syntax: Create Table Student (No Number (2), Name Varchar (10), Marks Number (3) )
206 pages
Talend Data Integration
No ratings yet
Talend Data Integration
5 pages
Oracle SQL Assignments
No ratings yet
Oracle SQL Assignments
4 pages
W3Schools Quiz Results1
No ratings yet
W3Schools Quiz Results1
8 pages
Ethical Hacking v10: Module 13 - SQL Injection
0% (1)
Ethical Hacking v10: Module 13 - SQL Injection
43 pages
JSP CRUD Application
No ratings yet
JSP CRUD Application
13 pages
Database Lab 2
No ratings yet
Database Lab 2
25 pages
Ict Lab LAB 02 Ms Excel
No ratings yet
Ict Lab LAB 02 Ms Excel
14 pages
DDL Commands
No ratings yet
DDL Commands
25 pages
SQL Commands for Beginners
No ratings yet
SQL Commands for Beginners
38 pages
Database Concepts & Protocols
No ratings yet
Database Concepts & Protocols
7 pages
SQL Tutorial PDF
No ratings yet
SQL Tutorial PDF
26 pages

Code Challenge Integrated Project P1 Student Version

Uploaded by

Code Challenge Integrated Project P1 Student Version

Uploaded by

Integrated Project: Understanding Maji Ndogo's agriculture

• Answer the questions according to the specifications provided.

• Do not hard-code answers to the questions.

• Field_ID: A unique identifier for each field (BigInt).

• Latitude: Geographical latitude of the field in degrees (Float).

• Longitude: Geographical longitude of the field in degrees (Float).

• Location: Province the field is in (Text).

• Slope: The slope of the land in the field (Float).

• Field_ID: Corresponding field identifier (BigInt).

• Rainfall: Amount of rainfall in the area in mm (Float).

• Min_temperature_C: Average minimum temperature recorded in Celsius (Float).

• Max_temperature_C: Average maximum temperature recorded in Celsius (Float).

• Ave_temps: Average temperature in Celcius (Float).

3. Soil and crop features

• Field_ID: Corresponding field identifier (BigInt).

• Soil_type: Type of soil present in the field (Text).

• Field_ID: Corresponding field identifier (BigInt).

• Plot_size: Size of the plot in the field (Ha) (Float).

• Chosen_crop: Type of crop chosen for cultivation (Text).

Average yield (tons/Ha) per crop type:

Here’s how we can do it:

In [ ]: import pandas as pd # importing the Pandas package with an alias, pd

# Create an engine for the database

At this point, we have two choices:

2. Use one SQL query and read it into a single DataFrame.

In [ ]: # Create a connection object

Check the DataFrame to see if it loaded correctly.

In [ ]: # Now, drop all columns named 'Field_ID'.

2. Some of the crop types contain spelling errors.

Use your Pandas skills to clean up the data.

In [ ]: # Insert your code here

Final data checkup

Expected output: 35.910797

Expected outcome: dtype('float64')

In [ ]: ### START FUNCTION

Expected output: (1534.5079956188388, 775.208667535597)

Expected output: (1010.2859910581222, 595.8384148002981)

Challenge 2: Finding fertile grounds

In [ ]: ### START FUNCTION

Challenge 3: Climate and geography analysis

Crop_type Elevation Min_temperature_C Max_temperature_C

Challenge 4: Advanced sorting techniques

To answer this, we need to:

In [ ]: grouped_df = MD_agric_df.groupby("Soil_type").mean(numeric_only = True).sort_values(by="El

Expected output: str

Challenge 5: Advanced filtering techniques

1. Filter by crop type.

2. Select only rows that have above average Standard_yield .

4. Have a Pollution_level lower than 0.0001.

In [ ]: ### START FUNCTION

Expected output: (14, 17)

Extra Pandas "nuggets"

But here are two tiny 'nuggets' to upskill in Pandas.

In [ ]: MD_agric_df.query('Standard_yield > 0.5 and Soil_type == "Loamy"')

In [ ]: MD_agric_df[(MD_agric_df['Standard_yield'] > 0.5) & (MD_agric_df['Soil_type'] == 'Loamy')]

In [ ]: soil_types = ['Loamy', 'Sandy', 'Silt']

Plotting data with Pandas

In [ ]: MD_agric_df['Standard_yield'].plot(kind='hist', bins =20)

In [ ]: MD_agric_df.plot(kind='scatter', x = "Pollution_level", y = "Standard_yield")

You might also like