0% found this document useful (0 votes)
44 views17 pages

Report Traffic Prediction

Uploaded by

Niyati Kumari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views17 pages

Report Traffic Prediction

Uploaded by

Niyati Kumari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 17

Traffic Flow Prediction Using

Machine Learning: A Comparative


Study of Time-Series Models on
NYC Bike Data
A Project Report
In the partial fulfilment of the award of the degree of

B.Tech
Under

Academy of Skill Development

Submitted by:
NIYATI KUMARI

NATIONAL INSTITUTE OF TECHNOLOGY, PATNA


CERTIFICATE FROM THE MENTOR

This is to certify that NIYATI KUMARI has completed the project titled BRAIN
TUMOR DETECTION USING DEEP LEARNING under my supervision during the
period from June to July which is in partial fulfilment of requirements for the award
of the B.Tech and submitted to Department of Computer science of national institute
of technology patna.

DATE: Signature of the Mentor


ACKNOWLEDGMENT

I take this opportunity to express my deep gratitude and sincerest thanks to my

project mentor, Mr. Mahendra Datta for giving the most valuable suggestions,

helpful guidance, and encouragement in the execution of this project work.

I would like to give a special mention to my colleagues. Last but not least I am

grateful to all the faculty members of the Academy of Skill Development for their

support.
(Note: All entries of the proforma of approval should be filled up with appropriate and
complete information of approval in any respect will be summarily rejected.)

1. Name of the Student With Group:

i. NIYATI KUMARI

Title of the Project : TRAFFIC FLOW PREDICTION USING MACHINE


LEARNING: A COMPARATIVE STUDY OF TIME-SERIES MODELS ON NYC BIKE DATA

2. Name and Address of the Guide : Mr. Mahendra Datta SR.

3. Educational Qualification of the Guide : Ph.d* M.tech* B.E*/B.Tech * MCA* M.Sc*

4. Working and Teaching experience of the Guide : …….Years


5. Software used in the Project:
a. Google collab
b. Python
c. Jupyter Notebook
d. Tensor flow

Signature of the Guide


Date:

Name: Mr. Mahendra Datta Subject


Matter Expert

Signature, Designation, Stamp of the


Project Proposal Evaluator
SELF- CERTIFICATE

This is to certify that the dissertation/project proposal entitled “Traffic Flow


Prediction Using Machine Learning: A Comparative Study Of Time-Series
Models On Nyc Bike Data” is done by us, is an Information Technology under
the guidance of Mr. Mahendra Datta. The matter embodied in this project work
has not been submitted earlier for award of any certificate to the best of our
knowledge and belief.

Name of the Students:


1. Niyati Kumari

Signature of the students:

1. Niyati Kumari
CERTIFICATE BY GUIDE

This is to certify that this project entitled “Traffic Flow Prediction Using
Machine Learning: A Comparative Study Of Time-Series Models On Nyc Bike
Data” submitted in partial fulfillment of the certificate of Bachelor of Computer
Application through Academy of Skill Development, done by the
Group Members:
1. Niyati Kumari
is an authentic work carried out under my guidance & best of our knowledge
and belief.

1. Niyati Kumari

Signature of the students Signature of the Guide


Date:09/04/2025 Date:
CERTIFICATE OF APPROVAL

This is to certify that this proposal of Minor project, entitled


“Traffic Flow Prediction Using Machine Learning: A
Comparative Study Of Time-Series Models On Nyc Bike
Data” is a record of bona-fide work, carried out by: 1. Niyati
Kumari under my supervision and guidance through the
Academy of Skill Development. In my opinion, the report in its
present form is in partial fulfillment of all the requirements, as
specified by the National Institute of Technology, Patna as per
regulations of the Academy of Skill Development. In fact, it has
attained the standard, necessary for submission. To the best of
my knowledge, the results embodied in this report, are original
in nature and worthy of incorporation in the present version of
the report for Bachelor of Technology.

Guid
e/
Supervi
sor Mr.
Mahen
dra
Datta
Subject Matter Expert & Technical Head (Python)
Academy of Skill Development (An ISO 9001:2008 Certified)
Module-132, SDF Building
Salt Lake Sector-V, Kolkata - 700 091

External Examiner(s) Head of


the
Depart
ment

A REPORT ON
“Traffic Flow Prediction
Using Machine Learning: A
Comparative Study of Time-
Series Models on NYC Bike
Data"
1. INTRODUCTION
As cities evolve, the need for efficient and intelligent transportation
systems becomes critical. Data-driven approaches provide a means to optimize
mobility, reduce congestion, and enhance commuter experiences. This study
focuses on traffic prediction within New York City’s bike-sharing system using
Citi Bike trip data.
Accurate traffic forecasting plays a crucial role in urban mobility
planning, as it enables city administrators and transportation agencies to allocate
resources effectively, optimize bike station availability, and improve overall
service efficiency. In this work, we leverage historical trip data to analyse
patterns in bike usage and develop a predictive model capable of estimating
future demand.
The study begins with data preprocessing, where raw Citi Bike trip
records are cleaned, transformed, and feature-engineered to extract temporal
trends (such as hourly and daily variations). The dataset is then aggregated to
derive meaningful insights into traffic fluctuations. A machine learning-based
approach, specifically Gradient Boosting Regression, is employed to forecast
trip demand. The model is evaluated using key performance metrics, ensuring
its reliability in predicting traffic patterns.
In this study, historical trip data is analysed using machine learning and
deep learning models to forecast transportation demand. The dataset undergoes
preprocessing to extract meaningful features such as hourly trends and trip
counts. Several predictive models, including Linear Regression, Random Forest,
Gradient Boosting, and ARIMA, are employed to understand the temporal
patterns in ridership data. Additionally, deep learning techniques using PyTorch
are explored to enhance forecasting accuracy.
This report details the data preprocessing steps, model selection process,
evaluation metrics, and insights drawn from predictive analytics. The findings
aim to contribute to the development of a more efficient and intelligent
transportation system, facilitating optimal scheduling and reduced operational
costs in EV-based mobility networks.
By implementing an intelligent prediction system, this research
contributes to the development of SMART (Sustainable, Multi-modal,
Adaptive, Resilient, and Technological) transportation solutions. The
findings can aid in better infrastructure planning, optimized bike availability,
and enhanced decision-making for future mobility strategies.

2. MOTIVATION OF THE TITLE


Traffic congestion is a growing concern in urban areas, affecting daily
commutes, logistics, and city planning. Accurate traffic prediction is crucial for
optimizing transportation systems, reducing delays, and improving mobility. In
this study, we explore the effectiveness of various machine learning models for
predicting traffic flow using the NYC Bike dataset.
We compare traditional and advanced machine learning techniques,
including Linear Regression, Random Forest, Gradient Boosting, LSTM, GRU,
and ARIMA, to determine their performance in forecasting traffic patterns. Our
analysis focuses on time-series modeling, where we generate a graph between
time and trip count to visualize trends and variations. The results indicate that
Gradient Boosting outperforms other models in predicting traffic accurately.
This study highlights the importance of choosing the right predictive
model for traffic forecasting, showcasing how machine learning can enhance
urban mobility solutions. Our findings can aid city planners, policymakers, and
transportation engineers in making data-driven decisions to improve traffic
management and infrastructure planning.
This study investigates the effectiveness of machine learning models in
predicting traffic flow, using real-world data from the New York City Citi
Bike system. As an alternative mode of transport, bike-sharing services provide
an eco-friendly solution to urban mobility, but their efficiency depends on
demand forecasting and fleet distribution. Without accurate predictions,
stations may suffer from bike shortages or excesses, causing inconvenience to
users. Hence, our research focuses on developing a robust predictive model
that can estimate future traffic trends based on historical data.
Our analysis follows a time-series modelling approach, where we
generate a graph between time and trip count to visualize fluctuations in bike
demand. By capturing daily, weekly, and seasonal trends, we aim to identify the
most effective method for traffic forecasting. Through extensive
experimentation, we find that Gradient Boosting outperforms other models,
demonstrating superior accuracy in predicting bike usage patterns.
Ultimately, this research provides actionable insights for city planners,
policymakers, and transportation engineers. By integrating machine learning
into urban mobility strategies, we can enhance the efficiency, reliability, and
sustainability of modern transportation networks.

3. RESEARCH PAPERS
● Research on highway traffic flow prediction model and decision-making
method Yuyu Zhu 1, QingE Wu 2 & Na Xiao

● Machine Learning Models for Real-Time Traffic Prediction: A Case


Study in Urban Traffic Management VanSang Ha1, Hien Nguyen Thi
Bao

● Road traffic can be predicted by machine learning equally effectively as


by complex microscopic model Andrzej Sroczyński & Andrzej
Czyżewski

● Urban traffic flow prediction techniques A review Boris Medina-


Salgadoa,b, Eddy Sánchez-DelaCruza, Pilar Pozos-Parrac, Javier E.
Sierra

4. DRAWBACK
● Computational inefficiencies made it difficult to provide instant
decision support.
● Focused only on highway traffic, missing real-world urban traffic
control challenges (e.g., signals, intersections).
● Did not integrate predictive analytics into operational traffic
management.
● Struggled with scalability → Required high-power GPUs, making them
impractical.
● Did not test responsiveness, stability, or interpretability in real-world
urban traffic
● Many deep learning models, such as Long Short-Term Memory
(LSTM) and Gated Recurrent Units (GRU), require large datasets and
extensive training, making them computationally expensive for real-
time applications.
● Iterative algorithms used in training these models introduce delays in
real-time predictions, reducing their applicability in fast-changing traffic
environments.
● The scalability of these models remains a major issue, as many require
high-performance GPUs and cloud computing resources, making
them impractical for widespread deployment in real-time traffic
management systems.
● Sensor-based data collection methods often suffer from noisy, missing,
or unreliable data, affecting model accuracy.
● Many datasets used in research studies are sourced from a single city,
highway, or intersection, making them less applicable to other
locations with different traffic dynamics.
● Data scarcity is a critical issue, as real-time collection requires robust
infrastructure, which is not always available.
● Models trained on synthetic datasets may perform well in simulations but
fail when deployed in real-world scenarios.
● Deep learning models such as Convolutional Neural Networks (CNNs),
LSTMs, and GRUs require substantial computational resources,
making real-time deployment challenging.
● Traditional models such as Linear Regression and ARIMA offer clear,
interpretable relationships, whereas deep learning models function as
black boxes, making it difficult to understand their decision-making
process.
● Traffic engineers and policymakers often require explainable models to
justify decisions on infrastructure development and policy change.
5. PROBLEM STATEMENT
Urban transportation systems face significant challenges in accurately
predicting traffic demand, which is critical for optimizing mobility, reducing
congestion, and improving overall transportation efficiency.
Another key issue is computational inefficiency, as many deep learning-
based traffic prediction models require extensive training time and high-power
GPUs. Iterative algorithms used in these models are often slow, causing delays
in real-time predictions, which reduces their practicality in fast-changing urban
environments. Additionally, the high computational cost makes widespread
deployment difficult, especially in resource-constrained settings. This limitation
prevents many transportation agencies from adopting state-of-the-art predictive
models on a larger scale.
Another significant limitation of current traffic prediction models is their
focus on highway traffic, while overlooking the complexities of urban
environments. Highways have relatively straightforward traffic dynamics
compared to cities, which involve numerous intersections, traffic signals,
pedestrian activity, and multi-modal transportation. Ignoring these urban-
specific challenges results in models that fail to provide actionable insights for
city planners and traffic engineers. Furthermore, most traffic prediction
frameworks do not integrate predictive analytics into operational traffic
management systems, limiting their real-world applicability.
The lack of model interpretability is another issue that affects the
practical adoption of deep learning models in traffic prediction. Many deep
learning techniques, such as Long Short-Term Memory (LSTM) networks and
Gated Recurrent Units (GRUs), function as black boxes, making it difficult to
understand the reasoning behind their predictions. This lack of transparency
poses a challenge for policymakers and traffic engineers, who require
explainable models to justify infrastructure development and policy decisions.
In contrast, traditional statistical models like Linear Regression and ARIMA
provide clear, interpretable relationships but often lack the predictive accuracy
required for modern traffic forecasting.
Moreover, data scarcity remains a critical issue, as real-time collection
requires robust infrastructure, which is not always available in all regions. Even
when synthetic datasets are used for model training, these models often fail to
perform well when deployed in real-world settings.

6. DIAGRAM

'Linear Regression': {'MAE': 58.12622164146461, 'RMSE': 74.42359825825935},


'Random Forest': {'MAE': 50.66503448275862, 'RMSE': 68.55958094971118},
'Gradient Boosting': {'MAE': 46.57842092494144, 'RMSE': 62.19531582467057},
'LSTM': {'MAE': 71.307724262106, 'RMSE': 83.68427684116088},
'GRU': {'MAE': 75.77964272334658, 'RMSE': 87.23276552279567},
'ARIMA': {'MAE': 754.0035427075342, 'RMSE': 917.4003655730268}}

After optimization :
Gradient Boosting MAE: 0.1106
Gradient Boosting RMSE: 0.1494

You might also like