Let's Start With Data Science
Let's Start With Data Science
Prerequisite: Python
While Jupyter runs code in many programming languages, Python is a requirement (Python 3.3 or
greater, or Python 2.7) for installing the Jupyter Notebook.
We recommend using the Anaconda distribution to install Python and Jupyter. We’ll go through its
installation in the next section.
Jupyter installation requires Python 3.3 or greater, or Python 2.7. IPython 1.x, which included the
parts that later became Jupyter, was the last version to support Python 3.2 and 2.6.
As an existing Python user, you may wish to install Jupyter using Python’s package manager, pip,
instead of Anaconda.
First, ensure that you have the latest pip; older versions may have trouble with some dependencies:
Anaconda
Basic Python concepts to go through:
● https://www.geeksforgeeks.org/python-programming-language/
● https://www.programiz.com/python-programmin
● https://towardsdatascience.com/a-beginners-guide-to-data-analysis-in-python-188706df5447
Pandas
● Matplotlib - Matplotlib is the most popular library for exploration and data
visualization in the Python ecosystem. Every other library is built upon this
library.
Matplotlib
● Scikit Learn - Sklearn is the Swiss Army Knife of data science libraries. It is an
indispensable tool in your data science armory that will carve a path through
seemingly unassailable hurdles. In simple words, it is used for making machine
learning models.
Scikit-learn is probably the most useful library for machine learning in Python.
The sklearn library contains a lot of efficient tools for machine learning and
statistical modeling including classification, regression, clustering, and
dimensionality reduction.
● Logistic Regression
● Decision Tree
● Random Forest Classifier