data science notes 1
data science notes 1
Data Science is an interdisciplinary field that extracts insights from structured and unstructured data
using scientific methods, algorithms, and systems. It combines statistics, mathematics, programming,
and domain expertise to analyze complex data.
Key Components:
Statistics & Probability: Used for data analysis and hypothesis testing.
Applications:
Business Analytics
Healthcare Predictions
Fraud Detection
Recommendation Systems
Autonomous Systems
Before analysis, raw data needs to be cleaned and processed to ensure accuracy and reliability.
1. Data Collection: Gathering structured and unstructured data from various sources.
Tools Used:
Machine Learning (ML) is a subset of AI that enables computers to learn patterns from data without
being explicitly programmed.
2. Unsupervised Learning: Finds hidden patterns in unlabeled data (e.g., Clustering, PCA)
Common Algorithms:
Data visualization helps in understanding trends, patterns, and insights by using graphical
representations.
Types of Visualizations:
Best Practices:
Big Data refers to extremely large datasets that require specialized tools for storage, processing, and
analysis.
Technologies Used:
Cloud Platforms: AWS, Azure, Google Cloud for scalable storage and processing.
Applications:
Predictive Analytics
Personalized Marketing