0% found this document useful (0 votes)
81 views

Data Scientist Job Description

This data scientist job description outlines the key skills and qualifications needed for the role. It includes proficiency with machine learning algorithms like regression, classification, clustering and neural networks. Programming languages highlighted are Python, R, Scala and SQL for databases. Statistical analysis skills and experience with libraries such as Scikit-Learn, TensorFlow and PyTorch are emphasized. It also lists experience desired with cloud platforms, data visualization tools, and big data technologies like Spark, Hadoop and Hive.

Uploaded by

Ganesh Gore
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
81 views

Data Scientist Job Description

This data scientist job description outlines the key skills and qualifications needed for the role. It includes proficiency with machine learning algorithms like regression, classification, clustering and neural networks. Programming languages highlighted are Python, R, Scala and SQL for databases. Statistical analysis skills and experience with libraries such as Scikit-Learn, TensorFlow and PyTorch are emphasized. It also lists experience desired with cloud platforms, data visualization tools, and big data technologies like Spark, Hadoop and Hive.

Uploaded by

Ganesh Gore
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

Data Scientist Job Description

1) Machine Learning Skills: -


a) Regression, Classification, Decision Trees (Bagging, Boosting, Random Forest, etc.),
Support Vector Machines, Artificial Neural Networks, Clustering, Dimension Reduction,
etc.
b)  k-Nearest Neighbors, Naive Bayes, SVM, Decision Forests, XGBoost,
c) clustering, dimensionality reduction, Outlier detection, Gradient Descent

2) Programming Language: -
a) Python, R, Alteryx, Julia, Scala, MATLAB, Spark

3) Libraries: -
a) Scikit Learn, Pandas, Pytorch, Tensor flow, scipy, OpenCV, Keras, spaCy, Matplotlib

4) Database: -
a) RDBMS, NoSQL, MongoDB, SQL, Cassandra/PostgreSQL/Neo4J, Oracle etc.

5) Statistics: -
a) statistical tests, distributions, regression, maximum likelihood estimators

6) Cloud: -
a) Microsoft Azure Databricks, Azure Data factory, CI/CD Pipelines
b) BigQuery, AWS\Azure\GCP

7) Visualization: -
a) Tableau, PowerBI, Excel, Visual Basic

8) time-series data, and mixed models.


Data Wrangling 
application design, coding practices, and technical documentation.
developing and prototyping analytics tools using Flask/Python and React, Angular, or
similar framework would be a huge plus.
Hadoop
 extraction from structured/unstructured text (knowledge or statistics based).
neural networks and deep learning frameworks 
eep learning approaches to NLP: word/paragraph embedding representation learning,
text/sentiment classification, word2vec
Machine Learning Engineer
1) statistical machine learning, linear algebra, and deep learning for computer vision
2) C/C++, Python
3)  PyTorch, TensorFlow
4) MLOps for Model Tracking, Model Serving
5) Linux
6) Docker, Podman, Kubernetes, Spark, Argo CD 
7) Presto, Apache Hive, and Apache Iceberg
8) deep learning architectures, CNN/Sequence Models
9)  Map/Reduce, Hadoop, Hive, Spark, Gurobi, MySQL, etc.
10) Deep Learning & Transfer learning techniques.
11) data structures, data modeling, and software architecture
12) math, probability, statistics, and algorithms
13) regularization techniques: - LASSO, RIDGE & ELASTIC NET
14) Time Series Modelling.

You might also like