Document (1)
Document (1)
Linux, Shells, Commands, and Navigation, Common Text Editors, Administering Linux,
Introduction to Users and Groups, Linux shell scripting, shell computing, Introduction
to enterprise computing, Remote access.
Python Programming: Python basics, If, If- else, Nested if-else, Looping, For, While,
Nested loops, Control Structure, Break, Continue, Pass, Strings and Tuples,
Accessing Strings, Basic Operations, String slices, Working with Lists, Accessing list,
Operations, Function and Methods, Files, Pickling, Modules, Dictionaries, Dictionary
Comprehension, Functions and Functional Programming, Declaring and calling
Functions, Declare, assign and retrieve values from Lists, Introducing Tuples,
Accessing tuples, Visualizing using Matplotlib, Seaborn, OOPs concept, Class and
object, Attributes, Inheritance, Overloading, Overriding, Data hiding, Generators,
Decorators, Operations Exception, Exception Handling, except clause, Try-finally
clause, User Defined Exceptions, Data wrangling, Data cleaning, Load images and
audio files using python libraries(pillow/scikit-learn), Creation of python virtual
environment, Logging in Python.
R Programming: Reading and Getting Data into R, Exporting Data from R, Data
Objects-Data Types & Data Structure. Viewing Named Objects, Structure of Data
Items, Manipulating and Processing Data in R (Creating, Accessing, sorting data
frames, Extracting, Combining, Merging, reshaping data frames), Control Structures,
Functions in R (numeric, character, statistical), working with objects, Viewing Objects
within Objects, Constructing Data Objects, Packages – Tidyverse, Dplyr, Tidyr etc.,
Queuing Theory, Case Study
Database Concepts (File System and DBMS), OLAP vs OLTP, Database Storage
Structures (Tablespace, Control files, Data files), Structured and Unstructured data,
SQL Commands (DDL, DML & DCL), Stored functions and procedures in SQL,
Conditional Constructs in SQL, data collection, Designing Database schema,Normal
Forms and ER Diagram, Relational Database modelling, Stored Procedures , Triggers.
The tools and how data can be gathered in a systematic fashion, Data ware Housing
concept, No-SQL, Data Models - XML, working with MongoDB, Cassandra- overview,
comparison with MongoDB, working with Cassendra, Connecting DB’s with Python,
Introduction to Data Driven Decisions, Enterprise Data Management, data preparation
and cleaning techniques
Understanding Data Lakes – concepts, architecture and components, Data Lake vs.
Data Warehouse vs. Lakehouse, data storage management, processing and
transformation, workflow orchestration, analytics in Data Lake, case study using Delta
Lake with analytics and AI.
Introduction to Big Data-Big Data - Beyond The Hype, Big Data Skills And Sources
Of Big Data, Big Data Adoption, Research And Changing Nature Of Data Repositories,
Data Sharing And Reuse Practices And Their Implications For Repository Data
Curation,
Introduction to HIVE: Programming with Hive: Data warehouse system for Hadoop,
Optimizing with Combiners and Practitioners, Bucketing, more common algorithms:
sorting, indexing and searching, Relational manipulation: map-side and reduce-side
joins, evolution, purpose and use, Case Studies on Ingestion and warehousing
HBase: Overview, comparison and architecture, java client API, CRUD operations and
security
Apache Spark: Overview, APIs for large-scale data processing, Linking with Spark,
Initializing Spark, Resilient Distributed Datasets (RDDs), External Datasets, RDD Operations,
Passing Functions to Spark, Job optimization, Working with Key-Value Pairs, Shuffle
operations, RDD Persistence, Removing Data, Shared Variables, EDA using PySpark,
Deploying to a Cluster Spark Streaming, Spark MLlib and ML APIs, Spark Data Frames/Spark
SQL, Integration of Spark and Kafka, Setting up Kafka Producer and Consumer, Kafka
Connect API, Map reduce, Connecting DB’s with Spark
Machine Learning:
Deep Learning:
Generative AI: