Big Data on Kubernetes

This is the code repository for Big Data on Kubernetes, published by Packt.

A practical guide to building efficient and scalable data solutions

What is this book about?

With step-by-step instructions and examples, this book will teach you the skills needed to build and deploy complex data pipelines on Kubernetes, resulting in efficient and scalable big data solutions.

This book covers the following exciting features:

Install and use Docker to run containers and build concise images
Gain a deep understanding of Kubernetes architecture and its components
Deploy and manage Kubernetes clusters on different cloud platforms
Implement and manage data pipelines using Apache Spark and Apache Airflow
Deploy and configure Apache Kafka for real-time data ingestion and processing
Build and orchestrate a complete big data pipeline using open-source tools
Deploy Generative AI applications on a Kubernetes-based architecture

If you feel this book is for you, get your copy today!

Instructions and Navigations

All of the code is organized into folders. For example, Chapter01.

The code will look like the following:

import pandas as pd
url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/
master/pima-indians-diabetes.data.csv'

df = pd.read_csv(url, header=None)

df["newcolumn"] = df[5].apply(lambda x: x*2)

print(df.columns)
print(df.head())
print(df.shape)

Following is what you need for this book: If you are a data engineer, BI analyst, data team leader, data architect, or tech manager with a basic understanding of big data technologies, then this big data book is for you. Familiarity with the basics of Python programming, SQL queries, and YAML is required to understand the topics discussed in this book

With the following software and hardware list you can run all code files present in the book (Chapter 1-11).

Software and Hardware List

Chapter	Software required	OS required
1-11	Python>=3.9	Windows, macOS, or Linux
1-11	Docker, the latest version available	Linux
1-11	Docker Desktop, the latest version available	Windows or macOS
1-11	Kubectl	Windows, macOS, or Linux
1-11	Awscli	Windows, macOS, or Linux
1-11	Eksctl	Windows, macOS, or Linux
1-11	DBeaver Community Edition	Windows, macOS, or Linux

Related products

Data Engineering with Databricks Cookbook [Packt] [Amazon]
Data Engineering with Scala and Spark [Packt] [Amazon]

Get to Know the Author

Neylson Crepalde is a Generative AI Strategist at AWS. Prior to that, he was CTO at A3Data, a consulting company focused on Data, Analytics and Artificial Intelligence. He holds a PhD in Economic Sociology, a master in Sociology of Culture, an MBA in Cultural Management and a Bachelor in Orchestra Conducting. He has been working with data since 2015. He is committed to sharing knowledge with people of every professional level and helping data teams achieve their best. He is several times AWS certified, Spark certified, Neo4j certified and Airflow certified. Neylson has been teaching for 10+ years now in colleges and MBA programs and he gives regular talks and lectures on Data Architecture, AI strategy, Data Governance and Network Science.

Name		Name	Last commit message	Last commit date
Latest commit History 198 Commits
Chapter01		Chapter01
Chapter03		Chapter03
Chapter05		Chapter05
Chapter06		Chapter06
Chapter07		Chapter07
Chapter08		Chapter08
Chapter09		Chapter09
Chapter10		Chapter10
Chapter11		Chapter11
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Big Data on Kubernetes

What is this book about?

Instructions and Navigations

Software and Hardware List

Related products

Get to Know the Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 5

Languages

License

PacktPublishing/Bigdata-on-Kubernetes

Folders and files

Latest commit

History

Repository files navigation

Big Data on Kubernetes

What is this book about?

Instructions and Navigations

Software and Hardware List

Related products

Get to Know the Author

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 5

Languages

Packages