Python Programming for Data Science
Module-4(continued)
1. Overview of Series and DataFrame
Series: A one-dimensional labeled array capable of
holding any data type. It has an index that provides labels
for each element.
import pandas as pd
# Creating a Series
s = pd.Series([10, 20, 30], index=['a', 'b', 'c'])
print(s)
DataFrame: A two-dimensional labeled data structure
with columns that can hold different data types. It is
similar to a SQL table or a spreadsheet.
# Creating a DataFrame
df = pd.DataFrame({
'A': [1, 2, 3],
'B': [4, 5, 6]
}, index=['row1', 'row2', 'row3'])
print(df)
2. Functionalities on Indexes: Hierarchical Indexing
Hierarchical indexing allows you to have multiple levels of
indexing on a DataFrame or Series, providing a way to work
with higher-dimensional data in lower-dimensional structures.
# Creating a DataFrame with hierarchical indexing
arrays = [['A', 'A', 'B', 'B'], [1, 2, 1, 2]]
index = pd.MultiIndex.from_arrays(arrays, names=('letter',
'number'))
df_hierarchical = pd.DataFrame({'value': [1, 2, 3, 4]},
index=index)
print(df_hierarchical)
3. Operations Between Data Structures: Merging, Joining,
and Concatenating
Merging: Similar to SQL joins, you can merge
DataFrames based on a common key.
df1 = pd.DataFrame({'key': ['A', 'B', 'C'], 'value1': [1, 2, 3]})
df2 = pd.DataFrame({'key': ['B', 'A', 'D'], 'value2': [4, 5, 6]})
merged_df = pd.merge(df1, df2, on='key', how='inner')
print(merged_df)
Joining: Used to combine two DataFrames based on their
indexes.
df1 = pd.DataFrame({'value1': [1, 2]}, index=['A', 'B'])
df2 = pd.DataFrame({'value2': [3, 4]}, index=['B', 'C'])
joined_df = df1.join(df2, how='outer')
print(joined_df)
Concatenating: Concatenate along a particular axis.
df1 = pd.DataFrame({'A': [1, 2]})
df2 = pd.DataFrame({'A': [3, 4]})
concatenated_df = pd.concat([df1, df2], axis=0)
print(concatenated_df)
Pandas Data Structures: Series and DataFrame
1. Series:
A Series is a one-dimensional labeled array that can hold any
data type (integer, float, string, Python objects, etc.). It is
similar to a column in a spreadsheet or a database.
Key Features:
o Has a label-based index to access data.
o Homogeneous data (all elements in a Series are of the
same data type).
o Can be created from lists, dictionaries, numpy arrays,
or scalar values.
Creation Examples:
import pandas as pd
# From a list
s = pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])
# From a dictionary
s = pd.Series({'a': 10, 'b': 20, 'c': 30})
# From a scalar
s = pd.Series(5, index=['a', 'b', 'c'])
2. DataFrame:
A DataFrame is a two-dimensional labeled data structure with
rows and columns, similar to a spreadsheet or SQL table.
Key Features:
o Heterogeneous data: Columns can contain different
data types.
o Row and column indexes for easy data manipulation.
o Supports various file formats for input and output,
e.g., CSV, Excel, JSON.
Creation Examples:
# From a dictionary of lists
data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35],
'City': ['NY', 'LA', 'SF']}
df = pd.DataFrame(data)
# From a list of dictionaries
data = [{'a': 1, 'b': 2}, {'a': 3, 'b': 4}]
df = pd.DataFrame(data)
Functionalities on Indexes
Pandas provides extensive functionalities for handling and
manipulating indexes in both Series and DataFrame:
1. Indexing Basics
Explicit Indexing: Using labels.
s['a'] # Accessing value by label
df.loc[0] # Accessing row by label
Implicit Indexing: Using integer positions.
s[0] # Accessing value by position
df.iloc[0] # Accessing row by position
2. Setting and Resetting Index
Set a new column as the index:
df.set_index('Name', inplace=True)
Reset the index to default:
df.reset_index(inplace=True)
3. Multi-Indexing (Hierarchical Indexing)
Create a MultiIndex:
arrays = [['A', 'A', 'B', 'B'], [1, 2, 1, 2]]
index = pd.MultiIndex.from_arrays(arrays, names=('Group',
'Subgroup'))
df = pd.DataFrame({'Data': [10, 20, 30, 40]}, index=index)
Access specific levels:
df.loc['A'] # Access all subgroups under 'A'
4. Index Alignment
Index alignment occurs automatically in operations:
s1 = pd.Series([1, 2], index=['a', 'b'])
s2 = pd.Series([3, 4], index=['b', 'c'])
result = s1 + s2 # Automatic alignment
5. Modifying Indexes
Rename indexes:
df.rename(index={0: 'zero', 1: 'one'}, inplace=True)
df.reindex([0, 2, 4], fill_value=0)
6. Sorting Index
Sort by index values:
df.sort_index(axis=0, ascending=True)
7. Handling Duplicate Indexes
Check for duplicates:
df.index.is_unique
Drop duplicates:
df = df.loc[~df.index.duplicated(keep='first')]
4. Function Application and Mapping: Applying Functions
for Data Transformation
You can apply functions to DataFrames and Series easily using
apply() and map().
Using apply(): To apply a function along an axis of the
DataFrame.
df = pd.DataFrame({'A': [1, 2, 3]})
df['B'] = df['A'].apply(lambda x: x ** 2)
print(df)
Using map(): For transforming values of a Series.
s = pd.Series([1, 2, 3])
s_mapped = s.map(lambda x: x * 2)
print(s_mapped)
Exploring Hierarchical Indexing
Hierarchical indexing (also called multi-level indexing) in
pandas allows you to work with data that has multiple
levels of row or column labels. This is especially useful
when dealing with higher-dimensional data in a tabular
format.
Key Features:
1. Creation: Use MultiIndex objects to create hierarchical
indexing.
2. Selection: Access subsets of data using .loc[], .iloc[], or
slicing.
3. Reorganization: Change the structure with methods like
stack() and unstack().
Example:
import pandas as pd
import numpy as np
# Creating a DataFrame with a MultiIndex
index = pd.MultiIndex.from_tuples(
[('A', 1), ('A', 2), ('B', 1), ('B', 2)],
names=['Letter', 'Number']
)
data = pd.DataFrame({'Value': [10, 20, 30, 40]},
index=index)
print(data)
# Accessing specific data
print(data.loc['A']) # All rows with 'A' as Letter
print(data.loc[('A', 1)]) # Specific entry
Operations Between Data Structures
Pandas provides versatile tools for merging, joining, and
concatenating data structures. These operations are
essential for combining datasets.
1. Merging: Combines dataframes based on keys or indices.
o pd.merge(): Performs SQL-like joins.
2. Joining: Merges datasets on their indices.
o DataFrame.join(): Convenient for joining on the
index.
3. Concatenation: Stacks datasets along a specific axis.
o pd.concat(): Combines datasets along rows or
columns.
4. Combining: Aligns datasets and performs element-wise
operations.
o combine_first(): Fills missing data with another
dataset.
Example:
# Merging
df1 = pd.DataFrame({'key': ['A', 'B', 'C'], 'value1': [1, 2,
3]})
df2 = pd.DataFrame({'key': ['A', 'B', 'D'], 'value2': [4, 5,
6]})
merged = pd.merge(df1, df2, on='key', how='outer')
# Concatenation
concat_df = pd.concat([df1, df2], axis=0,
ignore_index=True)
print("Merged DataFrame:")
print(merged)
print("\nConcatenated DataFrame:")
print(concat_df)
Function Application and Mapping
Functions can be applied across pandas objects to
transform or manipulate data efficiently. These include
element-wise transformations, row/column-wise
operations, or applying user-defined functions.
1. Element-wise:
o applymap(): Applies a function element-wise to
DataFrame.
o map(): Applies a function to Series elements.
2. Row/Column-wise:
o apply(): Applies a function along a specified axis.
Example:
# Applying functions
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
# Element-wise
squared = df.applymap(lambda x: x ** 2)
# Row-wise
row_sum = df.apply(lambda row: row.sum(), axis=1)
print("Squared DataFrame:")
print(squared)
print("\nRow Sums:")
print(row_sum)
Module-5
Introduction to Pandas I/O Tools
Pandas provides powerful tools for data input and output
(I/O) to read from and write to various file formats. These
tools allow seamless interaction with diverse data storage
formats, facilitating robust data analysis.
Reading CSV and Textual Files
Key Functions:
pd.read_csv(): Reads CSV files.
pd.read_table(): Reads tabular data from delimited text
files.
DataFrame.to_csv(): Writes a DataFrame to a CSV file.
Example:
import pandas as pd
# Reading a CSV file
df = pd.read_csv('data.csv')
# Writing to a CSV file
df.to_csv('output.csv', index=False)
Execute by creating CSV files in the working directory
Reading/Writing HTML Files
Key Functions:
pd.read_html(): Parses tables from HTML documents.
DataFrame.to_html(): Exports a DataFrame to an HTML
table.
Example:
# Reading tables from an HTML file
html_tables = pd.read_html('https://example.com/table-
page.html')
# Writing a DataFrame to an HTML file
df.to_html('output.html')
Reading Data from XML Files
Key Functions:
pd.read_xml(): Reads data from XML files.
DataFrame.to_xml(): Exports a DataFrame to an XML
file.
Example:
# Reading data from an XML file
xml_data = pd.read_xml('data.xml')
# Writing a DataFrame to an XML file
df.to_xml('output.xml')
Reading Data from Excel Files
Key Functions:
pd.read_excel(): Reads Excel files (.xls, .xlsx).
DataFrame.to_excel(): Writes a DataFrame to an Excel
file.
Example:
# Reading an Excel file
df = pd.read_excel('data.xlsx', sheet_name='Sheet1')
# Writing to an Excel file
df.to_excel('output.xlsx', index=False)
Reading JSON Data
Key Functions:
pd.read_json(): Reads JSON data.
DataFrame.to_json(): Exports a DataFrame to JSON
format.
Example:
# Reading JSON data
df = pd.read_json('data.json')
# Writing a DataFrame to a JSON file
df.to_json('output.json', orient='records')
Pickle Serialization
Pickle is a binary format for serializing and de-serializing
Python objects, allowing you to save and load Pandas
DataFrames efficiently.
Key Functions:
pd.to_pickle(): Saves a DataFrame to a pickle file.
pd.read_pickle(): Loads a DataFrame from a pickle file.
Example:
# Writing to a pickle file
df.to_pickle('data.pkl')
# Reading from a pickle file
df = pd.read_pickle('data.pkl')
Summary Table of I/O Functions
File Reading
Writing Function
Format Function
CSV/Text pd.read_csv() DataFrame.to_csv()
HTML pd.read_html() DataFrame.to_html()
XML pd.read_xml() DataFrame.to_xml()
Excel pd.read_excel() DataFrame.to_excel()
JSON pd.read_json() DataFrame.to_json()
Pickle pd.read_pickle() DataFrame.to_pickle()
Pandas Data Manipulation: Data Preparation
Data preparation is a critical step in the data analysis
workflow. It involves cleaning, transforming, and
preprocessing the data to ensure it is ready for analysis.
Pandas offers a comprehensive set of tools to handle
common data preparation tasks.
1. Handling Missing Data
Techniques:
Detect missing data with isnull() and notnull().
Remove missing data with dropna().
Fill missing values with fillna() or interpolate().
Examples:
import pandas as pd
import numpy as np
# Example DataFrame with missing values
df = pd.DataFrame({
'A': [1, 2, np.nan, 4],
'B': [np.nan, 2, 3, 4],
'C': ['a', 'b', 'c', np.nan]
})
# Detect missing values
print(df.isnull())
# Drop rows with missing values
cleaned_df = df.dropna()
# Fill missing values
filled_df = df.fillna({'A': 0, 'B': df['B'].mean(), 'C':
'unknown'})
print(filled_df)
2. Removing Duplicates
Duplicates can distort analyses and should be removed or
handled appropriately.
Key Functions:
duplicated(): Identifies duplicate rows.
drop_duplicates(): Removes duplicate rows.
Example:
# Example DataFrame with duplicates
df = pd.DataFrame({'A': [1, 1, 2, 2], 'B': [3, 3, 4, 5]})
# Detect duplicates
print(df.duplicated())
# Drop duplicate rows
unique_df = df.drop_duplicates()
print(unique_df)
3. Data Transformation
Scaling and Normalization:
Normalize values to a range [0, 1] or standardize them.
String Operations:
Clean or preprocess textual data using .str methods.
Example:
# Scaling numeric data
df = pd.DataFrame({'A': [10, 20, 30], 'B': [100, 200,
300]})
df['A_scaled'] = (df['A'] - df['A'].min()) / (df['A'].max() -
df['A'].min())
# String manipulation
text_df = pd.DataFrame({'C': [' hello ', 'WORLD!',
'pandas ']})
text_df['C_cleaned'] = text_df['C'].str.strip().str.lower()
print(text_df)
4. Handling Outliers
Outliers can skew results and may need removal or
transformation.
Techniques:
Identify outliers using statistical methods (e.g., IQR or z-
scores).
Replace or cap extreme values.
Example:
# Example DataFrame with outliers
df = pd.DataFrame({'A': [1, 2, 3, 1000]})
# Replace outliers with median
q1 = df['A'].quantile(0.25)
q3 = df['A'].quantile(0.75)
iqr = q3 - q1
outlier_threshold = 1.5 * iqr
df['A'] = df['A'].mask((df['A'] < (q1 - outlier_threshold)) |
(df['A'] > (q3 + outlier_threshold)), df['A'].median())
print(df)
5. Feature Encoding
Key Methods:
Convert categorical variables using one-hot encoding
(pd.get_dummies()).
Map or replace values (map(), replace()).
Example:
# One-hot encoding
df = pd.DataFrame({'Category': ['A', 'B', 'C']})
encoded_df = pd.get_dummies(df, columns=['Category'])
# Mapping values
df['Category_mapped'] = df['Category'].map({'A': 1, 'B':
2, 'C': 3})
print(df)
6. Data Integration
Combine or reshape datasets for analysis.
Techniques:
Merge datasets with pd.merge().
Reshape data using melt() or pivot().
Example:
python
Copy code
# Reshaping data with melt
df = pd.DataFrame({'ID': [1, 2], 'Math': [90, 80], 'Science':
[85, 95]})
melted = pd.melt(df, id_vars='ID', var_name='Subject',
value_name='Score')
print(melted)
7. Data Filtering and Subsetting
Extract meaningful subsets of data.
Key Methods:
Use boolean indexing for conditional filtering.
Use .query() for concise filtering.
Example:
# Conditional filtering
df = pd.DataFrame({'A': [1, 2, 3, 4], 'B': [10, 20, 30, 40]})
filtered_df = df[df['A'] > 2]
# Using query
filtered_df_query = df.query('A > 2')
print(filtered_df_query)
Concatenating Data: Combining Datasets
Concatenation is the process of combining datasets along
rows or columns. Pandas provides the pd.concat()
function for this purpose.
1. Concatenating Along Rows
Combines datasets vertically, stacking rows.
Example:
import pandas as pd
# Example DataFrames
df1 = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
df2 = pd.DataFrame({'A': [5, 6], 'B': [7, 8]})
# Concatenating along rows
result = pd.concat([df1, df2], axis=0, ignore_index=True)
print(result)
2. Concatenating Along Columns
Combines datasets horizontally, aligning by index.
Example:
# Example DataFrames
df1 = pd.DataFrame({'A': [1, 2]})
df2 = pd.DataFrame({'B': [3, 4]})
# Concatenating along columns
result = pd.concat([df1, df2], axis=1)
print(result)
3. Concatenation with Keys
Adds hierarchical indexing to differentiate datasets.
Example:
result = pd.concat([df1, df2], keys=['First', 'Second'])
print(result)
4. Concatenation with Different Indices
Handles missing values by default, or fills them using the
join parameter.
Example:
df1 = pd.DataFrame({'A': [1, 2]}, index=[0, 1])
df2 = pd.DataFrame({'B': [3, 4]}, index=[1, 2])
# Concatenate with outer join
result = pd.concat([df1, df2], axis=1, join='outer')
print(result)
Data Transformation: Sorting, Filtering, and
Replacing Values
1. Sorting Data
Sorting allows you to reorder rows or columns based on
values or indices.
Key Functions:
sort_values(): Sorts by column values.
sort_index(): Sorts by index.
Example:
# Sorting by column values
df = pd.DataFrame({'A': [3, 1, 2], 'B': [6, 4, 5]})
sorted_df = df.sort_values(by='A')
# Sorting by index
sorted_index_df = df.sort_index()
print(sorted_df)
print(sorted_index_df)
2. Filtering Data
Filters rows based on conditions.
Techniques:
Boolean indexing.
.query() for concise filtering.
Example:
# Filtering rows
filtered_df = df[df['A'] > 1]
# Using query
filtered_query_df = df.query('A > 1')
print(filtered_query_df)
3. Replacing Values
Replace values to standardize or clean data.
Key Functions:
replace(): Replace specific values.
fillna(): Replace missing values.
Example:
# Replacing specific values
df = pd.DataFrame({'A': [1, 2, -999], 'B': [3, -999, 5]})
cleaned_df = df.replace(-999, np.nan)
# Replacing multiple values
multi_replaced_df = df.replace({-999: np.nan, 1: 10})
print(multi_replaced_df)
4. Applying Custom Transformations
Apply custom transformations row-wise, column-wise, or
element-wise using .apply() or .applymap().
Example:
# Applying a function to a column
df['A_transformed'] = df['A'].apply(lambda x: x * 2)
# Applying a function to all elements
df_transformed = df.applymap(lambda x: x * 2 if
pd.notnull(x) else x)
print(df_transformed)
Summary Table
Operation Function Description
Combines
datasets along
Concatenation pd.concat()
rows or
columns.
Sorts rows
Sorting sort_values() based on
column values.
Sorts rows or
Sorting by
sort_index() columns by
Index
index.
Filters rows
Boolean
Filtering based on
indexing
conditions.
Replaces
Replacing
replace() specific values
Values
with new ones.
Replaces
Filling Missing missing data
fillna()
Values with specified
values.
Discretization and Binning: Grouping Continuous
Data
Discretization involves converting continuous data into
discrete intervals, or "bins." This is useful for simplifying
data or preparing it for certain types of analysis (e.g.,
histograms, categorical modeling).
1. Creating Bins
Use pd.cut() or pd.qcut() to create bins:
pd.cut(): Bins data into equal-width intervals.
pd.qcut(): Bins data into equal-frequency intervals.
Example:
import pandas as pd
import numpy as np
# Example DataFrame
data = pd.DataFrame({'Value': [0.5, 1.5, 2.5, 3.5, 4.5]})
# Equal-width binning
data['Equal_Width_Bin'] = pd.cut(data['Value'], bins=3,
labels=['Low', 'Medium', 'High'])
# Equal-frequency binning
data['Equal_Frequency_Bin'] = pd.qcut(data['Value'],
q=3, labels=['Low', 'Medium', 'High'])
print(data)
2. Custom Bins
You can define custom bin edges and labels.
Example:
# Custom binning
bin_edges = [0, 2, 4, 5]
bin_labels = ['Low', 'Medium', 'High']
data['Custom_Bin'] = pd.cut(data['Value'],
bins=bin_edges, labels=bin_labels)
print(data)
Permutation: Reordering Data
Permutation involves shuffling or reordering data. This is
particularly useful for testing and sampling.
1. Randomly Permuting Rows
Use sample() to randomly shuffle rows.
Example:
# Example DataFrame
df = pd.DataFrame({'A': [1, 2, 3, 4], 'B': [5, 6, 7, 8]})
# Shuffle rows
shuffled_df = df.sample(frac=1,
random_state=42).reset_index(drop=True)
print(shuffled_df)
2. Permuting Columns
You can reorder columns using np.random.permutation().
Example:
python
Copy code
# Shuffle columns
shuffled_columns =
df[np.random.permutation(df.columns)]
print(shuffled_columns)
String Manipulation: Text Data Operations
Pandas provides a variety of tools to manipulate text data
efficiently, accessible through the .str accessor.
1. Cleaning Strings
.strip(): Removes leading/trailing whitespace.
.lower(), .upper(): Changes case.
.replace(): Replaces substrings.
Example:
# Example DataFrame
text_data = pd.DataFrame({'Text': [' Hello ', 'World!', '
pandas ']})
# Clean and standardize text
text_data['Cleaned_Text'] =
text_data['Text'].str.strip().str.lower()
print(text_data)
2. Splitting and Joining
.split(): Splits strings into lists.
.join(): Joins lists into strings.
Example:
# Splitting strings
text_data['Split_Text'] =
text_data['Cleaned_Text'].str.split()
# Joining strings
text_data['Joined_Text'] =
text_data['Split_Text'].apply(lambda x: '-'.join(x))
print(text_data)
3. Finding and Extracting Patterns
.contains(): Checks if a pattern exists.
.extract(): Extracts matching substrings.
Example:
# Finding patterns
text_data['Has_World'] =
text_data['Cleaned_Text'].str.contains('world')
# Extracting patterns
text_data['First_Word'] =
text_data['Cleaned_Text'].str.extract(r'(\w+)')
print(text_data)
4. Replacing Patterns
.replace(): Replaces substrings using regex.
Example:
# Replace patterns
text_data['Replaced_Text'] =
text_data['Cleaned_Text'].str.replace('world', 'Earth',
regex=True)
print(text_data)
Summary Table
Operation Function Description
Discretization pd.cut(), pd.qcut() Group
continuous
Operation Function Description
data into
bins.
Shuffle
Random Row
sample(frac=1) rows of a
Permutation
DataFrame.
Shuffle
Column
np.random.permutation(columns) column
Permutation
order.
Clean and
Cleaning Strings .str.strip(), .str.lower() standardize
text data.
Split strings
Splitting/Joining
.str.split(), .join() or join lists
Text
into strings.
Find or
extract
Pattern
.str.contains(), .str.extract() substrings
Matching
matching a
pattern.
Replace
Replacing substrings
.str.replace()
Patterns or patterns
in text data.
Data Aggregation: Aggregating Data
Data aggregation involves computing summary statistics
or applying functions to grouped data. Pandas provides
robust methods to aggregate data efficiently.
1. Aggregation with groupby
The groupby method allows grouping data by one or more
columns and applying aggregation functions.
Example:
import pandas as pd
# Example DataFrame
data = pd.DataFrame({
'Category': ['A', 'A', 'B', 'B', 'C'],
'Values': [10, 20, 30, 40, 50]
})
# Group by 'Category' and calculate sum
grouped = data.groupby('Category').agg({'Values': 'sum'})
print(grouped)
2. Built-in Aggregation Functions
Common aggregation functions include:
sum(), mean(), max(), min(), count(), std()
Example:
# Multiple aggregations
grouped = data.groupby('Category').agg({'Values': ['sum',
'mean']})
print(grouped)
3. Custom Aggregations
You can use custom functions with agg().
Example:
# Custom aggregation: Calculate range (max - min)
custom_agg = data.groupby('Category').agg({'Values':
lambda x: x.max() - x.min()})
print(custom_agg)
4. Aggregation with Multiple Columns
You can aggregate different columns with different
functions.
Example:
data = pd.DataFrame({
'Category': ['A', 'A', 'B', 'B', 'C'],
'Values1': [10, 20, 30, 40, 50],
'Values2': [5, 10, 15, 20, 25]
})
# Apply different aggregations
grouped = data.groupby('Category').agg({
'Values1': 'sum',
'Values2': 'mean'
})
print(grouped)
Group Iteration: Iterating Over Grouped Data
Iterating over grouped data allows you to process or
analyze each group independently.
1. Iterating Through Groups
Use groupby to create groups and iterate using a loop.
Example:
# Example DataFrame
data = pd.DataFrame({
'Category': ['A', 'A', 'B', 'B', 'C'],
'Values': [10, 20, 30, 40, 50]
})
# Iterating through groups
for name, group in data.groupby('Category'):
print(f"Group: {name}")
print(group)
2. Applying Functions to Groups
Apply functions directly to groups using .apply().
Example:
python
Copy code
# Function to find range within groups
def range_func(group):
return group['Values'].max() - group['Values'].min()
group_ranges =
data.groupby('Category').apply(range_func)
print(group_ranges)
3. Accessing Specific Groups
Retrieve a specific group using get_group().
Example:
# Get specific group
group_b = data.groupby('Category').get_group('B')
print(group_b)
Summary Table
Operation Function Description
Groups data by
Grouping Data groupby() one or more
keys.
sum(),
Aggregates data
Built-in mean(),
using common
Aggregations max(),
functions.
count()
agg() with a Apply custom
Custom
custom operations to
Aggregations
function grouped data.
Iterating for name, Iterates over
Through group in each group in a
Groups groupby() groupby object.
Accessing Access a
Specific get_group() specific group
Groups by name.
Applying Applies a
Functions to apply() function to each
Groups group.