
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Column Summation of Uneven Sized Lists in Python
What is Column Summation
Column summation refers to the process of calculating the sum of values within each column of a dataset or a matrix. In the context of data analysis or numerical computations, column summation is a common operation used to summarize and analyze data along the vertical axis.
For example, consider a dataset represented as a table with rows and columns. Each column corresponds to a variable or a feature, and each row represents an observation or a data point. Column summation involves adding up the values within each column to obtain a single sum for each variable. Let's illustrate this with an example.
Suppose we have the following dataset representing the heights in centimeters of five individuals across three different measurements.
Measurement 1 | Measurement 2 | Measurement 3 | |
0 | 170 | 175 | 180 |
1 | 165 | 168 | 172 |
2 | 180 | 182 | 178 |
3 | 172 | 169 | 171 |
4 | 175 | 176 | 174 |
To calculate the column summation, we have to add up the values within each column.
Column Summation
Measurement 1: 862
Measurement 2: 870
Measurement 3: 875
In this case, the column summation provides the total height for each measurement, giving us an overview of the cumulative values within each variable.
When you have uneven-sized lists and want to calculate the column sum for each column in Python, we can use different approaches. Here are three methods to accomplish this.
Using Loops With Padded Lists
In this approach, we can loop through the lists and sum the values for each column, considering that the lists might have different lengths. We need to pad the shorter lists with zeros to make them equal in length.
Example
In this example, we first find the maximum length among all the lists using `max(len(lst) for lst in lists)`. Then, we pad each list with zeros to match the maximum length using a list comprehension. After padding, we can use `zip(*padded_lists)` to transpose the lists, and finally, we calculate the column sum using another list comprehension.
lists = [ [1, 2, 3], [4, 5], [6, 7, 8, 9] ] max_length = max(len(lst) for lst in lists) padded_lists = [lst + [0] * (max_length - len(lst)) for lst in lists] column_sum = [sum(col) for col in zip(*padded_lists)] print("The column summation of uneven sized lists:",column_sum)
Output
The column summation of uneven sized lists: [11, 14, 11, 9]
Using the Itertools.zip_longest() Function
The `zip_longest` function from the `itertools` module allows us to zip lists with different lengths and fill the missing values with a specified default value 0 in this case.
Example
Here in this example, `zip_longest(*lists, fillvalue=0)` zips the lists with padding, and then we calculate the column sum using a list comprehension, which is similar to the previous approach.
from itertools import zip_longest lists = [ [1, 2, 3], [4, 5], [6, 7, 8, 9, 10] ] column_sum = [sum(col) for col in zip_longest(*lists, fillvalue=0)] print("The column summation of uneven sized lists:",column_sum)
Output
The column summation of uneven sized lists: [11, 14, 11, 9, 10]
Using NumPy
NumPy provides an elegant way to handle uneven-sized lists without explicit padding. It automatically broadcasts the lists to perform the sum operation, even when they have different lengths.
Example
In this example, we convert the list of lists into a NumPy array using `np.array(lists)`, where each row represents a list. Then, we use `np.sum(arr, axis=0)` to calculate the sum along the first axis i.e. rows, which effectively gives us the column sum.
import numpy as np lists = [ [1, 2, 3], [4, 5], [6, 7, 8, 9] ] arr = np.array(lists, dtype=object ) column_sum = np.sum(arr, axis=0) print("The column summation of uneven sized lists:",column_sum)
Output
The column summation of uneven sized lists: [1, 2, 3, 4, 5, 6, 7, 8, 9]