What is the fastest way to drop consecutive duplicates a List[int] column?
Last Updated :
18 Jan, 2025
Dropping consecutive duplicates involves removing repeated elements that appear next to each other in a list, keeping only the first occurrence of each sequence.
For example, given the list [1, 2, 2, 3, 3, 3, 4]
, the result should be [1, 2, 3, 4]
because the duplicates immediately following the same value are dropped. Let's discuss various methods to do this in Python.
Using List Comprehension with zip()
This method uses list comprehension along with zip() compares each element with its previous one and includes it in the output only if it differs.
Python
# Input list
a = [1, 2, 2, 3, 3, 3, 4]
# Remove consecutive duplicates
res = [a[0]] + [x for i, x in zip(a, a[1:]) if i != x]
print(res)
Explanation:
- We iterate over the list with
zip()
, pairing each element with the next one. - If an element differs from the next, it is retained.
- This avoids explicitly iterating twice and is efficient for medium-sized lists.
Let's explore some more methods and see how we can drop consecutive duplicates a List[int] column.
groupby()
function groups consecutive identical elements and we take only one element from each group.
Python
from itertools import groupby
# Input list
a = [1, 2, 2, 3, 3, 3, 4]
# Remove consecutive duplicates
res = [key for key, _ in groupby(a)]
print(res)
Explanation:
groupby()
function clusters identical elements that are adjacent.- By taking the key of each group, we eliminate consecutive duplicates.
- This method is efficient and clean, especially for large datasets.
reduce() function allows us to iteratively apply a logic across the elements of a list. In this method, we use reduce to construct a result list by comparing each element with the last one appended.
Python
from functools import reduce
# Input list
a = [1, 2, 2, 3, 3, 3, 4]
# Remove consecutive duplicates
res = reduce(lambda acc, x: acc + [x] if not acc or acc[-1] != x else acc, a, [])
print(res)
Explanation:
- reduce() function accumulates elements in a list.
- If the current element is the same as the last one in the accumulator, it is skipped.
Using For Loop
A simple for loop can also achieve this by iterating through the list and appending elements only if they differ from the previous one.
Python
a = [1, 2, 2, 3, 3, 3, 4]
# Initialize result list
res = []
for x in a:
if not res or res[-1] != x:
res.append(x)
print(res)
Explanation:
- The loop maintains a result list that stores non-duplicate elements.
- If the current element is the same as the last one in the result, it is skipped.
- This method is easy to understand and works well for small to medium-sized lists.
Similar Reads
How to Return if List is Consecutive within Python Column? In programming, determining if a list of integers is consecutive means checking if the number appears in sequential order without any gaps. For example [1, 2, 3, 4, 5] is consecutive because each number follows the previous one directly.Understanding Consecutive ListsA consecutive list is a sequence
4 min read
Python | Remove consecutive duplicates from list Removing consecutive duplicates from a list means eliminating repeated elements that appear next to each other in the list. If an element repeats consecutively, only the first occurrence should remain and the duplicates should be removed.Example:Input: ['a', 'a', 'b', 'b', 'c', 'a', 'a', 'a']Output:
3 min read
How to Find & Drop duplicate columns in a Pandas DataFrame? Letâs discuss How to Find and drop duplicate columns in a Pandas DataFrame. First, Letâs create a simple Dataframe with column names 'Name', 'Age', 'Domicile', and 'Age'/'Marks'. Find Duplicate Columns from a DataFrameTo find duplicate columns we need to iterate through all columns of a DataFrame a
4 min read
What is the Equivalent of DataFrame.drop_duplicates() from Pandas in Polars? In data analysis, data manipulation is a critical task and sometimes involves removing duplicates from the data. Removing duplicate elements is crucial as it can affect the program, making it look perfect even though in reality it is flawed. Pandas, a popular data manipulation library in Python, has
4 min read
Identical Consecutive Grouping in list - Python The task of Identical Consecutive Grouping in a list involves grouping consecutive identical elements together while preserving their order. Given a list, the goal is to create sublists where each contains only identical consecutive elements. For example, with a = [4, 4, 5, 5, 5, 7, 7, 8, 8, 8], the
3 min read