NumPy - Unique Elements



What is Unique Elements?

Unique elements refer to the distinct values in a set or collection, where each value appears only once.

In other words, if a value appears multiple times, it is counted only once as a unique element. For example, in the set {1, 2, 2, 3, 4}, the unique elements are {1, 2, 3, 4}.

The NumPy unique() Function

The unique() function in NumPy returns the sorted unique elements of an array. It removes any duplicate values, keeping only distinct ones.

You can also get additional information, such as the indices of the unique values or their counts. Following is the basic syntax of the unique() function in NumPy −

numpy.unique(ar, return_index=False, return_inverse=False, return_counts=False, axis=None)
  • ar: The input array from which unique elements are to be found.
  • return_index: If True, also return the indices of the first occurrences of the unique values in the original array.
  • return_inverse: If True, also return the indices to reconstruct the original array from the unique array.
  • return_counts: If True, also return the number of times each unique value comes up in the original array.
  • axis: If specified, the axis along which to find the unique values. If None (default), the unique values are found in the flattened array.

Finding Unique Elements in a 1D Array

To find the unique elements in a one-dimensional array, you can simply pass the array to the numpy.unique() function.

Example

In this example, the numpy.unique() function finds and returns the unique elements in the array data

import numpy as np

# Create a one-dimensional array
data = np.array([1, 2, 2, 3, 4, 4, 4, 5])

# Find unique elements
unique_elements = np.unique(data)

print("Unique elements:", unique_elements)

Following is the output obtained −

Unique elements: [1 2 3 4 5]

Returning Additional Information

The numpy.unique() function can also return additional information about the unique elements, such as the indices of their first occurrences, the indices to reconstruct the original array, and the counts of each unique value.

This is controlled by the return_index, return_inverse, and return_counts parameters, respectively.

Returning Indices of First Occurrences

To get the indices of the first occurrences of the unique values in the original array, set the return_index parameter to True

import numpy as np

# Create an array
data = np.array([1, 2, 2, 3, 4, 4, 4, 5])

# Find unique elements and their indices
unique_elements, indices = np.unique(data, return_index=True)

print("Unique elements:", unique_elements)
print("Indices of first occurrences:", indices)

This will produce the following result −

Unique elements: [1 2 3 4 5]
Indices of first occurrences: [0 1 3 4 7]

Returning Indices to Reconstruct the Original Array

To get the indices that can be used to reconstruct the original array from the unique array, set the return_inverse parameter to True

import numpy as np

# Create an array
data = np.array([1, 2, 2, 3, 4, 4, 4, 5])

# Find unique elements and inverse indices
unique_elements, inverse_indices = np.unique(data, return_inverse=True)

print("Unique elements:", unique_elements)
print("Inverse indices:", inverse_indices)

Following is the output of the above code −

Unique elements: [1 2 3 4 5]
Inverse indices: [0 1 1 2 3 3 3 4]

Returning Counts of Unique Elements

To get the counts of each unique value in the original array, set the return_counts parameter to True

import numpy as np

# Create an array
data = np.array([1, 2, 2, 3, 4, 4, 4, 5])

# Find unique elements and their counts
unique_elements, counts = np.unique(data, return_counts=True)

print("Unique elements:", unique_elements)
print("Counts of unique elements:", counts)

The output obtained is as shown below −

Unique elements: [1 2 3 4 5]
Counts of unique elements: [1 2 1 3 1]

Unique Elements in a Multi-Dimensional Array

The numpy.unique() function can also be used to find unique elements in multi-dimensional arrays. By default, the function flattens the array and then finds the unique elements.

However, you can specify the axis along which to find the unique values using the axis parameter.

Default Behavior (Flattened Array)

Let us see an example of finding unique elements in a 2D array without specifying an axis −

import numpy as np

# Create a 2D array
data_2d = np.array([[1, 2, 2], [3, 4, 4], [4, 5, 5]])

# Find unique elements
unique_elements = np.unique(data_2d)

print("Unique elements:", unique_elements)

Here, the function flattens the 2D array and then finds the unique elements as shown in the output below −

Unique elements: [1 2 3 4 5]

Finding Unique Elements along a Specific Axis

You can also find unique elements along a specific axis. For example, to find unique elements along the rows (axis=1) or columns (axis=0) of a 2D array −

import numpy as np

# Create a 2D array
data_2d = np.array([[1, 2, 2], [3, 4, 4], [4, 5, 5]])

# Find unique elements along axis 0 (columns)
unique_elements_axis_0 = np.unique(data_2d, axis=0)

# Find unique elements along axis 1 (rows)
unique_elements_axis_1 = np.unique(data_2d, axis=1)

print("Unique elements along axis 0:\n", unique_elements_axis_0)
print("Unique elements along axis 1:\n", unique_elements_axis_1)

The result produced is as follows −

Unique elements along axis 0:
 [[1 2 2]
  [3 4 4]
  [4 5 5]]
Unique elements along axis 1:
 [[1 2]
  [3 4]
  [4 5]]

Unique Elements in Structured Arrays

NumPy also supports structured arrays, where each element can be a combination of multiple fields. You can find unique elements in structured arrays by specifying the fields to consider for uniqueness.

Example

In this example, the function finds unique elements in the structured array by considering all fields −

import numpy as np

# Create a structured array
data_structured = np.array([(1, 'a'), (2, 'b'), (2, 'b'), (3, 'c')],
                           dtype=[('num', 'i4'), ('char', 'U1')])

# Find unique elements considering all fields
unique_elements_structured = np.unique(data_structured)

print("Unique elements in structured array:", unique_elements_structured)

We get the output as shown below −

Unique elements in structured array: [(1, 'a') (2, 'b') (3, 'c')]
Advertisements