NumPy - Sorting Arrays



Sorting Arrays in NumPy

In NumPy, sorting refers to the process of arranging the elements of an array in a specific order, generally ascending or descending.

NumPy provides several functions to perform sorting operations, which can be applied to both one-dimensional and multi-dimensional arrays. They are as follows −

  • The sort() Function
  • The partition() Function
  • The argsort() Function
  • The lexsort() Function

Using np.sort() Function

The np.sort() function sorts the elements of an array and returns a new array containing the sorted elements. The original array remains unchanged unless the sorting is done in-place using the sort() function of the "ndarray" object.

Sorting can be done along a specified axis, or if no axis is specified, the function defaults to sorting along the last axis. Following is the syntax −

numpy.sort(a, axis=-1, kind=None, order=None)

Where,

  • a: It is the array to be sorted.
  • axis: It is the axis along which to sort. Default is -1, which means sorting along the last axis.
  • kind: It is the sorting algorithm to use. Options include 'quicksort', 'mergesort', 'heapsort', and 'stable'.
  • order: It is used when sorting a structured array to define which fields to compare.

Example

In the following example, we are using the np.sort() function to sort the given array in ascending order −

import numpy as np

arr = np.array([3, 1, 2, 5, 4])
sorted_arr = np.sort(arr)

print("Original Array:", arr)
print("Sorted Array:", sorted_arr)

Following is the output obtained −

Original Array: [3 1 2 5 4]
Sorted Array: [1 2 3 4 5]

In-Place Sorting in NumPy

In-place sorting is a way where the sorting operation is performed directly on the original array, modifying its order (ascending by default) without creating a separate sorted copy.

In NumPy, we can perform in-place sorting using the sort() function of the ndarray object. Following is the syntax −

ndarray.sort(axis=-1, kind=None, order=None)

Example

In this example, we are using the arr.sort() function to sort the given array in place, modifying the original array −

import numpy as np

arr = np.array([3, 1, 2, 5, 4])
arr.sort()
print("In-Place Sorted Array:", arr)

This will produce the following result −

In-Place Sorted Array: [1 2 3 4 5]

Sorting Along Specific Axes

NumPy allows sorting elements along a specific axes in multi-dimensional arrays. It helps you to organize the data in a manner that respects the structure of the array, whether that involves sorting rows, columns, or higher-dimensional slices.

We can sort elements along specific axes in Numpy using the axis parameter of the np.sort() function −

  • Axis 0: It represents the rows in a 2D array (downward direction). Sorting along axis 0 sorts each column independently.
  • Axis 1: It represents the columns in a 2D array (horizontal direction). Sorting along axis 1 sorts each row independently.
  • Higher Dimensions: In arrays with more than two dimensions, axes 2, 3, etc., correspond to higher-dimensional slices.

Example

In the example below, we are sorting a 2D NumPy array along two different axes: axis 0 (columns) and axis 1 (rows) −

import numpy as np

arr = np.array([[3, 2, 1], [6, 5, 4]])
sorted_arr_axis0 = np.sort(arr, axis=0)
sorted_arr_axis1 = np.sort(arr, axis=1)

print("Original Array:\n", arr)
print("Sorted Along Axis 0:\n", sorted_arr_axis0)
print("Sorted Along Axis 1:\n", sorted_arr_axis1)

Following is the output of the above code −

Original Array:
[[3 2 1]
 [6 5 4]]
Sorted Along Axis 0:
[[3 2 1]
 [6 5 4]]
Sorted Along Axis 1:
[[1 2 3]
 [4 5 6]] 

Partial Sorting Using partition() Function

The np.partition() function in NumPy is used to reorder elements in an array such that all elements smaller than a specified element (called the "kth element") are moved before it, and all elements greater than the "kth element" are moved after it.

This function is useful when you need to find the k-th smallest or largest element in an array without fully sorting it. Following is the syntax −

numpy.partition(a, kth, axis=-1, kind='introselect', order=None)

Where,

  • a: It is the array you want to partition.
  • kth: It is the index of the element around which we need to partition the array. It can be an integer or a sequence of integers.
  • axis: It is the axis along which to partition the array. By default, it is set to -1, meaning the last axis.
  • kind: It is the selection algorithm to use. The default is 'introselect', which is a hybrid of quickselect and median of medians.
  • order: It is used for complex data types to specify the field to sort on.

Example

In this example, the array is partitioned such that the element at index 2 is positioned in a way that all elements before it are smaller or equal, and all elements after it are larger or equal −

import numpy as np

arr = np.array([3, 1, 2, 5, 4])
partitioned_arr = np.partition(arr, 2)

print("Partitioned Array:", partitioned_arr)

The output obtained is as shown below −

Partitioned Array: [1 2 3 5 4]

Indirect Sorting Using argsort() Function

The np.argsort() function in NumPy is used to obtain the indices that would sort an array. Instead of returning the sorted array itself, np.argsort() function returns an array of indices that represents the order in which elements should be arranged to achieve a sorted array.

This function is useful when you need to sort one array based on the sorted order of another. Following is the syntax −

numpy.argsort(a, axis=-1, kind=None, order=None)

Where,

  • a: It is the array you want to sort.
  • axis: It is the axis along which to sort. By default, it is set to -1, meaning the last axis.
  • kind: It is the sorting algorithm to use. Options include 'quicksort', 'mergesort', 'heapsort', and 'stable'. The default is 'quicksort'.
  • order: It is used for complex data types to specify the field to sort by.

Example

In the following example, we are using np.argsort() function to obtain the indices that would sort the array "arr". We then use these indices to rearrange the original array into its sorted order −

import numpy as np

arr = np.array([3, 1, 2, 5, 4])
sorted_indices = np.argsort(arr)

print("Indices that would sort the array:", sorted_indices)
print("Sorted Array Using Indices:", arr[sorted_indices])

After executing the above code, we get the following output −

Indices that would sort the array: [1 2 0 4 3]
Sorted Array Using Indices: [1 2 3 4 5]

Sorting Structured Arrays

Structured arrays in NumPy allows you to create arrays where each element can have multiple fields, each with its own data type. This is similar to a database table or a record in a traditional programming language, where each entry can hold multiple types of data.

You can sort structured arrays based on one or more fields. This is useful when you want to order the records according to specific criteria. To achieve this, you can use the np.sort() function in NumPy that accepts an order parameter to specify which field(s) to sort by.

Example

In the example below, we are sorting the structured array "arr" by the 'age' field −

import numpy as np

arr = np.array([('John', 25), ('Alice', 30), ('Bob', 22)],
               dtype=[('name', 'U10'), ('age', 'i4')])
sorted_arr = np.sort(arr, order='age')

print("Sorted Structured Array:\n", sorted_arr)

The result produced is as follows −

Sorted Structured Array:
[('Bob', 22) ('John', 25) ('Alice', 30)]

Lexicographical Sorting Using lexsort() Function

The np.lexsort() function performs an indirect sort by using a sequence of keys. It takes a sequence of fields or columns and returns an array of indices that would sort the input arrays based on these keys. Following is the syntax −

numpy.lexsort(keys, axis=-1)

Where,

  • keys: It is a sequence of arrays or a single array, where each array represents a key by which to sort. The keys are sorted in the order provided, meaning the last key in the sequence is the primary key, the second-to-last key is the secondary key, and so on.
  • axis: It is the axis along which to sort. By default, it is set to -1, meaning the last axis.

Example

In this example, the np.lexsort() function is used to sort the arrays first based on name, then by age if names are the same. The sorted order is based on a lexicographical comparison −

import numpy as np

names = np.array(['John', 'Alice', 'Bob'])
ages = np.array([25, 30, 22])
sorted_indices = np.lexsort((ages, names))

print("Indices for Lexicographical Sort:", sorted_indices)
print("Sorted Names and Ages:", names[sorted_indices], ages[sorted_indices])

We get the output as shown below −

Indices for Lexicographical Sort: [1 2 0]
Sorted Names and Ages: ['Alice' 'Bob' 'John'] [30 22 25]
Advertisements