NumPy Cheatsheet



The NumPy Cheatsheet provides a quick reference to all the fundamental topics. NumPy is a popular library in Python used for mathematical operations based on linear algebra, statistics, and more. By learning from this cheat sheet, you can effectively work with NumPy to solve various problems. Go through the cheat sheet to understand its fundamentals and enhance your productivity.

1. Introduction to NumPy

In the introduction, NumPy is a Python library that works with arrays and mathematical operations. NumPy is the short form of "Numerical Python.". The NumPy provides a wide range of features, which are listed below −

  • High-performance numerical operations.
  • This supports complex numbers and matrices.
  • Integration with popular libraries, such as Pandas and Matplotlib.

2. Installing NumPy

To install NumPy on the system, use the following command −

pip install numpy

3. Importing NumPy

To import the Python library(numpy), use the below line of code −

import numpy as np

4. Creating NumPy Arrays

To create NumPy arrays in Python, use the ndarray object that is the array() function.

# 1D array
np.array([1, 2, 3])   

# 2D array     
np.array([[1, 2], [3, 4]]) 

5. Array Data Types

NumPy data types (dtype) are the specific types of data that NumPy arrays can hold. NumPy has various data types, such as int, float, bool, etc.

import numpy as np

# Specify data type
arr = np.array([1, 2, 3], dtype=np.float32)  

print(arr.dtype)

6. Array Shape and Size

The NumPy shape provides the dimension of an array, while size refers to the total number of elements in the array.

# Returns shape of array
arr.shape 

# Returns number of elements   
arr.size

# Returns number of dimensions     
arr.ndim     

7. Reshaping and Flattening Arrays

In NumPy, we use reshaping to change the shape of an array using reshape(). The flatten array is defined by converting a multi-dimensional array into a single-dimensional array.

arr.reshape((2, 3))   # Reshape to 2*3
arr.flatten()         # Flatten to 1D

8. Indexing and Slicing Arrays

In NumPy, indexing and slicing are used to access and manipulate elements within the array.

Indexing

An index allows the user to access the particular elements of a NumPy array.

import numpy as np
arr = np.array([[1, 2, 3], [4, 5, 6]])

# access the element at row 0 and column 1
element = arr[0, 1]  
print(element)

Slicing

In NumPy, slicing allows the user to access the range of data from the specified element. The syntax of slicing is "array[start:end:step]".

import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])

# range
slice = arr[2:5]  
print(slice)

9. Boolean Indexing

The boolean indexing in NumPy allows users to select the elements from an array based on conditions. Here, the user can use the 'True' and 'False' values instead of the integer value.

import numpy as np
# Create a NumPy array
arr = np.array([10, 20, 30, 40, 50])
# condition
bool_array = arr > 25 
# Use the boolean array to index the original array
result = arr[bool_array] 
print(result)  

10. Specific Indexing

The specific indexing is defined by accessing the group of elements that are present in an array. Here, we have two ways to get the specific indexes. −

i. Integer Array Indexing

import numpy as np

# Create a NumPy array
arr = np.array([11, 21, 31, 41, 51, 61])

# Use an array of indices to access specific elements
indexes = np.array([0, 2, 4])
res = arr[indexes] 

print(res)  

ii. Multi-dimensional Indexing

import numpy as np

# multi-dimensional array
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# first and third rows
row_indices = np.array([0, 2])
# second and third columns  
col_indices = np.array([1, 2])  

# Use np.ix_ to create a grid for the specified rows and columns
res = arr[np.ix_(row_indices, col_indices)]
print(res)

11. Basic Array Operations

Here, an array operation works with the arithmetic operators.

arr1 + arr2    # Addition
arr1 - arr2    # Subtraction
arr1 * arr2    # Multiplication
arr1 / arr2    # Division
arr1 ** 2      # Squaring

12. Universal Functions

In NumPy, the short form of universal function is "ufuncs," which operates on n−dimensional array(ndarray).

import numpy as np  
arr1 = np.array([3, 4, 6, 3])  
arr2 = np.array([1, 5, 4, 1])  
  
# ufunc (add) to add the two arrays element 
res = np.add(arr1, arr2)  
print(res) 

13. Aggregation Functions

In NumPy, an aggregate function is a mathematical operation that contains one or more array elements to produce a result.

np.sum(arr)     # Sum
np.mean(arr)    # Mean
np.max(arr)     # Maximum
np.min(arr)     # Minimum

14. Generating Random Numbers

In NumPy, generating random numbers is defined using the process of producing numbers that are not predictable.

import numpy as np
# Random numbers in [0, 1]
x = np.random.rand(3, 3)   

# Standard normal distribution
y = np.random.randn(3, 3)  

# Random integers
z = np.random.randint(1, 10, (2, 2))  

print(x)
print(y)
print(z)

15. Splitting Arrays

In NumPy, an array can be split into two or more sub-arrays using the split function or by using the numpy.split function.

import numpy as np
arr = np.array([11, 12, 13, 14, 15, 16])
res_arr = np.array_split(arr, 3)
print(res_arr)

16. Copy vs View in NumPy

In NumPy, understanding the difference between an array copy and view are important for effective memory management and data manipulation.

i. NumPy − copy

A copy of an array is a new array that is created with its own data. The changes made to the copy will not affect the original array.

import numpy as np

# Create an original array
arr = np.array([1, 2, 3, 4, 5])

# copy of the original array
copy_array = np.copy(arr)

# Modification on copied array
copy_array[0] = 10

print("Original array:", arr) 
print("The result of copy array:", copy_array)     

ii. NumPy − view

A view of an array is a new array object that looks at the same data as the original array. The changes made to the copy will affect the original array.

import numpy as np

# Create an original array
arr = np.array([1, 2, 3, 4, 5])

# view of the original array
view_array = arr[1:4]  

# Modification on view array
view_array[0] = 20

print("Original array:", arr)  
print("The result of view array:", view_array)         

17. Linear Algebra

In NumPy, linear algebra is a branch of mathematics that allows operations like addition, multiplication, inversion, and solving linear equations using numpy array functions.

np.dot(A, B)      # Dot product
np.linalg.inv(A)  # Inverse of a matrix
np.linalg.det(A)  # Determinant

18. Statistical Functions

In NumPy, statistical functions are built-in functions that allow users to calculate statistical calculations like mean, median, standard deviation, variance, minimum, maximum, percentiles, etc., directly on a numPy array.

# Standard deviation
np.std(arr)  

# Variance   
np.var(arr)    

# 40th percentile 
np.percentile(arr, 40)  

19. Sorting and Searching

In NumPy, sorting refers to the process of arranging a sequence of elements in a specific order, typically in ascending or descending order. Searching in NumPy refers to the process of finding the indices of the elements in an array that match a certain condition.

# sort the array
np.sort(arr) 

# return the indices of sorted elements   
np.argsort(arr) 

# indices of elements greater than 5
np.where(arr > 5)  

20. Filtering Arrays

In NumPy, filter an array containing the selected elements from an array that meets a specific condition.

arr[arr > 5] 

21. Handling Missing Data

To handle the missing data in NumPy refers to the user strategies to manage the absence of data points in a dataset.

# Check for NaN values
np.isnan(arr)   

# Replace NaN with zero
np.nan_to_num(arr)  

22. Working with Structured Arrays

In NumPy, while working with a structured array, it defines the data type (dtype) that specifies the names and types of the fields.

dtype = [('name', 'S10'), ('age', 'i4')]
data = np.array([('Mark', 25), ('Jobin', 30)], dtype = dtype)

23. Memory Layout and Optimization

In NumPy, memory layout refers to the arrangement of data elements in memory. Here are some of the key aspects of the memory optimization process −

  • dtype Choice: It uses data types, e.g., numpy.float16 instead of numpy.float64.
  • Memory Alignment: This stores arrays in contiguous memory to reduce fragmentation.
  • Viewing Arrays: It uses views instead of copies to save memory.
arr.nbytes    # Memory usage in bytes
arr.strides   # Steps in memory for each dimension

24. Using NumPy with Pandas

In Python, NumPy and Pandas are two popular libraries that are often used together for data manipulation and analysis.

import numpy as np
import pandas as pd

# creating a numpy array
data = np.array([[11, 12, 13], [14, 15, 16], [17, 18, 19]])

# creating a data frame from the numpy array
df = pd.DataFrame(data, columns=['A', 'B', 'C'])

print("DataFrame from NumPy Array:")
print(df)

25. Saving and Loading Arrays

The saving and loading arrays in NumPy contain two functions − save() and load() from the numpy library. Below the codes demonstrate how to save and load from an array.

# Save to .npy file
np.save('data.npy', arr)

# Load from .npy file  
np.load('data.npy')       
Advertisements