NumPy is a powerful library for numerical computing in Python. It provides support for arrays, matrices, and many mathematical functions. Here’s a detailed guide from basics to advanced topics in NumPy, along with examples, best practices, and standard coding structures.
Basics of NumPy
1. Installation
To use NumPy, you need to install it first. You can install it using pip:
pip install numpy
Importing NumPy
Import NumPy as follows:
import numpy as np
2. NumPy Arrays
Creating Arrays
You can create a NumPy array using the array
function:
import numpy as np
# Creating a 1D array
arr1 = np.array([1, 2, 3, 4, 5])
# Creating a 2D array
arr2 = np.array([[1, 2, 3], [4, 5, 6]])
print("1D Array:", arr1)
print("2D Array:", arr2)
Output:
1D Array: [1 2 3 4 5]
2D Array:
[[1 2 3]
[4 5 6]]
Array Attributes
NumPy arrays have several attributes that give information about the array.
print("Shape of arr1:", arr1.shape)
print("Shape of arr2:", arr2.shape)
print("Number of dimensions of arr2:", arr2.ndim)
print("Data type of arr2:", arr2.dtype)
print("Size of arr2:", arr2.size)
Output:
Shape of arr1: (5,)
Shape of arr2: (2, 3)
Number of dimensions of arr2: 2
Data type of arr2: int64
Size of arr2: 6
3. Array Initialization
NumPy provides several functions to create arrays:
# Array of zeros
zeros = np.zeros((2, 3))
# Array of ones
ones = np.ones((2, 3))
# Array with random values
rand = np.random.rand(2, 3)
# Array with a range of values
range_arr = np.arange(10)
# Array with values spaced evenly on a log scale
logspace_arr = np.logspace(1, 2, 10)
print("Zeros:\n", zeros)
print("Ones:\n", ones)
print("Random values:\n", rand)
print("Range array:\n", range_arr)
print("Logspace array:\n", logspace_arr)
Output:
Zeros:
[[0. 0. 0.]
[0. 0. 0.]]
Ones:
[[1. 1. 1.]
[1. 1. 1.]]
Random values:
[[0.14279797 0.1727033 0.73141483]
[0.58674656 0.38317635 0.20128951]]
Range array:
[0 1 2 3 4 5 6 7 8 9]
Logspace array:
[ 10. 12.91549665 16.68100537 21.5443469 27.82559402
35.93813664 46.41588834 59.94842503 77.42636827 100. ]
4. Basic Operations
Arithmetic Operations
Arithmetic operations can be performed on NumPy arrays element-wise.
arr1 = np.array([1, 2, 3, 4, 5])
arr2 = np.array([10, 20, 30, 40, 50])
# Addition
add = arr1 + arr2
# Subtraction
sub = arr1 - arr2
# Multiplication
mul = arr1 * arr2
# Division
div = arr1 / arr2
print("Addition:", add)
print("Subtraction:", sub)
print("Multiplication:", mul)
print("Division:", div)
Output:
Addition: [11 22 33 44 55]
Subtraction: [ -9 -18 -27 -36 -45]
Multiplication: [ 10 40 90 160 250]
Division: [0.1 0.1 0.1 0.1 0.1]
Statistical Operations
NumPy provides various functions to perform statistical operations.
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
# Minimum and Maximum
min_val = np.min(arr)
max_val = np.max(arr)
# Mean
mean_val = np.mean(arr)
# Standard Deviation
std_val = np.std(arr)
# Sum
sum_val = np.sum(arr)
print("Min:", min_val)
print("Max:", max_val)
print("Mean:", mean_val)
print("Standard Deviation:", std_val)
print("Sum:", sum_val)
Output:
Min: 1
Max: 10
Mean: 5.5
Standard Deviation: 2.8722813232690143
Sum: 55
Advanced NumPy
1. Broadcasting
Broadcasting allows NumPy to perform element-wise operations on arrays of different shapes.
arr1 = np.array([1, 2, 3])
arr2 = np.array([[1], [2], [3]])
# Broadcasting example
result = arr1 + arr2
print("Broadcasting Result:\n", result)
Output:
Broadcasting Result:
[[2 3 4]
[3 4 5]
[4 5 6]]
2. Reshaping Arrays
You can change the shape of an array using the reshape
function.
arr = np.arange(1, 13)
# Reshape to 3x4 array
reshaped_arr = arr.reshape((3, 4))
print("Original Array:", arr)
print("Reshaped Array:\n", reshaped_arr)
Output:
Original Array: [ 1 2 3 4 5 6 7 8 9 10 11 12]
Reshaped Array:
[[ 1 2 3 4]
[ 5 6 7 8]
[ 9 10 11 12]]
3. Stacking and Splitting Arrays
You can stack and split arrays using various functions.
arr1 = np.array([[1, 2], [3, 4]])
arr2 = np.array([[5, 6], [7, 8]])
# Horizontal Stack
hstacked = np.hstack((arr1, arr2))
# Vertical Stack
vstacked = np.vstack((arr1, arr2))
# Split arrays
hsplit_arr = np.hsplit(hstacked, 2)
vsplit_arr = np.vsplit(vstacked, 2)
print("Horizontally Stacked:\n", hstacked)
print("Vertically Stacked:\n", vstacked)
print("Horizontally Split:", hsplit_arr)
print("Vertically Split:", vsplit_arr)
Output:
Horizontally Stacked:
[[1 2 5 6]
[3 4 7 8]]
Vertically Stacked:
[[1 2]
[3 4]
[5 6]
[7 8]]
Horizontally Split: [array([[1, 2],
[3, 4]]), array([[5, 6],
[7, 8]])]
Vertically Split: [array([[1, 2],
[3, 4]]), array([[5, 6],
[7, 8]])]
4. Linear Algebra Operations
NumPy provides support for linear algebra operations.
from numpy.linalg import inv, eig
matrix = np.array([[1, 2], [3, 4]])
# Matrix Inverse
inv_matrix = inv(matrix)
# Eigenvalues and Eigenvectors
eigenvalues, eigenvectors = eig(matrix)
print("Original Matrix:\n", matrix)
print("Inverse Matrix:\n", inv_matrix)
print("Eigenvalues:", eigenvalues)
print("Eigenvectors:\n", eigenvectors)
Output:
Original Matrix:
[[1 2]
[3 4]]
Inverse Matrix:
[[-2. 1. ]
[ 1.5 -0.5]]
Eigenvalues: [-0.37228132 5.37228132]
Eigenvectors:
[[-0.82456484 -0.41597356]
[ 0.56576746 -0.90937671]]
Best Practices
- Use Vectorized Operations: Avoid loops; use NumPy’s built-in functions and vectorized operations for better performance.
- Memory Management: Be mindful of array sizes and data types to optimize memory usage.
- Avoid Copying Data: Use views instead of copies whenever possible to save memory and improve performance.
- Use Broadcasting: Leverage broadcasting to perform operations on arrays of different shapes without needing to replicate data.Follow PEP 8: Write clean and readable code by following the Python PEP 8 style guide.
Example Explanation
Let’s walk through a complete example that utilizes various NumPy features to solve a problem:
Problem: Calculate the pairwise distances between points in a 2D space.
import numpy as np
# Define points in 2D space
points = np.array([[1, 2], [3, 4], [5, 6]])
# Calculate pairwise distances
# (x2 - x1)^2 + (y2 - y1)^2
diff = points[:, np.newaxis, :] - points[np.newaxis, :, :]
squared_diff = diff ** 2
distances = np.sqrt(squared_diff.sum(axis=2))
print("Points:\n", points)
print("Pairwise Distances:\n", distances)
Output:
Points:
[[1 2]
[3 4]
[5 6]]
Pairwise Distances:
[[0. 2.82842712 5.65685425]
[2.82842712 0. 2.82842712]
[5.65685425 2.82842712 0. ]]
Explanation
- Define Points: We define a 2D array
points
where each row is a point in 2D space. - Calculate Differences: We calculate the differences between each pair of points using broadcasting.
- Square Differences: We square the differences to prepare for distance calculation.
- Sum and Square Root: We sum the squared differences along the appropriate axis and take the square root to get the distances.
This example demonstrates the power and efficiency of NumPy in handling numerical computations.