NumPy - Data Science

2. The Shape and Reshaping of NumPy Array

Once you have created your ndarray, the next thing you would want to do is check the number of axes, shape, and the size of the ndarray.

a. Dimensions of NumPy array

You can easily determine the number of dimensions or axes of a NumPy array using the ndims attribute:

Array : 
 [[ 5 10 15]
 [20 25 20]]
Dimensions : 
 2

This array has two dimensions: 2 rows and 3 columns.

b. Shape of NumPy array

The shape is an attribute of the NumPy array that shows how many rows of elements are there along each dimension. You can further index the shape so returned by the ndarray to get value along each dimension:

a = np.array([[1,2,3],[4,5,6]])
print('Array :','\n',a)
print('Shape :','\n',a.shape)
print('Rows = ',a.shape[0])
print('Columns = ',a.shape[1])
Array : 
 [[1 2 3]
 [4 5 6]]
Shape : 
 (2, 3)
Rows =  2
Columns =  3

c. Size of NumPy array

You can determine how many values there are in the array using the size attribute. It just multiplies the number of rows by the number of columns in the ndarray:

Size of array : 6
Manual determination of size of array : 6

Numpy shape and size

d. Reshaping a NumPy array

Reshaping a ndarray can be done using the np.reshape() method. It changes the shape of the ndarray without changing the data within the ndarray:

array([[ 3,  6],
       [ 9, 12]])

Here, I reshaped the ndarray from a 1-D to a 2-D ndarray.

While reshaping, if you are unsure about the shape of any of the axis, just input -1. NumPy automatically calculates the shape when it sees a -1:

Three rows : 
 [[ 3  6]
 [ 9 12]
 [18 24]]
Three columns : 
 [[ 3  6  9]
 [12 18 24]] 

e. Flattening a NumPy array

Sometimes when you have a multidimensional array and want to collapse it to a single-dimensional array, you can either use the flatten() method or the ravel() method:

a = np.ones((2,2))
b = a.flatten()
c = a.ravel()
print('Original shape :', a.shape)
print('Array :','\n', a)
print('Shape after flatten :',b.shape)
print('Array :','\n', b)
print('Shape after ravel :',c.shape)
print('Array :','\n', c)
Original shape : (2, 2)
Array : 
 [[1. 1.]
 [1. 1.]]
Shape after flatten : (4,)
Array : 
 [1. 1. 1. 1.]
Shape after ravel : (4,)
Array : 
 [1. 1. 1. 1.]

Numpy flatten

But an important difference between flatten() and ravel() is that the former returns a copy of the original array while the latter returns a reference to the original array. This means any changes made to the array returned from ravel() will also be reflected in the original array while this will not be the case with flatten().

b[0] = 0
print(a)
[[1. 1.]
 [1. 1.]]

The change made was not reflected in the original array.

But here, the changed value is also reflected in the original ndarray.

What is happening here is that flatten() creates a Deep copy of the ndarray while ravel() creates a Shallow copy of the ndarray.

Deep copy means that a completely new ndarray is created in memory and the ndarray object returned by flatten() is now pointing to this memory location. Therefore, any changes made here will not be reflected in the original ndarray.

A Shallow copy, on the other hand, returns a reference to the original memory location. Meaning the object returned by ravel() is pointing to the same memory location as the original ndarray object. So, definitely, any changes made to this ndarray will also be reflected in the original ndarray too.

Shallow vs Deep copy

f. Transpose of a NumPy array

Another very interesting reshaping method of NumPy is the transpose() method. It takes the input array and swaps the rows with the column values, and the column values with the values of the rows:

a = np.array([[1,2,3],
[4,5,6]])
b = np.transpose(a)
print('Original','\n','Shape',a.shape,'\n',a)
print('Expand along columns:','\n','Shape',b.shape,'\n',b)
Original 
 Shape (2, 3) 
 [[1 2 3]
 [4 5 6]]
Expand along columns: 
 Shape (3, 2) 
 [[1 4]
 [2 5]
 [3 6]]

On transposing a 2 x 3 array, we got a 3 x 2 array. Transpose has a lot of significance in linear algebra.