Reshaping and Resizing

When working with data in array form, it is often useful to rearrange arrays and alter the way they are interpreted. For example, an NxN matrix array could be rearranged into a vector of length $N^2$, or a set of one-dimensional arrays could be concatenated together or stacked next to each other to form a matrix.

Summary of NumPy Functions for Manipulating the Dimensions and the Shape of Arrays

Function/Method Description
np.reshape, np.ndarray.reshape Reshape an N-dimensional array. The total number of elements must remain the same.
np.ndarray.flatten Creates a copy of an N-dimensional array, and reinterpret it as a one-dimensional array (i.e., all dimensions are collapsed into one).
np.ravel, np.ndarray.ravel Create a view (if possible, otherwise a copy) of an N-dimensional array in which it is interpreted as a one-dimensional array.
np.squeeze Removes axes with lenght 1.
np.expand_dims, np.newaxis Add a new axis/dimension of length 1 to an array, where np.newaxis is used with array indexing.
np.transpose or np.ndarray.transpose, np.ndarray.T Transpose the array. The transpose operation corresponds to reversing (or more generally, permuting) the axes of the array.
np.hstack Stacks a list of arrays horizontally (along axis 1): for example, given a list of column vectors, appends the columns to form a matrix.
np.vstack Stacks a list of arrays vertically (along axis 0): for example, given a list of row vectors, appends the rows to form a matrix.
np.dstack Stacks arrays depth-wise (along axis 2).
np.concatenate Creates a new array by appending arrays after each other, along a given axis.
np.resize Resizes an array. Creates a new copy of the original array, with the requested size. If necessary, the original array will be repeated to fill up the new array.
np.append Appends an element to an array. Creates a new copy of the array.
np.insert Inserts a new element at a given position. Creates a new copy of the array.
np.delete Deletes an element at a given position. Creates a new copy of the array.
  • Reshaping an array does not require modifying the underlying array data; it only changes in how the data is interpreted, by redefining the array’s strides attribute.
import numpy as np
data = np.array([[10, 3], [5, 8]])
data
array([[10,  3],
       [ 5,  8]])
data.strides
(16, 8)
x = np.reshape(a=data, newshape=(1, 4))
x
array([[10,  3,  5,  8]])
x.strides
(32, 8)
  • Note that reshaping an array produces a view of the array, and if an independent copy of the array is needed, the view has to be copied explicitly (e.g., using np.copy).
x[0, 1] = -100
x
array([[  10, -100,    5,    8]])
data
array([[  10, -100],
       [   5,    8]])
  • The np.ravel() is a special case of reshape, which collapses all dimensions of an array and returns a flattened one-dimensional array with a length that corresponds to the total number of elements in the original array.
data.flatten()
array([  10, -100,    5,    8])
data.flatten().shape
(4,)
  • While np.ravel() and np.flatten() collapse the axes of an array into a one-dimensional array, it is also possible to introduce new axes into an array, either by using np.reshape or, when adding new empty axes, using indexing notation and the np.newaxis keyword at the place of a new axis.
data
array([[  10, -100],
       [   5,    8]])
column = data[:, np.newaxis]
column
array([[[  10, -100]],

       [[   5,    8]]])
row = data[np.newaxis, :]
row
array([[[  10, -100],
        [   5,    8]]])
data.shape, column.shape, row.shape
((2, 2), (2, 1, 2), (1, 2, 2))
  • The function np.expand_dims can also be used to add new dimensions to an array, and in the preceding example, the expression data[:, np.newaxis] is equivalent to np.expand_dims(data, axis=0). Here the axis argument specifies the location relative to the existing axes where the new axis is to be inserted.
np.expand_dims(data, axis=0).shape
(1, 2, 2)
  • In addition to reshaping and selecting subarrays, it is often necessary to merge arrays into bigger arrays, for example, when joining separately computed or measured data series into a higher-dimensional array, such as a matrix. For this task, NumPy provides the functions np.vstack, for vertical stacking of, for example, rows into a matrix, and np.hstack for horizontal stacking of, for example, columns into a matrix. The function np.concatenate provides similar functionality, but it takes a keyword argument axis that specifies the axis along which the arrays are to be concatenated.
data = np.arange(5)
data
array([0, 1, 2, 3, 4])
np.vstack((data, data, data))
array([[0, 1, 2, 3, 4],
       [0, 1, 2, 3, 4],
       [0, 1, 2, 3, 4]])
  • If we instead want to stack the arrays horizontally, to obtain a matrix where the arrays are the column vectors, we might first attempt something similar using np.hstack:
np.hstack((data, data, data))
array([0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 0, 1, 2, 3, 4])

However, this doesn’t stack the arrays horizontally, but not in the way intended here. To make np.hstack() treat the input arrays as columns and stack them accordingly, we need to make the input arrays two-dimensional arrays of shape (1, 5) rather than one-dimensional arrays of shape (5,) by inserting a new axis by indexing with np.newaxis

data = data[:, np.newaxis]
np.hstack((data, data, data))
array([[0, 0, 0],
       [1, 1, 1],
       [2, 2, 2],
       [3, 3, 3],
       [4, 4, 4]])

The behavior of the functions for horizontal and vertical stacking, as well as concatenating arrays using np.concatenate, is clearest when the stacked arrays have the same number of dimensions as the final array and when the input arrays are stacked along an axis for which they have length 1.


NOTE:

  • The number of elements in a NumPy array cannot be changed once the array has been created. To insert, append, and remove elements from a NumPy array, for example, using the function np.append, np.insert, and np.delete, a new array must be created and the data copied to it.
  • It may sometimes be tempting to use these functions to grow or shrink the size of a NumPy array, but due to the overhead of creating new arrays and copying the data, it is usually a good idea to preallocate arrays with size such that they do not later need to be resized.

%load_ext watermark
%watermark --iversion -g -m -v -u -d
numpy 1.16.3
last updated: 2019-05-16 

CPython 3.6.7
IPython 7.5.0

compiler   : GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)
system     : Darwin
release    : 18.2.0
machine    : x86_64
processor  : i386
CPU cores  : 8
interpreter: 64bit
Git hash   : 6fea3e461997774d354d33506e67cdcac1f92e66