Indexing and Slicing

  • Elements and subarrays of NumPy arrays are accessed using the standard square bracket notation that is also used with Python lists.
  • Within the square bracket, a variety of different index formats are used for different types of element selection.
  • In general, the expression within the bracket is a tuple, where each item in the tuple is a specification of which elements to select from each axis/dimensions of the array.

Examples of Array Indexing and Slicing Expressions

Expression Description
a[m] Select element at index m, where m is an integer (start counting form 0).
a[-m] Select the nth element from the end of the list, where n is an integer. The last element in the list is addressed as -1, the second to last element as -2, and so on.
a[m:n] Select elements with index starting at m and ending at n-1 (m and n are integers).
a[:] or a[0:-1] Select all elements in the given axis.
a[:n] Select elements starting with index 0 and going up to index n-1 (integer)
a[m:] or a[m:-1] Select elements starting with index m (integer) and going up to the last element in the array.
a[m:n:p] Select elements with index m through n (exclusive), with increment p.
a[::-1] Select all the elements, in reverse order.

One-Dimensional Arrays

Along a single axis, integers are used to select single elements, and so-called slices are used to select ranges and sequences of elements. Positive integers are used to index elements from the beginning of the array (index starts at 0), and negative integers are used to index elements from the end of the array, where the last element is indexed with –1, the second to last element with –2, and so on.

import numpy as np
data = np.arange(8)
data
array([0, 1, 2, 3, 4, 5, 6, 7])
data[0] # First element
0
data[-1] # last element
7
data[4] # fifth element, at index 4
4
data[1:-1] # second-to-last
array([1, 2, 3, 4, 5, 6])
data[1:-1:2] # second-to-last, selecting every second element
array([1, 3, 5])
data[:5] # select first five
array([0, 1, 2, 3, 4])
data[-5:] # last five element
array([3, 4, 5, 6, 7])
data[::-2] # reverse the array and select only every second value
array([7, 5, 3, 1])

Multidimensional Arrays

With multidimensional arrays, elements selections like those introduced in the previous section can be applied on each axis/dimension. The result is a reduced array where each element matches the given selection rules

f = lambda m, n: n + 10 * m
data = np.fromfunction(function=f, shape=(6, 6), dtype=np.int32)
data
array([[ 0,  1,  2,  3,  4,  5],
       [10, 11, 12, 13, 14, 15],
       [20, 21, 22, 23, 24, 25],
       [30, 31, 32, 33, 34, 35],
       [40, 41, 42, 43, 44, 45],
       [50, 51, 52, 53, 54, 55]], dtype=int32)
data[:, 1] # second column
array([ 1, 11, 21, 31, 41, 51], dtype=int32)
data[1, :] # second row
array([10, 11, 12, 13, 14, 15], dtype=int32)
  • By applying a slice on each of the array axes, we xan extract subarrays:
data[:3, :3] # Upper half diagonal block matrix
array([[ 0,  1,  2],
       [10, 11, 12],
       [20, 21, 22]], dtype=int32)
data[3:, :3] # lower left off-diagonal block matrix
array([[30, 31, 32],
       [40, 41, 42],
       [50, 51, 52]], dtype=int32)
  • With element spacing other that 1, subarrays made up from nonconsecutive elements can be extracted:
data[::2, ::2] # every second element starting from 0, 0
array([[ 0,  2,  4],
       [20, 22, 24],
       [40, 42, 44]], dtype=int32)
data[1::2, 1::3] # every second and third element starting from 1, 1
array([[11, 14],
       [31, 34],
       [51, 54]], dtype=int32)

This ability to extract subsets of data from a multidimensional array is a simple but very powerful feature.

Copies and Views of Objects

  • Subarrays that are extracted from arrays using slice operations are alternative views of the same underlying array data. This means that they are arrays that refer to the same data in memory as the original array, but with a different strides configuration.
  • When elements in a view are assigned new values, the values of the original array are therefore also updated:
data = np.fromfunction(function=f, shape=(6, 6), dtype=np.int32)
data
array([[ 0,  1,  2,  3,  4,  5],
       [10, 11, 12, 13, 14, 15],
       [20, 21, 22, 23, 24, 25],
       [30, 31, 32, 33, 34, 35],
       [40, 41, 42, 43, 44, 45],
       [50, 51, 52, 53, 54, 55]], dtype=int32)
x = data[1:5, 1:5]
x
array([[11, 12, 13, 14],
       [21, 22, 23, 24],
       [31, 32, 33, 34],
       [41, 42, 43, 44]], dtype=int32)
x[:, :] = 100
data
array([[  0,   1,   2,   3,   4,   5],
       [ 10, 100, 100, 100, 100,  15],
       [ 20, 100, 100, 100, 100,  25],
       [ 30, 100, 100, 100, 100,  35],
       [ 40, 100, 100, 100, 100,  45],
       [ 50,  51,  52,  53,  54,  55]], dtype=int32)
  • Here, assigning new values to the elements in an array B, which is created from the array A, also modifies the values in A (since both arrays refer to the same data in the memory).
  • The fact that extracting subarrays results in views rather than new independent arrays eliminates the need for copying data and improves performance.
  • When a copy rather than a view is needed, the view can be copied explicitly by using the copy method of the ndarray instance.
y = x[1:3, 1:3].copy()
y
array([[100, 100],
       [100, 100]], dtype=int32)
y[:, :] = 1 # does not affect x since y is a copy of the view x[1:3, 1:3]
y
array([[1, 1],
       [1, 1]], dtype=int32)
x
array([[100, 100, 100, 100],
       [100, 100, 100, 100],
       [100, 100, 100, 100],
       [100, 100, 100, 100]], dtype=int32)

Fancy Indexing and Boolean-Valued Index

NumPy provides another convenient method to index arrays, called fancy indexing.

  • With fancy indexing, an array can be indexed with another NumPy array, a Python list, or a sequence of integers, whose values select elements in the indexed array.
  • Fancy indexing requires that the elements in the array or list used for indexing are integers.
data = np.linspace(0, 1, 11)
data
array([0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. ])
data[np.array([0, 2, 4])]
array([0. , 0.2, 0.4])
data[[0, 2, 4]]
array([0. , 0.2, 0.4])
  • Another variant of indexing NumPy arrays is to use Boolean-valued index arrays. In this case, each element indicates whether or not to select the element from the list with the corresponding index. This index method is handy when filtering out elements from an array
data > 0.6
array([False, False, False, False, False, False,  True,  True,  True,
        True,  True])
data[data > 0.6]
array([0.6, 0.7, 0.8, 0.9, 1. ])

NOTE:

Unlike arrays created by using slices, the arrays returned using fancy indexing and Boolean-valued indexing are not views but rather new independent arrays. Nonetheless, it is possible to assign values to elements selected using fancy indexing:


data = np.arange(10)
data
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
indices = [3, 5, 7]
x = data[indices]
x
array([3, 5, 7])
x[0] = -1 # this does not affect data
data
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
data[indices] = -1 # this affects data
data
array([ 0,  1,  2, -1,  4, -1,  6, -1,  8,  9])
data = np.arange(10)
x = data[data > 5]
x
array([6, 7, 8, 9])
x[0] = -1 # this does not affect data
data
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
data[data > 5] = -1 # this alters data
data
array([ 0,  1,  2,  3,  4,  5, -1, -1, -1, -1])
%load_ext watermark
%watermark --iversion -g -m -v -u -d
numpy 1.16.3
last updated: 2019-05-16 

CPython 3.6.7
IPython 7.5.0

compiler   : GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)
system     : Darwin
release    : 18.2.0
machine    : x86_64
processor  : i386
CPU cores  : 8
interpreter: 64bit
Git hash   : d5ca5803791f464b37b10773989e09aa9c5c781b