Python NumPy Tutorial

Numpy is a package in Python for mathematical computation of scientific and mathematical data through the use of an n-dimensional array. It is one of the most popular packages used for data analysis in Python. These NumPy arrays not only provide great functionality but also are efficient in the way the store the data and the computation performed on them is very quick. Which is why they are preferred over a normal python list.

Basic assumption made in this Python NumPy tutorial are :

  • You have python already installed.
  • You know how to program in any programming language.

Installing Numpy

Installing Using Anaconda

Anaconda already comes with numpy but if you messed up the installation . You can still install numpy using the command below.

conda install -c anaconda numpy		

Installing Using PIP

pip install numpy

Learning Numpy

The Numpy Basics

To use numpy we need to import is first as with any other package.

import numpy as np
x = [1,2,3,4]
a = np.array(x)
type(a)
type(x)

In the code above we imported numpy as np(the most commonly used alias for numpy). We then created a normal python list only to check later the difference between a numpy array and a python list. Now, To create a numpy array. We would pass a list to the np.array method which would return us the numpy array for the given list.

Comparison between numpy array and a normal Python list

As you could see with the above image the type of a normal python list and a numpy array are different. Moreover for some no of element with same type numpy array occupies lesser memory than the python list.

Numpy.Arange()

The arange function is used to create a NumPy array of values between the start, the stop with the difference of an optional step interval. The arange method in NumPy accepts three parameters. Start value, Step value, and value of the step interval. The value for step is optional and defaults to 1 and the values start from the start value and goes up to stop but does not include the value of stop. Consider the value of stop to be the Upper bound.

a = np.arange(1, 10)
print(a)
b = np.arange(1, 10, .2)
print(b)

When we execute the code above for print a we get.

[1 2 3 4 5 6 7 8 9]

As we didn’t specify the value for step interval it defaulted to 1 but in case of b we defined the step interval to .2. for which the result printed is

[1.  1.2 1.4 1.6 1.8 2.  2.2 2.4 2.6 2.8 3.  3.2 3.4 3.6 3.8 4.  4.2 4.4
 4.6 4.8 5.  5.2 5.4 5.6 5.8 6.  6.2 6.4 6.6 6.8 7.  7.2 7.4 7.6 7.8 8.
 8.2 8.4 8.6 8.8 9.  9.2 9.4 9.6 9.8]

We would continue using the Numpy array a moving ahead in the article.

Finding Minimum value in an NumPy array.

Using the min method we can find the minimum value in the NumPy array.

a.min()

The above returns us 1. Which was the minimum value in our numpy array a.

Finding Maximum value in an NumPy array.

Using the max method we can find the minimum value in the NumPy array.

a.max()

The above returns us 9.8. Which was the maximum value in our NumPy array a.

Finding the sum of values in an NumPy array.

Using the sum method we can find the sum of all values in the NumPy array.

a.sum()

The above returns us 45. Which was the sum of all values in our NumPy array a.

NumPy.linspace method

Now arange function already discussed returns us all the value between the interval with the specified step interval but what linspace method does is that it returns to you the no of values you want between a set interval. These values are evenly spaced. it accepts three arguments. start value stop value and the value that specifies the no of values you want between the specified interval. Unlike the arange function here the stop value is included in the result.

for example let say You wanted 30 values in the interval of 1, 10.

np.linspace(1, 10, 30)

The output of the above could would be.

array([ 1.        ,  1.31034483,  1.62068966,  1.93103448,  2.24137931,
        2.55172414,  2.86206897,  3.17241379,  3.48275862,  3.79310345,
        4.10344828,  4.4137931 ,  4.72413793,  5.03448276,  5.34482759,
        5.65517241,  5.96551724,  6.27586207,  6.5862069 ,  6.89655172,
        7.20689655,  7.51724138,  7.82758621,  8.13793103,  8.44827586,
        8.75862069,  9.06896552,  9.37931034,  9.68965517, 10.        ])

Which represents 30 evenly spaced values from 1 to 10.

Finding the Shape of an Numpy Array

The Shape of an array defines the no of rows and column in that array.Let us first create a multidimensional array

a = np.array([[1,2],[2,3],[4,5]])

Shape is a property of a numpy array to get the shape you simply need to say.

a.shape

Which would return to you (3, 2). Where 3 is no of rows and 2 is no of columns of the numpy array a.

Reshaping the NumPy array

You could reshape your numpy array to any other shape as long as that is compatible. for ex our np array a has 3 rows 2 colums meaning 6 elements. you could reshape it to (1,6),(2,3),(3,2),(6,1)

Our Original Numpy array ‘a’ with shape (3,2)

array([[1, 2],
       [2, 3],
       [4, 5]])

Changing it to shape (2,3)

a.reshape(2,3)

Results in

array([[1, 2, 2],
       [3, 4, 5]])

As you could see now that the original state has changed. Now there are 2 rows and 3 columns. So as long as the no of rows * no of columns results in the same no of elements it is compatible and reshaping could be done. Otherwise trying to reshape in an incompatible shape would result in an error.

Creating a Zero’s array

To create a numpy array with all values 0. we could use the numpy method zeroes which accepts only one argument a tuple specifying the dimensions of the numpy array you wish to create.

np.zeros((3,2))

Which results in

array([[0., 0.],
       [0., 0.],
       [0., 0.]])

Creating a Ones Numpy array

Simarly to zeros to create a numpy array with all values set to one you would use the ones method. It also takes in the dimensions

np.ones((3,2))

Which results in

array([[1., 1.],
       [1., 1.],
       [1., 1.]])

Creating a Numpy array with values set to a specified number

Similar to Ones and zeroes you could create a numpy array with all values set to a no of your choice. For this, you could use the full method. It accepts a tuple that specifies the shape of the array you wish to create and second argument the number you wish this array to be filled with.

np.full((3,3),5)

The above method returns us a Numpy array with the shape 3,3 and where all values are 5.

array([[5, 5, 5],
       [5, 5, 5],
       [5, 5, 5]])

Creating an Identity Matrix

The Identity matrix has all the diagonal element 1 and every other element 0. One of the major properties of this matrix is that it is a square matrix that is no of rows and columns are the same. Another important property is that any matrix multiplied to this matrix results in the matrix itself.

You have a method called eye that accepts only one argument a number that specifies the no of rows and column.

identity = np.eye(3)

Which results in

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

Sum of two Numpy array.

To calculate the sum of two numpy array. It is important that both of them are of same shape. We would continuing the use of our identity matrix we created above and create a new matrix a with the shape (3,3)

a = np.full((3,3),5)
print(identity + a)

The above piece of code results in

[[6. 5. 5.]
 [5. 6. 5.]
 [5. 5. 6.]]

Multiplication of Numpy array

We won’t discuss as to how matrix multiplication happens but rather how to multiply two matrices using Numpy. We calculate the dot product of the two matrices we wish to multiply

a.dot(identity)

Now, We already know that any matrices that is multiplied to a identity matrix result in itself. Which is proved here as we get the same value of a on calculating the dot product.

Result of the dot product.

array([[5., 5., 5.],
       [5., 5., 5.],
       [5., 5., 5.]])

Calculating the Square Root of the matrix.

You could calculate the square root of the matrix by using the sqrt method. It only takes in the numpy array. You wish to have the square root for.

d = np.full((2,2),9)
np.sqrt(d)

The above results in

array([[3., 3.],
       [3., 3.]])

As we passed it a numpy matrix d filled with value 9 whose square root is 3.

Stacking Numpy matrices

Stacking here refers to arranging numpy matrices together either in a vertical way or horizontal way

Vertical Stacking using vstack

the vstack method takes in a list of the numpy array you wish to vertically stack. The order in which they are mentioned in the list is maintained in stacking.

e = np.vstack((a,identity))

The above results in

array([[5., 5., 5.],
       [5., 5., 5.],
       [5., 5., 5.],
       [1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

As you could clearly see our matrix a is stacked over the identity matrix as we mentioned a before identity in the list we provided to the vstack method.

Horizontal Stacking using hstack

The hstack method takes in a list of the numpy array you wish to horizontally stack. The order in which they are mentioned in the list is maintained in stacking.

np.hstack((a, identity))

The above results to

array([[5., 5., 5., 1., 0., 0.],
       [5., 5., 5., 0., 1., 0.],
       [5., 5., 5., 0., 0., 1.]])

Splitting the numpy array

If you wish to split the array you could use the split method. Similar to stacking it has vsplit and hsplit. You need to pass the matrix you wish to split and a number that specifies in how many part you wish to split. This Number should be able to split the numpy array in equal shape otherwise it would lead to an error.

Consider you wish to vertically split the numpy array e(which was created in vstack) into two parts.

np.vsplit(e,2)

The above code resulted in

[array([[5., 5., 5.],
        [5., 5., 5.],
        [5., 5., 5.]]),
 array([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]])]

It splitted e into two equal parts. Now the first array represent matrix ‘a’ and second one is our identity matrix as e was created vertically stacking those two.

Creating a Numpy Matrix with random values

To create a NumPy array with the random value you could use the random method. It only takes in the tuple specifying the shape of the matrix to be created.

a = np.random.random((3,3))

Every time you would run the above code it would produce a different matrix of the shape (3,3)

The No of Elements in the Numpy array

The size of the numpy array is the number of elements inside the array.

a.size

The above results in 9 as in the above example we created the matrix a of shape (3,3). which would have 9 element.

Getting the Memory size of Numpy array

Let’s first get the size of an individual element in the matrix a

a.itemsize

Which results in 8 . Which means 8 bytes. You could check the type of the element to confirm the same

a.dtype

Which results in dtype(‘float64’) and float64 does take 8 bytes

To get the total bytes occupied by this array.

a.nbytes

Which results in 72 bytes as the no of element in the array a is 9.

Getting Number of Dimensions in the array.

To get no of dimensions in an array we would use the ndim method

a.ndim

It would result in 2 as there are just two dimensions in this array

  • Rows
  • Columns

Slicing of the Numpy Array

Slicing of the numpy array is similar to that in the python list. Lets first Create a new multi dimensional array

e = np.array([[1,2,3],[4,5,6],[7,8,9]])

To have all the rows except last column you would say

x = e[:,:-1]

Which would result in

array([[1, 2],
       [4, 5],
       [7, 8]])

To select just last column we would write.

x = e[:,-1]

Which would result in

array([3, 6, 9])

Simarly for rows

x = e[:-1,:] // all rows except last
x=[-1]// last row

AS you could see there is nothing special in splitting of a numpy array it is same as splitting a normal python list.

Leave a Reply