# Numpy and Scipy

## Overview

Teaching:40 min

Exercises:10 minQuestions

How do I deal with tabular scientific data?

Objectives

Import the numpy library.

Understand the NDArray object.

Import the numpy library.

Get some basic information about a numpy and scipy objects and methods.

## Numpy is the main Python library for scientific computation

- Numpy provides a new data type, the
`array`

`arrays`

are multi-dimensional collections of data of the same intrinsic type (int, float, etc.)

## Import numpy before using it

`numpy`

is**not**built in, but is often installed by default.- use
`import numpy`

to import the entire package. - use
`from numpy import ...`

to import some functions. - use
`import numpy as np`

to use the most common alias.

```
import numpy as np
import numpy
from numpy import cos
print(numpy.cos, np.cos, cos)
```

```
<ufunc 'cos'> <ufunc 'cos'> <ufunc 'cos'>
```

## Use `numpy.zeros`

to create empty arrays

```
f10 = numpy.zeros(10)
i10 = numpy.zeros(10, dtype=int)
print("default array of zeros: ", f10)
print("integer array of zeros: ", i10)
```

```
default array of zeros: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
integer array of zeros: [0 0 0 0 0 0 0 0 0 0]
```

## Use `numpy.ones`

to create an array of ones.

```
print("Using numpy.ones : ", numpy.ones(10))
print("is the same thing as: ", numpy.zeros(10)+1)
```

```
Using numpy.ones : [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
is the same thing as: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
```

## Using `numpy.arange`

to generate sets of numbers

- arange takes from one to three arguments. By default arange will generate numbers starting from 0 with a step of 1
`arange(N)`

generates numbers from 0..N-1`arange(M,N)`

generates numbers from M..N-1`arange(M,N,P)`

generates numbers from M..N-1 including only ever Pth number.

```
numpy.arange(10)
```

```
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
```

- generate an array of numbers from 1 to 10

```
numpy.arange(1,10)
```

```
array([1, 2, 3, 4, 5, 6, 7, 8, 9])
```

- generate an array of odd numbers from 1 to 10

```
numpy.arange(1,10,2)
```

```
array([1, 3, 5, 7, 9])
```

**incorrectly**generate an array of odd numbers from 1 to 10, backwards

```
numpy.arange(1,10,-2)
```

```
array([], dtype=int64)
```

- generate an array of even numbers from 10 to 2, backwards

```
numpy.arange(10,1,-2)
```

```
array([10, 8, 6, 4, 2])
```

## Numpy arrays have a `shape`

- Numpy arrays have a
`shape`

parameter associated with them - You can change the shape with the
`reshape`

method

```
a = numpy.arange(10)
print("a's shape is ",a.shape)
b=a.reshape(5,2)
print("b's shape is ",b.shape)
```

```
a's shape is (10,)
b's shape is (5, 2)
```

## Numpy arrays can be treated like single numbers in arithmetic

- Arithmetic using numpy arrays is
*element-by-element* - Matrix operations are possible with functions or methods.
- The size and shape of the arrays should match.

```
a = numpy.arange(5)
b = numpy.arange(5)
print("a=",a)
print("b=",b)
print("a*b=",a*b)
print("a+b=",a+b)
```

```
a= [0 1 2 3 4]
b= [0 1 2 3 4]
a*b= [ 0 1 4 9 16]
a+b= [0 2 4 6 8]
```

```
c = numpy.ones((5,2))
d = numpy.ones((5,2)) + 100
c+d
```

```
array([[102., 102.],
[102., 102.],
[102., 102.],
[102., 102.],
[102., 102.]])
```

```
e = c.reshape(2,5)
c+e #c and e have different shapes
```

```
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-46-0e32881b9afe> in <module>()
1 e = c.reshape(2,5)
----> 2 c+e #c and e have different shapes
ValueError: operands could not be broadcast together with shapes (5,2) (2,5)
---------------------------------------------------------------------------
```

## The Numpy library has many functions that work on `arrays`

- Aggregation functions like
`sum`

,`mean`

,`size`

```
a=numpy.arange(5)
print("a = ", a)
```

```
a = [0 1 2 3 4]
```

- Add all of the elements of the array together.

```
print("sum(a) = ", a.sum())
```

```
sum(a) = 10
```

- Calculate the average value of the elements in the array.

```
print("mean(a) = ", a.mean())
```

```
mean(a) = 2.0
```

- Calculate something called
`std`

of the array.

```
print("std(a) = ", a.std()) #what is this?
```

```
std(a) = 1.4142135623730951
```

- Calculate the
`sin`

of each element in the array

```
print("np.sin(a) = ", np.sin(a))
```

```
np.sin(a) = [ 0. 0.84147098 0.90929743 0.14112001 -0.7568025 ]
```

## Check the `numpy`

help and webpage for more functions

https://docs.scipy.org/doc/numpy/reference/routines.html

## Use the `axis`

keyword to use the function over a subset of the data.

- Many functions take the
`axis`

keyword to perform the aggregation of that dimension

```
a = numpy.arange(10).reshape(5,2)
print("a=",a)
print("mean(a)=" ,numpy.mean(a))
print("mean(a,0)=",numpy.mean(a,axis=0))
print("mean(a,1)=",numpy.mean(a,axis=1))
```

```
a= [[0 1]
[2 3]
[4 5]
[6 7]
[8 9]]
mean(a)= 4.5
mean(a,0)= [4. 5.]
mean(a,1)= [0.5 2.5 4.5 6.5 8.5]
```

## Use square brackets to access elements in the array

- Single integers in square brackets returns one element
- ranges of data can be accessed with slices

```
a=numpy.arange(10)
```

- Access the fifth element

```
a[5]
```

```
5
```

- Access elements 5 through 10

```
a[5:10]
```

```
array([5, 6, 7, 8, 9])
```

- Access elements from 5 to the end of the array

```
a[5:] #No second number means "rest of the array"
```

```
array([5, 6, 7, 8, 9])
```

- Access all elements from the start of the array to the fifth element.

```
a[:5] #No first number means "from the start of the array"
```

```
array([0, 1, 2, 3, 4])
```

- Access every 2nd element from the 5th to the 10th

```
a[5:10:2] #A third number means "every Nth element"
```

```
array([5, 7, 9])
```

- Access every -2nd element from the 5th to the 10th. (
**incorrect**)

```
a[5:10:-2] #negative numbers mean "count backwards"
```

```
array([], dtype=int64)
```

- Access every -2nd element from the 10th to the 5th. (
**correct**)

```
a[10:5:-2] #but you need to start and stop in the same order
```

```
array([9, 7])
```

## Challenge 1

There is an

`arange`

function and`linspace`

function, that take similar arguments. Explain the difference. For example, what does the following code do?`print (numpy.arange(1.,9,3)) print (numpy.linspace(1.,9,3))`

## Solution

`arange`

takes the argumentsstart, stop, step, and generates numbers fromstarttostop(excludingstop) stepping bystepeach time.`linspace`

takes the argumentsstart, stop, number, and generates numbers fromstarttostop(includingstop) withnumberof steps.`print (numpy.arange(1.,9,3)) print (numpy.linspace(1.,9,3))`

`[1. 4. 7.] [1. 5. 9.]`

## Challenge 2

Generate a 10 x 3 array of random numbers (using

`numpy.random.rand`

). From each row, find the minimum absolute value. Make use of numpy.abs and numpy.min. The result should be a one-dimensional array.## Solution

The important part of the solution is passing the

`axis`

keyword to the min function:`a = numpy.random.rand(30).reshape(10,3) print("a is ", a) print() print("min(a) along each row is ", numpy.min( numpy.abs( a ), axis=0))`

## Use the `scipy`

library for common scientific and numerical methods

`scipy`

contains functions to generate random numbers, calculate Fourier transforms, integrate- Check the
`scipy`

website for more help: https://docs.scipy.org/doc/scipy/reference/

## Example : integrate y=x^2 from 0 to 10

```
x = numpy.arange(11)
y = x**2
import scipy.integrate
#by default, trapz assumes the independent variable is a list of integers from 0..N
print("integral of x^2 from 0 to 10", scipy.integrate.trapz(y) )#This value should be 10**3/3 = 333
```

```
integral of x^2 from 0 to 10 335.0
```

- Numerical integration can be inprecise with a coarse grid. (this time, incorrectly!)

```
x = numpy.linspace(0,10,1000) # finer grid
y=x**2
print("integral of x^2 from 0 to 10", scipy.integrate.trapz(y) )#This value should be 10**3/3 = 333.333
```

```
integral of x^2 from 0 to 10 33300.01668335002
```

- Passing the
`x`

values to`trapz`

allows it to integrate correctly

```
print("integral of x^2 from 0 to 10", scipy.integrate.trapz(y,x) )#This value should be 10**3/3 = 333.333
```

```
integral of x^2 from 0 to 10 333.333500333834
```

We’ll come back to `scipy.optimize`

later.

## Key Points

Use the numpy library to get basic statistics out of tabular data.

Print numpy arrays.

Use mean, sum, std to get summary statistics.

Add numpy arrays together.

Study the scipy website

Use scipy to integrate tabular data.