Introduction to Python - Numpy
introml.analyticsdojo.com
4. Overview of Numpy#
Numpy is a package that provides additional functionality often useful working with arrays for data science.
Typically Numpy is imported as
np
.np.array()
will cast a list (or other collection) as a numpy array.You can slice an array in the same way yo can slice a list.
import numpy as np
a = np.array([0, 1, 2, 3, 4, 5, 6])
print('A is of type:', type(a))
print('Print the entire array:', a)
print('Print the first value:', a[0])
print('Print the first three value:', a[0:3])
print('Print from second value till end of list:', a[2:])
print('Print the last value of a numpy array:', a[-1])
print('Print up till the 2nd to last value:', a[:-2])
4.1. Arrays and Functions#
A really powerful aspect of arrays is the capaiblity to do calculations over arrays.
Numpy has a number of functions possible listed here.
Often it is possible to do calculations directly or via np functions, as shown below.
import numpy as np
a = np.array([1, 2, 3, 4, 5, 6])
b1=10*a
b2=np.multiply(10,a)
c1=a+b1
c2=np.add(a,b1) #This is an alternate way of adding
d=np.log(a)
e=np.sqrt(a)
f=a**2 #This squares the value.
np.square([-1j, 1])
print('Print the entire array a:', a)
print('Print the entire array b1:', b1)
print('Print the entire array b2:', b2)
print('Print the entire array b3:', c1)
print('Print the entire array c2:', c2)
print('Print the entire array d:', d)
print('Print the entire array e:', e)
print('Print the entire array f:', f)
4.2. Creating and Manipulating Numpy Arrays#
The arrange function will generate an array.
Reshape changes the structure of the array to n rows and m columns.
a=a.reshape(n, m)
-ones
will create an array with all ones andzeros
with all zeros.Reshaping can get it in the appropriate structure, but make sure that the size fits the appropriate dimensions.
import numpy as np
a = np.arange(15)
print(a)
a2 = np.arange( 0, 15, 1 ) #Alternate specification with np.arrange(start, end, step)
print(a2)
a=a.reshape(3, 5)
print(a)
b= np.ones(shape=(3, 5), dtype=float)
print(b)
c= np.zeros(shape=(3, 5), dtype=int)
print(c)
d= np.full((3, 5), 4, dtype=int)
print(d)
e= np.arange( 0, 1.5, .1 ).reshape(3,5) #String together creations and reshaping. Also can use decimals.
print(e)
e= np.arange( 0, 1.5, .1 ).reshape(3,5)
4.3. Generating Random Numpy Data#
This is often useful, and we will be using it to demonstrate some initial techniques.
Often you want random but repeatable results, so that for example a test could have a consistent average on a random array. For this we need to set a seed. You only have to do this once.
np.random.seed([2335])
a = np.random.uniform(50, 150, 10) #Between 50-150, generate 10 variables from uniform
b = np.random.standard_normal(10) #With mean 0 and standard deviation 1
print(a)
print(b)
4.4. Combining Numpy Arrays#
concatenate
will string a list of numpy arrays togethernp.concatenate([a,b])
vstack
will stack numpy arraysDefaults: start =0, end =last and step is 1.
To print the entire array, leave start/stop/step blank
a[::]
a = np.arange(5)
b=np.concatenate([a,a])
c=np.vstack([a,a])
d=np.hstack([c,c])
print('a:',a,'\nb:',b,'\nc:',c,'\nd:',d)
4.5. Slicing Single Dimension Numpy Arrays#
Slicing arrays includes three numbers
a[start:stop:step]
but not all are required.Defaults: start =0, end =last and step is 1.
To print the entire array, leave start/stop/step blank
a[::]
e= np.arange( 0, 15, 1 )
print(e)
#[start:end:step]
print("This is the start, end, and step:",e[2:9:3])
print("Print every other:",e[::2])
print("Print starting at 2 and ending at 9, default step 1:",e[2:9])
print("Print all:",e[::])
print("Print all:",e[:])
print("Print all:",e)
4.6. Numpy Arrays From External Datasets#
We can take a list from an external dataset and change it to an numpy array.
#First let's download some data.
!wget https://raw.githubusercontent.com/rpi-techfundamentals/spring2019-materials/master/input/iris.csv
import csv
csv_file_object = csv.reader(open('iris.csv', newline=''), delimiter=',')
data=[]
header = next(csv_file_object) #
for row in csv_file_object:
data.append(row) # add each row to the
data = np.array(data)
print(data)
4.7. Slicing 2 Dimensional Numpy Arrays#
We can slice arrays with
array[row, column]
were row and column each include the (start:stop:step) like in arraysWe can sepecify the type with the
.astype(np.float_)
For a full list of Numpy types, see documentation
If we create a one dimensional array from 2 dimensional numpy array, it will also be a numpy array of same type.
#We can slice the array several different ways and generate new variables.
irisdata=data[0::,0:4:].astype(np.float_) #This will select only the first 4 columns and change the type to float
irisdata=data[:,0:4].astype(np.float_)
iristype=data[0::,4:5:] # This will select only the type.
print(irisdata,'\n',iristype)
#This can be used to select column 1 and assign to new variable.
#This will sum up column 1
newvariable=irisdata[::,0:1:]
#This will sum up column 0
final=irisdata[::,0:1:].sum()
type(newvariable)
#print(newvariable)
print(final)
#This will take the mean of column 1
print('mean:', irisdata[::,0:1:].mean())
4.8. CREDITS#
Copyright AnalyticsDojo 2016 This work is licensed under the Creative Commons Attribution 4.0 International license agreement.