AnalyticsDojo

Introduction to Python - Datastructures

rpi.analyticsdojo.com

3. Introduction Datastructures (Varibles, Lists, Dictionaries, and Sets)#

Common to R and Python

  • Variables

  • Opearations on Numeric and String Variables

  • Lists

Python Only

  • Dictionaries

  • Sets

3.1. Variables#

  • Single value

  • Strings, Integer, Floats and boolean are the most common types of variables.

  • Remember, under the covers they are all objects.

  • Multiple variables can be output with the print() statement.

  • \t can be used to add a tab while \n can input a new line.

a = '#pythonrules' # string
b = 30              # integer
c = True            # boolean

#This prints (1) only the variables, (2) with labels, (3) including tabs, and (4) with new lines.
print('1:', a,  b, c)
print('2:','String:', a, 'Integer:', b, 'Boolean:', c)
print('3:','String:', a, '\tInteger:', b, '\tBoolean:', c)
print('4a:','String:', a, '\n4b: Integer:', b, '\n4c: Boolean:', c)
print(a+str(b))
1: #pythonrules 30 True
2: String: #pythonrules Integer: 30 Boolean: True
3: String: #pythonrules 	Integer: 30 	Boolean: True
4a: String: #pythonrules 
4b: Integer: 30 
4c: Boolean: True
#pythonrules30

3.2. Variable Type (continued)#

  • In Python when we write b = 30 this means the value of 30 is assigned to the b object.

  • Python is a dynamically typed.

  • Unlike some languages, we don’t have to declare the type of a variable before using it.

  • Variable type can also change with the reassignment of a variable.

a = 1
print ('The value of a is ', a,  'and type ', type(a) )

a = 2.5
print ('Now the value of a is ', a,  'and type ', type(a) )

a = 'hello there'
print ('Now the value of a is ', a,  'and of type ', type(a) )
The value of a is  1 and type  <class 'int'>
Now the value of a is  2.5 and type  <class 'float'>
Now the value of a is  hello there and of type  <class 'str'>

3.3. Variable Type (continued)#

  • Variables themselves do not have a fixed type.

  • It is only the values that they refer to that have an associated type.

  • This means that the type referred to by a variable can change as more statements are interpreted.

  • If we combine types incorrectly we get an error.

#We can't add 5 to a 
b = 'string variable'
c=b+5
c
'string variable5'

3.4. The type Function#

  • We can query the type of a value using the type function.

  • Variables can be reassigned to a different type.

  • There are integer, floating point, and complex number numeric types.

  • Boolean is a special type of integer.

a = 1
type(a)
int
a = 'hello'
type(a)
str
a=2.5
type(a)
float
a=True
type(a)
bool

3.5. Converting Values Between Types#

  • We can convert values between different types.

  • To convert to string use the str() function.

  • To convert to floating-point use the float() function.

  • To convert to an integer use the int() function.

  • To convert to a boolean use the bool() function.

a = 1
print(a, type(a))

a = str(a)
print (a, type(a))

a = float(a)
print (a, type(a))

a = int(a)
print (a, type(a))
1 <class 'int'>
1 <class 'str'>
1.0 <class 'float'>
1 <class 'int'>

3.6. Converting Values Between Types (Continued)#

  • To convert to a boolean use the bool() function.

  • bool can work with a String type that is True or False

  • bool can work with an integer type that is 1 for True or 0 for False

b = 'True'
print (b, type(b))

b = bool(b)
print (b, type(b))

c = 1
c= bool(c)
print (c, type(c))

d = 0
d= bool(d)
print (d, type(d))
True <class 'str'>
True <class 'bool'>
True <class 'bool'>
False <class 'bool'>

3.7. Null Values#

  • Sometimes we represent “no data” or “not applicable”.

  • In Python we use the special value None.

  • This corresponds to NA in R of Null in Java/SQL.

  • When we print the value None is printed.

  • If we enter the variable, no result is printed out.

a = None
print(a)
None
#Notice nothing is printed.
a

3.8. Operations on Numeric Variables#

  • Python can be used as a basic calculator.

  • Check out this associated tutorial.

print('Addition:', 53 + 5)
print('Multiplication:', 53 * 5)
print('Subtraction:', 53 - 5)
print('Division', 53 / 5 )
print('Floor Division (discards the fractional part)', 53 // 5 )
print('Floor Division (returns the remainder)', 53 % 5 )
print('Exponents:', 5 ** 2 )
Addition: 58
Multiplication: 265
Subtraction: 48
Division 10.6
Floor Division (discards the fractional part) 10
Floor Division (returns the remainder) 3
Exponents: 25

3.9. Operations on String Variables#

  • Just as we can do numeric operations, we can also do operations on strings.

  • Concatentate Strings

  • A backslash is used as an escape variable.

  • More info on this tutorial.

a='Start'
b='End'
tab='\t'
newline='\n'
c='can\'t'  #Note that we have to use the Escape character '\' to inclue a apostrophe '  in the key.
cb="can't"
continueline = 'This is the first line. \
This is the second line, but we have included a line continuation character: \\'
#Note that to print the continueline character we have to list 2 (\\)
#Note that to print the continueline character we have to list 2 (\\)

contin2= """
This is the second line, but we have included a line continuation character: 
#Note that to print the continueline character we have to list 2 
#Note that to print the continueline character we have to list 2


"""



print('Concatenation:', a+b )
print('Tab:', a+tab+b )
print('Newline:', a+newline+b )
print('Apostrophe:', c )
print('Apostrophe:', cb )
print('Continue line:', continueline )
print(contin2)
Concatenation: StartEnd
Tab: Start	End
Newline: Start
End
Apostrophe: can't
Apostrophe: can't
Continue line: This is the first line. This is the second line, but we have included a line continuation character: \

This is the second line, but we have included a line continuation character: 
#Note that to print the continueline character we have to list 2 
#Note that to print the continueline character we have to list 2

3.10. Calling Functions on Variables#

  • We can call functions in a conventional way using round brackets

  • Python has a wide variety of built in functions,

a=abs(-98.45)
print('abs() takes the absolute value:', a )
a=round(a)
print('round() rounds to nearest integer:', a )
character=chr(a)
print('chr(98) returns the string representing a character whose Unicode code point is associated with the integer:',character) 

3.11. Exercise - Operations on Variables#

  1. What happens when you multiply a number times a boolean? What is the resulting type?

  2. What happens when you try to multiply an integer value times a null?

  3. Take 5 to the power of 4.

3.12. Lists#

  • Lists can be used to contain a sequence of values of any type.

  • You can do operations on lists.

  • The list values start at 0 and that the first value of a list can be printed using a[0]

  • Lists can be sliced or indexed using the start and end value a[start:end]

  • Lists are mutable datastructures, meaning that they can be changed (added to).

#Set the value of the list
a = [1, 2, 'three', 'four', 5.0]

print('Print the entire array:', a)
print('Print the first value:', a[0])
print('Print the first three value:', a[0:3])
print('Print from second value till end  of list:', a[2:])
print('Print the last value of a list:', a[-1])
print('Print up till the 2nd to last value:', a[:-2]) 
type(a)
Print the entire array: [1, 2, 'three', 'four', 5.0]
Print the first value: 1
Print the first three value: [1, 2, 'three']
Print from second value till end  of list: ['three', 'four', 5.0]
Print the last value of a list: 5.0
Print up till the 2nd to last value: [1, 2, 'three']
list

3.13. Lists#

  • Lists can be nested, where there are lists of lists.

  • The elements of a nested list is specified after the first list when slicing c[0][0]

a = [1, 2, 'three', 'four', 5.0]
b = [6, 'seven', 8, 'nine']
c = [a, b]

print('This is a list with 2 lists in it:', c)
print('This is the first list:', c[0])
print('This is the first element of the second list:', c[1][0])
This is a list with 2 lists in it: [[1, 2, 'three', 'four', 5.0], [6, 'seven', 8, 'nine']]
This is the first list: [1, 2, 'three', 'four', 5.0]
This is the first element of the second list: 6

3.14. Lists#

  • Lists can added to with the append method or your can directly assign location in list.

  • You can identify the length of a list with len(a)

  • More fuctions on lists include pop() insert() etc.

  • If you set a lista = listb this list will not be a copy but instead be the same list, where if you modify one it will modify both.

  • To create a copy of a list, you can use lista=listb[:]

b = [6, 'seven', 8, 'nine']
b.append(10)
print('We added 10 to b:', b)
print('the length of b is now:', len(b))

b[len(b):] = ['Eleven',12]
print('We added 11 to b:', b)
We added 10 to b: [6, 'seven', 8, 'nine', 10]
the length of b is now: 5
We added 11 to b: [6, 'seven', 8, 'nine', 10, 'Eleven', 12]

3.15. List#

  • If you set a lista = listb this list will not be a copy but instead be the same list, where if you modify one it will modify both.

  • To create a copy of a list, you can use lista=listb[:] or lista=listb.copy()

listb=[1,2,3,4]
listb1=[1,2,3,4]
listb2=[1,2,3,4]
#This assigns one variable to another, linking them
lista=listb
#This creates a copy
lista1=listb1[:]
lista2=listb2.copy() # This does the same thing.
#This deletes the third item in the array.
lista.pop(3)
lista1.pop(3)
lista2.pop(3)
#Notice how when we pop lista, listb is also impacted.
print(lista, listb)
#Notice how when we use a copy, listb1 is not impacted. 
print(lista1, listb1)
print(lista2, listb2)
[1, 2, 3] [1, 2, 3]
[1, 2, 3] [1, 2, 3, 4]
[1, 2, 3] [1, 2, 3, 4]

3.16. Exercise-Lists#

Hint: This list of functions on lists is useful.

  1. Create a list elists1 with the following values (1,2,3,4,5).

  2. Create a new list elists2 by first creating a copy of elist1 and then reversing the order.

  3. Create a new list elists3 by first creating a copy of elist1 and then adding 7 8 9 to the end. (Hint: Search for a different function if appending doesn’t work.)

  4. Create a new list elists4 by first creating a copy of elist3 and then insert 6 between 5 and 7.

3.17. Sets#

  • Lists can contain duplicate values.

  • A set, in contrast, contains no duplicates.

  • Sets can be created from lists using the set() function.

  • Alternatively we can write a set literal using the { and } brackets.

#This creates a set from a list. 
X = set([1, 2, 3, 3, 4])

print(X, type(X))
{1, 2, 3, 4} <class 'set'>
X = {1, 2, 3, 4, 4}
print(X, type(X))
{1, 2, 3, 4} <class 'set'>

3.18. Sets are Mutable#

  • Sets are mutable like lists (meaning we can change them)

  • Duplicates are automatically removed

X = {1, 2, 3, 4}
X.add(0)
X.add(5)
print(X)
X.add(5)
print(X)
{0, 1, 2, 3, 4, 5}
{0, 1, 2, 3, 4, 5}

3.19. Sets are Unordered#

  • Sets do not have an order.

  • Therefore we cannot index or slice them.

X[0]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-30-19c40ecbd036> in <module>()
----> 1 X[0]

TypeError: 'set' object does not support indexing

3.20. Operations on Sets#

  • Union: \(X \cup Y\) combines two sets

X = {1, 2, 3, 4}
Y = {4, 5, 6}
X.union(Y)
{1, 2, 3, 4, 5, 6}

3.21. Operations on Sets#

  • Intersection: \(X \cap Y\):

X = {1, 2, 3, 4}
Y = {3, 4, 5}
X.intersection(Y)
{3, 4}

3.22. Operations on Sets#

  • Difference \(X - Y\):

X = {1, 2, 3, 4}
Y = {3, 4, 5}
X - Y
{1, 2}

3.23. Dictionaries#

  • You can think of dictionaries as arrays that help you assocaite a key with a value.

  • Dictionaries can be specified with {key: value, key: value}

  • Dictionaries can be specified with dict([(‘key’, value), (‘key’, value)])

  • Key’s and values can be either string or numeric.

  • Dictionaries are mutable, (can be changed) adict['g'] = 41

adict1 = {'a' : 0, 'b' : 1, 'c' : 2}
adict2 = dict([(1, 'a'), (2, 'b'), (3, 'c')])
print(adict1,adict2, '\n', type(adict1),type(adict2), '\n',adict1['b'],adict2[2])
{'c': 2, 'b': 1, 'a': 0} {1: 'a', 2: 'b', 3: 'c'} 
 <class 'dict'> <class 'dict'> 
 1 b
adict2['g']=1234
adict2['g']
1234

3.24. Exercise-Sets/Dictionary#

  1. Create a set eset1 with the following values (1,2,3,4,5).

  2. Create a new set eset2 the following values (1,3,6).

  3. Create a new set eset3 that is eset1-eset2.

  4. Create a new set eset4 that is the union of eset1+eset2.

  5. Create a new set eset5 that includes values that are in both eset1 and eset2 (intersection).

  6. Create a new dict edict1 with the following keys and associated values: st1=45; st2=32; st3=40; st4=31.

  7. Create a new variable edict2 from edict 1 where the key is st3.

3.25. CREDITS#

Copyright AnalyticsDojo 2020 This work is licensed under the Creative Commons Attribution 4.0 International license agreement.

This work has been adopted from the origional version: Copyright Steve Phelps 2014