By the end of this lesson, you should be able to:
Welcome to the last lesson of this course. The first course on programming is always an amazing journey where you learn how to write instructions to use a computer for problem solving. This last lesson of the course bridges the contents covered in CP104 with the future course on data structures (CP164).
Data Structures is a dedicated course which is mainly offered to the
computer science students. In this introductory course on programming we
are interested in understanding the basic usage of the data structures
in the Python programming language. They are four built-in data
structures in Python: Lists
, Tuples
,
Dictionaries
, and Sets
.
You have already seen lists
, which is a common data
structure available in many languages. It may have a different name in
various programming languages (like Arrays) and may have slightly
varying properties, but the main purpose is similar.
A data structure is a storage system designed for data organization and
management, and it facilitates the efficient access and manipulation of
the data. In this lesson, we will look at the three data structures,
namely Tuples
, Dictionaries
and Sets
.
You will note that the properties of each data structure differentiates
it from the others, and hence each has a unique application area.
Let's briefly recall the main properties of a list
:
lists
) can be
elements of a list
.
list
are ordered in a sequence,
and they can be accessed using forward indexes and reverse indexes.
Let's discuss the other three data structures in this lesson.
Tuples
are a sequence and are quite similar to lists
.
A tuple
is immutable and this is the main difference
between a list
and a tuple
. As they are both
similar, we use parentheses instead of square brackets for tuples
.
The following two lines of code show how to create a list
and a tuple
.
my_list = [1, 2, 3, 4] # List
my_tuple = (1, 2, 3, 4) # Tuple
In fact, all the operations that can be applied on the lists
,
except those that manipulate the elements, can be used on tuples
as well. Indexes can be used for accessing elements, slicing is
available to extract a sub-tuple, and built-in functions can also be
used for finding length and maximum etc. Methods that are used for
changing the contents of a list
are not available for tuple
,
because of immutability.
List
methods for adding elements, deleting elements or
changing the order of the elements are not available for use for tuples
.
It may seem by now that tuples
are useless when the list
is available in Python, but that is not true. Tuples
offer
the following advantages over a list
.
list
.
Consider you are making an application for user registration, and as
part of the process the user needs to provide her country of origin.
There are a finite number of countries, and once you have the 'list' of
countries available, your code has no reason to modify the names. In
fact, you will like to ensure that the 'list' of names is never modified
by a bug (logical error) in your source code. In such a case, using a tuple
is better as the long list of countries will be processed faster using a
tuple
.
As a second example, let's imagine we are creating an application to calculate the student's grade in the courses he has enrolled in. We enter the marks in each assessment tool like quizzes, assignments, midterm and the final exam. The application is supposed to find the grade in multiple courses, therefore we allow the user to enter the weightage for the assessment tools for each of the courses.
Technical Note:
Unlike the lists
, you must use a comma when creating a tuple
containing a single element, or otherwise it will be a case of simple
assignment. The comma after 'item 1' is missing in the second line of
code below, whereas it is included in the third line of code.
my_list = ['item 1'] # Creates a list of one element
my_tuple = ('item 1') # Creates a string
my_tuple = ('item 1',) # Creates a tuple with one element
Prog 12-01 shown in Code Listing 1 below creates a
simple tuple
with seven elements and demonstrates the use
of indexes and slicing.
# Program Name: Prog 12-01
# This program is part of Lesson 12 of the course
def main ():
thistuple = ("apple", "banana", "cherry", "orange", "kiwi", "melon", "mango")
str = thistuple[0]
print(str)
print(thistuple[2:5])
print(thistuple[2:])
print(thistuple[2:5000])
print(thistuple[-15:-1])
main()
Let's go through the code line by line:
Line 5 | A tuple is created with seven elements. We know
it is a tuple and not a list , as
parentheses and not square brackets are used.
|
Lines 6, 7 | The first element of the tuple is extracted
using indexing, and it is assigned to a variable named str ,
which is then printed on the console. We used parentheses while
creating the tuple , but note that square brackets are
used for slicing and specifying indexes, like in the case of lists .
|
Line 8 | Three elements of the tuple (element 2, 3, and
4) are extracted as a sub-tuple and printed on the console.
Parentheses are used in the console to denote that it is a tuple
and not a list .
|
Line 9 | All elements of the tuple starting from element
at index 2 onwards are printed on the console.
|
Line 10 | The ending index is way bigger than the length of the tuple ,
but just like the lists , it is an acceptable syntax.
But it is only acceptable when using slicing. It will generate an
index out of range error if you try something like the following:print(thistuple[5000]) |
Line 11 | Like in the case of lists , the reverse indexing
is also available for tuples .
|
The output for the Code Listing 1 is shown in Console 1.
apple
('cherry', 'orange', 'kiwi')
('cherry', 'orange', 'kiwi', 'melon', 'mango')
('cherry', 'orange', 'kiwi', 'melon', 'mango')
('apple', 'banana', 'cherry', 'orange', 'kiwi', 'melon')
Python provides many functionalities to effectively use tuples
in your programs. You have already seen that there are a lot of
similarities in the use of tuples
and lists
.
Python allows the conversion of lists
into tuples
and vice versa. Like lists
, the tuples
can be
nested.
Technical Note:
Tuples
are immutable but lists
are mutable.
What if a tuple
contains a list
as an
element like below?:
my_tuple = (1, 'Waterloo Region', ['WLU', 'UW'])
A tuple
remains immutable, which means the following line
of code to add an element to a tuple
will result in an
error:
my_tuple.append("Ontario")
The list
containing the names of higher education
institutions in the Waterloo region is inside the tuple
,
but the list
remains mutable. The following line of code
is legal in Python, and will result in adding a third element to the list
.
my_tuple[2].append('Conestoga')
Printing the tuple
will result in following being printed
on the console:
(1, 'Waterloo Region', ['WLU', 'UW', 'Conestoga'])
Code Listing 2 shows Prog 12-02 that creates a tuple
and modifies it by converting it into a list
before
printing the tuple
. You will see a lot of similarities with
the code examples in Lessons 8 and 11.
# Program Name: Prog 12-02
# This program is part of Lesson 12 of the course
def main ():
my_tuple = (3, 'Waterloo Region', ['WLU', 'UW'])
my_tuple[2].append('Conestoga')
my_list = list(my_tuple)
my_list.append("Ontario")
my_tuple = tuple(my_list)
print("***** Data Log *****")
print(my_list)
print(my_tuple)
print("____________________\n")
for item in my_tuple:
if not(isinstance(item, list)):
print(f"{item}")
else:
print("Higher Education Institutions")
for subitem in item:
print(f"\t{subitem}")
main()
Let's go over the code.
Line 5 | Creates a tuple named my_tuple ,
which contains 3 elements:
|
Line 6 | my_tuple[2] is used to access the list
inside the tuple . The list is mutable, so
the append method is called on the list to insert Conestoga
at the end of the list . my_tuple after
line 6 looks like:(1, 'Waterloo Region',
['WLU', 'UW', 'Conestoga'], ) |
Line 8 | We want to add 'Ontario' as the forth element of the tuple .
But the tuple is immutable, so we convert the tuple
into a list named my_list . Built-in list()
method is used for the conversion.
|
Line 9 | 'Ontario' is appended to the end of the list .
The contents of my_list are:[1,
'Waterloo Region', ['WLU', 'UW', 'Conestoga'], 'Ontario'] |
Line 11 | The built-in tuple() method is used to convert
the list into a tuple again.
|
Lines 13 - 16 | my_list and my_tuple are both
printed on the console. The contents are the same for both of them,
but square brackets are used for the list and
parentheses are used for the tuple . |
Line 18 | A for loop is used to iterate over the tuple .
The following elements will be assigned to the target
variable named item in each iteration:
|
Lines 19, 20 | The data type of the item is checked using the isinstance
method. The isinstance method requires two
arguments. The first is a variable and the second is the class to
compare against. If the variable is of the same class, which is
mentioned as the second argument, the function returns True.
Otherwise it returns False.In this case, we check whether the item in the current iteration is a list
or not. If it is not a list , then its value is printed
in line We do so, as we want to treat the list at index
2 differently.
|
Lines 21 - 24 | If the item is a list (which will
be in the case of the third iteration of the loop), Higher
Education Institutions is printed on the console and then a for
loop is used to iterate over the list . Each element of
the list is printed separately on an individual line.
|
Console 2 shows the output for Code Listing 2.
***** Data Log *****
[3, 'Waterloo Region', ['WLU', 'UW', 'Conestoga'], 'Ontario']
(3, 'Waterloo Region', ['WLU', 'UW', 'Conestoga'], 'Ontario')
____________________
3
Waterloo Region
Higher Education Institutions
WLU
UW
Conestoga
Ontario
We know dictionary as a reference book that can be used to find the meaning of words. Each word exists only once in the dictionary, and its meaning, example usage and synonyms are listed against it. Some dictionaries also provide translation of words from one language to another language. Words are alphabetically sorted to assist the user in their search.
A dictionary
is also a data structure in the Python
programming language. It is named because of its similarities to the
dictionary we already know. A dictionary
has the following
properties:
dictionary
is unordered. Therefore,
it cannot be accessed using forward or backward indexes.
Key-value pairs are also called mappings. They are separated from each
other using a colon. The following lines of code generate some dictionaries
.
Explanation: dict_1 has three mappings. All the
keys are strings . Two values are strings
and one is an integer . Curly brackets are used to
denote the start and end of a dictionary . A colon is
used to separate a key from its value, whereas a comma is used to
differentiate between mappings. The dictionary is
defined on multiple lines to make it easier to read. It can be
defined on a single line as well. |
Explanation: dict_2 has three mappings. Keys and
values can be of different data types, but keys must be immutable.
An integer , string , float ,
and tuple are all fine but you cannot use a list ,
which is mutable as a key. There is no such restriction for values.
dict_2 uses an integer , a float ,
and a tuple as keys and has a list , a tuple
and a float as values. |
Explanation: dict_3 is the same as the dictionary
named dict_2 above, but it is defined on a single line. |
Explanation: dict_4 is a valid empty dictionary .
|
Explanation: dict_5 is invalid as it tries to use
a list as a key. |
Code Listing 3 shows Prog 12-03, which demonstrates the
basic operations of dictionaries
. The program is not
intended to perform anything meaningful.
# Program Name: Prog 12-03
# This program is part of Lesson 12 of the course
def main ():
a_dictionary = {
"team": "Toronto Raptors",
"rank": "winners",
"year": 2019
}
rank = a_dictionary["rank"]
print(f"Value '{rank}' has the datatype {type(rank)}")
year = a_dictionary.get("year")
print(f"Value '{year}' has the datatype {type(year)}")
k = "year"
v = 19
a_dictionary[k] = v
if ("country" not in a_dictionary):
a_dictionary["country"] = "Canada"
a_copy = a_dictionary.copy()
v = a_copy.pop("team")
print(f"{'team'}: {v}")
k, v = a_copy.popitem()
30. print("{}: {}". format(k, v))
3
3 a_copy.clear()
3
3 print()
3 print(f"a_copy:\t\t{a_copy}")
3 print(f"a_dictionary:\t{a_dictionary}")
3
3 a_temp = {"rank": "unknown", "year": 20, "league": "NBA"}
3 a_dictionary.update(a_temp)
40. print("updating ... ")
4 print(f"a_dictionary:\t{a_dictionary}")
4
4 main()
Line 5 | A dictionary named a_dictionary is
created with three mappings.
|
Lines 11, 12 | The value for the key rank is retrieved and
stored in a variable named rank .
rank and its datatype is printed
on the console.
|
Lines 14, 15 | The get method is an alternate way to retrieve a
value given a specific key. A get method is called on
the dictionary , and the key is passed as a parameter.
The method returns the value stored against that key. The value is
assigned to a variable named year . A year
and its datatype are printed on line 15.
|
Lines 17 - 19 | The year is mentioned as 2019 in the dictionary ,
but we want to change it to 19 . This is possible, as dictionaries
are mutable. Two variables called k and v
are initialized with values year and 19
respectively. The value of k is used to update 2019
to 19 . Again, the syntax is similar to the lists .
The difference is that in case of lists we mention the
index inside the square brackets, whereas a key is used for dictionaries .
|
Lines 21, 22 | in and not in operators can be used
to check the presence of a certain key in a dictionary .
Line 22 will be executed as no key named country exists
in the dictionary . A new mapping with the key country
and value Canada is added to the a_dictionary . |
Line 24 | A copy of the a_dictionary is created with the
name of a_copy . It creates a shallow copy, which means
copying by value and not copying by reference. (See section 8.4 of
Lesson 8 for copying by reference)
|
Lines 26, 27 | The pop method is used to remove the key team
and its value from the a_dictionary . The value is
returned by the pop method, which is assigned to the
variable named v . It is printed on the console.
|
Lines 29, 30 | The popitem method retrieves the last entered
key value pair from the dictionary and returns it. Two
variables k and v are used in line 29 and
they will be assigned the key and the value respectively.
|
Line 32 | The clear method is used to remove all the key
value pairs from the dictionary . a_copy
becomes an empty dictionary .
|
Lines 35, 36 | a_copy and a_dictionary are both
printed on the console. |
Line 38 | Another dictionary named a_temp is
created with three mappings. Two of the keys in this new dictionary
are the same as in a_dictionary , whereas there is one
unique key named league as well.
|
Line 39 | a_dictionary is updated using a_temp .
Common keys will be updated with new values from a_temp ,
whereas unique mapping will be copied into a_dictionary
too. |
Lines 41 | a_dictionary is printed on the console. |
Technical Note:
Note: In the earlier versions of Python (Before
Python 3.7) the popitem
method is used to return an
arbitrary mapping from the dictionary
rather than the one
entered most recently.
Console 3 shows the output of Console Listing 3.
Value 'winners' has the datatype <class 'str'>
Value '2019' has the datatype <class 'int'>
team: Toronto Raptors
country: Canada
a_copy: {}
a_dictionary: {'team': 'Toronto Raptors', 'rank': 'winners', 'year': 19, 'country': 'Canada'}
updating ...
a_dictionary: {'team': 'Toronto Raptors', 'rank': 'unknown', 'year': 20, 'country': 'Canada', 'league': 'NBA'}
for
loops are specialized loops that make iterating over a
dictionary
very easy. We can iterate over the keys in the dictionary
,
values in the dictionary
, or the key value pairs in the dictionary
.
The values
method can be used to access a sequence of all
the values in the dictionary
. The items
method, on the other hand returns a sequence of all key value pairs in
the dictionary
.
Let's use these functions to iterate over a dictionary
thrice. See Prog 12-04 in Code Listing 4.
# Program Name: Prog 12-04
# This program is part of Lesson 12 of the course
def main():
a_dictionary = {
"team": "Toronto Raptors",
"rank": "winners",
"year": 2019
}
print("Loop 1")
for k in a_dictionary:
print(f"{k}")
print("\nLoop 2")
for v in a_dictionary.values():
print(f"{v}")
print("\nLoop 3")
for k, v in a_dictionary.items():
print(f"{{{k}, {v}}}")
2
main()
The first loop iterates over the keys in the dictionary
.
One key is assigned to the variable named k
in each
iteration, and it is printed on the console. One value from the dictionary
is assigned to the target variable in each iteration of the second for
loop in the program. All the values are printed on the console by this
loop.
In the third for
loop, the items
method is
used to retrieve all the key value pairs from the dictionary
.
One mapping is assigned to the variables k
and v
in each iteration and then printed on the console. The output of the
program is shown in the Console 4.
Loop 1
team
rank
year
Loop 2
Toronto Raptors
winners
2019
Loop 3
{team, Toronto Raptors}
{rank, winners}
{year, 2019}
Sets
in the Python language are similar to the mathematical
sets
. They are specifically designed to facilitate in
mathematical operations. The sets
have the following
properties.
set
is unique.
set
. Forward or
backward indexing cannot be used for retrieving values.
set
.
set
in itself is mutable though.
The following lines of code can be used to create sets
:
Explanation: A set is created with three
elements. |
Explanation: An empty set is created using the
built-in set method. |
Explanation: A set containing an integer ,
a string , and a tuple is created. All
three elements are immutable so they can be in a set . |
Explanation: A syntax error will be reported as we are trying
to include a list as an element of a set .
The list being mutable is not a valid element for a set . |
Explanation: A set is created with three
elements. As every element must be unique in a set , the
duplicate values will be ignored. |
A dictionary
and a set
both use curly
brackets. They can be differentiated in the console output, as a dictionary
contains key value pairs separated by a colon, whereas a set
has single elements separated by commas. The only problem may occur when
there is an empty set
or a dictionary
.
Therefore, the Python programming language displays them in the
following manner to assist in recognition.
{} # Curly brackets mean an empty dictionary is displayed
set() # The word “set” followed by two parentheses are used to shown an empty set.
Prog 12-05 shown in Code Listing 5 performs some basic
set
operations.
# Program Name: Prog 12-05
# This program is part of Lesson 12 of the course
def main():
recipe1 = {"apple", "banana", "cherry"}
print(f"{len(recipe1)} Ingredients: {recipe1}")
recipe1.update(["orange", "mango", "grapes"])
print(f"{len(recipe1)} Ingredients: {recipe1}")
recipe1.update(["apple"])
print(f"{len(recipe1)} Ingredients: {recipe1}")
if ("banana" in recipe1):
recipe1.remove("banana")
recipe1.discard("guava")
print(f"{len(recipe1)} Ingredients: {recipe1}")
item = recipe1.pop()
print(f"Removed: {item}")
print(f"{len(recipe1)} Ingredients: {recipe1}")
recipe1.clear()
print(f"{len(recipe1)} Ingredients: {recipe1}")
main()
The number of elements in the set
(number of ingredients)
and the elements are printed in lines 6, 9, 12, 16, 20 and The console
output will help you understand the functionality of various methods
used in the code. The line by line code explanation is given below:
Line 5 | A set is created with three elements. The
elements in a set are unordered, and the order may vary
from one execution of the program to another. Console 5
and 6 show two executions of the same program. Note the order of elements is different in the first line of the outputs. |
Line 8 | The update method is called to add three more
elements to the set . The update method is
passed a single list . All the elements in the list
are extracted and added to the set named recipe1 .
recipe1 has 6 elements now.
|
Line 11 | The update method is called to add a seventh
element. A list containing a single element was passed
for inclusion in the set , but as the element already
exists in the set , it is ignored. recipe1
still has only 6 elements.
|
Lines 14, 15 | banana is removed from the set in
line The method remove raises an error if the element
you are trying to remove does not exist in the set . We
ensure in line 14 that the element exists in the set
before attempting to remove. |
Line 16 | The discard method is used to remove guava .
We did not check whether guava exists or not before
trying to discard it, as the discard method does not
raise an error if the element does not exist. guava is
not present in the set , so there is no change.
|
Lines 19, 20 | The pop method is used to remove an element from
the set . The elements in the set are not
ordered, therefore a random element will be removed. See the outputs
of this program in Console 5 and 6. orange was popped
in the first execution of the program and cherry was
popped in the second execution.
|
Line 23 | The clear method removes all the elements from a
set .
|
Console 5 and 6 show output for two executions of the Prog 12-05.
3 Ingredients: {'banana', 'apple', 'cherry'}
6 Ingredients: {'banana', 'orange', 'mango', 'grapes', 'cherry', 'apple'}
6 Ingredients: {'banana', 'orange', 'mango', 'grapes', 'cherry', 'apple'}
5 Ingredients: {'orange', 'mango', 'grapes', 'cherry', 'apple'}
Removed: orange
4 Ingredients: {'mango', 'grapes', 'cherry', 'apple'}
0 Ingredients: set()
3 Ingredients: {'cherry', 'banana', 'apple'}
6 Ingredients: {'cherry', 'mango', 'banana', 'apple', 'orange', 'grapes'}
6 Ingredients: {'cherry', 'mango', 'banana', 'apple', 'orange', 'grapes'}
5 Ingredients: {'cherry', 'mango', 'apple', 'orange', 'grapes'}
Removed: cherry
4 Ingredients: {'mango', 'apple', 'orange', 'grapes'}
0 Ingredients: set()
Basic set
operations from mathematics can be easily applied
to a set
data structure in Python. Prog
12-06 shows the usage of intersection, union, difference and symmetric
difference methods.
# Program Name: Prog 12-06
# This program is part of Lesson 12 of the course
def main():
course_cs104 = set(['Yousuf', 'Maria', 'Haaniya', 'Ajeet'])
course_cs164 = set(['Eva', 'Haaniya', 'Alicia', 'Yousuf'])
print('Students enrolled in CS104:\t', end = ' ')
for name in course_cs104:
print(name, end = ', ')
print("\n")
print('Students enrolled in CS164:\t', end = ' ')
for name in course_cs164:
print(name, end = ', ')
print("\n")
print('Students enrolled in both:\t', end = ' ')
print(course_cs104.intersection(course_cs164))
print('\nStudents enrolled in either:\t', end = ' ')
print(course_cs104.union(course_cs164))
print('\nStudents in 104 but not 164:\t', end = ' ')
print(course_cs104.difference(course_cs164))
print('\nStudents in 164 but not 104:\t', end = ' ')
print(course_cs164.difference(course_cs104))
30. print('\nStudents in only one course:\t', end = ' ')
3 print(course_cs104.symmetric_difference(course_cs164))
3
3 main()
Prog 12-06 starts by creating two sets
. sets
named course_cs104
and course_cs164
contain
the names of the students enrolled in respective courses. Two for
loops are used to print the elements of each set
.
The intersection
method is called in line 19 to find the
names of the students who are present in both the courses. The union
method in line 22 returns the names of all the students present in
either or both of the sets
. The names of the students
present in both the sets
are listed just once.
The difference
method is used twice. First, to find the
students present in only CS 104 (line 25), and second, to find the
students in only CS 164 (line 28). The symmetric_difference
method returns the names of the students from both the sets
such that they are present in only one of the two sets
. All
of the above described methods return the result as a set
.
It is evident from the output as curly brackets are used in the output.
Console 7 shows the output of Code Listing 6.
Students enrolled in CS104: Maria, Ajeet, Yousuf, Haaniya,
Students enrolled in CS164: Yousuf, Haaniya, Alicia, Eva,
Students enrolled in both: {'Yousuf', 'Haaniya'}
Students enrolled in either: {'Alicia', 'Maria', 'Ajeet', 'Yousuf', 'Haaniya', 'Eva'}
Students in 104 but not 164: {'Maria', 'Ajeet'}
Students in 164 but not 104: {'Alicia', 'Eva'}
Students in only one course: {'Alicia', 'Maria', 'Ajeet', 'Eva'}
In this lesson, we learnt the difference between Tuples
, Dictionaries
and Sets
in Python and how we
can use them.This brings us to the end of this first course on
programming. I hope all the students made a lot of progress and
developed interest in Python. Python is one of the most used language in
software development and is frequently used for data analytics,
scientific computing, IoT (Internet of Things), and web development.
This course is designed to equip you with the basics of computer
programming that you can use to learn more about Python.
You are encouraged to consult the syllabus for clarity about the deadlines for preparation activities, and assignments. Please regularly visit MyLearningSpace for announcements, course material, and discussion boards. The schedule for the exam will be announced separately.