title

Tasks

  1. Read and complete the activities at Zybook, Z12: Tuples, Dictionaries and Sets.
  2. Complete reading this lesson, which involves the following:
    1. Data Structures
    2. Tuples
    3. Dictionary
    4. Sets
  3. Complete quiz 11 to verify your reading and understanding.
  4. Read and complete the activities at Zybook, Z12: Tuples, Dictionaries and Sets.

Learning Outcomes

By the end of this lesson, you should be able to:

  1. Differentiate between tuples, dictionaries, and sets.
  2. Use tuples, dictionaries, and sets in the Python programs.
  3. Discuss how to choose a data structure to solve a given problem.

Key Terms/Concepts

Introduction

Welcome to the last lesson of this course. The first course on programming is always an amazing journey where you learn how to write instructions to use a computer for problem solving. This last lesson of the course bridges the contents covered in CP104 with the future course on data structures (CP164).

Data Structures is a dedicated course which is mainly offered to the computer science students. In this introductory course on programming we are interested in understanding the basic usage of the data structures in the Python programming language. They are four built-in data structures in Python: Lists, Tuples, Dictionaries, and Sets.

You have already seen lists, which is a common data structure available in many languages. It may have a different name in various programming languages (like Arrays) and may have slightly varying properties, but the main purpose is similar.

A data structure is a storage system designed for data organization and management, and it facilitates the efficient access and manipulation of the data. In this lesson, we will look at the three data structures, namely Tuples, Dictionaries and Sets. You will note that the properties of each data structure differentiates it from the others, and hence each has a unique application area.

Let's briefly recall the main properties of a list:

Let's discuss the other three data structures in this lesson.  

Tuples

Tuples are a sequence and are quite similar to lists. A tuple is immutable and this is the main difference between a list and a tuple. As they are both similar, we use parentheses instead of square brackets for tuples. The following two lines of code show how to create a list and a tuple.


my_list = [1, 2, 3, 4]  # List
my_tuple = (1, 2, 3, 4) # Tuple

In fact, all the operations that can be applied on the lists, except those that manipulate the elements, can be used on tuples as well. Indexes can be used for accessing elements, slicing is available to extract a sub-tuple, and built-in functions can also be used for finding length and maximum etc. Methods that are used for changing the contents of a list are not available for tuple, because of immutability.

List methods for adding elements, deleting elements or changing the order of the elements are not available for use for tuples. It may seem by now that tuples are useless when the list is available in Python, but that is not true. Tuples offer the following advantages over a list.

12.2.1 Application of tuples

Consider you are making an application for user registration, and as part of the process the user needs to provide her country of origin. There are a finite number of countries, and once you have the 'list' of countries available, your code has no reason to modify the names. In fact, you will like to ensure that the 'list' of names is never modified by a bug (logical error) in your source code. In such a case, using a tuple is better as the long list of countries will be processed faster using a tuple.

As a second example, let's imagine we are creating an application to calculate the student's grade in the courses he has enrolled in. We enter the marks in each assessment tool like quizzes, assignments, midterm and the final exam. The application is supposed to find the grade in multiple courses, therefore we allow the user to enter the weightage for the assessment tools for each of the courses.

Technical Note:

Unlike the lists, you must use a comma when creating a tuple containing a single element, or otherwise it will be a case of simple assignment. The comma after 'item 1' is missing in the second line of code below, whereas it is included in the third line of code.


my_list = ['item 1']    # Creates a list of one element
my_tuple = ('item 1')   # Creates a string 
my_tuple = ('item 1',)  # Creates a tuple with one element

Prog 12-01 shown in Code Listing 1 below creates a simple tuple with seven elements and demonstrates the use of indexes and slicing.


 # Program Name: Prog 12-01
 # This program is part of Lesson 12 of the course 
  
 def main ():
        thistuple = ("apple", "banana", "cherry", "orange", "kiwi", "melon", "mango")
        str = thistuple[0]
        print(str)
        print(thistuple[2:5])
        print(thistuple[2:])
        print(thistuple[2:5000])
        print(thistuple[-15:-1])
    
 main()

Code Listing 1: Prog 12-01

Let's go through the code line by line:

Line 5 A tuple is created with seven elements. We know it is a tuple and not a list, as parentheses and not square brackets are used.
Lines 6, 7 The first element of the tuple is extracted using indexing, and it is assigned to a variable named str, which is then printed on the console. We used parentheses while creating the tuple, but note that square brackets are used for slicing and specifying indexes, like in the case of lists.
Line 8 Three elements of the tuple (element 2, 3, and 4) are extracted as a sub-tuple and printed on the console. Parentheses are used in the console to denote that it is a tuple and not a list.
Line 9 All elements of the tuple starting from element at index 2 onwards are printed on the console.
Line 10 The ending index is way bigger than the length of the tuple, but just like the lists, it is an acceptable syntax. But it is only acceptable when using slicing. It will generate an index out of range error if you try something like the following:
print(thistuple[5000])
Line 11 Like in the case of lists, the reverse indexing is also available for tuples.

The output for the Code Listing 1 is shown in Console 1.


apple
('cherry', 'orange', 'kiwi')
('cherry', 'orange', 'kiwi', 'melon', 'mango')
('cherry', 'orange', 'kiwi', 'melon', 'mango')
('apple', 'banana', 'cherry', 'orange', 'kiwi', 'melon')

Console 1: Output of Code Listing 1.

Python provides many functionalities to effectively use tuples in your programs.  You have already seen that there are a lot of similarities in the use of tuples and lists. Python allows the conversion of lists into tuples and vice versa. Like lists, the tuples can be nested.

Technical Note:

Tuples are immutable but lists are mutable.

What if a tuple contains a list as an element like below?:


my_tuple = (1, 'Waterloo Region', ['WLU', 'UW'])

A tuple remains immutable, which means the following line of code to add an element to a tuple will result in an error:


my_tuple.append("Ontario")

The list containing the names of higher education institutions in the Waterloo region is inside the tuple, but the list remains mutable. The following line of code is legal in Python, and will result in adding a third element to the list.


my_tuple[2].append('Conestoga') 

Printing the tuple will result in following being printed on the console:


(1, 'Waterloo Region', ['WLU', 'UW', 'Conestoga'])

Code Listing 2 shows Prog 12-02 that creates a tuple and modifies it by converting it into a list before printing the tuple. You will see a lot of similarities with the code examples in Lessons 8 and 11.


 # Program Name: Prog 12-02
 # This program is part of Lesson 12 of the course 
  
 def main ():
        my_tuple = (3, 'Waterloo Region', ['WLU', 'UW'])
        my_tuple[2].append('Conestoga')
  
        my_list = list(my_tuple)
        my_list.append("Ontario")
  
        my_tuple = tuple(my_list)
  
        print("***** Data Log *****")
    print(my_list)
    print(my_tuple)
    print("____________________\n")
  
    for item in my_tuple:
        if not(isinstance(item, list)): 
                print(f"{item}")
        else:
            print("Higher Education Institutions")
            for subitem in item:
                print(f"\t{subitem}")
    
main()

Code Listing 2: Prog 12-02

Let's go over the code.

Line 5 Creates a tuple named my_tuple, which contains 3 elements:
  • An integer at index 0
  • A string called Waterloo Region at index 1
  • A list at index 2. The list has two elements. Both are strings.
Line 6 my_tuple[2] is used to access the list inside the tuple. The list is mutable, so the append method is called on the list to insert Conestoga at the end of the list. my_tuple after line 6 looks like:
(1, 'Waterloo Region', ['WLU', 'UW', 'Conestoga'], )
Line 8 We want to add 'Ontario' as the forth element of the tuple. But the tuple is immutable, so we convert the tuple into a list named my_list. Built-in list() method is used for the conversion.
Line 9 'Ontario' is appended to the end of the list. The contents of my_list are:
[1, 'Waterloo Region', ['WLU', 'UW', 'Conestoga'], 'Ontario']
Line 11 The built-in tuple() method is used to convert the list into a tuple again.
Lines 13 - 16 my_list and my_tuple are both printed on the console. The contents are the same for both of them, but square brackets are used for the list and parentheses are used for the tuple.
Line 18 A for loop is used to iterate over the tuple. The following elements will be assigned to the target variable named item in each iteration:
  • iteration 1: item = 3
  • iteration 2: item = 'Waterloo Region'
  • iteration 3: item = ['WLU', 'UW', 'Conestoga']
  • iteration 4: item = 'Ontario'
Lines 19, 20 The data type of the item is checked using the isinstance method.
The isinstance method requires two arguments. The first is a variable and the second is the class to compare against. If the variable is of the same class, which is mentioned as the second argument, the function returns True. Otherwise it returns False.
In this case, we check whether the item in the current iteration is a list or not. If it is not a list, then its value is printed in line We do so, as we want to treat the list at index 2 differently.
Lines 21 - 24 If the item is a list (which will be in the case of the third iteration of the loop), Higher Education Institutions is printed on the console and then a for loop is used to iterate over the list. Each element of the list is printed separately on an individual line.

Console 2 shows the output for Code Listing 2.


***** Data Log *****
[3, 'Waterloo Region', ['WLU', 'UW', 'Conestoga'], 'Ontario']
(3, 'Waterloo Region', ['WLU', 'UW', 'Conestoga'], 'Ontario')
____________________

3
Waterloo Region
Higher Education Institutions
    WLU
    UW
    Conestoga
Ontario

Console 2: Output of Code Listing 2.

Dictionaries

We know dictionary as a reference book that can be used to find the meaning of words. Each word exists only once in the dictionary, and its meaning, example usage and synonyms are listed against it. Some dictionaries also provide translation of words from one language to another language. Words are alphabetically sorted to assist the user in their search.

A dictionary is also a data structure in the Python programming language. It is named because of its similarities to the dictionary we already know. A dictionary has the following properties:

Key-value pairs are also called mappings. They are separated from each other using a colon. The following lines of code generate some dictionaries.


dict_1 =  {   
                "team": "Toronto Raptors", 
                "rank": "winners",
                "year": 2019
          }

Explanation: dict_1 has three mappings. All the keys are strings. Two values are strings and one is an integer. Curly brackets are used to denote the start and end of a dictionary. A colon is used to separate a key from its value, whereas a comma is used to differentiate between mappings. The dictionary is defined on multiple lines to make it easier to read. It can be defined on a single line as well.

dict_2 =  {   
                 1: [3, 4], 
                 4.5: (5, 6),
                 (1, 2): 7.7
          }

Explanation: dict_2 has three mappings. Keys and values can be of different data types, but keys must be immutable. An integer, string, float, and tuple are all fine but you cannot use a list, which is mutable as a key. There is no such restriction for values. dict_2 uses an integer, a float, and a tuple as keys and has a list, a tuple and a float as values.

dict_3 =  { 1: [3, 4], 4.5: (5, 6), (1, 2): 7.7 }

Explanation: dict_3 is the same as the dictionary named dict_2 above, but it is defined on a single line.

dict_4 = {}

Explanation: dict_4 is a valid empty dictionary.

dict_5 = {"one": 1, [1, 2]: "two"}

Explanation: dict_5 is invalid as it tries to use a list as a key.

12.3.1 Basic Dictionary Manipulation

Code Listing 3 shows Prog 12-03, which demonstrates the basic operations of dictionaries. The program is not intended to perform anything meaningful.


 # Program Name: Prog 12-03
 # This program is part of Lesson 12 of the course 
  
 def main ():
        a_dictionary =  {   
                        "team": "Toronto Raptors", 
                        "rank": "winners",
                        "year": 2019
                    }
  
        rank = a_dictionary["rank"]
        print(f"Value '{rank}' has the datatype {type(rank)}")
  
    year = a_dictionary.get("year")
    print(f"Value '{year}' has the datatype {type(year)}")

    k = "year"
    v = 19
    a_dictionary[k] = v
 
    if ("country" not in a_dictionary):
        a_dictionary["country"] = "Canada"

    a_copy = a_dictionary.copy()
   
    v = a_copy.pop("team")
    print(f"{'team'}: {v}")

    k, v = a_copy.popitem()
30.         print("{}: {}". format(k, v))
3  
3       a_copy.clear()
3  
3       print()
3       print(f"a_copy:\t\t{a_copy}")
3       print(f"a_dictionary:\t{a_dictionary}")
3  
3       a_temp = {"rank": "unknown", "year": 20, "league": "NBA"}
3       a_dictionary.update(a_temp)
40.         print("updating ... ")
4       print(f"a_dictionary:\t{a_dictionary}")
4   
4 main()

Code Listing 3: Prog 12-03
Line 5 A dictionary named a_dictionary is created with three mappings.
Lines 11, 12 The value for the key rank is retrieved and stored in a variable named rank.
  • Square brackets are used to specify the key. The syntax is similar to the lists and tuples.
  • A variable named rank and the key named rank are different, and both with the same name can exist.
The value assigned to rank and its datatype is printed on the console.
Lines 14, 15 The get method is an alternate way to retrieve a value given a specific key. A get method is called on the dictionary, and the key is passed as a parameter. The method returns the value stored against that key. The value is assigned to a variable named year. A year and its datatype are printed on line 15.
Lines 17 - 19 The year is mentioned as 2019 in the dictionary, but we want to change it to 19. This is possible, as dictionaries are mutable. Two variables called k and v are initialized with values year and 19 respectively. The value of k is used to update 2019 to 19. Again, the syntax is similar to the lists. The difference is that in case of lists we mention the index inside the square brackets, whereas a key is used for dictionaries.
Lines 21, 22 in and not in operators can be used to check the presence of a certain key in a dictionary. Line 22 will be executed as no key named country exists in the dictionary. A new mapping with the key country and value Canada is added to the a_dictionary.
Line 24 A copy of the a_dictionary is created with the name of a_copy. It creates a shallow copy, which means copying by value and not copying by reference. (See section 8.4 of Lesson 8 for copying by reference)
Lines 26, 27 The pop method is used to remove the key team and its value from the a_dictionary. The value is returned by the pop method, which is assigned to the variable named v. It is printed on the console.
Lines 29, 30 The popitem method retrieves the last entered key value pair from the dictionary and returns it. Two variables k and v are used in line 29 and they will be assigned the key and the value respectively.
Line 32 The clear method is used to remove all the key value pairs from the dictionary. a_copy becomes an empty dictionary.
Lines 35, 36 a_copy and a_dictionary are both printed on the console.
Line 38 Another dictionary named a_temp is created with three mappings. Two of the keys in this new dictionary are the same as in a_dictionary, whereas there is one unique key named league as well.
Line 39 a_dictionary is updated using a_temp. Common keys will be updated with new values from a_temp, whereas unique mapping will be copied into a_dictionary too.
Lines 41 a_dictionary is printed on the console.

Technical Note:

Note: In the earlier versions of Python (Before Python 3.7) the popitem method is used to return an arbitrary mapping from the dictionary rather than the one entered most recently.

Console 3 shows the output of Console Listing 3.


Value 'winners' has the datatype <class 'str'>
Value '2019' has the datatype <class 'int'>
team: Toronto Raptors
country: Canada

a_copy: {}
a_dictionary:   {'team': 'Toronto Raptors', 'rank': 'winners', 'year': 19, 'country': 'Canada'}
updating ... 
a_dictionary:   {'team': 'Toronto Raptors', 'rank': 'unknown', 'year': 20, 'country': 'Canada', 'league': 'NBA'}

Console 3: Output of Code Listing 3.

12.3.2 loops for Dictionaries

for loops are specialized loops that make iterating over a dictionary very easy. We can iterate over the keys in the dictionary, values in the dictionary, or the key value pairs in the dictionary. The values method can be used to access a sequence of all the values in the dictionary. The items method, on the other hand returns a sequence of all key value pairs in the dictionary.

Let's use these functions to iterate over a dictionary thrice. See Prog 12-04 in Code Listing 4.


 # Program Name: Prog 12-04
 # This program is part of Lesson 12 of the course 
  
 def main():
        a_dictionary =  {   
                        "team": "Toronto Raptors", 
                        "rank": "winners",
                        "year": 2019
        }
        print("Loop 1")
        for k in a_dictionary:
                print(f"{k}")
  
    print("\nLoop 2")    
    for v in a_dictionary.values():
            print(f"{v}")

    print("\nLoop 3")
    for k, v in a_dictionary.items():
                print(f"{{{k}, {v}}}")
2 
main()

Code Listing 4: Prog 12-04

The first loop iterates over the keys in the dictionary. One key is assigned to the variable named k in each iteration, and it is printed on the console. One value from the dictionary is assigned to the target variable in each iteration of the second for loop in the program. All the values are printed on the console by this loop.

In the third for loop, the items method is used to retrieve all the key value pairs from the dictionary. One mapping is assigned to the variables k and v in each iteration and then printed on the console. The output of the program is shown in the Console 4.


Loop 1
team
rank
year

Loop 2
Toronto Raptors
winners
2019

Loop 3
{team, Toronto Raptors}
{rank, winners}
{year, 2019}

Console 4: Output of Code Listing 4.

Sets

Sets in the Python language are similar to the mathematical sets. They are specifically designed to facilitate in mathematical operations. The sets have the following properties.

The following lines of code can be used to create sets:


this_set = {"Kitchener", "Waterloo", "Cambridge"}

Explanation: A set is created with three elements.

this_set = set()

Explanation: An empty set is created using the built-in set method.

this_set = {3, "Waterloo Region", (519, 226, 548)}

Explanation: A set containing an integer, a string, and a tuple is created. All three elements are immutable so they can be in a set.

this_set = {3, "Waterloo Region", [519, 226, 548]}

Explanation: A syntax error will be reported as we are trying to include a list as an element of a set. The list being mutable is not a valid element for a set.

this_set = {"Kitchener", "Waterloo", "Cambridge", "Cambridge"}

Explanation: A set is created with three elements. As every element must be unique in a set, the duplicate values will be ignored.

A dictionary and a set both use curly brackets. They can be differentiated in the console output, as a dictionary contains key value pairs separated by a colon, whereas a set has single elements separated by commas. The only problem may occur when there is an empty set or a dictionary. Therefore, the Python programming language displays them in the following manner to assist in recognition.


{}      # Curly brackets mean an empty dictionary is displayed
set()   # The word “set” followed by two parentheses are used to shown an empty set.

12.4.1 Basic Set Manipulation

Prog 12-05 shown in Code Listing 5 performs some basic set operations.


 # Program Name: Prog 12-05
 # This program is part of Lesson 12 of the course 
  
 def main():
        recipe1 = {"apple", "banana", "cherry"}
        print(f"{len(recipe1)} Ingredients: {recipe1}")
  
        recipe1.update(["orange", "mango", "grapes"])
        print(f"{len(recipe1)} Ingredients: {recipe1}")
  
        recipe1.update(["apple"])
        print(f"{len(recipe1)} Ingredients: {recipe1}")
  
    if ("banana" in recipe1):
            recipe1.remove("banana") 
    recipe1.discard("guava")   
    print(f"{len(recipe1)} Ingredients: {recipe1}") 

    item = recipe1.pop()          
        print(f"Removed: {item}")
    print(f"{len(recipe1)} Ingredients: {recipe1}") 

    recipe1.clear()
    print(f"{len(recipe1)} Ingredients: {recipe1}")

main() 

Code Listing 5: Prog 12-05 .

The number of elements in the set (number of ingredients) and the elements are printed in lines 6, 9, 12, 16, 20 and The console output will help you understand the functionality of various methods used in the code. The line by line code explanation is given below:

Line 5 A set is created with three elements. The elements in a set are unordered, and the order may vary from one execution of the program to another. Console 5 and 6 show two executions of the same program.

Note the order of elements is different in the first line of the outputs.
Line 8 The update method is called to add three more elements to the set. The update method is passed a single list. All the elements in the list are extracted and added to the set named recipe1. recipe1 has 6 elements now.
Line 11 The update method is called to add a seventh element. A list containing a single element was passed for inclusion in the set, but as the element already exists in the set, it is ignored. recipe1 still has only 6 elements.
Lines 14, 15 banana is removed from the set in line The method remove raises an error if the element you are trying to remove does not exist in the set. We ensure in line 14 that the element exists in the set before attempting to remove.
Line 16 The discard method is used to remove guava. We did not check whether guava exists or not before trying to discard it, as the discard method does not raise an error if the element does not exist. guava is not present in the set, so there is no change.
Lines 19, 20 The pop method is used to remove an element from the set. The elements in the set are not ordered, therefore a random element will be removed. See the outputs of this program in Console 5 and 6. orange was popped in the first execution of the program and cherry was popped in the second execution.
Line 23 The clear method removes all the elements from a set.

Console 5 and 6 show output for two executions of the Prog 12-05.


3 Ingredients: {'banana', 'apple', 'cherry'}
6 Ingredients: {'banana', 'orange', 'mango', 'grapes', 'cherry', 'apple'}
6 Ingredients: {'banana', 'orange', 'mango', 'grapes', 'cherry', 'apple'}
5 Ingredients: {'orange', 'mango', 'grapes', 'cherry', 'apple'}
Removed: orange
4 Ingredients: {'mango', 'grapes', 'cherry', 'apple'}
0 Ingredients: set()

Console 5: Output for first execution of Code Listing 5

3 Ingredients: {'cherry', 'banana', 'apple'}
6 Ingredients: {'cherry', 'mango', 'banana', 'apple', 'orange', 'grapes'}
6 Ingredients: {'cherry', 'mango', 'banana', 'apple', 'orange', 'grapes'}
5 Ingredients: {'cherry', 'mango', 'apple', 'orange', 'grapes'}
Removed: cherry
4 Ingredients: {'mango', 'apple', 'orange', 'grapes'}
0 Ingredients: set()

Console 6: Output for second execution of Code Listing 5

12.4.2 Basic Mathematical Operations

Basic set operations from mathematics can be easily applied to a set data structure in Python. Prog 12-06 shows the usage of intersection, union, difference and symmetric difference methods.


 # Program Name: Prog 12-06
 # This program is part of Lesson 12 of the course 
  
 def main():
        course_cs104 = set(['Yousuf', 'Maria', 'Haaniya', 'Ajeet'])
        course_cs164 = set(['Eva', 'Haaniya', 'Alicia', 'Yousuf'])
  
        print('Students enrolled in CS104:\t', end = ' ')
        for name in course_cs104:
                print(name, end = ', ')
  
        print("\n")
        print('Students enrolled in CS164:\t', end = ' ')
    for name in course_cs164:
            print(name, end = ', ')

    print("\n")
    print('Students enrolled in both:\t', end = ' ')
    print(course_cs104.intersection(course_cs164))
 
    print('\nStudents enrolled in either:\t', end = ' ')
    print(course_cs104.union(course_cs164))

    print('\nStudents in 104 but not 164:\t', end = ' ')
    print(course_cs104.difference(course_cs164))

    print('\nStudents in 164 but not 104:\t', end = ' ')
    print(course_cs164.difference(course_cs104))

30.         print('\nStudents in only one course:\t', end = ' ')
3       print(course_cs104.symmetric_difference(course_cs164))
3  
3 main()

Code Listing 6: Prog 12-06

Prog 12-06 starts by creating two sets. sets named course_cs104 and course_cs164 contain the names of the students enrolled in respective courses. Two for loops are used to print the elements of each set.

The intersection method is called in line 19 to find the names of the students who are present in both the courses. The union method in line 22 returns the names of all the students present in either or both of the sets. The names of the students present in both the sets are listed just once.  

The difference method is used twice. First, to find the students present in only CS 104 (line 25), and second, to find the students in only CS 164 (line 28). The symmetric_difference method returns the names of the students from both the sets such that they are present in only one of the two sets. All of the above described methods return the result as a set. It is evident from the output as curly brackets are used in the output.

Console 7 shows the output of Code Listing 6.


Students enrolled in CS104:  Maria, Ajeet, Yousuf, Haaniya, 

Students enrolled in CS164:  Yousuf, Haaniya, Alicia, Eva, 

Students enrolled in both:   {'Yousuf', 'Haaniya'}

Students enrolled in either:     {'Alicia', 'Maria', 'Ajeet', 'Yousuf', 'Haaniya', 'Eva'}

Students in 104 but not 164:     {'Maria', 'Ajeet'}

Students in 164 but not 104:     {'Alicia', 'Eva'}

Students in only one course:     {'Alicia', 'Maria', 'Ajeet', 'Eva'}

Console 7: Output of Code Listing 6.

Conclusion

In this lesson, we learnt the difference between Tuples, Dictionaries and Sets in Python and how we can use them.This brings us to the end of this first course on programming. I hope all the students made a lot of progress and developed interest in Python. Python is one of the most used language in software development and is frequently used for data analytics, scientific computing, IoT (Internet of Things), and web development. This course is designed to equip you with the basics of computer programming that you can use to learn more about Python.

You are encouraged to consult the syllabus for clarity about the deadlines for preparation activities, and assignments. Please regularly visit MyLearningSpace for announcements, course material, and discussion boards. The schedule for the exam will be announced separately.