Lesson 5: The List

Topics

In this lesson, we are going to introduce the List ADT. The topic includes:

  1. List ADT
  2. Use of the List ADT
  3. Using Python lists to implement List ADT

Required Tasks

  1. The List ADT
  2. Using the List
  3. Implementation of the List ADT

Required Tasks

  1. Complete the List tasks of Lab 3.
  2. Complete and submit Assignment 4.

Learning Outcomes

By the end of this lesson, you should be able to

Introduction

So far we have learned about ADTs in general. We also talked about algorithm analysis and learned how to estimate running time of an algorithm as a function on input size.

In this lesson, we are going to learn in more detail about one of the important ADTs named List. We talk about how the methods of this ADT can be implemented and discuss efficiency of some methods. We will discuss Array data structures and explain how we can implement the List ADT using Python lists.

List ADT

A List is an ADT that contains an ordered collection of data items. "Ordered" means that each value in the List has a position, not that the values are sorted. Later we will also talk about the Sorted List ADT.

In a List ADT values can be added to and removed from any position in the list. Therefore, the List ADT provides random access to data. Values in a List ADT (like all other ADTs discussed in this course) must have the same type. They could be integer, string, a class (like Student, Food, etc.).

List ADT Methods

When talking about an ADT, we need to specify its methods. The basic List methods are:

initialize()
creates an empty List
len()
returns the number of values in the List
insert(i, value)
adds value to the List at position i
remove(key)
removes and returns the value in the List that matches key
find(key)
returns the value in the List that matches key without removing it
is_empty()
determines whether the List is empty
count(key)
returns the number of values in the List that match key
min, max()
returns the minimum or maximum values respectively from the List
clear()
removes all values from the List
index(key)
returns the index of the first value in the list that matches key
del(i)
returns and removes the value at index i
append(value)
adds value to the end of the List

List ADT Operators

In Python, when methods with certain names are defined and implemented for a class, they can be used as operators on the objects of that class. Here for the List ADT, which is implemented as a class in Python, we can define the following operators:

key in
returns True if a value matching key is in the List, False otherwise. This operator is provided by implementing the __contains__ magic method.
l[i]
gets the element at position i in the list or sets the element at position i. This operator is provided by implementing __getitem__ and __setitem__ magic methods respectively

Keys versus Values

The section above refers to keys and that lists can be searched for keys. A key is simply an object in which only the subset of all possible attributes in that object have meaningful values assigned. Those attributes must include the attributes used in the comparison operators.

In other words, a key is a value that has just enough information to allow us to search for the full set of matching data in a data structure. A student ID number is sufficient for us to search a list of students for a matching student. A movie title and year is sufficient for us to search a list of movies for a matching movie that contains the rest of the information we require. Note that in both these cases we still need our search attributes stored in an object of the correct type: i.e. a Student object that holds only a student ID number, and a Movie object that contains only a title and year. Because data structures must be generic and work with objects of any type (albeit any particular List must hold values of only one type), we cannot design our data structures to search for only the string ID part of a Student object, or only the string title and integer year of a Movie object. Thus our key values must still be the proper types of objects, just with some attributes left empty. These 'extra' attributes are left empty because it would be pointless for us to search for a movie if we already have all of the possible information it could have. Why search for it, then?

The following is an example of how we can define a Movie object as a key to be searched for in a List.

Define a new Movie object, assigning values to only the title and year attributes:

    
key_movie = Movie('Dark City', 1998, None, None, None)

  

The resulting movie object looks like this:

Title:    Dark City
Year:     1998
Director: None
Rating:   None
Genres:   None

The important thing to recognize here is that this is sufficient information to use to find a matching Movie object in a List. The Movie comparison functions use only the title and year attributes to compare two movie objects. We can now use this to search for the full movie information (assuming Dark City is actually stored in the list):

    
n = movies.index(key_movie)

if n > -1:
    movie = movie_list[n]
    print(movie)

  

and print the following:

Title:    Dark City
Year:     1998
Director: Alex Proyas
Rating:   7.8
Genres:   science fiction

The distinction between a key and a value in terms of having different amounts of information is only applicable to objects with multiple attributes. A simple integer has only one possible value - we can certainly search a List for it, but we can't extract any extra information about it from the List.

Food keys are similar in the only the name and origin attributes of a Food object are necessary to create a key value, because only those attributes are used to compare Food objects.

Array-Based List

The rear of the List is the right-most element in the List's underlying array. The front of the List is always position 0. The size of the List at any given time is simply the size of the Python list in the List. A List is empty if its underlying list is empty.

Implementing the Array-Based List

The following code shows how to implement part of the array-based List in Python. The rest is left to you as an exercise for labs and assignments.

Initializing an Array-Based List

List __init__
    
from copy import deepcopy

class List:

    def __init__(self):
        """
        -------------------------------------------------------
        Initializes an empty list.
        Use: target = List()
        -------------------------------------------------------
        Returns:
            a new List object (List)
        -------------------------------------------------------
        """
        self._values = []

  

This is the same as the initializations for a Stack or Queue. The is_empty and __len__ methods are also the same.

The count Method

The count method returns the number of times a key appears in the List. We use this as an example of the kind of approach we want you to take when writing array-based ADTs in this course.

List count
    
    def count(self, key):
        """
        -------------------------------------------------------
        Finds the number of times key appears in list.
        Use: n = source.count(key)
        -------------------------------------------------------
        Parameters:
            key - a partial data element (?)
        Returns:
            number - number of times key appears in list (int)
        -------------------------------------------------------
        """
        n = len(self._values)
        i = 0
        number = 0

        while i < n:
            if self._values[i] == key:
                number += 1
            i += 1
        return number

  

Note that we could use a for loop here because we are always going to process the entire list. We're using a while loop for easy comparison to a similar method in the next Lesson.

The _linear_search Method

A linear search is an algorithm that searches for the first occurrence of a key in a linear list of values. It is a very important concept in algorithm design, and you must completely master it in order to do well in this course.

The algorithm is simple:

There are no special cases - if, for example, the list is empty, the loop simply never executes.

This search is called a linear search because it works in a linear fashion - moving from list element to list element from left to right.

The linear search method is a private helper method. A helper method is an method that is not called from outside the class it is defined in, but rather, is used by other methods within that class. This is particularly useful when the actions it performs are used by multiple other functions. The linear search in particular is used by the remove, __contains__, and index methods to find the index of a key in the List. The methods then do different things with that information.

List _linear_search
    
    def _linear_search(self, key):
        """
        -------------------------------------------------------
        Searches for the first occurrence of key in the list.
        Private helper method - used only by other ADT methods.
        Use: i = self._linear_search(key)
        -------------------------------------------------------
        Parameters:
            key - a partial data element (?)
        Returns:
            i - the index of key in the list, -1 if key is not found (int)
        -------------------------------------------------------
        """

        # your code here

        return i

  

Writing the linear search is left as an exercise to you. As noted above, since you may not be processing the entire list (i.e. stop looping when you find what you are looking for), you must use a while loop in this code.

Implementing the in Operator

The in operator tells us whether a value is contained in a List or not, and is used as:

    
if key_movie in movie_list:

  

The special Python name of the method for this operator is __contains__:

List __contains__
    
    def __contains__(self, key):
        """
        -------------------------------------------------------
        Determines if the list contains key.
        Use: b = key in lst
        -------------------------------------------------------
        Parameters:
            key - a partial data element (?)
        Returns:
            True if the list contains key, False otherwise. (boolean)
        -------------------------------------------------------
        """
        # search self._values for key
        return ...

  

Hint: this method must calls the _linear_search helper method. Again, the point of a helper method is to provide common code used by multiple other methods, and searching for a key is exactly what the _linear_search does, and what this method must accomplish as well.

Implementing the [] Operator

Implementing the Python __getitem__ method allows us to use the square bracket / index syntax to retrieve a value from a List, like this:

    
movie = movie_list[n]

  

The method is:

List __getitem__
    
    def __getitem__(self, i):
        """
        -------------------------------------------------------
        Returns a copy of the i-th element of the list.
        Use: value = lst[i]
        -------------------------------------------------------
        Parameters:
            i - index of the element to access (int)
        Returns:
            value - the i-th element of list (?)
        -------------------------------------------------------
        """
        assert self._is_valid_index(i), "Invalid index value"

        value = deepcopy(self._values[i])
        return value

  

This method allows only valid indexes to be used. The helper method _valid_index returns False if i is an invalid index. (Note that in Python a valid index has both a negative as well as a positive range. _valid_index must account for that.)

Implementing the Python __setitem__ method allows us to use the square bracket / index syntax to assign a value to a List, like this:

    
movie_list[n] = movie

  

The method is:

List __setitem__
    
    def __setitem__(self, i, value):
        """
        -------------------------------------------------------
        The i-th element of list contains a copy of value. The
        existing value at i is overwritten.
        Use: lst[i] = value
        -------------------------------------------------------
        Parameters:
            i - index of the element to access (int)
            value - a data value (?)
        Returns:
            None
        -------------------------------------------------------
        """
        assert self._valid_index(i), "Invalid index value"

        self._values[i] = deepcopy(value)
        return

  

As above, only valid indexes are allowed.

Other methods and operators will be discussed and implemented in labs and assignments.

Summary

In this lesson, we introduced the List ADT and discussed different approaches to implement the List ADT using arrays. We discussed how a List can be implemented using Python lists. Later we in the course we will talk about another implementation of the List ADT using a linked structure.

References

  1. CS1027b: Computer Science Fundamentals II, Department of Computer Science at Western University, Notes on the List ADT.
  2. Design and Analysis of Data Structures, Niema Moshiri and Liz Izhikevich, 2018.