In this lesson, we are going to introduce the List ADT. The topic includes:
By the end of this lesson, you should be able to
So far we have learned about ADTs in general. We also talked about algorithm analysis and learned how to estimate running time of an algorithm as a function on input size.
In this lesson, we are going to learn in more detail about one of the important ADTs named List. We talk about how the methods of this ADT can be implemented and discuss efficiency of some methods. We will discuss Array data structures and explain how we can implement the List ADT using Python lists.
A List is an ADT that contains an ordered collection of data items. "Ordered" means that each value in the List has a position, not that the values are sorted. Later we will also talk about the Sorted List ADT.
In a List ADT values can be added to and removed from any position in the list. Therefore, the List ADT provides random access to data. Values in a List ADT (like all other ADTs discussed in this course) must have the same type. They could be integer, string, a class (like Student, Food, etc.).
When talking about an ADT, we need to specify its methods. The basic List methods are:
initialize()
len()
insert(i, value)
value
to the List at position i
remove(key)
key
find(key)
key
without
removing it
is_empty()
count(key)
key
min
, max()
clear()
index(key)
key
del(i)
append(value)
value
to the end of the List
In Python, when methods with certain names are defined and implemented for a class, they can be used as operators on the objects of that class. Here for the List ADT, which is implemented as a class in Python, we can define the following operators:
key in
True
if a value matching key
is in
the List, False
otherwise. This operator is provided by
implementing the __contains__
magic method.
l[i]
i
in the list or sets the
element at position i. This operator is provided by implementing __getitem__
and __setitem__
magic methods respectively
The section above refers to keys and that lists can be searched for keys. A key is simply an object in which only the subset of all possible attributes in that object have meaningful values assigned. Those attributes must include the attributes used in the comparison operators.
In other words, a key is a value that has just enough information to
allow us to search for the full set of matching data in a data
structure. A student ID number is sufficient for us to search a list of
students for a matching student. A movie title and year is sufficient
for us to search a list of movies for a matching movie that contains the
rest of the information we require. Note that in both these cases we
still need our search attributes stored in an object of the correct
type: i.e. a Student
object that holds only a student ID
number, and a Movie
object that contains only a title and
year. Because data structures must be generic and work with objects of
any type (albeit any particular List must hold values of only one type),
we cannot design our data structures to search for only the string ID
part of a Student
object, or only the string title and
integer year of a Movie
object. Thus our key values must
still be the proper types of objects, just with some attributes left
empty. These 'extra' attributes are left empty because it would be
pointless for us to search for a movie if we already have all of the
possible information it could have. Why search for it, then?
The following is an example of how we can define a Movie
object as a key to be searched for in a List.
Define a new Movie
object, assigning values to only the
title and year attributes:
key_movie = Movie('Dark City', 1998, None, None, None)
The resulting movie object looks like this:
Title: Dark City Year: 1998 Director: None Rating: None Genres: None
The important thing to recognize here is that this is sufficient
information to use to find a matching Movie
object in a
List. The Movie
comparison functions use only the title and
year attributes to compare two movie objects. We can now use this to
search for the full movie information (assuming Dark City
is
actually stored in the list):
n = movies.index(key_movie)
if n > -1:
movie = movie_list[n]
print(movie)
and print the following:
Title: Dark City Year: 1998 Director: Alex Proyas Rating: 7.8 Genres: science fiction
The distinction between a key and a value in terms of having different amounts of information is only applicable to objects with multiple attributes. A simple integer has only one possible value - we can certainly search a List for it, but we can't extract any extra information about it from the List.
Food
keys are similar in the only the name
and
origin
attributes of a Food
object are
necessary to create a key value, because only those attributes are used
to compare Food
objects.
The rear of the List is the right-most element in the List's underlying array. The front of the List is always position 0. The size of the List at any given time is simply the size of the Python list in the List. A List is empty if its underlying list is empty.
The following code shows how to implement part of the array-based List in Python. The rest is left to you as an exercise for labs and assignments.
__init__
from copy import deepcopy
class List:
def __init__(self):
"""
-------------------------------------------------------
Initializes an empty list.
Use: target = List()
-------------------------------------------------------
Returns:
a new List object (List)
-------------------------------------------------------
"""
self._values = []
This is the same as the initializations for a Stack or Queue. The is_empty
and __len__
methods are also the same.
count
Method
The count
method returns the number of times a key appears
in the List. We use this as an example of the kind of approach we want
you to take when writing array-based ADTs in this course.
count
def count(self, key):
"""
-------------------------------------------------------
Finds the number of times key appears in list.
Use: n = source.count(key)
-------------------------------------------------------
Parameters:
key - a partial data element (?)
Returns:
number - number of times key appears in list (int)
-------------------------------------------------------
"""
n = len(self._values)
i = 0
number = 0
while i < n:
if self._values[i] == key:
number += 1
i += 1
return number
Note that we could use a for
loop here because we are always
going to process the entire list. We're using a while
loop
for easy comparison to a similar method in the next Lesson.
_linear_search
Method
A linear search is an algorithm that searches for the first occurrence of a key in a linear list of values. It is a very important concept in algorithm design, and you must completely master it in order to do well in this course.
The algorithm is simple:
i
) that tracks
which value in the list is currently being examined
i
to the key
i
has not reached the end of the list,
and if the value at i
matches key, stop looping and
return i
i
i
has reached
the last valid index
-1
There are no special cases - if, for example, the list is empty, the loop simply never executes.
This search is called a linear search because it works in a linear fashion - moving from list element to list element from left to right.
The linear search method is a private helper method. A helper
method is an method that is not called from outside the class it is
defined in, but rather, is used by other methods within that class. This
is particularly useful when the actions it performs are used by multiple
other functions. The linear search in particular is used by the remove
,
__contains__
, and index
methods to find the
index of a key in the List. The methods then do different things with
that information.
_linear_search
def _linear_search(self, key):
"""
-------------------------------------------------------
Searches for the first occurrence of key in the list.
Private helper method - used only by other ADT methods.
Use: i = self._linear_search(key)
-------------------------------------------------------
Parameters:
key - a partial data element (?)
Returns:
i - the index of key in the list, -1 if key is not found (int)
-------------------------------------------------------
"""
# your code here
return i
Writing the linear search is left as an exercise to you. As noted above,
since you may not be processing the entire list (i.e. stop
looping when you find what you are looking for), you must use
a while
loop in this code.
in
Operator
The in operator tells us whether a value is contained in a List or not, and is used as:
if key_movie in movie_list:
The special Python name of the method for this operator is __contains__
:
__contains__
def __contains__(self, key):
"""
-------------------------------------------------------
Determines if the list contains key.
Use: b = key in lst
-------------------------------------------------------
Parameters:
key - a partial data element (?)
Returns:
True if the list contains key, False otherwise. (boolean)
-------------------------------------------------------
"""
# search self._values for key
return ...
Hint: this method must calls the _linear_search
helper
method. Again, the point of a helper method is to provide common code
used by multiple other methods, and searching for a key is exactly what
the _linear_search
does, and what this method must
accomplish as well.
Implementing the Python __getitem__
method allows us to use
the square bracket / index syntax to retrieve a value from a List, like
this:
movie = movie_list[n]
The method is:
__getitem__
def __getitem__(self, i):
"""
-------------------------------------------------------
Returns a copy of the i-th element of the list.
Use: value = lst[i]
-------------------------------------------------------
Parameters:
i - index of the element to access (int)
Returns:
value - the i-th element of list (?)
-------------------------------------------------------
"""
assert self._is_valid_index(i), "Invalid index value"
value = deepcopy(self._values[i])
return value
This method allows only valid indexes to be used. The helper
method _is_valid_index
returns False
if i
is an invalid index. (Note that in Python a valid index has both a
negative as well as a positive range. _is_valid_index
must
account for that.)
Implementing the Python __setitem__
method allows us to use
the square bracket / index syntax to assign a value to a List, like
this:
movie_list[n] = movie
The method is:
__setitem__
def __setitem__(self, i, value):
"""
-------------------------------------------------------
The i-th element of list contains a copy of value. The
existing value at i is overwritten.
Use: lst[i] = value
-------------------------------------------------------
Parameters:
i - index of the element to access (int)
value - a data value (?)
Returns:
None
-------------------------------------------------------
"""
assert self._is_valid_index(i), "Invalid index value"
self._values[i] = deepcopy(value)
return
As above, only valid indexes are allowed.
Other methods and operators will be discussed and implemented in labs and assignments.
In this lesson, we introduced the List ADT and discussed different approaches to implement the List ADT using arrays. We discussed how a List can be implemented using Python lists. Later we in the course we will talk about another implementation of the List ADT using a linked structure.