This Lesson introduces our next ADT, the Priority Queue. The topics include:
By the end of this lesson, you should be able to
In this lesson, we are going to introduce the Priority Queue ADT. First, we discuss the concept of a Priority Queue, how it works, what its interface is, and how it could be used for solving a problem. Then we get into the code and see how we can develop and use a Priority Queue in Python.
A priority queue is a collection of prioritized values. (It was discussed briefly in an earlier lecture.)
An emergency waiting room is a real-world example of a priority queue.
Patients go through triage (the process of
determining the priority of patients' treatments based on the severity
of their condition
) before being seen by a doctor. Those with the
worst conditions get highest priority. If two or more patients have
equally bad conditions the patients are seen in the order in which they
arrived at the emergency waiting room.
The following diagram shows patients queueing for an emergency ward. The order they enter the priority queue is irrelevant, and how they are organized once in the waiting area is unimportant, what counts is the order in which they are served.
initialize()
insert(value)
value
to the Priority Queue
remove()
peek()
is_empty()
True
if the Priority Queue is empty, False
otherwise
is_full()
True
if the Queue is full, False
otherwise
len()
Any implementation of a Priority Queue must define these methods.
Conceptually, how the contents of a Priority Queue are stored is not important. However, as programmers and software engineers, at some point we have to make a choice of implementation strategy. The fact that values must be removed from a priority queue in priority order puts constraints on how we can store Priority Queue contents. (We can't always rely on leprechauns doing the work for us!)
There are two ways to store the contents of a Priority Queue: sorted and unsorted.
A linked structure is convenient for this implementation.
An array-based structure is convenient for this implementation.
Note that the description of the unsorted contents approach is rather
naive. If we are only inserting and removing items from a Priority Queue
we understand that we have to look for the item with the highest
priority and then close up the empty space just made in the storage
list. This is no more complex than inserting items in priority order in
the sorted contents approach. However, what happens in a situation where
peek is used far more often than remove? Must we
constantly search for the highest priority item whenever a user wants to
know what it is? This doesn't seem very efficient. We will suggest an
answer to this question when we exam the array-base Priority Queue in
detail.
In the CP164 approach to Priority Queues, priority is not determined by
some magic value separate from the data, priority is determined by
the data itself. In short, whatever comes first has highest priority.
A metaphor that may be helpful to remember is the phrase, We're
Number One!
Being Number One is impressive; shouting, We're
Number Forty-Five!
impresses no one.
Thus, for integers, the lower the integer, the higher the priority. 1 has a higher priority than 3, which has a higher priority than 8, which has a higher priority then 53, and so on.
For all data, a simple < (less than) comparison is sufficient to
determine what has higher priority. For integers, 1 < 3. For the Food
class, as an example, a Food
object with the name "Butter
Chicken" has a higher priority than a Food
object with the
name "Lasagna" because if we compare those two Food
objects, the string "Butter Chicken" is less than "Lasagna". (And that's
just because a string starting with "B" comes before a string starting
with "L".) We don't have to do anything special just because the data is
a Food
object. (Note that we are ignoring the rest of the
attributes of the Food
class for simplicity's sake.)
The emphasis here is that a simple < comparison between data is sufficient. There are no special or external values associated with the data that determine the priority. Just compare the data.
The general notes about array-based implementations for Stacks and Queues also apply to a Priority Queue.
As noted in the discussion of the Priority Queue ADT, an array-based
approach to implementing a Priority Queue is best used for an unsorted
contents approach. (Note: although it is true that Python has built-in
sorting functionality for its list objects, we cannot assume
therefore that something magical is happening in the background. Even
though we didn't write the code and see the work going on under the
hood, that work goes on nonetheless. Therefore, simply saying, Meh,
let Python take care of it
, is not an acceptable answer.) We will look
at the basic idea behind using an array-based solution, then see if we
can improve on it.
The following code shows how to implement part of the array-based Priority Queue in Python. The rest is left to you as an exercise for labs and assignments.
__init__
from copy import deepcopy
class Priority_Queue:
def __init__(self):
"""
-------------------------------------------------------
Initializes an empty priority queue.
Use: pq = Priority_Queue()
-------------------------------------------------------
Returns:
a new Priority_Queue object (Priority_Queue)
-------------------------------------------------------
"""
self._values = []
self._first = None
return
So far, the Priority Queue initialization is very simple: the Priority
Queue data is held in a Python list attribute named _values
,
and that attribute is initialized as an empty list.
insert
def insert(self, value):
"""
-------------------------------------------------------
A copy of value is appended to the end of the priority queue
Python list, and _first is updated as appropriate to the index of
value with the highest priority.
Use: pq.insert(value)
-------------------------------------------------------
Parameters:
value - a data element (?)
Returns:
None
-------------------------------------------------------
"""
self._values.append(deepcopy(value))
# Determine whether new value has highest priority.
# your code here
return
insert
simply uses the Python list method append
to add value
to the end of the _values
list.
If the new values has a higher priority than the current value indexed
by _first
, then _first
must be updated to
index this new value. Since the new value was appended to the list, _first
must be updated to the length of _values
-1.
_first
tracks the index of the value in _values
with the highest priority. It is initialized to None
when
the Priority Queue is first constructed in order to make clear there are
no values in the Priority Queue at all, and thus no value of highest
priority. _first
may be updated whenever a new value is
appended to _values
, and must be updated whenever the
highest priority value is removed from _values
.
To emphasize: _first
is not the data with the
highest priority. The _first
attribute is just an index,
i.e. it is always an integer that tells us where in _values
the data with the highest priority lives.
As an example, Here is some simple string data stored in a Priority Queue. The top row contains the string, the second row contains the indexes:
_values | wombat | chicken | zebra | ptarmigan |
---|---|---|---|---|
indexes | 0 | 1 | 2 | 3 |
_first | 1 |
_first
is 1 because that is the index of the string with
the highest priority - "chicken" is the minimum of the string objects in
_values
.
We now append "hippo" to the end of _values
:
_values | wombat | chicken | zebra | ptarmigan | hippo |
---|---|---|---|---|---|
indexes | 0 | 1 | 2 | 3 | 4 |
_first | 1 |
_first
is unchanged because "chicken" still has a higher
priority than "hippo" (again, alphabetically "chicken" comes before
"hippo".)
We now append "aardvark" to the end of _values
:
_values | wombat | chicken | zebra | ptarmigan | hippo | aardvark |
---|---|---|---|---|---|---|
indexes | 0 | 1 | 2 | 3 | 4 | 5 |
_first | 5 |
_first
is updated to 5, because "aardvark" has a higher
priority than "chicken", i.e. "aardvark" < "chicken".
Note that we did not have to compare "aardvark" to all of the strings in
_values
, we only had to compare it to "chicken", i.e. the
data at index _first
.
Just to be 100% clear, assuming that _first
is not None
(as it is when the Priority Queue is empty), we reference the data with
the highest priority in the Priority Queue with self._values[self._first]
remove
def remove(self):
"""
-------------------------------------------------------
Removes and returns the highest priority value from the priority queue.
Use: value = pq.remove()
-------------------------------------------------------
Returns:
value - the highest priority value in the priority queue -
the value is removed from the priority queue. (?)
-------------------------------------------------------
"""
assert len(self._values) > 0, \
"Cannot remove from an empty priority queue"
# your code here
return value
To remove the item with the highest priority from the _values
list, use the Python list method pop
to remove the value at
_first
. Then search the remaining list for the value with
the highest priority, and update _first
to index it.
When data is removed from the Priority Queue you must update the value
of _first
by finding the index of the minimum
data in _values
. That index becomes the new value of _first
.
Finding the minimum value in a Python list is a CP104 task and we will
not go into greater detail on that.
This diagram displays the _values
and _first
of the array-based Priority Queue. Use and to see how it works:
Why go to the trouble of tracking the index of the highest priority
value with _first
?
When peeking at a Priority Queue, the method should return to the user a
copy of the value with the highest priority. This implies, that as with
the remove
method, the highest priority value must first be
found. A naive algorithm would look for the highest priority
value every time it was requested. With remove
there is no
choice but to do so, for once a value is removed from a Priority Queue,
at some point there must be a search for the next highest priority
value. However, this is clearly very inefficient if every time peek
is executed on a Priority Queue the highest priority value must be
searched for. To use the hospital metaphor again, it is as though a
doctor asks the Emergency Room nurse to tell them who the next patient
would be, and the nurse, having a terrible memory, has to triage all the
patients in the room every time the doctor asks. The equivalent of using
_first
is instead having the nurse note who the next
patient would be so that she can tell the doctor immediately without
having to perform the triage all over again. And when a new patient
arrives, the nurse has to compare their condition only against the
current highest priority patient to determine if the new patient has a
higher priority.
This non-naive algorithm leads to:
peek
method. When the Priority
Queue is asked for its highest priority item it can return a copy of
it immediately without having to search for it.
insert
method. The method
now has to determine if the item being inserted has a higher priority
than the item which currently has the highest priority, and keep track
of the appropriate one. But it only has to do a single comparison.
remove
method that is not much different in
complexity. remove
can immediately return the item with
the highest priority, but must then look in the remainder of _values
for the next item to be be tagged as the highest priority (_first
)
item.
Implementing this improved Priority Queue is an exercise for you.
In this lesson, we introduced the Priority Queue ADT and discussed different approaches to implement the Priority Queue ADT using arrays. We discussed how a Priority Queue can be implemented using Python lists and how Priority Queues can be used to solve problems. Later we in the course we will talk about another implementation of the Priority Queue ADT using a linked structure.