Lesson 4: The Priority Queue

Topics

This Lesson introduces our next ADT, the Priority Queue. The topics include:

  1. The Priority Queue ADT
  2. Using the Priority Queue
  3. Implementation of the Priority Queue ADT
  4. Problem solving and code development with the Priority Queue

Required Tasks

  1. Complete the tasks of Lab 3.
  2. Complete and submit Assignment 4.

Learning Outcomes

By the end of this lesson, you should be able to

Introduction

In this lesson, we are going to introduce the Priority Queue ADT. First, we discuss the concept of a Priority Queue, how it works, what its interface is, and how it could be used for solving a problem. Then we get into the code and see how we can develop and use a Priority Queue in Python.

The Priority Queue ADT

A priority queue is a collection of prioritized values. (It was discussed briefly in an earlier lecture.)

An emergency waiting room is a real-world example of a priority queue. Patients go through triage (the process of determining the priority of patients' treatments based on the severity of their condition) before being seen by a doctor. Those with the worst conditions get highest priority. If two or more patients have equally bad conditions the patients are seen in the order in which they arrived at the emergency waiting room.

The following diagram shows patients queueing for an emergency ward. The order they enter the priority queue is irrelevant, and how they are organized once in the waiting area is unimportant, what counts is the order in which they are served.

A Priority Queue in Action
(My thanks to the Wong-Baker Faces Foundation for the inspiration.)

Priority Queue Methods

initialize()
creates an empty Priority Queue
insert(value)
adds a copy of value to the Priority Queue
remove()
removes and returns the value with the highest priority value in the Priority Queue, if one exists. Elements of equal priority are removed in the order in which they are added to the Priority Queue
peek()
returns a copy of the value with the highest priority value in the Priority Queue without removing it, if one exists
is_empty()
returns True if the Priority Queue is empty, False otherwise
is_full()
returns True if the Queue is full, False otherwise
len()
returns the number of values in the Priority Queue

Any implementation of a Priority Queue must define these methods.

Priority Queue Applications

Simulation
Priority Queues, like regular FIFO Queues mentioned before, are often used to simulate real-life situations that involve some kind of line-ups in which priority is an important consideration.
Searching and Traversing
Like stacks, and queues, we can use queues to search another data structure. However, a priority queue can sometimes help find better solutions than a simple stack or queue.

Priority Queue Implementations

Priority Queue Implementation Choices

Conceptually, how the contents of a Priority Queue are stored is not important. However, as programmers and software engineers, at some point we have to make a choice of implementation strategy. The fact that values must be removed from a priority queue in priority order puts constraints on how we can store Priority Queue contents. (We can't always rely on leprechauns doing the work for us!)

There are two ways to store the contents of a Priority Queue: sorted and unsorted.

Sorted Contents

Advantages
Removal is simple. Items are sorted by priority so simply remove the item at the front of the storage list.
Disadvantages
Insertion is difficult. Items must be put into their proper place in the sorted list when added to the queue.

A linked structure is convenient for this implementation.

Unsorted Contents

Advantage
Insertion is simple. Just add the item to the end of the storage list.
Disadvantage
Removal is difficult. It is necessary to find the item with the highest priority before it can be removed.

An array-based structure is convenient for this implementation.

Note that the description of the unsorted contents approach is rather naive. If we are only inserting and removing items from a Priority Queue we understand that we have to look for the item with the highest priority and then close up the empty space just made in the storage list. This is no more complex than inserting items in priority order in the sorted contents approach. However, what happens in a situation where peek is used far more often than remove? Must we constantly search for the highest priority item whenever a user wants to know what it is? This doesn't seem very efficient. We will suggest an answer to this question when we exam the array-base Priority Queue in detail.

Handling Priority

In the CP164 approach to Priority Queues, priority is not determined by some magic value separate from the data, priority is determined by the data itself. In short, whatever comes first has highest priority. A metaphor that may be helpful to remember is the phrase, We're Number One! Being Number One is impressive; shouting, We're Number Forty-Five! impresses no one.

Thus, for integers, the lower the integer, the higher the priority. 1 has a higher priority than 3, which has a higher priority than 8, which has a higher priority then 53, and so on.

For all data, a simple < (less than) comparison is sufficient to determine what has higher priority. For integers, 1 < 3. For the Food class, as an example, a Food object with the name "Butter Chicken" has a higher priority than a Food object with the name "Lasagna" because if we compare those two Food objects, the string "Butter Chicken" is less than "Lasagna". (And that's just because a string starting with "B" comes before a string starting with "L".) We don't have to do anything special just because the data is a Food object. (Note that we are ignoring the rest of the attributes of the Food class for simplicity's sake.)

The emphasis here is that a simple < comparison between data is sufficient. There are no special or external values associated with the data that determine the priority. Just compare the data.

Array-Based Priority Queue

The general notes about array-based implementations for Stacks and Queues also apply to a Priority Queue.

As noted in the discussion of the Priority Queue ADT, an array-based approach to implementing a Priority Queue is best used for an unsorted contents approach. (Note: although it is true that Python has built-in sorting functionality for its list objects, we cannot assume therefore that something magical is happening in the background. Even though we didn't write the code and see the work going on under the hood, that work goes on nonetheless. Therefore, simply saying, Meh, let Python take care of it, is not an acceptable answer.) We will look at the basic idea behind using an array-based solution, then see if we can improve on it.

Implementing the Array-Based Priority Queue

The following code shows how to implement part of the array-based Priority Queue in Python. The rest is left to you as an exercise for labs and assignments.

Initializing an Array-Based Queue

Priority Queue __init__
    
from copy import deepcopy

class Priority_Queue:

    def __init__(self):
        """
        -------------------------------------------------------
        Initializes an empty priority queue.
        Use: pq = Priority_Queue()
        -------------------------------------------------------
        Returns:
            a new Priority_Queue object (Priority_Queue)
        -------------------------------------------------------
        """
        self._values = []
        self._first = None
        return

  

So far, the Priority Queue initialization is very simple: the Priority Queue data is held in a Python list attribute named _values, and that attribute is initialized as an empty list.

Inserting Items Into an Array-Based Queue

Priority Queue insert
    
    def insert(self, value):
        """
        -------------------------------------------------------
        A copy of value is appended to the end of the priority queue
        Python list, and _first is updated as appropriate to the index of
        value with the highest priority.
        Use: pq.insert(value)
        -------------------------------------------------------
        Parameters:
            value - a data element (?)
        Returns:
            None
        -------------------------------------------------------
        """
        self._values.append(deepcopy(value))

        # Determine whether new value has highest priority.

        # your code here

        return

  

insert simply uses the Python list method append to add value to the end of the _values list. If the new values has a higher priority than the current value indexed by _first, then _first must be updated to index this new value. Since the new value was appended to the list, _first must be updated to the length of _values-1.

_first tracks the index of the value in _values with the highest priority. It is initialized to None when the Priority Queue is first constructed in order to make clear there are no values in the Priority Queue at all, and thus no value of highest priority. _first may be updated whenever a new value is appended to _values, and must be updated whenever the highest priority value is removed from _values.

To emphasize: _first is not the data with the highest priority. The _first attribute is just an index, i.e. it is always an integer that tells us where in _values the data with the highest priority lives.

As an example, Here is some simple string data stored in a Priority Queue. The top row contains the string, the second row contains the indexes:

_values wombat chicken zebra ptarmigan
indexes 0 1 2 3
_first 1

_first is 1 because that is the index of the string with the highest priority - "chicken" is the minimum of the string objects in _values.

We now append "hippo" to the end of _values:

_values wombat chicken zebra ptarmigan hippo
indexes 0 1 2 3 4
_first 1

_first is unchanged because "chicken" still has a higher priority than "hippo" (again, alphabetically "chicken" comes before "hippo".)

We now append "aardvark" to the end of _values:

_values wombat chicken zebra ptarmigan hippo aardvark
indexes 0 1 2 3 4 5
_first 5

_first is updated to 5, because "aardvark" has a higher priority than "chicken", i.e. "aardvark" < "chicken".

Note that we did not have to compare "aardvark" to all of the strings in _values, we only had to compare it to "chicken", i.e. the data at index _first.

Just to be 100% clear, assuming that _first is not None (as it is when the Priority Queue is empty), we reference the data with the highest priority in the Priority Queue with self._values[self._first]

Removing Items from an Array-Based Priority Queue

Priority Queue remove
    
    def remove(self):
        """
        -------------------------------------------------------
        Removes and returns the highest priority value from the priority queue.
        Use: value = pq.remove()
        -------------------------------------------------------
        Returns:
            value - the highest priority value in the priority queue -
                the value is removed from the priority queue. (?)
        -------------------------------------------------------
        """
        assert len(self._values) > 0, \
            "Cannot remove from an empty priority queue"

        # your code here

        return value

  

To remove the item with the highest priority from the _values list, use the Python list method pop to remove the value at _first. Then search the remaining list for the value with the highest priority, and update _first to index it.

When data is removed from the Priority Queue you must update the value of _first by finding the index of the minimum data in _values. That index becomes the new value of _first. Finding the minimum value in a Python list is a CP104 task and we will not go into greater detail on that.

This diagram displays the _values and _first of the array-based Priority Queue. Use and to see how it works:

The Priority Queue

Peeking at a Priority Queue

Why go to the trouble of tracking the index of the highest priority value with _first?

When peeking at a Priority Queue, the method should return to the user a copy of the value with the highest priority. This implies, that as with the remove method, the highest priority value must first be found. A naive algorithm would look for the highest priority value every time it was requested. With remove there is no choice but to do so, for once a value is removed from a Priority Queue, at some point there must be a search for the next highest priority value. However, this is clearly very inefficient if every time peek is executed on a Priority Queue the highest priority value must be searched for. To use the hospital metaphor again, it is as though a doctor asks the Emergency Room nurse to tell them who the next patient would be, and the nurse, having a terrible memory, has to triage all the patients in the room every time the doctor asks. The equivalent of using _first is instead having the nurse note who the next patient would be so that she can tell the doctor immediately without having to perform the triage all over again. And when a new patient arrives, the nurse has to compare their condition only against the current highest priority patient to determine if the new patient has a higher priority.

This non-naive algorithm leads to:

Implementing this improved Priority Queue is an exercise for you.

Summary

In this lesson, we introduced the Priority Queue ADT and discussed different approaches to implement the Priority Queue ADT using arrays. We discussed how a Priority Queue can be implemented using Python lists and how Priority Queues can be used to solve problems. Later we in the course we will talk about another implementation of the Priority Queue ADT using a linked structure.

References

  1. Data Structures and Algorithm Analysis in C++, Mark Allen Weiss. 2006.