By the end of this lesson, you should be able to:
In this Lesson, we discuss in detail the linked Priority Queue data structure and more approaches to manipulating linked data structure nodes.
Becoming familiar with linked data structures can be difficult. This document takes you through the implementation of the Priority Queue insertion method as an example of how to think about a linked data structure.
Unlike the array-based Priority Queue, the linked Priority Queue stores its values in priority order. Keeping track of the node with the highest priority value as was done in the array-based Priority Queue does not work well in the linked Priority Queue, because what we would really need to track is the node previous to the node with the highest priority value in order to be able to remove that node and update the links properly. Storing the values in priority order means that the node with the highest priority is always at the front, and therefore easily removed, just like in the linked Queue. However, this means that inserting a new value into a linked Priority Queue means that we must insert it into the correct sorted position in the linked list. To insert a new node, then, we must first determine where it goes, then connect that new node with two other nodes - the preceding node, and the node that follows.
_PQ_Node
Class
This a simple singly-linked list node that we have seen many times.
class _PQ_Node:
def __init__(self, value, _next):
"""
-------------------------------------------------------
Initializes a Priority Queue node that contains a copy of value
and a link to the next node in the Priority Queue
Use: node = _PQ_Node(value, _next)
-------------------------------------------------------
Parameters:
value - value value for node (?)
_next - another Priority Queue node (_PQ_Node)
Returns:
a new Priority_Queue object (_PQ_Node)
-------------------------------------------------------
"""
self._value = copy.deepcopy(value)
self._next = _next
Priority_Queue
Class
__init__
Method
The Priority_Queue
class constructor is also quite simple
and very similar to the linked Queue class definition:
class Priority_Queue:
def __init__(self):
"""
-------------------------------------------------------
Initializes an empty Priority Queue.
Use: pq = Priority_Queue()
-------------------------------------------------------
Returns:
a new Priority_Queue object (Priority_Queue)
-------------------------------------------------------
"""
self._front = None
self._rear = None
self._count = 0
The attributes in the Priority Queue class constructor should be familiar:
_front
None
as the Priority
Queue starts out with no data.
_rear
None
as the Priority
Queue starts out with no data.
_count
When we work with a linked Priority Queue these are the only attributes
available to us. The _front
attribute is key to our
manipulation of the linked data structure in the priority queue. _front
tells us where the Priority Queue data is stored and how it is arranged.
Defining the insert
method deserves a section of its own,
and is covered next.
insert
The Priority Queue insertion method (insert
) adds one data
element (value
) to the Priority Queue. It must place value
in the proper spot in the linked data structure. The data structure is
ordered from highest priority to lowest priority, and by order of
insertion within priority (i.e. insertion is stable).
The first step is writing insert
is to clearly understand
its purpose and parameters. The method comments and skeleton are:
def insert(self, value):
"""
-------------------------------------------------------
A copy of value is inserted into the Priority Queue.
Values are stored in priority order.
Use: pq.insert(value)
-------------------------------------------------------
Parameters:
value - a data element (?)
Returns:
None
-------------------------------------------------------
"""
...
return
We now know: we have to work with a Priority Queue; a single piece of data; and the method does not have to return anything. Thus all changes to the Priority Queue are done internally.
We don't have to detail the insertion process in order to understand what a typical Priority Queue linked data structure looks like. We know that in the linked version the Priority Queue stores its data in priority order. In a simple example, we have the integer values, 1, 3, 5, and 7, stored in the Priority Queue. The smaller the value the higher the priority. The resulting linked data structure can be shown like this:
Given a new value we have to insert it into its proper position in the linked data structure. Depending on the priority of the data its new node could be at the front, rear, or interior of the linked data structure. We may (or may not) have to deal with each of these situations separately.
Assume for a moment then that we wish to insert the value 6 into the typical list above. This new node must go in between and be linked to the nodes containing 5 and 7.
To insert this new node, then, we must do the following:
Figure 2 shows how we can traverse the linked structure to find where
the new node should go. This is similar to the linked
traversal discussed earlier, but with the addition of an extra
link, prev
to be discussed shortly:
Each of these steps can be dealt with separately. First, we must find
where to put the new node. In general, a new value may end up as the
only node in the Priority Queue, it may end up at the beginning of the
Priority Queue, it may end up at the end of the Priority Queue, or it
may end up elsewhere in the Priority Queue. In general, if one of the
data structure links (in this case, _front
or _rear
)
have to be updated, it is a special case. Dealing with the special cases
first (only, first, or last node) usually involves simpler code than the
code necessary to place a new node in the middle of a Priority Queue.
Given that, we start with the special cases.
if self._front is None:
node = _PQ_Node(value, None)
self._front = node
self._rear = node
self._count += 1
The Priority Queue is empty when self._front
is None
,
self._rear
is None
, or self._count
is zero. Pick one of these approaches and use it consistently. In this
example we see if self._front
is None
. We end
up with a new node pointed to by both the front and rear links. Checking
for an empty Priority Queue is normally done first because that gets rid
of any worries in the rest of the cases as to whether there are nodes in
the Priority Queue.
Our new value has a higher priority than any other value in the Priority Queue, and thus must be added to the front of the Priority Queue.
elif value < self._front._value:
self._front = _PQ_Node(value, self._front)
self._count += 1
We know that there is at least one value in the Priority Queue, so we are safe to compare our new value against the value at the front of the Priority Queue. The new node created becomes the new first node of the Priority Queue.
Our new value has a lower priority than any other value in the Priority Queue, and thus must be added to the rear of the Priority Queue.
elif value >= self._rear._value:
node = _PQ_Node(value, None)
self._rear._next = node
self._rear = node
self._count += 1
Again, we know that there is at least one value in the Priority Queue,
so we are safe to compare our new value against the value at the rear of
the Priority Queue. The new node created becomes the new rear node of
the Priority Queue. First, we have to update the _next
link
of the old rear node, then update the rear node itself.
Clearly we must move through the list either iteratively (looping) or recursively to find where the new node belongs. We will use an iterative algorithm in this example similar to that of the linked traversal discussed earlier.
To write the loop we need to know where the loop starts and when it should stop. We know the loop must start at the front of the list because that is the only link to the list that we have. We will therefore set an initial condition based upon the front of the list:
curr
curr = self._front
We use the variable name curr
(current) for two
reasons - first, because it is a commonly used name in dealing with
linked data structures, and second because it is a good name to use when
referencing the 'current' node in the linked data structure that we are
looking at.
There may be other initial conditions, but we will see what they are later. Now let's look at the looping conditions. Experience in working with both array and linked structures tell us that we should stop moving through the list when we reach the end. We can now write:
curr = self._front
while curr is not None:
We also know that we need to stop when we reach the correct position in the list. We reach that position when our new value is less than the current value in the list. i.e. we keep looping as long as our new value is greater than or equal to the value in the current node. (We put the 'or equal to' in so that we have a stable insert.) The loop with the comparison now looks like:
curr = self._front
while value >= curr._value:
Now we have to put these two looping conditions together. Does the order
matter? Sometimes, no. Here, the order is very important. If curr
is None
then it cannot have a _value
element,
and any attempt to reference that element causes the program to die.
Thus it is key to check whether curr
actually points to a
node before we attempt to compare a value in that node. Our loop
becomes:
curr = self._front
while curr is not None and value >= curr._value:
To complete the loop we must update the value of curr
within the body of the loop or the loop never moves past the first node
(if there is one). curr
must be updated to point to the
next node in the linked data structure. The next node is pointed to by curr
's
_next
attribute:
curr
curr = self._front
while curr is not None and value >= curr._value:
curr = curr._next
The loop got us to the point that our curr
link references
the node containing the value that should follow our new value.
We can now create our new node:
node = _PQ_Node(value, curr)
So far, so good, but now we have a problem - we cannot connect the node
that contains the value 5 to the new node because we have lost track of
where that previous node is. We need an extra variable in our loop to
keep track of this previous node - a good (and obvious!) name for this
previous node is prev
. It must be initialized before the
loop start. We initialize it to None
because there is no
node previous to the front of the linked data structure. Because it is
designed to keep track of the node we just looked at, prev
should be updated to the old value of curr
on each
execution of the loop. Thus:
prev
to the Loop
prev = None
curr = self._front
while curr is not None and value >= curr._value:
prev = curr
curr = curr._next
This not only gets us to the correct spot in the list, but it allows us to update both links to the new node. The following code updates these links:
prev = None
curr = self._front
while curr is not None and value >= curr._value:
prev = curr
curr = curr._next
node = _PQ_Node(value, curr)
prev._next = node
self._count += 1
Why not use a for
loop or a loop condition based upon the
size of the linked data structure to loop through the nodes? After all,
_count
tells us how many nodes are in the linked data
structure.
for
allows for only one condition and loops all the
way to the end of the linked data structure. Given that we know there
are two reasons to stop looping, the for
loop is not much
use.
curr
and prev
anyways!
Having an extra variable is pointless when curr
can tell
us when we reach the end of the linked data structure just as well.
Strictly speaking, this loop does not need the case curr is not
None
, because we've already dealt with the case where our new
value is added to the end of the list. The loop therefore should never
reach None
.
Note that the Priority Queue count is updated under every condition, so let's move that outside of the condition structure altogether.
In the end we have the following method:
insert
Method
def insert(self, value):
"""
-------------------------------------------------------
A copy of value is inserted into the Priority Queue.
Values are stored in priority order.
Use: pq.insert(value)
-------------------------------------------------------
Parameters:
value - a data element (?)
Returns:
None
-------------------------------------------------------
"""
if self._front is None:
# Priority Queue is empty
node = _PQ_Node(value, None)
self._front = node
self._rear = node
elif value < self._front._value:
# New value has highest priority
node = _PQ_Node(value, self._front)
self._front = node
elif value >= self._rear._value:
# New values has lowest priority
node = _PQ_Node(value, None)
self._rear._next = node
self._rear = node
else:
# Find the proper position for value.
prev = None
curr = self._front
while value >= curr._value:
prev = curr
curr = curr._next
# Create the new node and link it to curr.
node = _PQ_Node(value, curr)
# The previous node is linked to the new node.
prev._next = node
# Increment the Priority Queue size.
self._count += 1
return
A similar, build the method piece-by-piece approach, can be used for any method for any of our data structures.
remove
Implementing the remove
method is left to you as an
exercise.
In this Lesson, we introduced the linked Priority Queue data structure and went into great detail of the thinking that goes into doing a sorted insertion into a linked data structure.