In this lesson, we are going to introduce linked-list data structure. Topics include:
By the end of this lesson, you should be able to:
In this lesson, we discuss linked-data structures where the data elements are placed in different places in memory and are linked together. Specifically, we are going to introduce and discuss in detail the linked Stack data structure.
So far we have worked with array-based data structures. Our Stack, Queue, and List ADTs have been implemented with the arrays or Python lists that allows us to access individual elements with an index value. Linked data structures are based upon nodes. A node typically consists of two parts:
Figure 1 shows a node made up of the elements:
_value
_next
_next
part of the previous node contains the address
of the next node.
Figure 2 shows a series of nodes linked together. The elements of the Figure are:
v
_value
, or data, part of a node, as noted in Figure
1.
n
_next
, or link, part of a node, as noted in Figure 1.
a
(a)
_front
, or _top
, or
something else appropriate to the data structure. We have to know
where the first node of the data structure resides or we will lose
track of it.
None
.
If there is no data, there is no node.
None
Depending on the number of links in a node and the way nodes point to each other, we may have a linear linked structure or a non-linear linked data structure. In this course, we will be talking about linear linked data structures such as linked lists and non-linear linked data structures such as trees.
In the next section we will look specifically at the linked implementation of the Stack.
A linked Stack is a linear linked data structure, as each node in a Stack links to one other node. We must keep track of the first node in a Stack, and call it the top node.
Given a node:
how can we construct a Stack from a linked series of these nodes?
Figure 2 displays the workings of the linked Stack. Use and to see how it works:
_Stack_Node
Class
A Stack node needs nothing beyond the basic _value
and _next
attributes already discussed. There is no use for an empty node. The _Stack_Node
class defines this node and a constructor to create a node. There are no
other methods in the _Stack_Node
class as the Stack
class takes care of manipulating the nodes. Thus the _Stack_Node
class code is very simple:
_Stack_Node
Class
class _Stack_Node:
def __init__(self, value, next_):
"""
-------------------------------------------------------
Initializes a stack node that contains a copy of value
and a link to the next node in the stack.
Use: node = _Stack_Node(value, next_)
-------------------------------------------------------
Parameters:
value - value for node (?)
next_ - another Stack node (_Stack_Node)
Returns:
a new _Stack_Node object (_Stack_Node)
-------------------------------------------------------
"""
self._value = deepcopy(value)
self._next = next_
Note that the name of the _Stack_Node
class is prefixed
with an underscore (_
). This means that the entire class is
private. Its definition must be included in the same source code file as
the Stack
class. _Stack_Node
should never be
referenced by an import
statement because nothing other
than Stack
should ever access it.
Values are deepcopied into a node for exactly the same reasons they are copied into the array-based Stack list, i.e. to isolate the values from the outside world.
The parameter next_
is named with a trailing _
(underscore) because next
is a Python keyword, and we
shouldn't use Python keywords as attribute or variable names.
A linked Stack is initialized with:
__init__
Method
class Stack:
def __init__(self):
"""
-------------------------------------------------------
Initializes an empty stack. Values are stored in a
linked structure.
Use: stack = Stack()
-------------------------------------------------------
Returns:
a new Stack object (Stack)
-------------------------------------------------------
"""
self._top = None
The Stack initialization is simple: the private attribute _top
links to the first _Stack_Node
in the stack. Because the
stack is empty, it contains no _Stack_Node
objects, thus _top
is initialized to None
. The real work comes in pushing a
node onto or popping a node from a stack. (Defining the is_empty
method is trivial and left to you as an exercise.)
push
Method
Pushing a value onto a linked Stack means creating a new node, adding the value to it, and making this new node the top of the Stack.
push
Method
def push(self, value):
"""
-------------------------------------------------------
Pushes a copy of value onto the top of the stack.
Use: stack.push(value)
-------------------------------------------------------
Parameters:
value - value to be added to stack (?)
Returns:
None
-------------------------------------------------------
"""
self._top = _Stack_Node(value, self._top)
return
push
creates a new _Stack_Node
object and adds
a value to it. The _next
link of the _Stack_Node
object is set to point to the 'old' top of the stack. When a Stack is
empty, the value of _top
is None
, otherwise it
links to whatever _Stack_Node
is currently the top node.
The 'new' top of the stack is linked to this new node. This is analogous
to updating a sum
variable by incrementing its previous
value with sum = sum + n
, which we have done
many times.
Note that there are no special cases to consider when pushing a new value onto the stack. The new node always becomes the top node whether the stack is empty or not.
pop
Method
Popping a value from a linked stack means extracting the value from the
top node then removing that node. The stack's _top
pointer
must now point to the next node in the stack. This is easy: get the
current top node's link to its next node by referring to its _next
attribute, then making that node the new top. This is the reverse of push
.
pop
Method
def pop(self):
"""
-------------------------------------------------------
Pops and returns the top of stack. The value is removed
from the stack. Attempting to pop from an empty stack
throws an exception.
Use: value = stack.pop()
-------------------------------------------------------
Returns:
value - the value at the top of stack (?)
-------------------------------------------------------
"""
assert self._top is not None, "Cannot pop from an empty stack"
value = self._top._value
self._top = self._top._next
return value
combine
Method
You may have been asked to write methods or functions to combine two
Stacks into one. The function version stack_combine
uses
the ADT methods to combine the Stacks. Because the ADT methods look
identical from the outside for both the array-based and linked
implementations of the Stack, that function can use the linked Stack
representation without change - you would simply import
the
linked implementation of the Stack rather than the array-based
implementation.
The method that is defined as part of the Stack class extends
the Stack ADT. Although the method signatures (the def
lines of the methods) are identical, the implementation details of
between the two Stack implementations (array-based and linked) are
radically different. However, their general algorithms is not. It
requires looping through the two source Stacks moving data to the target
Stack until one or the other of the source Stacks is exhausted. Then the
algorithm loops through the remaining source Stack (if any) moving data
to the target Stack until the remaining source Stack is exhausted:
combine
def combine(self, source1, source2):
"""
-------------------------------------------------------
Combines two source Stacks into the current target Stack.
When finished, the contents of source1 and source2 are
interlaced into target and source1 and source2 are empty.
Use: target.combine(source1, source2)
-------------------------------------------------------
Parameters:
source1 - an array-based Stack (Stack)
source2 - an array-based Stack (Stack)
Returns:
None
-------------------------------------------------------
"""
while len(source1._values) > 0 and len(source2._values) > 0:
self._values.append(source1._values.pop())
self._values.append(source2._values.pop())
while len(source1._values) > 0:
self._values.append(source1._values.pop())
while len(source2._values) > 0:
self._values.append(source2._values.pop())
return
The key things to understand about this algorithm:
_values
directly and
does not call any high level methods such as is_empty
.
deepcopy
is not
required because the data only lives in one Stack at a time.
combine
Method
The linked version of the algorithm is implemented as:
combine
def combine(self, source1, source2):
"""
-------------------------------------------------------
Combines two source Stacks into the current target Stack.
When finished, the contents of source1 and source2 are interlaced
into target and source1 and source2 are empty.
Use: target.combine(source1, source2)
-------------------------------------------------------
Parameters:
source1 - a linked Stack (Stack)
source2 - a linked Stack (Stack)
Returns:
None
-------------------------------------------------------
"""
while source1._top is not None and source2._top is not None:
self._move_top_to_top(source1)
self._move_top_to_top(source2)
while source1._top is not None:
self._move_top_to_top(source1)
while source2._top is not None:
self._move_top_to_top(source2)
return
The key things to understand about this algorithm:
_top
directly and does
not call any high level methods such as is_empty
.
_move_top_to_top
.
Note that we are going to the trouble of moving the top nodes of each source Stack to the target Stack one at a time, because we are alternating the contents of the source Stacks into the target Stack.
How, then, does _move_top_to_top
actually shift nodes from
one Stack to another? It performs the equivalent of self.push(source.pop())
,
but without copying that data that calling these methods would involve.
_move_top_to_top
def _move_top_to_top(self, source):
"""
-------------------------------------------------------
Moves the top node from the source Stack to the target Stack.
The target Stack contains the old top node of the source Stack.
The source Stack top is updated. Equivalent of
self.push(source.pop()), but moves nodes, not data.
Use: target._move_top(source)
-------------------------------------------------------
Parameters:
source - a linked Stack (Stack)
Returns:
None
-------------------------------------------------------
"""
assert source._top is not None, "Cannot move the top of an empty Stack"
temp = source._top
source._top = source._top._next
temp._next = self._top
self._top = temp
return
Note that the reference to target is generic and refers to how
a Stack variable might be referenced in a program - within the actual
helper method self
is the target Stack.
The assert
statement makes sure we don't attempt to move a
node from an empty Stack, which makes no sense in any case.
The key difference between the two approaches is that in the array-based
approach the actual node data is moved/copied from one memory location
to another. In the linked version it is the node connections that
are changed and the data is left where it is in memory. This is a
significant improvement over the work done by the array-based
implementation, not in terms of Big-O, which is the same, but in terms
of work done because copying data is more more expensive than changing
addresses. Remember that the _top
attribute simply stores
the address of the first node of a Stack. This address is just an
integer, and updating the value of an integer is significantly less work
than copying a chunk of Student, or Food, or Movie data.
How, then, do we move nodes? What does this look like? The actual work
is done by the following lines in the _move_top_to_top
above contains:
temp = source._top
source._top = source._top._next
temp._next = self._top
self._top = temp
Use and to see how this works:
While moving a Stack top, both temp._next
and source._top
refer to the same node. Remember that we can have multiple references to
an object in memory.
The order in which these lines are executed is key to moving a node
correctly. We must make sure that at all times the node we are moving
has a reference, and that the next nodes are referenced as well. Without
the temp
variable we would lose access to one of these
nodes.
Once the top is moved, we no longer care what happens to temp
.
In the combine algorithm, it is used repeatedly to move nodes, and when
the method ends, it disappears, leaving the combined Stack behind.
Moving nodes between linked data structures is a critical concept in understanding the efficient use of linked data structures. We will look at moving nodes in other data structures in later Lessons.
is
and None
None
The Stack
class definition use None
to mark
the end of the Stack, and our sample code looks for this None
to determine when the end of a Stack is reached. The code uses the is
operator rather than ==
to compare to None
.
Why?
The is
operator is Python's way of allowing you to ask the
question, Are these two objects the same object?
, which is not
the same as asking whether the contents of two objects are equal. It is
particularly important to use when working with None
. Here
are some examples:
None
Examples
f1 = Food('Ravioli', 7, False, 246)
f2 = Food('Ravioli', 7, False, 246)
f3 = f1
x = None
y = None
print(f"f1 == f2: {f1 == f2}")
print(f"f1 is f2: {f1 is f2}")
print(f"f1 is None: {f1 is None}")
print(f"f1 is f3: {f1 is f3}")
print(f"x is y: {x is y}")
print(f"f1 == None: {f1 == None}")
and the results are:
f1 == f2: True f1 is f2: False f1 is None: False f1 is f3: True x is y: True Traceback (most recent call last): File "test.py", line 38, in <module> print("f1 == None: {}".format(f1 == None)) File "Food.py", line 105, in __eq__ result = (self.name.lower(), self.origin) == ( target.name.lower(), target.origin) AttributeError: 'NoneType' object has no attribute 'name'
The first thing to understand is that there is one and only one None
object. Every reference to None
is to the same None
.
Thus None
is a very useful placeholder that we will use a
lot in linked data structures. It is a way of clearly
illustrating that whatever we are referring to has no value.
Code Listing 1 illustrates that:
==
) between objects is determined by the
definition of __eq__
.
is
are not the same thing.
None
is None
no matter what variable name
it is assigned to.==
to compare to None
doesn't work
with complex classes. (It can work with simple things like numbers and
strings.)
The rule that arises out of this: always use the is
operator when comparing against None
. Even if equivalency
works with simple items, consistency is almost always a better approach.
In this Lesson, we introduced linked data structures including nodes and
how nodes contain data and are linked to other nodes to form a data
structure. We discussed the linked Stack in particular, and provided
code to demonstrate how a linked data structure can be initialized and
updated. We examined how to move nodes between Stacks without copying
data. We look at the role of None
in linked data
structures.