Depth-First vs Breadth-First Tree Traversal Using A While-Loop In Lucee CFML 5.3.3.62
As part of my post yesterday on how I sometimes replace recursive algorithms with loop-based algorithms in Lucee CFML, I mentioned that the algorithm incidentally shifted from a depth-first strategy to a breadth-first strategy. In a follow-up conversation that I had about the post with fellow InVisioneer, Kevin Conway, he mentioned that, if I wanted to, I could maintain a depth-first strategy by changing the way that I am queuing-up values for subsequent inspection. Given that this fact was not obvious to me, I wanted to quickly look at how the queuing approach changes the order of traversal. To explore this, I'm going to walk the nodes of an XML document in Lucee CFML 5.3.3.62.
In the following demo, I have a function, walkTree()
, which takes an XML document and a traversal strategy: depth-first
or breadth-first
. The walkTree()
function then iterates over the nodes in the XML document using the given strategy and outputting the name of each XML node as it is visited.
Just as with yesterday's post, the traversal is powered by an Array of values and a while
loop that keeps pull values off the front of the array until all the values have been exhausted. What sets this demo apart is how new values are added to the array as they are discovered. When using a depth-first strategy, new values are prepended to the array; and, when using a breadth-first strategy, new values are appended to the array:
<cfscript>
// Setup our Tree for demonstration purposes. Each node in this tree is named based
// on its depth and order within the tree structure.
xmlDom = xmlParse("
<a1>
<b1>
<c1>
<d1 />
</c1>
<c2>
<d2 />
<d3 />
<d4 />
</c2>
<c3 />
</b1>
<b2>
<c4 />
<c5>
<d5>
<e1 />
</d5>
</c5>
<c6>
<d6>
<e2 />
</d6>
</c6>
</b2>
<b3>
<c7 />
</b3>
</a1>
");
dump( xmlDom );
walkTree( xmlDom, "depth-first" );
walkTree( xmlDom, "breadth-first" );
// ---------------------------------------------------------------------------- //
// ---------------------------------------------------------------------------- //
/**
* I traverse the given XML tree using the given traversal strategy, outputting the
* name of each node as it is visited.
*
* @xmlDom I am the XML tree being traversed.
* @strategy I am the traversal strategy to use.
*/
public void function walkTree(
required xml xmlDom,
required string strategy
) {
echo( "<h2> Traversal Strategy: #strategy# </h2>" );
echo( "<p> Visited: " );
// Starting at the root node of the XML Tree, the traversal of the tree will be
// powered by this queue of nodes to visit. As we examine a given node, the
// children of said node will be added to this queue for subsequent inspection.
// The manner in which they are added will determine if the traversal is depth-
// first or breadth-first.
var nodesToVisit = [ xmlDom.xmlRoot ];
// Continue pulling nodes out of the queue until we have visited the entire tree.
while ( nodesToVisit.isDefined( 1 ) ) {
// SHIFT the next node off of the queue.
var node = nodesToVisit.first();
nodesToVisit.deleteAt( 1 );
// Output the node name of the current node.
echo( "#node.xmlName# " );
// If we're using a breadth-first strategy, all newly-discovered nodes need
// to be APPENDED to the END of the queue so that each layer of nodes can be
// fully visited before we get to the nodes at a lower depth.
if ( strategy == "breadth-first" ) {
nodesToVisit.append( node.xmlChildren, true );
// If we're using a depth-first strategy, all newly-discovered nodes need to
// be PREPENDED to the FRONT of the queue so that the first child of each
// node becomes the next node to be explored.
} else if ( strategy == "depth-first" ) {
nodesToVisit.prepend( node.xmlChildren.reverse(), true );
} else {
throw( type = "UnsupportedStrategy" );
}
}
echo( "</p>" );
}
</cfscript>
As you can see from the logic within the while
loop, if we're using a depth-first strategy, we use nodesToVisit.prepend()
to track new values; and, when we're using a breadth-first strategy, we use nodesToVisit.append()
to track new values.
NOTE: I have to use
.reverse()
on thexmlChildren
when using.prepend()
because of the way.prepend()
is implemented under the hood. I could have used.merge()
; but, I wanted to stick with.append()
and.prepend()
as I felt like they better illustrate the different strategies.
Now, when we run the ColdFusion code above, we get the following output:
As you can see, when we use the .prepend()
function to add new XML nodes to the front of the queue, we end up with a depth-first traversal of the XML document. And, when we use the .append()
function to add new XML nodes to the end of the queue, we end up with a breadth-first traversal.
At least for me, it wasn't immediately obvious as to why this works. As such, I think it is helpful to see the evolution of the internal nodesToExplore
queue based on the different tree traversal strategies.
Here's what the queue looks like over time with a .prepend()
driven depth-first strategy:
- [ a1 ]
- [ b1, b2, b3 ]
- [ c1, c2, c3, b2, b3 ]
- [ d1, c2, c3, b2, b3 ]
- [ c2, c3, b2, b3 ]
- [ d2, d3, d4, c3, b2, b3 ]
- [ d3, d4, c3, b2, b3 ]
- [ d4, c3, b2, b3 ]
- [ c3, b2, b3 ]
- [ b2, b3 ]
- [ c4, c5, c6, b3 ]
- [ c5, c6, b3 ]
- [ d5, c6, b3 ]
- [ e1, c6, b3 ]
- [ c6, b3 ]
- [ d6, b3 ]
- [ e2, b3 ]
- [ b3 ]
- [ c7 ]
- [ ]
And, here's what the queue looks like over time with an .append()
driven breadth-first strategy:
- [ a1 ]
- [ b1, b2, b3 ]
- [ b2, b3, c1, c2, c3 ]
- [ b3, c1, c2, c3, c4, c5, c6 ]
- [ c1, c2, c3, c4, c5, c6, c7 ]
- [ c2, c3, c4, c5, c6, c7, d1 ]
- [ c3, c4, c5, c6, c7, d1, d2, d3, d4 ]
- [ c4, c5, c6, c7, d1, d2, d3, d4 ]
- [ c5, c6, c7, d1, d2, d3, d4 ]
- [ c6, c7, d1, d2, d3, d4, d5 ]
- [ c7, d1, d2, d3, d4, d5, d6 ]
- [ d1, d2, d3, d4, d5, d6 ]
- [ d2, d3, d4, d5, d6 ]
- [ d3, d4, d5, d6 ]
- [ d4, d5, d6 ]
- [ d5, d6 ]
- [ d6, e1 ]
- [ e1, e2 ]
- [ e2 ]
- [ ]
Seeing the "state machine" of the queue over time helps me understand why it ends up working the way that it does.
I don't do a lot of Tree traversal in my day-to-day ColdFusion programming. But, the Tree traversal wasn't really the point of this post - it was only a context in which we could explore the way that different queue mechanics affect our ability to emulate different styles of recursive data consumption without a deep call stack. And this will definitely be helpful in my Lucee CFML application programming.
Want to use code from this post? Check out the license.
Reader Comments
@All,
On a related note, I just published an article about using a Closure to separate the recursive logic from the consumption logic:
www.bennadel.com/blog/3755-using-a-closure-to-encapsulate-depth-first-tree-traversal-in-lucee-cfml-5-3-3-62.htm
In my earlier examples, the recursion and the consumption were all tightly commingled / coupled together. But, by passing-in a Closure, we can create a reusable algorithm with very little performance overhead.