ColdFusion 10 - XmlSearch() And XmlTransform() Now Support XPath 2.0

By Ben Nadel

Published 2012-02-28 in ColdFusion — Comments (5)

In today's world, we don't often work with XML; the majority of data exchange is done using JavaScript Object Notation (JSON). Even APIs that support both XML and JSON seem to be dropping XML support in their roadmap (I know this from personal experience). That said, XML is still a data type that will inevitably be a part of our lives for some time. That's why it's actually kind of exciting that ColdFusion 10 now supports XPath 2.0 in the xmlSearch() and xmlTransform() functions.

NOTE: At the time of this writing, ColdFusion 10 was in public beta.

I don't pretend to be an expert on XPath or XSLT (Extensible Stylesheet Language Transformations); so, rather than try to explain the differences between the versions of XPath, I figured I would just demonstrate some of the functionality that is now available in ColdFusion 10. In the following code, I'm creating a simple XML document and then using xmlSearch() to gather various nodes. I try to explain what's going on in the comments.

  
          <!---
        
          	Create an XML document on which to test new XPath 2.0
        
          	functionality support.
        
          --->
        
          <cfxml variable="bookData">
        
          	<books>
        
          		<book id="101" rating="4.5">
        
          			<title>Muscle: Confessions of an Unlikely Bodybuilder</title>
        
          			<author>Samuel W. Fussell</author>
        
          			<published>August 1, 1992</published>
        
          			<isbn>0380717638</isbn>
        
          		</book>
        
          		<book id="201" rating="4">
        
          			<title>The Fountainhead</title>
        
          			<author>Ayn Rand</author>
        
          			<published>November 1, 1994</published>
        
          			<isbn>0452273331</isbn>
        
          		</book>
        
          		<book id="301" rating="4.5">
        
          			<title>It Was On Fire When I Lay Down On It</title>
        
          			<author>Robert Fulghum</author>
        
          			<isbn>0804105820</isbn>
        
          		</book>
        
          	</books>
        
          </cfxml>
        
          <!--- Groovy - now let's execute some XML Path queries. --->
        
          <cfscript>
        
          	// Get all of the ratings that are greater than or equal to 4.5.
        
          	results = xmlSearch(
        
          		bookData,
        
          		"//book/@rating[ number( . ) > 4.0 ]"
        
          	);
        
          	// Get the average rating of the reviews.
        
          	results = xmlSearch(
        
          		bookData,
        
          		"avg( //book/@rating )"
        
          	);
        
          	// Get a compoud result of the Title and Author notdes. Notice
        
          	// that we can now create divergent results in the SAME path.
        
          	// We don't need to create two completely different paths.
        
          	results = xmlSearch(
        
          		bookData,
        
          		"//book/( title, author )"
        
          	);
        
          	// Get all of the book's children EXCEPT for the ISBN number.
        
          	// XPath 2.0 introduces some intesting operators like "except",
        
          	// "every", "some", etc.
        
          	results = xmlSearch(
        
          		bookData,
        
          		"//book/( * except isbn )"
        
          	);
        
          	// XPath 2.0 now uses sequences instead of node-sets which allow
        
          	// for more interesting data combinations. This only gets the
        
          	// nodes from one collection that are NOT in the other collection.
        
          	// We're using inline branching and merging!
        
          	results = xmlSearch(
        
          		bookData,
        
          		"//book/( (title, published) except (isbn, published) )"
        
          	);
        
          	// Get all of the ISBN numbers that use a 10-digit ISBN. XPath
        
          	// 2.0 now supports regular exprdssion functions like matches(),
        
          	// replace(), and tokenize() -- thought it is quicky and a
        
          	// bit limited in patterns.
        
          	results = xmlSearch(
        
          		bookData,
        
          		"//book/isbn[ matches( text(), '^\d{10}$' ) ]"
        
          	);
        
          	// Iterate over one collection and map it onto the resultant
        
          	// collection. We can now iterate inline within a path.
        
          	results = xmlSearch(
        
          		bookData,
        
          		"for $b in (//book) return ( $b/published )"
        
          	);
        
          	// We can now pass in params into our xmlSeach() calls. Notice
        
          	// that the key, "title" is quoted - that is because XPATH is
        
          	// case-sensitive.
        
          	results = xmlSearch(
        
          		bookData,
        
          		"//book/title[ . = $title ]",
        
          		{
        
          			"title": "The Fountainhead"
        
          		}
        
          	);
        
          	// Get the given book, no matter what the casing. FINALLY, we
        
          	// can case-insensitive searching in XML :)
        
          	results = xmlSearch(
        
          		bookData,
        
          		"//book[ upper-case( title ) = 'THE FOUNTAINHEAD' ]"
        
          	);
        
          	// Debug the results.
        
          	writeDump( results );
        
          </cfscript>

view raw xpath.cfm hosted with ❤ by GitHub

From what I've read about the functionality in XPath 2.0, the biggest upgrades seem to be the use of sequences over node-sets and the use of inline path branching and logic. At a very practical level, XPath 2.0 simply supports more functions like lower-case() and upper-case() for case-insensitive matching - something many people have asked for in previous versions of ColdFusion.

Oh, and XPath 2.0 now supports Regular Expression matching as well - yeah boyyyyyy!

Well, that's probably about as much excitement as I can squeeze out of searching XML documents in ColdFusion 10. That is, of course, until you realize that ColdFusion 10 can now parse HTML... but more to come on that shortly.

Want to use code from this post? Check out the license.

Short link: https://bennadel.com/2340

Reader Comments

Ben Nadel Feb 28, 2012 at 9:58 AM

15,983 Comments

@All,

And here's part of why XML is getting more exciting in ColdFusion 10 - we can now "easily" convert dirty HTML into valid XML documents:

www.bennadel.com/blog/2341-ColdFusion-10-Parsing-Dirty-HTML-Into-Valid-XML-Documents.htm

Due to the JAR files that now ship with ColdFusion 10 (ie. TagSoup), we have now have built-in Java classes that facilitate this kind of parsing.

Steve Bryant Feb 28, 2012 at 9:59 AM

59 Comments

It all looked good until you added the part about regular expression support. That really put it over the edge to greatness!

Ben Nadel Feb 28, 2012 at 10:04 AM

15,983 Comments

@Steve,

Heck yeah! Regular expressions are always groovy :) Unfortunately, it looks like the "\b" word-boundary construct is not supported, which I only realized because it was the first thing I tried. They have a slightly different notation for some things, which I haven't gone through yet.

http://www.w3.org/TR/xmlschema-2/#regexs

But, good to know that it's there.

Steve Bryant Feb 28, 2012 at 4:05 PM

59 Comments

That is disappointing omission. Still, I guess some regular expression support is a major improvement over no regular expression support at all.

ahmed Dec 4, 2013 at 7:53 AM

1 Comments

Please provide the explanation of xml transfom in Cold Fusion

Oh my chickens, this post is old!

Hit me up on Twitter if you want to discuss it further.

	<!---
	Create an XML document on which to test new XPath 2.0
	functionality support.
	--->
	<cfxml variable="bookData">

	<books>
	<book id="101" rating="4.5">
	<title>Muscle: Confessions of an Unlikely Bodybuilder</title>
	<author>Samuel W. Fussell</author>
	<published>August 1, 1992</published>
	<isbn>0380717638</isbn>
	</book>
	<book id="201" rating="4">
	<title>The Fountainhead</title>
	<author>Ayn Rand</author>
	<published>November 1, 1994</published>
	<isbn>0452273331</isbn>
	</book>
	<book id="301" rating="4.5">
	<title>It Was On Fire When I Lay Down On It</title>
	<author>Robert Fulghum</author>
	<isbn>0804105820</isbn>
	</book>
	</books>

	</cfxml>


	<!--- Groovy - now let's execute some XML Path queries. --->
	<cfscript>


	// Get all of the ratings that are greater than or equal to 4.5.
	results = xmlSearch(
	bookData,
	"//book/@rating[ number( . ) > 4.0 ]"
	);


	// Get the average rating of the reviews.
	results = xmlSearch(
	bookData,
	"avg( //book/@rating )"
	);


	// Get a compoud result of the Title and Author notdes. Notice
	// that we can now create divergent results in the SAME path.
	// We don't need to create two completely different paths.
	results = xmlSearch(
	bookData,
	"//book/( title, author )"
	);


	// Get all of the book's children EXCEPT for the ISBN number.
	// XPath 2.0 introduces some intesting operators like "except",
	// "every", "some", etc.
	results = xmlSearch(
	bookData,
	"//book/( * except isbn )"
	);


	// XPath 2.0 now uses sequences instead of node-sets which allow
	// for more interesting data combinations. This only gets the
	// nodes from one collection that are NOT in the other collection.
	// We're using inline branching and merging!
	results = xmlSearch(
	bookData,
	"//book/( (title, published) except (isbn, published) )"
	);


	// Get all of the ISBN numbers that use a 10-digit ISBN. XPath
	// 2.0 now supports regular exprdssion functions like matches(),
	// replace(), and tokenize() -- thought it is quicky and a
	// bit limited in patterns.
	results = xmlSearch(
	bookData,
	"//book/isbn[ matches( text(), '^\d{10}$' ) ]"
	);


	// Iterate over one collection and map it onto the resultant
	// collection. We can now iterate inline within a path.
	results = xmlSearch(
	bookData,
	"for $b in (//book) return ( $b/published )"
	);


	// We can now pass in params into our xmlSeach() calls. Notice
	// that the key, "title" is quoted - that is because XPATH is
	// case-sensitive.
	results = xmlSearch(
	bookData,
	"//book/title[ . = $title ]",
	{
	"title": "The Fountainhead"
	}
	);


	// Get the given book, no matter what the casing. FINALLY, we
	// can case-insensitive searching in XML :)
	results = xmlSearch(
	bookData,
	"//book[ upper-case( title ) = 'THE FOUNTAINHEAD' ]"
	);


	// Debug the results.
	writeDump( results );


	</cfscript>