Converting A Full CSS Selector To XPath Using ColdFusion
Now that we have a ColdFusion user defined function that converts a single element CSS selector to XPath, we can build on that foundation to convert a full CSS selector to XPath. Really, this is a rather small jump; all we have to do is handle the element delimiters and our previous UDF will take care of the heavy lifting. When it comes to descendent selection in CSS, I am only going to support two different kinds at this time:
- space = Any descendent selector
- > = Direct descendent selector (child)
I know that CSS can handle more than that (depending on the browser), but since we are keeping things simple for now, I am only going to think about these two common types. In terms of XPath syntax, these two relationships are quite easy to map:
- space ==> // (any descendent)
- > ==> / (direct descendent)
Ok so, keeping in mind that I have already defined the CSSElementSelectorToXPath() UDF, I am now defining the CSSSelectorToXPath() that builds on top of that to convert a full CSS selector to an XPath selector:
<cffunction
name="CSSSelectorToXPath"
access="public"
returntype="string"
output="false"
hint="I convert a full CSS selector to XPath (ex. div.header p span).">
<!--- Define arguments. --->
<cfargument
name="Selector"
type="string"
required="true"
hint="I am the full CSS selector."
/>
<!--- Define the local scope. --->
<cfset var LOCAL = {} />
<!--- Remove all extra white space. --->
<cfset LOCAL.Selector = Trim(
REReplace(
ARGUMENTS.Selector,
"\s+",
" ",
"all"
)
) />
<!---
We are going to handle three different kinds of selection
delimiters:
[ ] = decendent
[>] = child
[,] = OR'ing two full selectors together.
Because we have three delimiters that mean different
things, we cannot treat this as a list. Rather, what we
need to do is capture all elements of the selector.
--->
<cfset LOCAL.SelectorParts = REMatch(
"(\s*>\s*)|(\s*,\s*)|(\s+)|([^\s,>]+)",
ARGUMENTS.Selector
) />
<!--- Create an array of XPath selection parts. --->
<cfset LOCAL.XPathParts = [] />
<!---
Start off by adding an "anywhere" selector to the
XPath parts. This is because our CSS selector might
match anywhere within the XHTML document.
--->
<cfset LOCAL.XPathParts[ 1 ] = "//" />
<!---
Now, let's loop over the parts of the CSS selector and
convert those to their XPath equivalent.
--->
<cfloop
index="LOCAL.SelectorPart"
array="#LOCAL.SelectorParts#">
<!--- Trim this selection part. --->
<cfset LOCAL.SelectorPart = Trim( LOCAL.SelectorPart ) />
<!---
Check to see if we have a direct decendent
delimiter. If so, we simply need to add a slash
to the XPath parts.
--->
<cfif (LOCAL.SelectorPart EQ ">")>
<!--- Add child tag XPath selector. --->
<cfset ArrayAppend(
LOCAL.XPathParts,
"/"
) />
<cfelseif (LOCAL.SelectorPart EQ "")>
<!--- Add decendant XPath selector. --->
<cfset ArrayAppend(
LOCAL.XPathParts,
"//"
) />
<cfelseif (LOCAL.SelectorPart EQ ",")>
<!---
Add OR XPath selector. Because we are beginng a
new selector, prepend the "anywhere" selector.
--->
<cfset ArrayAppend(
LOCAL.XPathParts,
"|//"
) />
<cfelse>
<!---
We have an actual element selector. Convert
this to XPath syntax and add it to the XPath
parts array.
--->
<cfset ArrayAppend(
LOCAL.XPathParts,
CSSElementSelectorToXPath( LOCAL.SelectorPart )
) />
</cfif>
</cfloop>
<!---
Now that we have our XPath parts array, all we need to
do is join it to form our full XPath selection query.
--->
<cfreturn ArrayToList( LOCAL.XPathParts, "" ) />
</cffunction>
As you can see, not much going on here - we are basically replacing the delimiters using the above rules and passing off the element translation to our previous UDF. Because CSS selectors don't have an initial context, I am prepending "//" to the final XPath selection. This will allow our XPath selection to make its first match anywhere within the given XHTML document.
To test this, I set up the following code:
<cfoutput>
div<br />
#CSSSelectorToXPath( "div" )#<br />
<br />
div p<br />
#CSSSelectorToXPath( "div p" )#<br />
<br />
div p strong<br />
#CSSSelectorToXPath( "div p strong" )#<br />
<br />
##data-form label<br />
#CSSSelectorToXPath( "##data-form label" )#<br />
<br />
div > p<br />
#CSSSelectorToXPath( "div > p" )#<br />
<br />
div p.stanza > strong<br />
#CSSSelectorToXPath( "div p.stanza > strong" )#<br />
</cfoutput>
And, when we run the above test code, we get the following output:
div
//divdiv p
//div//pdiv p strong
//div//p//strong#data-form label
//*[ @id = "data-form" ) ]//labeldiv > p
//div/pdiv p.stanza > strong
//div//p[ contains( @class, "stanza" ) ]/strong
The full CSS selectors are getting converted to proper XPath syntax. So far so good, now on to the next step.
Want to use code from this post? Check out the license.
Reader Comments
I would be remiss in my duties if I didn't point out that you should talk like a pirate and use → instead of ==> .
@Rick,
Ha ha, I actually know what you're referring to :)
NOTE: I have updated the UDF above to handle the "or" selector (,).