Using jSoup To Inject Section-Title Anchors In ColdFusion 2021
Now that I have jSoup running in my ColdFusion blogging platform, my brain is starting to think about what kind of fun stuff I can do on-the-fly to augment and enhance my HTML. And, other than cleaning up and sanitizing the content, the first thing that occurred to me was that I could inject dynamic section-title anchors that would allow readers to provide fragment-based links to specific sections of an article.
The desire here is to search each blog article for header tags: h2
- h6
; and then, prepend an anchor tag (<a>
) that would link the reader directly to that header tag. Each anchor / fragment / id
attribute is supposed to be unique within the context of a single page. However, since I'm injecting the anchors dynamically - and since I want them to be consistent over time - things get a little dicey. I'm going to use the text of the title tag to generate the id
attribute and just hope that it is sufficiently unique.
ASIDE: "Hope" is not a plan. This is true. But, the stakes here are very low. If I happen to have two titles on the page with the same content that lead to the same
id
attribute, it's not the end of the world. That's more of a poor-content authoring problem than a technical one.
All I have to do to get this to work is add a new method to my BlogPostNormalizer.cfc
ColdFusion component from my previous post:
component
accessors = true
output = false
hint = "I provide some helper methods to clean and normalize the pre-rendering of blog post content."
{
// Define properties for dependency-injection.
property jSoupJavaLoader;
property utilities;
// ---
// PUBLIC METHODS.
// ---
/**
* I apply some pre-render normalization to the given blog post / comment content.
*/
public string function normalizeContent(
required string content,
boolean stripRenderedProtocol = false
) {
// The jSoup library allows us to parse, traverse, and mutate HTML on the
// ColdFusion server using a familiar jQuery-inspired syntax.
var dom = jSoupJavaLoader
.create( "org.jsoup.Jsoup" )
.parse( content )
.body()
;
// .... truncated code .... //
cleanUpHeaderLinks( dom );
// .... truncated code .... //
}
// ---
// PRIVATE METHODS.
// ---
/**
* I prepend link-anchors to the title tags so that a user can provide a URL that
* scrolls the user directly to the title.
*/
private void function cleanUpHeaderLinks( required any dom ) {
for ( var node in dom.select( "h2, h3, h4, h5, h6" ) ) {
// Since the links are being generated dynamically - and need to be unique
// within the page content - we're going to HOPE that using the title text to
// generate the slug will be sufficient. However, it's not guaranteed. That
// said, it's also not the end of the world if this fails.
// --
// NOTE: The .text() method will give us the NORMALIZED, COMBINED text of all
// the elements within this title. As such, we don't have to worry about any
// embedded formatting tags or other links - it will all be concatenated.
var title = node.text();
var slug = utilities.generateSlug( title );
var anchor = node
.prepend( "<a></a>" )
.selectFirst( "a" )
.attr( "aria-hidden", "true" )
.attr( "id", slug )
.attr( "href", "###slug#" )
.attr( "title", "Link directly to this section: #title#" )
.addClass( "m-title-anchor" )
;
}
}
}
Working with the jSoup API is so nice and easy - so reminiscent of our jQuery days when one could dance around the DOM with abandon, injecting elements, traversing up and down the node-tree, and updating attributes in passing. In this case, I'm iterating over the titles, prepending an anchor element, selecting the anchor, and then updating it - all in a single statement!
The "slug" that is being generated is just a reduction of the given text value into something that is URL-friendly. This is the same method that I use to generate the URLs for my blog entries:
component {
/**
* I generate a normalized slug for the given text value.
*/
public string function generateSlug( required string value ) {
var slug = value
.trim()
.lcase()
// Strip out quotes.
.reReplace( "['""]+", "", "all" )
// Strip out brackets.
.reReplace( "[()[\]<>]", "", "all" )
// Strip out punctuation followed by spaces.
.reReplace( "[:.] ", " ", "all" )
// Replace any non-url-friendly characters.
.reReplace( "[^a-z0-9-]+", "-", "all" )
// Replace repeated dashes.
.reReplace( "-{2,}", "-", "all" )
// Strip off any leading or trailing dashes.
.reReplace( "^-+|-+$", "", "all" )
;
return( slug );
}
}
In the jSoup workflow above, you may notice that my anchor tag has no text. This is my attempt to prevent the "utility text" of this functionality from showing up in the search content on Google. Instead, I'm using the CSS content
property for the .m-title-anchor
class to inject the anchor text (#
):
.m-title-anchor {
display: inline-block ;
margin-right: 8px ;
&:before {
content: "#" ;
}
}
To be honest, I have no idea if Google does or does not include CSS content
text in its indexing consideration. But, I felt like using content
more closely expressed my intent of removing this text from the main page content.
Anyway, with this pre-render processing of my blog content, all my embedded title tags now have anchors:
I love how easy this was to do with jSoup! What a great API; and, it plays so nicely with ColdFusion.
Epilogue on Accessibility and Anchor Links
In my first iteration of this feature, I'm using aria-hidden
to explicitly hide these anchor links from screen readers and assistive technology. Not because I don't think they would be useful; but, because I'm not sure how to make them both accessible and look good.
Amber Wilson has a great write-up on what accessible anchor links might look like on her blog. But, her approach - and other similar approaches - start to mess with the block-level elements and end up rendering the UI (User Interface) widgetry in an order that is not reflected in the DOM; and, it's just not a skill-level that I am comfortable with ... yet. Hopefully in the future, I'll have more comfort with this style of approach.
Want to use code from this post? Check out the license.
Reader Comments
I'm sure you know this, but for other readers, you don't need an anchor tag to link to headings on a page. Any element with an
id
attribute can be linked to view the hash in the URL.So you could generate the
id
attribute on the heading tags themselves. This can open up the options. If you want an anchor tag to be able to link to the element, you can put it anywhere (with any text) and not have to worry about how it affects your heading tags, or you can just leave it off. In many cases the usefulness of linking to a heading is more an internal benefit to the application than needing it in the UI for users.@Dan,
That's a great point. In fact, when I'm reading an article and I want to link to a section, but there is no obvious link CTA (call to action), I'll inspect the source and half-the-time find an
id
on the section or title that I'm looking at. And then, just as you're saying, I can generate a link with a fragment on my own.@All,
A minor update on accessibility. After running the Lighthouse utility again on my pages, I was getting dinged for having a focusable item inside an
aria-hidden="true"
element. To remedy this, I added:... to the injected anchor tag. The downside is that people won't be able to access it via the Tab key. However, since it's not really related to the article itself, but is more of a utility, I think that's OK.
Post A Comment — ❤️ I'd Love To Hear From You! ❤️
Post a Comment →