Using jSoup To Fix Post-Marriage Name Changes In ColdFusion 2021
At the beginning of this year, I started using jSoup on my ColdFusion blog. This opened up all kinds of possibilities from extracting Open Graph / Twitter Card data to generating blog post previews to injecting anchor links on my section titles. And now, this morning, I realized that I could use it to fix post-marriage name changes; at least, until I update the underlying content.
As you may know, I co-host a podcast call Working Code. On each episode write-up, I list out the names and links for all of our co-hosts (and the occasional guest). For the first 80+ episodes, Carol has been "Carol Hamilton". But now, in a post-nuptial era, Carol has become "Carol Weiler" - both here and around the web.
A side effect of this is that her vanity LinkedIn URL has changed. And - shame on LinkedIn for not handling this more gracefully - it means that all of her LinkedIn links on my 90 episode blog-posts have broken.
So now, here comes ColdFusion 2021 and jSoup to the rescue. As part of my "content normalization" process, I can paper over this change using some just-in-time DOM (Document Object Model) manipulation. Now, as I'm preparing my blog-post content for caching, I run it through the following "Carol Link Fixer" method. Please note that this ColdFusion component has been heavily truncated for the demo:
component {
/**
* I fix Carol's LinkedIn URLs and name (after her name change).
*/
private void function cleanUpCarolLinks( required string content ) {
// The jSoup library allows us to parse, traverse, and mutate HTML on the
// ColdFusion server using a familiar, luxurious jQuery-inspired syntax.
var dom = jSoupJavaLoader
.create( "org.jsoup.Jsoup" )
.parse( content )
.body()
;
// Find all the embedded anchor tags in the content that currently point to
// Carol's old LinkedIn profile (using a partial match on the LinkedIn slug in
// order to make the selector a tiny bit more flexible).
for ( var node in dom.select( "a[href*='carol-hamilton-5a869257']" ) ) {
// Update the link to point to her new LinkedIn profile.
node.attr( "href", "https://www.linkedin.com/in/carol-weiler-5a869257/" );
// This is likely not going to be true in all cases; but, in many cases, the
// link to Carol's LinkedIn profile is preceded by her name (such as in the
// list of co-hosts on the Working Code podcast). In such a scenario, let's
// also try to back-up in the DOM tree and update her name as well.
var labelNodes = node
.parent()
.select( "strong:contains(Carol Hamilton)" )
;
if ( labelNodes.len() ) {
labelNodes[ 1 ].text( "Carol Weiler" );
}
}
}
}
As you can see, I start to locating anchor links that point to her old LinkedIn URL. Then, I update the href
attribute for those links. And then, I try to find her name and update that as well. And now, when I render my blog content, Carol Hamilton has magically become Carol Weiler:
I believe it was Archimedes who said, "Give me a programming language strong enough and I will change the world". It turns out, the combination of ColdFusion and jSoup may have been just the combination of strength and flexibility that he was talking about. It seems, with ColdFusion, the only limit is my imagination.
Want to use code from this post? Check out the license.
Reader Comments
Why not search/replace the underlying data? Or is this more for the code cota?
@Chris,
So, I plan to do that as well at some point. But, something like that can "go wrong"; where as, since this is being done on-the-fly, there's no persistence to it.
But, there's another wrinkle to my thinking. I actually author all my posts in a
.md
(markdown) file - like an actual file just saved to my computer - before I publish them in my content management system. So now, I have an emotional problem -- do I go back and update the.md
in addition to changing the data in the database? It's weird having two sources-of-truth; so, it's probably better if I just get over it. But, it's hard hurdle.I could always update the DB first, get that done with, and then go back and update the files using some find-replace functionality.
Who knew that this was as much an emotional problem as it was a technical problem 🤣 🤣 🤣
@Ben
Haha, you make great points. Keeping the .md files updated feels like more of a type-a problem. I can relate 100% btw. While if love for my markdown and db to agree, I tend to shift the source of truth to the DB once converted. The MD files then become a relic.
@Chris,
That's probably the smarter approach :D I don't know why I feel so attached to the files.
Post A Comment — ❤️ I'd Love To Hear From You! ❤️
Post a Comment →