The 12th Annual Regular Expression Day - June 1st 2019
Oh snap, can you feel it? That magic in the air?! Those groovy vibes?! That indescribable feeling of joy and wonderment currently coursing through your veins?! It's June 1st; which means, it's Regular Expression Day 2019!!! Woot woot! That one day of the year in which we take pause and give thanks for the powerful Pattern matching functionality that our programming languages place at our fingertips.
NOTE: If you are already completely lost and have no idea what I'm talking about, please checkout my video and slide presentation - Regular Expressions, Extraordinary Power. They will blow your mind and change your world forever.
Every year, I hope to be more prepared for this day. Create something clever; a game; a puzzle; some sort of competition. But, for the last few years, I just haven't been able to get it done. But, that doesn't mean I can't spread some joy.
So, this year, I'm giving away one $100 Amazon.com gift card and four $50 Amazon.com gift cards. To be entered for a chance to win, all you have to do is drop a comment below describing a problem you solved with Regular Expressions. Winners will be randomly selected end-of-day Monday.
Happy Regular Expression day! May this day be sweet and free of catastrophic back-tracking!
Reader Comments
Of late I've been doing more work with Docker, which has necessitated writing shell scripts and learning more about the command line (full endorsement of Julia Evans Bite Size Command Line zine - very accesible).
One of the really powerful commands is
sed
, which can utilize regular expressions (https://www.gnu.org/software/sed/manual/sed.html#sed-regular-expressions).So, for example, in one instance I use it to set a variable for stepping up and over to a different directory, adjacent to where the script is located.
Regex for the win!
This is a little bit more meta, but once I had to resolve the following problem:
How do you know if two different regex expressions match will match the same set of entries?
For example:
The problem is resolved by converting each regex to a Finite State machine and then computing the intersection of both by generating a third FSM and then converting it back to regex.
I ended up using a open source library for it https://github.com/qntm/greenery
Not sure if this will qualify as a proper answer, but... I haven't had my "unforgettable regex moment". It's more of a long-term relationship: don't worry, I'll explain.
See, most regex I've done is - to put it bluntly - uninteresting.
Most patterns I come up with are well-tread paths to solutions that aren't revolutionary in any way. They aren't some flash-in-the-pan, fancy approach. They aren't some new sugar for my syntactical sweet tooth. Rarely are they anything you would draw inspiration from, or that spurns you on as a dev. Were programming analogous to fiction, regex would not be found in that section of the library with titles like "A Night to Surrender" or "40 Nights at Sea".
Rather, regex is unyielding, inflexible, and often stubbornly unforgiving. Yet, it's in those qualities that I find a deeper love for regex: reliable predicably!
Libraries come and go, that form data you're parsing is crazy, and your code style will change over time; yet that regex you wrote remains an obelisk of utility. It just works the same way every time. Consistency is its draw, and it doesn't disappoint.
So, I don't particularly remember any regex I've written. It's all unremarkably the unsung hero of my career, but the hero nonetheless. For that, I owe regex a debt of gratitude.
Graduated as a mathematician, I decided to take a career into programming some years ago.
One of the first assignments I got was to implement a regular expression for parsing social media urls of several platforms.
Not familiar with regular expressions before, I was - and certainly still am - amazed by the power of it!
I believe it was one of my first ever commits in which I composed this regex for parsing the Twitter variant:
^(?:http(?:s)?://)?(?:[w]{0,3}[.]?)?(twitter.com|@)?(?:/)?([A-Za-z0-9_]*)?(?:/)?(.+)?$
When processing the value "https://www.twitter.com/BenNadel" for example, the username would be extracted by the regex and the value be modified to "@BenNadel".
Beautiful.
On a recent project, I needed to provide "masked" versions of a user's email address and phone number for display. I was doing this in a ColdFusion app but I achieved my results by dropping down to Java and using replaceAll() on the string.
For the email, I needed to match/replace everything but the first character on the left side of the "@".
For the phone number, I needed to match/replace everything but the last 4 digits in the value.
@Matthew,
Very cool -- I wish my command-line skills were much better. I've never even heard of
sed
. I can basically dopwd
andcd
andls
:D Awesome stuff!@Hector,
What you just said, hurt my brain :P
@William,
100% on point! I agree. RegEx doesn't have to be crazy because it's that tool that you know you can reliably reach for. It "just works". The vast majority of the patterns I write are simple, like
(\w+): (\d+)
that kind of stuff. But, it just makes life so much easier.@Martijn,
It's always a balance, too. Between things like readability and using non-capturing groups, like your
(?:xxx)
usage. One of the things I've come to love, though I don't reach for it all that often is the verbose flag(?x)
. This allows you to break your patterns up over spaces / line-breaks in order to make them easier to understand.@Tony,
Yoooo, I love dropping down into the Java layer for the more powerful RegEx. It has so much delicious juice down there. You may find a project of mine interesting,
JRegEx
. It's a ColdFusion component that wraps around some of the Java methods to make them a bit easier to use:www.bennadel.com/blog/3322-jregex---a-coldfusion-wrapper-around-java-s-regular-expression-patterns.htm
Such good stuff!
@Matthew, @Hector, @William, @Martijn, @Tony,
Thank you all for participating! This year's turn-out wasn't so great :( which is my fault for not really doing anything all that exciting.
But, the upside to it is that you're all winners! Please email me at
ben@bennadel.com
and I'll get your Amazon gift certificates out to you. Thank you all :D@Ben,
Booo for low turnout! Woohoo for winning!
With regard to the command line, I am a complete beginner, but I can't say enough good things about this: https://gumroad.com/l/bite-size-command-line - It's extremely practical and makes those incomprehensible man pages actually accessible. If you want something completely different for your next "A Book Apart", or just want to explore the command line, I'd recommend it. Also, in terms of helpfulness, here's another cool reference: https://tldr.sh/
Keep up the awesome blogging! And thanks!
@Ben,
Awesome! I'll be sure to check out JRegEx. Thanks.
For a cool online regex testing utility, check out http://gskinner.com/RegExr/ .
I've gotten a great deal of use out of it throughout the years :)
@William,
Yooo, that's really cool! I'm not sure I've seen that one before.
@Matthew,
This looks awesome. Julia Evans is a wonderful teacher with her illustrations. I will definitely be checking out this book! Awesome suggestions.