Boyan's Tremendous Idea For Downloading Code Snippets

By Ben Nadel

Published 2007-02-09 in ColdFusion — Comments (8)

For those of you who view my site on a regular basis, you have probably noticed that above each code snippet, there is a link labeled "Launch code in new window." This simply pops up the code in a new window with some color coding and a cut-n-paste box. The other day, Boyan Kostadinov had a great idea: what about another link for letting the user download the snippet as a text file. This could, of course, be accomplished via cut-n-paste... but why go through the extra steps. I thought this was a tremendously awesome idea and implemented it this morning.

Ironically enough, the creating of the code file was a cinch. The hard part was getting the stupid buttons above the code snippets to look right. Damn FireFox vs. Internet explorer box model! You might have to refresh a few times to get the updated CSS file. You will see that they are right next to each other. I am giving up for now on having them line up exactly right. It looks better in FireFox than IE (big surprise there).

Anyway, Boyan was interested in seeing how I accomplished this, so I am posting the code below. The way it works is that it takes a blog entry and a character offset for the given code snippet (within the blog content). Then, using the offset, it reads the blog content and strips out the given code chunk and spits it into a file (I have changed some variables so that you don't have to worry about my particular framework):

<!---
	Query for the blog entry with the target code snippet
	that we want to download.
--->
<cfquery name="qBlog" datasource="...">
	SELECT
		b.id,
		b.name,
		b.content,
		b.date_posted,
		b.time_posted
	FROM
		blog_entry b
	WHERE
		b.id = <cfqueryparam
					value="#id#"
					cfsqltype="CF_SQL_INTEGER"
					/>
</cfquery>


<!---
	This is the variable that we are going to store the
	clean code snippet in (the one that has no HTML or escaped
	characters in it, but just clean text).
--->
<cfset strCleanCode = "" />


<!---
	Check to see if we found the blog entry. If we did, we can
	grab the code. If we did not, then something went wrong.
--->
<cfif qBlog.RecordCount>

	<!---
		Now, let's try and find the matching code snippet using
		a java pattern matcher. Since we know the offset of the
		given code snippet (passed in the URL), we can loop
		through all the code snippets of this entry until we
		find the one with the proper offset. Let's create a
		pattern for matching the code divs.
	--->
	<cfset objPattern = CreateObject(
		"java",
		"java.util.regex.Pattern"
		) />

	<!---
		Compile the pattern (to get the pattern object
		instance). Remember our code snippet frames might be
		standard (no scrolling) or they might be of a fixed
		height. We have to account for both versions in
		our pattern.
	--->
	<cfset objPattern = objPattern.Compile(
		"(<div class=""code[^""]*?"">)([\w\W]+?)(</div>)"
		) />

	<!---
		Now that we have the compiled pattern, let's get a
		pattern matcher that will help us iterate over the
		matches in the target blog content.
	--->
	<cfset objMatcher = objPattern.Matcher( qBlog.content ) />

	<!---
		Loop over the content for every instance of the code
		divs (pattern). Now, we only want to loop until we have
		found the code offset that we were passed in the URL.
	--->
	<cfloop condition="objMatcher.Find()">

		<!--- Check to see if this is the proper offset. --->
		<cfif (objMatcher.Start() EQ REQUEST.Attributes.start)>

			<!---
				We have found the code snippet! Now, extract
				the code snippet. Based on our regular
				expresssion pattern, the code of our snippet
				should be contained in the second group.
			--->
			<cfset strCleanCode = objMatcher.Group( 2 ) />

			<!---
				Since we found our code snippet in this
				iteration, we can break out of the loop. We
				don't care about any further code snippets.
			--->
			<cfbreak />

		</cfif>

	</cfloop>


	<!---
		ASSERT: At this point, we have scowered over our
		target blog post looking for the desired code
		snippet. If we found it, the code for it will be
		in the strCleanCode variable. If we did not
		find it, then that variable, strCleanCode,
		will still be blank (default value).
	--->


	<!---
		Check to see if we found our code snippet. If we did,
		our clean code variable will have some sort of length.
	--->
	<cfif Len( strCleanCode )>

		<!---
			We have a successfully found our code snippet. Right
			now, it is still in blog-format. This means that it
			has a bunch of list items in it. We need to clean all
			that junk out so that the code snippet is clean.
		--->

		<!--- Let's start by removing the UL tags. --->
		<cfset strCleanCode = strCleanCode.ReplaceAll(
			"</?ul>",
			""
			) />

		<!--- Remove all non-breaking spaces. --->
		<cfset strCleanCode = strCleanCode.ReplaceAll(
			"(&nbsp;|&##160;)",
			" "
			) />

		<!---
			Remove all endling LI tags and replace with
			a single line break.
		--->
		<cfset strCleanCode = strCleanCode.ReplaceAll(
			"</li>",
			"#Chr( 13 )##Chr( 10 )#"
			) />

		<!---
			Now, loop over each LI tab type and replace with
			the appropriate tabbing. Remember, each tab class
			such as "tab1", "tab2", "tab3", etc... stands for
			a set number of preceeding tabs.
		--->
		<cfloop index="intTab" from="1" to="10" step="1">

			<cfset strCleanCode = strCleanCode.ReplaceAll(
				"<li class=""tab#intTab#"">",
				RepeatString( " ", intTab )
				) />

		</cfloop>

		<!--- Replace out any LIs that are not tabbed. --->
		<cfset strCleanCode = strCleanCode.ReplaceAll(
			"<li>",
			""
			) />

		<!--- Replace out any breaks tags and trim it. --->
		<cfset strCleanCode = strCleanCode.ReplaceAll(
			"<br( ?/)?>",
			""
			).Trim()
			/>

		<!--- Replace back in proper brackets. --->
		<cfset strCleanCode = strCleanCode.ReplaceAll(
			"&lt;",
			"<"
			) />

		<!--- Replace back in proper brackets. --->
		<cfset strCleanCode = strCleanCode.ReplaceAll(
			"&gt;",
			">"
			) />

	</cfif>

</cfif>


<!---
	ASSERT: At this point, we either found a blog entry or we
	did not. If we did find a blog entry, we either foudn the
	desired code snippet or we did not. If we found the code
	snippet, we have cleaned it up.

	BOTTOM LINE: At this point, we either have a clean code
	snippet or we do not. If we do have a clean code snippet,
	then it has been stored in strCleanCode. If we did not
	find a clean code snippet, then strCleanCode is empty.
--->


<!--- Check to see if we have any code at this time. --->
<cfif Len( strCleanCode )>


	<!---
		We did find the code snippet. Let's create a file
		data variable so that we can add some more meta
		information to the file.

		Remember than when putting in file comments, we cannot
		put in straight-up ColdFusion comments other wise they
		will get used on the sever. Instead, we are going to
		have to put in escaped comments and then replace later.
	--->
	<cfsavecontent variable="strFileData">
		<cfoutput>

			[--- ------------------------------------------ ----

				Blog Entry:
				#qBlog.name#

				Author:
				Ben Nadel / Kinky Solutions

				Link:
				http://www.bennadel.com/index.cfm?dax=blog:#qBlog.id#.view

				Date Posted:
				#DateFormat( qBlog.date_posted, "mmm d, yyyy" )# at #TimeFormat( qBlog.time_posted, "h:mm TT" )#

			---- ------------------------------------------ ---]

		</cfoutput>
	</cfsavecontent>


	<!---
		Now that we have the file header, let's clean it up
		(all the extra tabs and what not). Let's start out by
		stripping out all tabs that preceed the first and
		last line.
	--->
	<cfset strFileData = strFileData.Trim().ReplaceAll(
		"(?m)(?:\t+)([\[\-])",
		"$1"
		) />

	<!---
		Now, strip out all other tabs except for the first tab,
		which should be the first tab of any line (as we are
		preforming a multi-line regular expression).
	--->
	<cfset strFileData = strFileData.ReplaceAll(
		"(?m)(\t)(?:\t*)",
		"$1"
		) />


	<!---
		Now, replace in the correct brackets. These are the
		brackets that we put in as escaped ColdFusion code.
	--->
	<cfset strFileData = strFileData.ReplaceFirst(
		"^\[",
		"<"
		) />

	<cfset strFileData = strFileData.ReplaceFirst(
		"\]$",
		">"
		) />


	<!---
		Now, all that's left is streaming the data to the user
		and prompting them for download. Add a few line breaks
		to the file data and the append the actual code snippet.
	--->
	<cfset strFileData = (
		strFileData &
		RepeatString(
			(Chr( 13 ) & Chr( 10 )),
			3
			) &
		strCleanCode
		) />


	<!---
		When streaming the snippet to the user, set the header
		to have a file name and act as an attachment.
	--->
	<cfheader
		name="content-disposition"
		value="attachment; filename=blog_code_#qBlog.id#_#REQUEST.Attributes.start#.txt"
		/>

	<!--- Stream error message as binary text object. --->
	<cfcontent
		type="text/plain"
		variable="#ToBinary( ToBase64( strFileData ))#"
		/>

<cfelse>


	<!---
		Either we could not find the given blog entry or the
		code snippet offset was incorrect. Either way, we have
		no code to given back to the user. Just give them and
		text error message. We can just send this one as an
		inline data file (no need to prompt the download of
		an error message).
	--->
	<cfheader
		name="content-disposition"
		value="inline; filename=no_code_found.txt"
		/>

	<!--- Stream error message as binary text object. --->
	<cfcontent
		type="text/plain"
		variable="#ToBinary( ToBase64( 'Code snippet could not be found.' ))#"
		/>


</cfif>

So anyway, there it is.

You might look at the code above and question the CFSaveContent stuff. You will notice that after I do the CFSaveContent, I am stripping out lots of the leading tabs. You might think to yourself, "You could just save all that effort by not tabbing in the code." Yeah, I could do that, but I think that makes my code look crappy with poor tabbing... sorry, mini-rant there :)

Want to use code from this post? Check out the license.

Short link: https://bennadel.com/520

Reader Comments

Boyan Kostadinov Feb 9, 2007 at 10:00 AM

95 Comments

Ben,

you are the man! That is awesome! I love reading your blog and this is going to make it so easy to download your snippets. Just one tiny little thing - can you name the file to be downloaded with the name of the post? An example would be "520-Boyan-s-Tremendous-Idea-For-Downloading-Code-Snippets.txt" for this post? I know I might be asking too much, he he. I was thinking about your blog this morning and all the cool stuff you do on it and I'm inspired to write my own custom blog application.

Javier Julio Feb 9, 2007 at 10:05 AM

92 Comments

A very nice feature guys. Great recommendation Boyan. Thanks too for sharing Ben how you accomplished this!

Ben Nadel Feb 9, 2007 at 10:08 AM

15,978 Comments

@Boyan,

I have updated the CFHeader tag:

That's probably not readable, but it's basically stripping out all the non-word characters and replacing with "-". Also, I only grab the left 50 characters as I don't know what the file name limitations are.

@Javi,

Thanks dude. Always happy to make it easier.

Boyan Kostadinov Feb 9, 2007 at 10:16 AM

95 Comments

Ben,

you continue to amaze me. Thank you for the fast reply and for making that little change so quick. Let me know anytime I could be of service. I would be glad to help you out.

Ben Nadel Feb 9, 2007 at 10:19 AM

15,978 Comments

No problem dude. You help by making kick-ass suggestions ;)

Dan G. Switzer, II Feb 9, 2007 at 11:42 AM

198 Comments

@Ben:

If you're blogs are stored as valid XHTML, you could always just convert the blog entry into an XML DOM to grab the source code as well--instead of parsing the string to find the source code block.

Ben Nadel Feb 9, 2007 at 1:06 PM

15,978 Comments

Dan,

Very interesting.... I didn't know that XHTML would parse properly as it has markup mixed in within text (ex. a STRONG tag within a paragraph content). Although, I guess the surrounding text would be stored as text nodes.

I will do some experimenting. Thanks for the tip.

Ben Nadel Feb 12, 2007 at 3:35 PM

15,978 Comments

Dan,

I just tried parsing some XHTML into XML. It doesn't quite work as I would expect. For one, it needs a root node, which I do not have. Of course, I can wrap the whole content is a DIV or something before I parse it.

The thing that gets me though, is that when it parses, it strips out styled elements (ex. STRONG, EM) as child elements, but keeps the sibling text as the XmlText value but does not put in the styled text.

However, calling ToString() on the resultant XML object DOES merge the two back together again successfully. Very interesting. There is probably stuff going on here that I am just not understanding.

Oh my chickens, this post is old!

Hit me up on Twitter if you want to discuss it further.