Ask Ben: Counting Spaces In A Given String

By Ben Nadel

Published 2006-08-23 in Ask Ben, ColdFusion — Comments (11)

How can I get the number of spaces in a string?

This seemingly simple problem does not have the most simple answer. I wish there was some sort of ValueCount() method in ColdFusion, but right, I think that only applies to List (ie. ListValueCount()). Luckily for your particular problem, there is a mostly simple solution. Since you are looking for just spaces, we can strip out everything that is NOT a space and then just get the length of the resultant string:

<cfset intLength = Len(
	REReplace(
		"You are simply a vision in that dress!",
		"[^ ]+",
		"",
		"ALL"
		)
	) />

This really only works when you are looking for single characters. If you want to search for all instances of a word, then things get a bit hairy. The easy solution is simple to keep searching the string untill you cannot find any instances.

<!--- The test value. --->
<cfset strTest = "You are the best and the most beautiful person." />

<!--- The target instance. --->
<cfset strTarget = "the" />

<!--- The instance counter. --->
<cfset intCount = 0 />

<!--- Get the initial position. --->
<cfset intPosition = Find( strTarget, strTest, 0 ) />

<!--- Keep searching till no more instances are found. --->
<cfloop condition="intPosition">

	<!--- Increment instance counter. --->
	<cfset intCount = (intCount + 1)>

	<!--- Get the next position. --->
	<cfset intPosition = Find(
		strTarget,
		strTest,
		(intPosition + Len( strTarget ))
		) />

</cfloop>

<!--- Output the number of target instances. --->
#intCount#

Each time we do a search, we have to increment the counter and then start the search again after the given instance. Not the greatest solution, but it works.

Want to use code from this post? Check out the license.

Short link: https://bennadel.com/216

Reader Comments

Rick O Aug 23, 2006 at 11:10 PM

153 Comments

Why not leverage Java?

intCount=ArrayLen(strTest.split(strTarget.replaceAll("\W","\$1")))

Rick O Aug 23, 2006 at 11:12 PM

153 Comments

Erm, make that:

intCount=DecrementValue(ArrayLen(strTest.split(strTarget.replaceAll("\W","\$1"))))

Silly off-by-one error.

Rick O Aug 23, 2006 at 11:37 PM

153 Comments

Okay, last try, I promise.

<cfset strTest = "You are \the\ best (and) the [most] beautiful girl.">
<cfset strTarget = "\">
<cfset newTest=Chr(1) & strTest & Chr(1)>
<cfset intCount=DecrementValue(ArrayLen(newTest.split(strTarget.replaceAll("(\W)","\\$1"))))>
<cfoutput>#intCount#</cfoutput>

Trond Ulseth Aug 24, 2006 at 4:30 AM

5 Comments

Here's my simple take on it:

Trond Ulseth Aug 24, 2006 at 5:13 AM

5 Comments

This could be done for phrases as well:

Obviously it would return wrong results if the phrase is at the begining or the endt of the string. This can easily be fixed by prepending and appending the string with some rubbish phrases.

ps - Ben those spam fighthing math equations are hard on me early in the morning ;)

Ben Nadel Aug 24, 2006 at 1:07 PM

16,125 Comments

Rick, Trond,

Excellent suggestions all around. As we can see, there are a number of solutions to this problem, but still, I think this would be an easy method for CF to build in, right?

Trond, good call with the replacing the phrase with the "delimiter". That never even occurred to me. The only red flag I could see is that you might use a delimiter character that is already in the string (and therefore would throw off the count). This of course can be offset by using extrememly rare characters or even by replacing that character out before replacing out the target phrase.

Good stuff all around. Also sorry about the math, but it keeps the SPAM out :)

Steven Levithan Feb 2, 2007 at 2:08 PM

172 Comments

We can take this further...

<cfset intLen = listLen(reReplaceNoCase(strTarget, "(?:(?!test)[\S\s])+", ",", "ALL")) />
Test, tester, and retest count as one match each, testtest counts as two matches.

<cfset intLen = listLen(reReplaceNoCase(strTarget, "(?:(?!\btest\b)[\S\s])+", ",", "ALL")) />
Test counts as one match, tester, retest, and testtest do not count as matches.

<cfset intLen = listLen(reReplaceNoCase(strTarget, "\b(?:(?!test)[\S\s])+\b", ",", "ALL")) />
Test, tester, retest, and testtest count as one match each.

Or, using my reMatch() UDF (http://badassery.blogspot.com/2007/01/coldfusion-regex-support-udfs-rematch.html), the regexes become even simpler...

<cfset intLen = arrayLen(reMatchNoCase("test", strTarget, 1, "ALL")) />
Test, tester, and retest count as one match each, testtest counts as two matches.

<cfset intLen = arrayLen(reMatchNoCase("\btest\b", strTarget, 1, "ALL")) />
Test counts as one match, tester, retest, and testtest do not count as matches.

<cfset intLen = arrayLen(reMatchNoCase("\b\w*?test\w*\b", strTarget, 1, "ALL")) />
Test, tester, retest, and testtest count as one match each.

Steven Levithan Feb 2, 2007 at 2:16 PM

172 Comments

Note that I'm not familiar with using the underlying Java regex methods such as split(). I'm sure that at least my first three, non-reMatch()-based examples could be written more elegantly using the Java core. Goddamn CF7's lame regex support and available functions...

Ben Nadel Feb 2, 2007 at 2:27 PM

16,125 Comments

Yeah, Java's regex stuff is really cool and very powerful. It can handle most of the regular expression stuff that straight-up CFMX method calls cannot handle. I use them all the time. I find that they are also a good bit faster.

Kyle Feb 2, 2011 at 9:24 PM

1 Comments

Thanks Ben! I used this to find the first space after the midway point in a document, so that I could split it into near length columns. I seem to end up are your blog posts more often than Adobe LiveDocs...

-Kyle

Thijs Mar 27, 2012 at 10:24 AM

1 Comments

Excellent post Ben, very simple solution. As Kyle already mentioned in general your blog is way more usefull and interesting then the stuff on adobe's website. You could loose some of the comments in the scripts though, but hey that's just my opinion.

Oh my chickens, this post is old!

Hit me up on LinkedIn if you want to discuss it further.