Array Iteration Is Much Faster That Struct Iteration (Thanks Eric Stevens)
Last week, I claimed that Struct iteration would be faster than array iteration in ColdFusion. This wasn't totally out of left field - what I have read about the two different objects and their underlying Java classes (FastHashTable and ArrayCollection) would have suggested that look ups on a Struct were faster than look ups on an array. After I made this claim, Eric Stevens of the B & E Blog did some performance testing and found that I was totally wrong. His test was quite thorough and clear, but it was so shocking that I had to see it for myself:
<!--- Create an array to populate. --->
<cfset arrData = [] />
<!--- Create a struct to populate. --->
<cfset objData = {} />
<!--- Now, populate both using 1-based index. --->
<cfloop
index="intIndex"
from="1"
to="100000"
step="1">
<!--- Add to array. --->
<cfset arrData[ intIndex ] = true />
<!--- Add to struct. --->
<cfset objData[ intIndex ] = true />
</cfloop>
<!--- Iterate over array. --->
<cftimer
type="outline"
label="Array Iteration">
<!--- Use index loop. --->
<cfloop
index="intIndex"
from="1"
to="#ArrayLen( arrData )#"
step="1">
<!--- Loop up value using Array index. --->
<cfset blnValue = arrData[ intIndex ] />
</cfloop>
Done.
</cftimer>
<!--- Iterate over struct. --->
<cftimer
type="outline"
label="Struct Iteration">
<!--- Use index loop. --->
<cfloop
index="intIndex"
from="1"
to="#StructCount( objData )#"
step="1">
<!--- Loop up value using struct key. --->
<cfset blnValue = objData[ intIndex ] />
</cfloop>
Done.
</cftimer>
Just as with Eric's findings, I found that the Array iteration was about 2-3 times faster than the Struct iteration, even when we were iterating over explicit keys (ie. not doing a Collection loop). Even after seeing this, something about it feels hard to believe. This is some good information to know.
What was also very interesting to see was that the List loop was almost as fast as an Array loop (one that uses the List attribute of the CFLoop, not one that uses ListGetAt()). This is something that I never really thought about before; I had just assumed everything with lists was slower than non-lists, but I guess ColdFusion is really being smart about it. Whether it is converting it to an array behind the scenes, or keeping track of the offset (as Eric theorizes), it is good to see that it is so fast.
Want to use code from this post? Check out the license.
Reader Comments
Missed you at WebManiacs Ben. It was a pretty good time.
@Andy,
Sorry man. I heard it was a lot of fun. Hopefully one day I will get down there. Going to CFUnited soon, at least.
That's good to know about lists not being slow after all. I like using lists if all I need is a shallow 1D string of data. CF has lots of list functions and because of that it's more flexible than using a 1D array. A few times I've used 2 or 3 lists to create pseudo 2D or 3D arrays just so I can use the list functions, but if any of the lists get out of synch with each other (e.g. become different lengths) then it's as good as dead.
But CF8 and 8.01 introduced faster ways to create arrays which had previously been one of the grievances I had with using arrays in CF.
@Gary,
For small sets of data, there is really no difference. The research above is really all theoretical and would really only come into use on extremely high traffic sites or algorithms that work with a tremendous amount of data.
For your average stuff, lists are just fine.
I've always been impressed by the speed and variety of list functions. Another intangible to consider is the fact that list looping is SO EASY to construct and understand. They all have their place, but I'm glad that we don't have to shy away from this one.
@Dan,
True. And remember that we are talking about minor optimizations. Certainly, you are never going to have a system where List looping and manipulation is your bottleneck :)
I've always detested lists. I'm highly allergic to the idea that they can be broken by inserting data that contains a delimiter character, and the ListAppend function won't complain. They're not bad when dealing with, for example, a list of only numbers, and they're not too bad for dealing with really small sets of data which you control completely. But as soon as you take user input and pass it through a list, you're playing with fire.
It should be noted that ListGetAt(), ListFind(), and most other such functions are really terrible performers in general unless dealing with very small sets of data. <cfloop list> performs quite respectably because it never really has to search the list.
Even though <cfloop collection> under-performs compared to <cfloop list>, structs are a much more natural fit for managing the sort of complex data we'll encounter day-to-day. I almost always use a struct or an array for juggling data in my scripts.