Updating My Mental Model For Shared Array / Struct Iteration In ColdFusion
In the very early days of ColdFusion, it was considered a "Best Practice" to synchronize access to all shared data structures. Meaning, to single-thread the access in order to avoid race conditions. Over time, this practice became much more nuanced; and, more of the native data types within ColdFusion became synchronized by default. Meaning, they were being made thread-safe internally. I believe that my mental model for iteration over shared data structures is outdated. As such, I wanted to run some parallel iterations in Adobe ColdFusion 2021 and Lucee CFML 5.3.8.201.
To be clear, it was not so long ago (year 2018) that iteration over shared data structures has caused a problem: in Lucee CFML 5.2, iteration over shared arrays caused deadlocks in my application. But, as Zac Spitzer pointed out in the comments to that post, much as been done since then to make data access safer.
To explore the state of the modern ColdFusion engines, I'll be using parallel array iteration to simulate concurrent access of the same data. Essentially, I'll be trying to iterate over a shared value from different threads within the same request. I hope that this sufficiently exercises possible iteration contention.
CAUTION: Please note that I am iterating over read-only data. Attempting to iterative over data - either a Struct or an Array - that is mutated during iteration is still unsafe. With a Struct, it can cause "Concurrent Modification"; and, with an Array, it can cause "Undefined Value" errors. My interest in this post is only to iterate over shared, read-only data.
Iterating Over Shared Arrays In ColdFusion
In the following ColdFusion demo, I have an Array of 100 items. I'm then attempting to spawn 100 threads (in blocks of 20) to all iterate over the same shared array at the same time and log their iteration indices to the console:
<cfscript>
// Build the shared data-array that we're going to iterate-over from multiple,
// concurrent threads (simulating multiple concurrent requests). We're going to use
// parallel array iteration to simulate the shared-load on the data array.
sharedData = buildArray( 100 );
simulatedRequests = buildArray( 100 );
simulatedRequests.each(
( requestID ) => {
for ( var value in sharedData ) {
consoleOutput( "#requestID#: #value#" );
// Briefly sleep this parallel iterator to make sure that other iterators
// have time to start iterating over the same array at the same time.
sleep( 20 );
}
},
// Parallel iteration of the ITERATORS array to simulate concurrent requests that
// are all trying to iterate over the same SHARED DATA array at the same time.
true,
// Maximum number of parallel threads.
20
);
consoleOutput( "Done" );
writeOutput( "Done" );
// ------------------------------------------------------------------------------- //
// ------------------------------------------------------------------------------- //
/**
* I construct a reflected array of the given size.
*/
public array function buildArray( required numeric size ) {
var result = [];
for ( var i = 1 ; i <= size ; i++ ) {
result[ i ] = i;
}
return( result );
}
/**
* I write the given value to the console.
*/
public void function consoleOutput( required string value ) {
cfdump( var = value, output = "console" );
}
</cfscript>
This script executes without error or deadlocks in both Adobe ColdFusion 2021 and Lucee CFML 5.3.8.201. The CommandBox logs look like this at the end:
.... (truncated for post) ....
[INFO ] string 93: 99
[INFO ] string 90: 100
[INFO ] string 87: 100
[INFO ] string 94: 100
[INFO ] string 100: 99
[INFO ] string 99: 98
[INFO ] string 92: 100
[INFO ] string 97: 99
[INFO ] string 96: 99
[INFO ] string 95: 100
[INFO ] string 93: 100
[INFO ] string 99: 99
[INFO ] string 100: 100
[INFO ] string 97: 100
[INFO ] string 96: 100
[INFO ] string 99: 100
[INFO ] string Done
As you can see, each iterator (parallel thread) was clearly iterating over the shared data array at the same time.
Iterating Over Shared Structs In ColdFusion
In the following ColdFusion demo, I have a Struct of 100 keys. I'm then attempting to spawn 100 threads (in blocks of 20) to all iterate over the same shared struct at the same time and log their iteration keys to the console:
<cfscript>
// Build the shared data-struct that we're going to iterate-over from multiple,
// concurrent threads (simulating multiple concurrent requests). We're going to use
// parallel array iteration to simulate the shared-load on the data struct.
sharedData = buildStruct( 100 );
simulatedRequests = buildArray( 100 );
simulatedRequests.each(
( requestID ) => {
for ( var key in sharedData ) {
consoleOutput( "#requestID#: #sharedData[ key ]#" );
// Briefly sleep this parallel iterator to make sure that other iterators
// have time to start iterating over the same struct at the same time.
sleep( 20 );
}
},
// Parallel iteration of the ITERATORS array to simulate concurrent requests that
// are all trying to iterate over the same SHARED DATA struct at the same time.
true,
// Maximum number of parallel threads.
20
);
consoleOutput( "Done" );
writeOutput( "Done" );
// ------------------------------------------------------------------------------- //
// ------------------------------------------------------------------------------- //
/**
* I construct a reflected array of the given size.
*/
public array function buildArray( required numeric size ) {
var result = [];
for ( var i = 1 ; i <= size ; i++ ) {
result[ i ] = i;
}
return( result );
}
/**
* I construct a reflected struct of the given size.
*/
public struct function buildStruct( required numeric size ) {
var result = [:];
for ( var i = 1 ; i <= size ; i++ ) {
result[ i ] = i;
}
return( result );
}
/**
* I write the given value to the console.
*/
public void function consoleOutput( required string value ) {
cfdump( var = value, output = "console" );
}
</cfscript>
This script executes without error or deadlocks in both Adobe ColdFusion 2021 and Lucee CFML 5.3.8.201. The CommandBox logs look like this at the end:
.... (truncated for post) ....
[INFO ] string 97: 99
[INFO ] string 86: 100
[INFO ] string 94: 99
[INFO ] string 96: 99
[INFO ] string 95: 100
[INFO ] string 92: 100
[INFO ] string 91: 100
[INFO ] string 88: 100
[INFO ] string 100: 98
[INFO ] string 90: 100
[INFO ] string 99: 99
[INFO ] string 93: 100
[INFO ] string 98: 100
[INFO ] string 97: 100
[INFO ] string 96: 100
[INFO ] string 94: 100
[INFO ] string 100: 99
[INFO ] string 99: 100
[INFO ] string 100: 100
[INFO ] string Done
As you can see, each iterator (parallel thread) was clearly iterating over the shared data struct at the same time.
This Exploration Only Applies to Native ColdFusion Data Types
It seems that with the most modern ColdFusion engines, it is completely safe to iterate over shared, read-only data. That said, this only demonstrates this behavior for native ColdFusion Arrays and Structs - this is not necessarily indicative of all "Struct like" or "Array like" values that come back from APIs. It's possible that an API will return an Array or Struct that is backed by a Java object that is not thread safe.
Furthermore, when creating an Array with arrayNew()
, there is a 3rd-argument that allows non-synchronized arrays to be created. I attempted to demonstrate iteration contention with these non-synchronized arrays; however, I was unable to do so successfully. That said, I still wouldn't trust an explicitly non-synchronized array in a shared context.
This Exploration Only Applies to Iteration
In this demo, I'm specifically looking at iteration over data, not access. Meaning, I didn't look at getting-and-setting values in a shared data structure. As far as I know, straight-up access has been thread-safe in ColdFusion for a long time. At least for Structs - I'm less aware of shared access for Arrays.
Want to use code from this post? Check out the license.
Reader Comments
Super interesting. Had no idea read only access to shared objects would be an issue in the async context. I'm so confused about how a synchronized object solves the problem. And what would be the use case for non-synchronized arrays? This has my head spinning...which I appreciate!
@Chris,
I think struct iteration has always been safe. And, I think the fact that array iteration used to cause contention in Lucee was maybe a low-level bug. I'm with you, though, I don't really understand how Synchronized objects works internally. It's just magic 🤩
I think the case for a non-synchronized objects is just performance. Locking has some overhead to it in and of itself. So, if you can work locally with a non-synchronized array, it might be faster. Though, this will likely not be an issue in the vast, vast majority of cases.
Here's a post from Adobe on the new non-synchronized arrays:
https://coldfusion.adobe.com/2019/03/look-unsynchronised-arrays-cfml/
I haven't read it too closely, but I came across it the other day when looking into this topic.
Are you hitting the page request (the .cfm file) twice, immediately to get two threads going simultaneously?
I'm not quite grokking how you're getting the two threads, each running parallel operations of the struct and array.
@Will,
I am running the
simulatedRequests.each()
method in parallel:Array.each( operator [, parallel [, maxThreads]] )
So, I'm calling it like so:
simulatedRequests.each( operator, true, 20 )
... to simulate parallel threads all trying to access
sharedData
.@All,
In this demo, I'm in a read-only mode. But, if I needed to mutate the shared data some time, then I'd have to put a readonly lock around the data for the reads, and an exclusive lock around the data for the writes. This got me wondering if a read-only lock has any overhead in and of itself:
www.bennadel.com/blog/4290-looking-at-the-performance-overhead-of-a-read-only-lock-in-lucee-cfml-5-3-8-201.htm
I ran some tests on my local Lucee CFML server, and I couldn't discern any obvious performance impact of having a read-only lock. That's pretty awesome!
Post A Comment — ❤️ I'd Love To Hear From You! ❤️
Post a Comment →