Volatile Keys Can Expire Mid-MULTI-Transaction In Redis / Jedis
Yesterday, I was surprised to find that I had 4 keys in my Redis name-space that had no expiration date (Time-To-Live). I'm creating hundreds-of-thousands (if not millions) of keys a day, all with expiration dates. And yet, 4 of them were persisted without an expiration date. Such a tiny number of errors felt like a race condition; but, the code was so simple, it didn't make sense. Until, I realized that the Redis sense of "transaction" didn't preclude key expiration.
NOTE: I am running this test on Redis version 2.8.17 using Jedis for Java.
In Redis, you can use the MULTI and EXEC commands to queue up a collection of commands that will all be executed in a single step on the Redis server. When using MULTI and EXEC, the documentation states the following guarantee:
All the commands in a transaction are serialized and executed sequentially. It can never happen that a request issued by another client is served in the middle of the execution of a Redis transaction. This guarantees that the commands are executed as a single isolated operation.
This guarantee makes it sound like your queue of commands will operate in absolute isolation. But, this is not what it's saying. The "isolation" here is being qualified - it pertains to requests issued by other clients. This guarantee says nothing about operations performed by the Redis server itself, such as expiring a volatile key.
Later on, in the documentation about WATCH, things are a bit more explicit. The documentation stated that a volatile key - that expires after a WATCH command is called - will not prevent EXEC from executing. Key-expiration, it further explains (in a linked forum thread), is not considered a "touch" of the key.
So, based on the somewhat-ambiguous nature of the transaction guarantee and the explicit exception of key-expiration in WATCH, I think it's fair to say that a volatile key can expire in the middle of a Redis MULTI transaction. And, in fact, that is what I can see from my demo code.
To reproduce this, I'm running a loop that executes a single transaction over and over again:
- Set a key, with TTL (time-to-live), if it doesn't exist (NX).
- Increment key value.
- Return current TTL.
In this transaction, there are two opportunities to create the key. The first - .set() - is obvious. But, the second command - .incr() - can also create the key. If the .incr() command is called when the key doesn't exist, it will create it and set the value to 1.
<cfscript>
// .... Some setup code for Jedis removed .... //
// We're going to throw an error when we hit our unexpected condition.
try {
var redis = jedisPool.getResource();
// I keep track of the number of times the .set() command executed successfully.
var setCount = 0;
// What we're looking for here is a very very very small race-condition - a moment
// in time in which a Redis key expires in the middle of a Redis / Jedis multi
// transaction. To trigger this race condition, we're going to loop a large number
// of times and work with a key whose expiration date (time to live) is very small.
// --
// 1. Start queuing commands (multi).
// 2a. Set key w/ expiration if it doesn't exist.
// 2b. Increment key.
// 2c. Check TTL.
// 3. Execute transaction queue.
// --
// The race condition that we're trying to find is if the key set in (2a) in a
// previous loop iteration will expire between (2a) and (2b) in a subsequent loop
// iteration, thereby forcing (2b) to re-create the key without a TTL.
for ( var i = 0 ; i < 10000 ; i++ ) {
var multi = redis.multi();
// By using the "NX" option, we're only going to set this key if it does NOT yet
// exist. And, we'll set it to only have a 20ms timeout.
multi.set(
javaCast( "string", "ben_test_race_condition" ),
javaCast( "string", 0 ),
javaCast( "string", "NX" ),
javaCast( "string", "PX" ),
javaCast( "long", 20 )
);
// Increment the key value by 1.
// --
// CAUTION: If the key does NOT EXIST, this will re-create the key and set its
// value to 1 (however, it will not have a TTL at that point).
multi.incr( javaCast( "string", "ben_test_race_condition" ) );
// Query for the time-to-live of the given key.
// --
// CAUTION If the key does not have a time-to-live, this value will be -1.
multi.pttl( javaCast( "string", "ben_test_race_condition" ) );
// Execute all the queued commands - this will return an array of results, one
// index per queued command response.
var multiResponse = multi.exec();
// If the .set() command set the key successfully, because it didn't exist,
// the first index will come back as "OK"; otherwise, if the key already
// existed, and the "NX" setting prevented the .set() from executing, this
// array index will be undefined.
if ( arrayIsDefined( multiResponse, 1 ) ) {
setCount++;
}
// Check the time-to-live (TTL). If the key EXISTED, but did NOT HAVE an
// expiration date, this will return as -1. This is the edge-case that we are
// trying to find - this will indicate that .incr() - NOT .set() - created the
// key that we are checking.
if ( multiResponse[ 3 ] == -1 ) {
throw( type = "UnexpectedTTLState", message = "TTL is -1" );
}
}
// Only catch errors for the edge-case.
} catch ( UnexpectedTTLState error ) {
writeOutput( "InvalidState [ #error.message# ]." );
writeOutput( "<br />" );
writeOutput( "Iterations: #numberFormat( i, ',' )#." );
writeOutput( "<br />" );
writeOutput( "Set Executed: #numberFormat( setCount, ',' )# times." );
}
</cfscript>
When I run this code, on a clean Redis database, I get the following output:
InvalidState [ TTL is -1 ].
Iterations: 4,457.
Set Executed: 70 times.
As you can see, the for-loop was able to execute 4,457 times before we managed to create a key without a TTL (time-to-live). And, in fact, the key had successfully been created with a TTL 70 times before we hit our race condition. Given the fact that our code only had two possible key-creation steps, it's clear that this key - the one with no TTL - was created by the .incr() command.
But, given the transaction, how could the .incr() command ever be called in a context in which the key doesn't exist? After all, if the key didn't exist, it should have been created by .set() in the first step of the MULTI transaction. The only explanation is that transactions are not run in absolute isolation - only in isolation for other client requests. As such, there is a small race-condition in which my key expires after a failed .set() command but, before the .incr() executed.
In Redis, the MULTI, EXEC, and WATCH commands allow us to execute a set of commands as a single atomic operation. This executes in isolation of other client requests; but, not in absolute isolation. Both the WATCH command and (apparently) the EXEC command allow for volatile keys to be expired without terminating the transaction workflow. This can lead to tiny race conditions that you just have to account for in your code.
Want to use code from this post? Check out the license.
Reader Comments
I'm also look into such a similar issue. Did you try with Lua script "EVAL/EVALSHA" commands. In that as well, it is possible for the issue to happen.
@Sundar,
I was going to try to look into that next; but, given the time constraints, I didn't want to go down too many rabbit holes. Solve the problem and move on, and so forth.
That said, I am not sure if eval'ing script on the server would actually help? I think the point of the script-based eval was to make it atomic; but, if that's what the .multi() is already doing, then I'm not sure the eval would actually solve this problem in a different way.
Of course, I've never written a single line of Lua, so take this with a grain of salt.
@Ben,
I agree with you. LUA should also technically have the same issue, though it depends on how REDIS executes the LUA script. BTW, I have now written the LUA script and after incrementing a field in my hash key, I check for pttl of the key and if the pttl returns -1, then I directly delete my key from the script.
@Ben, how did you actually solve the problem? We have exactly the same case (transaction with SET NX EX, followed by INCR) and we are worried about this potential race condition.