You're reading the free online version of this book. If you'd like to support me, please considering purchasing the e-book.

Going Deep on Feature Flag Targeting

As web application developers, we generally communicate with the database anytime we need to gather information about the current request. These external calls are fast; but, they do have some overhead. And, the more calls that we make to the database while processing a request, the more latency we add to the response-time.

In order to minimize this cost, most feature flag implementations use an in-memory rules engine which allows feature flag state to be queried without having to communicate with an external storage provider. This keeps the processing time blazing fast! So fast, in fact, that you should be able to query for feature flag state as many times as you need to without having to worry about latency.

Aside: Obviously, all processing adds some degree of latency; but, the in-memory feature flag processing—when compared to a network call—should be considered negligible.

That said, shifting from a database mindset to a rules-engine mindset can present a stumbling block for the uninitiated. At first, it may be unclear as to why you need anything more than a User ID (for example) in order to do all the necessary user targeting. After all, in a traditional server-side context, said User ID opens the door to any number of database records that can then be fed into any number of control flow conditions.

But, when you can't go back to the database to get more information, all targeting must be done using the information at hand. Which means, in order to target based on a given data-point (such as a user's role or subscription plan), said data-point must be provided to the feature flag state negotiation process.

I find that a pure function provides a helpful analogy. A pure function will always result in the same output when invoked using the same inputs. This is because the result of the pure function is based solely on the inputs and whatever internal logic exists.

Consider this function:

function calc( a, b ) {

	return( a * b + 1 );

}

If you invoke the calc() function with parameters (2,4) you will always get the result, 9. Even if you invoke this function 1,000 times in a row (with the same parameters), you will get 9 on every single execution. This is because the result is based entirely on the inputs and the inputs aren't changing.

If you wanted the result of a (2,4) invocation to change, you'd have to change the underlying logic embedded within the function itself. For example, you might change the +1 to a +2. This would change the absolute result of the invocation; but, the relative result of all invocations going forward would be consistent.

Feature flag targeting works the same way. If you want to target based on an email address (for example), you'd have to invoke the targeting mechanism using the email address as an input. And, the same email input will always result in the same targeting output; unless—and until—you change the logic embedded within the targeting system; and, that's where the rules engine come into play.

The rules engine represents that embedded logic. Only, unlike a pure function, a feature flag system allows us to dynamically change the embedded logic at runtime; and, to do so without deploying new code.

A Simplistic Feature Flag Targeting Implementation

Sometimes an engineer won't truly understand how something works until they pull back the covers and look at the code. As such, I want us to collaboratively build of a completely lacking, overly simplistic, toy version of a feature flags system. What we look at here will not represent production-grade quality by any means; but, I hope that it contains detail enough to shed light on the core mechanics of targeting.

As in the previous chapters, we're going to define our feature flags using a series of simple data structures. Each structure contains the following information:

  • The variants that can be served by the feature flag targeting.
  • The default rollout strategy.
  • An optional, rule (or set of rules) that can target a subset of requests and apply a different rollout strategy.

A Boolean feature flag configuration, with no optional rule, looks like this:

{
	variants: [ false, true ],
	distribution: [ 100, 0 ]
}

The first array (variants) defines the collection of valid variants that can be returned from the feature flag evaluation process. The second array (distribution) defines the weighted distribution of the corresponding variants. In this case, the 100 in the first index means that 100% of users will receive the false variant (and the 0 in the second index means that 0% of users will receive the true variant).

If we wanted to start rolling this feature out slowly, we could change the distribution property:

{
	variants: [ false, true ],
	distribution: [ 90, 10 ]
}

Here, we're saying that 90% of the users will now receive the first variant (false) and 10% of the users will now receive the second variant (true).

The number of elements in the variants array must match the number of elements in the distribution array (as the latter corresponds to the weighted distribution of the former). And, in the case of a Boolean feature flag, only two elements make sense. But, if we were to serve up non-Boolean data, these arrays can be longer.

For example, if we wanted to implement dynamic logging in a production application, we could create a string-based feature flag that serves the minimum log level to be emitted by the application:

{
	variants: [ "error", "warn", "info", "debug", "trace" ],
	distribution: [ 100, 0, 0, 0, 0 ]
}

Here, the variants represent the minimum log levels emitted by the application. And, in this case, 100% of all requests are configured to emit error level log entries and higher.

If we were in the middle of an incident and wanted to temporarily turn on lower-level logging, we might move the 100 from the 1st index (error) to the 4th index (debug). Then, all requests coming into the production application would start emitting error, warn, info, and debug level log entries (but not trace):

{
	variants: [ "error", "warn", "info", "debug", "trace" ],
	distribution: [ 0, 0, 0, 100, 0 ]
}

Of course, turning debug level logging on for an entire application at one time could overwhelm the system and / or lead to a non-trivial cost increase (depending on how you aggregate logs). Instead, we might try to enable low-level logging for 5% of users and hope that the slight increase in logging provides enough insight:

{
	variants: [ "error", "warn", "info", "debug", "trace" ],
	distribution: [ 95, 0, 0, 5, 0 ]
}

Here, 95% of users will continue to receive error level logging and 5% of users will receive debug level logging (and higher).

Or, maybe we only want to turn debug level logging on for specific users, such as our internal Product Engineers. That's where our rule property comes into play. The rule property allows us to target a subset of users and apply a different distribution of variants to just those users.

A rule contains an operator, two operands, and the weighted distribution that will be used if—and only if—the request passes the operator assertion. So, if we want to turn low-level logging on for our own developers, our rule could use the IsOneOf operator and test the incoming User ID against a set of known developer IDs:

{
	variants: [ "error", "warn", "info", "debug", "trace" ],
	distribution: [ 100, 0, 0, 0, 0 ],
	rule: {
		operator: "IsOneOf",
		input: "UserID",
		values: [ 1, 16, 34, 2009 ],
		distribution: [ 0, 0, 0, 100, 0 ]
	}
}

With this configuration, the feature flag serves the error log level to 100% of users by default. However, if the requesting UserID is one of the given set of values, [1, 16, 34, 2009], then 100% of that subset of four specific users will receive the debug log level (and higher).

We can make this targeting even more flexible by allowing for an array of rules in which each rule targets a different cohort of users. Instead of turning debug level logging on for just our engineers, perhaps we also want to turn it on for the customers reporting the bug.

Let's assume that all affected customers are part of the "Example, Inc" enterprise, which accesses the application using the URL subdomain, example-inc. We can create an additional rule that targets just that subdomain:

{
	variants: [ "error", "warn", "info", "debug", "trace" ],
	distribution: [ 100, 0, 0, 0, 0 ],
	rules: [
		{
			operator: "IsOneOf",
			input: "UserID",
			values: [ 1, 16, 34, 2009 ],
			distribution: [ 0, 0, 0, 100, 0 ]
		},
		{
			operator: "IsOneOf",
			input: "CompanySubdomain",
			values: [ "example-inc" ],
			distribution: [ 0, 0, 0, 100, 0 ]
		}
	]
}

The rules are evaluated in order. And, at most, only one rule is applied in the given feature flag evaluation. So, if the user making the request is one of our developers (UserID: 34), the first rule will be applied and the second rule—testing CompanySubdomain—won't be evaluated.

If, however, the user making the request is a member of "Example, Inc", the first rule will be tested and won't pass (no matching UserID); and, the rules engine will move onto the second rule, which will pass (subdomain: example-inc). As such, the distribution in the second rule is what gets applied in the feature flag evaluation. And, in this case, 100% of all "Example, Inc" users will emit debug level logs (and higher).

Using these simple constructs—variants, distributions, rules, and operators—we have everything we need to create a powerful targeting system.

And now that we have a sense of how the underlying configuration works, we need to write the code that maps application requests onto feature flag variants. Even though we're keeping this implementation as simple as possible, we'll still benefit from using best practices like decomposition, coding to interfaces, and encapsulation; which all just means taking the larger concepts and breaking them up into smaller pieces that are easier to understand.

To that end, I want to model the concepts of the "operator" and the "distribution" as separate components. This will help keep all the code within our top-level "Features" component at the same level of abstraction.

In CFML, which is the programming language that we're using in this exploration, the concept of a "component" is what other languages might call a "class". It can be instantiated, with an optional constructor function named init(); it has both a public scope (this) and a private scope (variables); and it can contain both properties and methods.

Let's start with the implementation of our operators. Each type of operator (ex, IsOneOf, StartsWith, EndsWith, etc) will be modeled as a separate component; but, each of these components will adhere to the same API interface which exposes a single method—test()—which allows each operator to be used in a uniform manner:

public boolean function test(
	required any contextValue,
	required array values
)

Note: Although I'm defining the contextValue argument to be of type any, it must be a "simple" value. In CFML, this includes a string, number, Boolean, or date.

The contextValue parameter is the value associated with the incoming request. In our previous configuration, this might be the User ID of the authenticated user or the URL subdomain under which the application is being accessed. The values parameter is the array of configured values (in our feature flag) against which the contextValue is being tested. In our previous configuration, this would be the collection of known developer IDs or the enterprise subdomains.

In order to implement our IsOneOf operator, all we have to do is loop over the values collection and perform an equality check against the contextValue:

// Implementation of IsOneOf operator.
component {

	public boolean function test(
		required any contextValue,
		required array values
		) {

		for ( var value in values ) {

			// If any value matches, short-circuit the
			// loop execution and return true. We don't
			// need to test any other values once we
			// find a targeted match.
			if ( value == contextValue ) {

				return( true );

			}

		}

		return( false );

	}

}

The API interface for every operator is the same; but, of course, the implementation details within the test() method will vary. As another example, let's look at the EndsWith operator which checks the contextValue against a collection of suffix values:

// Implementation of EndsWith operator.
component {

	public boolean function test(
		required any contextValue,
		required array values
		) {

		for ( var value in values ) {

			// Extract the TRAILING portion of the
			// context value input that matches the
			// length of the given test value.
			var suffix = right( contextValue, len( value ) );

			// If any value matches the suffix
			// substring in the context value short-
			// circuit the loop execution.
			if ( suffix == value ) {

				return( true );

			}

		}

		return( false );

	}

}

Using this simple interface, we can create any number of operators: equality testing, prefix testing, suffix testing, substring testing, date comparisons, regular expression pattern matching, range testing, etc. Each test() implementation is a pure function that operates solely on its own invocation arguments.

The distribution concept is a bit trickier to implement. In our configuration object, we're defining the variant distribution as a weighted distribution based on a percent allocation. When we have this configuration:

{
	variants: [ false, true ],
	distribution: [ 90, 10 ]
}

... it means that the false variant should be assigned 90% of requests and the true variant should be assigned to 10% of requests. And, it's implied that the variants should be assigned in a way that is consistent across requests.

An effective way to implement this logic is to create an array with 100 elements, wherein 90 of the indices contain the first variant (false) and 10 of the indices contain the second variant (true):

// Generate a 100-element array with:
[
	// 90 false values.
	false, false, false, false, false,
	false, false, false, false, false,
	false, false, false, false, false,
	// false, false, ... and so on, 90-times.

	// 10 true values.
	true, true, true, true, true,
	true, true, true, true, true
]

If we then randomly choose from this 100-element array, 90% of the random choices should result in false and 10% of the random choices should result in true, on balance.

Of course, nothing about a feature flags system is random. In fact, randomness would completely negate the value-add of gating code behind a feature flag. Instead, we need a way to consistently map the incoming request onto one of the 100 indices illustrated above. And for this, we need to use a context property that consistently and repeatedly represents the same user.

In our exploration, we're going to assume that this user context property is numeric so that we can use the modulo operator in order to easily bucket an opened-ended number of values into our 100-index distribution. A WeightedDistribution component will then take this property, construct the 100-index allocation, and return the variant chosen by mapping the property onto the allocation index.

Apologies: This code is a bit complicated. I tried to simplify it as much as possible; but, it has some essential complexity that cannot be reduced. If you can't follow this code, don't stress - it won't negatively impact your consumption of this book.

// Implementation of WeightedRollout distribution.
component {

	public any function getVariant(
		required numeric contextKey,
		required array variants,
		required array distribution
		) {

		var weightedVariants = [];

		// Generate our 100-element array using the
		// given weighted distribution.
		distribution.each(
			( percent, variantIndex ) => {

				for ( var i = 1 ; i <= percent ; i++ ) {

					weightedVariants
						.append( variants[ variantIndex ] )
					;

				}

			}
		);

		// Consistently convert the same contextKey
		// into the same array index that we will use
		// in our weighted distribution.
		var keyIndex = ( ( ( contextKey - 1 ) % 100 ) + 1 );

		return( weightedVariants[ keyIndex ] );

	}

}

In this code, our weightedVariants variable is the 100-element array that contains the materialized distribution of the given variants. We then take the contextKey—our numeric User ID—consistently map it onto a 1..100 inclusive range and return the corresponding variant. This way, the same user is always assigned the same variant when given the same distribution.

At this point, we have our operator implementations and our weighted distribution implementation. Now, let's wire them together into a coherent feature flags system. And, to configure this feature flags system, we'll use a dictionary of individual feature configuration objects. This is the data structure that we'll administer in order to create dynamic runtime behaviors within our application.

Each feature within this data structure must have a unique name so that our application code has something to reference when incorporating feature flags into the request control flow. Example:

{
	"new-checkout-process": {
		variants: [ false, true ],
		distribution: [ 50, 50 ]
	},
	"internal-admin-tools": {
		variants: [ false, true ],
		distribution: [ 100, 0 ],
		rule: {
			operator: "IsOneOf",
			input: "userType",
			values: [ "Admin" ],
			distribution: [ 0, 100 ]
		}
	}
}

Remember that our feature flag evaluation process is acting like a pure function. Which means that when we check the state of a feature flag in the context of a given request, we have to pass in the request context data such that our pure function has all of the inputs it needs in order to process the evaluation.

At a minimum, checking the feature flag state requires the name of the feature in question (ex, new-checkout-process) and the unique ID representing the user. This unique ID is what we use to perform the modulo calculation within our WeightedRollout component. In our request context object, we're going to refer to this unique ID as key:

features.getVariant(
	"new-checkout-process",
	{
		key: request.user.id
	}
);

Since our feature flags implementation uses a weighted distribution of variants, all calls to the .getVariant() method must include the key property. And, if we ever check for the state of a feature flag that includes a rule that operates on an additional property, we must include that additional property in the context object.

So, if we wanted to check the internal-admin-tools feature, which has a rule that operates on userType, we'd have to pass userType in as an additional entry in the context object in the .getVariant() invocation:

features.getVariant(
	"internal-admin-tools",
	{
		key: request.user.id,
		userType: request.user.type
	}
);

With our feature flags configuration object and our request context object, the .getVariant() method implementation ends up doing little more than piping properties into function calls. Here's our Features component—the config object being passed into the init() constructor is our feature flags configuration data structure discussed earlier.

There's more code here than in the previous examples. I've tried to highlight the parts of this component that are the most meaningful regarding a high-level understanding of how feature flag evaluation works:

// Implementation of Features.
component {

	// Constructor function.
	public void function init( required struct config ) {

		variables.config = arguments.config;
		// Instantiate and cache instances of our
		// operator components. The object-keys here
		// must match the `operator` values in the
		// feature flags configuration.
		variables.operators = {
			IsOneOf: new operators.IsOneOf(),
			NotIsOneOf: new operators.NotIsOneOf(),
			StartsWith: new operators.StartsWith(),
			EndsWith: new operators.EndsWith()
		};
		// Instantiate and cache our rollout component.
		variables.weightedRollout = new rollouts.WeightedRollout();

	}

	public any function getVariant(
		required string featureName,
		required struct context
		) {

		var feature = config[ featureName ];
		var distribution = feature.distribution;

		// The rules property is OPTIONAL. If it doesn't exist,
		// fallback to an array so we can assume its existence.
		var rules = ( feature.rules ?: [] );

		// For simplicity, we want to allow for an OPTIONAL
		// single rule property as well. In that case, we'll
		// override the rules collection (so that we can
		// use an array going forward).
		if ( feature.keyExists( "rule" ) ) {

			rules = [ feature.rule ];

		}

		for ( var rule in rules ) {

			var operator = operators[ rule.operator ];
			var contextValue = context[ rule.input ];
			var values = rule.values;

			// If the operator test "passes", override the
			// distribution for the current feature flag evaluation.
			if ( operator.test( contextValue, values ) ) {

				distribution = rule.distribution;
				// The first rule that matches, wins. No need
				// to check any other rules.
				break;

			}

		}

		var contextKey = context.key;
		var variants = feature.variants;
		var variant = weightedRollout
			.getVariant( contextKey, variants, distribution )
		;

		return( variant );

	}

}

When we initialize our Features component, we cache our configuration (the dictionary of named feature flags), we instantiate and cache our operators, and we instantiate and cache our weighted distribution rollout strategy. Our .getVariant() method then weaves these parts together when the application needs to check the state of a given feature in the context of a given request.

If we wanted to change the feature flags configuration, we could re-instantiate the Features component with a new data structure. But, most applications cache components as part of the application bootstrapping processing. As such, in order to enable the dynamic runtime behavior, we need to add a way to update the configuration within an existing Features instance. To do that, we can add a .setConfig() method:

component {

	// ... truncated code ...

	public void function setConfig( required struct config ) {

		variables.config = arguments.config;

	}

}

With that method, we can change the feature flags configuration at runtime which can change the outcome of a feature flag evaluation. To illustrate that, let's perform a feature flag evaluation twice in a single request; but, change the distribution of the feature flag mid-request:

// These two distributions will always serve up the
// first variant or the second variant, respectively.
ALWAYS_FIRST_VARIANT = [ 100, 0 ];
ALWAYS_SECOND_VARIANT = [ 0, 100 ];

features = new lib.Features({
	"new-checkout-process": {
		variants: [ false, true ],
		distribution: ALWAYS_FIRST_VARIANT
	}
});

// Outputs FALSE since all users get the first
// variant under the current configuration.
writeOutput(
	features.getVariant(
		"new-checkout-process",
		{
			key: 12345
		}
	)
);

// Change the configuration at RUNTIME.
features.setConfig({
	"new-checkout-process": {
		variants: [ false, true ],
		distribution: ALWAYS_SECOND_VARIANT
	}
});

// Outputs TRUE since all users get the second
// variant under the new configuration.
writeOutput(
	features.getVariant(
		"new-checkout-process",
		{
			key: 12345
		}
	)
);

Running this code gives us the following output:

false  // ALWAYS_FIRST_VARIANT
true   // ALWAYS_SECOND_VARIANT

The first call to the .getVariant() method uses the initial configuration, resulting in false (with 100% of evaluations returning the first variant). Then, the second call to .getVariant() uses the runtime-updated configuration, resulting in true (with 100% of evaluations returning the second variant).

And just like that, we have a functioning feature flags implementation! It has no data validation and no type safety; and, it's missing the majority of functionality that you'd find in a production-grade implementation (such as a user interface for administering the feature flag configurations). But, I'm hoping that by diving into the mechanics of variants, user targeting, and percent-based distribution, you'll start to form a stronger mental model for how feature flags can be woven into both your application code and your product development workflow.

Targeting With Non-Numeric key Values

In our implementation above, every call to .getVariant() requires a key to be passed in:

features.getVariant(
	"new-checkout-process",
	{
		key: request.user.id
	}
);

This key is used to power the weighted distribution calculation, which uses the modulo operator internally. As such—for the sake of simplicity—I asserted that the key had to be a numeric value.

But, in reality, it doesn't have to be numeric. The key can be any simple value that converts to a string. Such as an IP address:

features.getVariant(
	"new-checkout-process",
	{
		key: request.client.ipAddress
	}
);

When given a string, we can use a CRC-32 checksum to consistently translate a non-numeric value into a 32-bit integer. This level of detail goes beyond the scope of this book; but, I wanted to provide one example for the sake of completeness. Especially since I do make use of this non-numeric key behavior later on in the book (see Use Cases).

In the following code, we take a string-based IP address and translate it into a number using a CRC-32 checksum (by way of Java's CRC32 class implementation).

// Using the IP address as the "key"
// for feature flag targeting.
key = "192.168.1.1";

// Convert the String key into a Number
// using a CRC-32 checksum.
keyChecksum = createObject( "java", "java.util.zip.CRC32" )
	.init()
;
keyChecksum.update( charsetDecode( key, "utf-8" ) );

numericKey = keyChecksum.getValue();

// Use the BigInteger class to perform the
// modulo operation on the numeric key
// against the 100-bucket index.
BigIntegerClass = createObject( "java", "java.math.BigInteger" );
bigKey = BigIntegerClass.valueOf( numericKey );
bigBucketCount = BigIntegerClass.valueOf( 100 );

writeDump( bigKey.mod( bigBucketCount ) + 1 );

This is not simple code; so, don't worry if it makes no sense—understanding it isn't that helpful for the book. All that you need to know is that when we use the key of 192.168.1.1 (for example), we can consistently convert that string value into the numeric value, 64; which, we can then use to access the 100-element weighted distribution array.

Which is all to say, we can power a weighted distribution of feature flag variants using non-numeric key values. This is something that we'll do in the upcoming chapters.

Have questions? Let's discuss this chapter: https://bennadel.com/go/4544