On Always Returning Collections From Data Access Layers (DAL) In ColdFusion
In my ColdFusion applications, I always have a "Data Access Layer," often referred to as a "DAL". This layer abstracts the persistence mechanisms for the application; and, encapsulates the complexities of interacting with low-level data inputs and outputs (I/O). While I love using a DAL, I've never felt confident in how I manage requests for records with a unique identifier. And, in fact, my approach changes with new implementations. This weekend, however, after listening to the JS Party episode on Postgres.js, I've decided to always return some form of collection from the data access layer in my ColdFusion applications.
Historically, I've thought about "filtering" and "direct access" as two distinctly different things:
Filtering: Get me records based on some column values. Zero or more records may match this filter; so, returning zero rows is a legitimate outcome.
Direct Access: Get me the record with a given unique value (ex, primary key). A single record is expected to exist; so, returning zero rows is an invalid request.
When filtering, my data access layer will happily return an empty collection if no records match. However, when using a direct access approach, I will often throw an error if the record with the given key cannot be found:
component {
/**
* Pseudo-code example of a direct-access method.
*/
public struct function getByID( required numeric id ) {
if ( ! cacheIndex.keyExists( id ) ) {
throw( type = "DAL.NotFound" );
}
return( cacheIndex[ id ] );
}
/**
* Pseudo-code example of a filter method.
*/
public array function getByUserID( required numeric userID ) {
var results = cache.filter(
( record ) => {
return( record.userID == userID );
}
);
return( results );
}
}
As you can see, in the "get by ID" method, if the record with the ID (primary key) doesn't exist, I am throw()
ing an error. I believe I started following this pattern for two reasons:
I really like the idea that a method (whether inside or outside of the DAL) throws an error when it cannot do what it was asked to do - ie, when it cannot uphold its "contract". In this case, the method was asked to retrieve a specific object; and, it cannot do that, so it must throw an error.
In some cases, it made the calling context easier to code because I never had to check for
isNull()
on the resultant value. I always knew that I would either get the record in question; or, the code would throw an error and the control-flow would return to higher-up in the call-stack. This allowed my calling code to be rich with "logic" and free of error-handling noise.
The "get by UserID" method, on the other hand, will happily return an empty collection if none of the records have the given userID
value. This is because there is no "contract" that this method should return records.
At first, this dichotomy worked well. But, over time, I've run into too many situations where having the "direct access" method throw an error was actually creating friction, not removing it. Then last week, I had to create a data access layer (DAL) that abstracted a remote API, not a database; and, translating 404 Not Found
HTTP response codes into throw()
statements just felt super dirty because it starting to build what felt like business logic into my data access layer.
Circling back to the JS Party podcast, the hosts discussed the fact that the SQL library for Postgres will always return an array, even when no rows can be found. And, more importantly, they talked about how nice that was. This is exactly what the CFQuery
object and the queryExecute()
method do in ColdFusion; but, for some reason, hearing people talk about the comforting predictability of this "symmetrical access pattern" in a different context gave my brain a moment to think about it from a fresh perspective.
And, what I've decided is that, going forward, my data access layer (DAL) will always return a collection when asked for records, even when a single record is expected:
component {
/**
* Pseudo-code example of a direct-access method.
*/
public array function getByID( required numeric id ) {
if ( cacheIndex.keyExists( id ) ) {
return( [ cacheIndex[ id ] ] );
}
return( [] );
}
/**
* Pseudo-code example of a filter method.
*/
public array function getByUserID( required numeric userID ) {
var results = cache.filter(
( record ) => {
return( record.userID == userID );
}
);
return( results );
}
}
In this approach, if there is no record with a given unique key, the DAL just returns an empty array. This means that the calling context - the associated "service layer" object - will have to check for this case; but, this will make all of my service layer / data access layer interactions consistent. And, I believe that this consistency will make my code simpler and more predictable in the long-run.
I think what I'm also coming to realize is that "filter" methods and "direct access" methods are NOT fundamentally different. They are both, ultimately, just filtering methods. What makes them different is what the calling context expects, not what the data access layer expects. In fact, we can code both patterns into a single "get by filters" method:
component {
/**
* Pseudo-code example of a filter method.
*/
public array function getByFilters( required struct filters ) {
var results = cache.filter(
( record ) => {
// Filter on id.
if ( filters.keyExists( "id" ) && ( record.id != filters.id ) ) {
return( false );
}
// Filter on userID.
if ( filters.keyExists( "userID" ) && ( record.userID != filters.userID ) ) {
return( false );
}
// If none of the filters excluded the record, allow it.
return( true );
}
);
return( results );
}
}
When we code it like this, it becomes much more clear that "get by ID" and "get by filter" are fundamentally the same: they are merely ways to winnow down records to those that match. Whether or not the results of the filtering are "valid" is a matter for the "service layer", not the "data access layer".
As always, programming is a journey. And, it's possible that this new path in my journey will prove to be problematic; and, if so, I'll address it then. But, for the time being, I'm going to start programming my data access layers to return arrays, even empty one; and, we'll see how it goes.
This Does Not Apply to Aggregates and Other Use-Cases
This use of collections only applies to DAL methods that aim to return records. This does not apply to DAL methods that return aggregates and existence checks. Those methods can continue to return numbers and Booleans.
Want to use code from this post? Check out the license.
Reader Comments
What we do for a DAL is this:
All services have a DAO. Let's say "CustomerService" and "CustomerDAO".
CustomerService has:
.getCustomerByID(id)
.getCustomerSearch(CustomerSearchCriteria)
Our BaseSearchCriteria object contains things like pagination request data (page: 2, pageSize: 10 (records)).
Our BaseSearchResults object contains things like TotalRecords (5250 found for the criteria) and RecordsReturned (10) because of pagination. Each inherited object (like "CUSTOMERSearchCriteria") contains getter/setter/properties for those specific items (CustomerName, CustomerNameLike), etc.
There are matching functions in the CustomerDAO, except, for when we have specific stored procs to call for taking advantage of specific indexes, determined through slow query analysis. For example, call proc A for normal searches, but if we've populated another search criteria value that requires a JOIN, then call this proc instead.
The DAO functions all return the raw queries. The Service functions then populate the query rows into model objects, and place everything in a Results object that includes pagination information. The Service functions also do what we call "the DAO split" where it analyzes the criteria (as noted above) for different DAO (and subsequently, different proc) calls.
For the .getXXXByID() calls, we expect the service to simply NOT return, therefore a check in the handler or other calling code of:
When it comes to something that should return one or more records, of course the service always returns an array, and the array could be 0-length or more. I should note that, while it might seem heavy (but isn't due to pagination), every "model" (e.g. Customer) is wrapped in a CustomerSearchResult that contains the TotalRecords. Yes, this means a 10 paginated result set is actually 20 objects. But, in our case, the Customer object itself was pulled from cache anyway and not created, only referenced for the return.
Lastly, for REST/API/Ajax calls, the entire array (or single model) is passed off to a DataExtractService which looks at the object, determines its type, and knows what (and how) to extract data into arrays/structs for JSON formatting compatibility. (We use Coldbox, and have a Wirebox listener/observer that injects the IOC (Wirebox) name into each model, service, etc. as its created, so we can always call getIOCAlias() in DataExtractService to determine its type.)
Since the majority of the calls are NOT ajax/json, we isolated the struct-extraction out.
Just another viewpoint to share! :) Great blog as always, Ben!
@Will,
Great write-up. Sounds like you have some pretty solid practices going on there. And, you raise something that I don't think I was even really thinking much about, which is that there's the "DAO", the "service layer", and then whatever is calling the service (controllers, workflows, scheduled tasks, etc). I think for this post, I was mostly thinking about the Service to Gateway (DAO) step. Meaning, the Service layer itself might still pass-on
null
values, orthrow
errors, depending on what it wants to do -- it's only the DAO that will make sure to pass back empty collections.Of course, I was trying to use the term "collection" abstractly so that my thinking wasn't too tightly coupled to the query object. But, in reality, in 99.99% of all DAO cases, I am returning a query object which is just a fancy collection with a
.recordCount
property instead of a.len()
method.All to say, from what I'm seeing in your comment, are two approaches aren't really at odds - I think they could / would work nicely together.
Personally we return a structure (or a model) for a getOne(:id) and an query (or a store) for a get(:config) (as we dont know how many records there are). then we can use the record directly rather than unwrap it.
Just thinking throwing an error for zero records would cause other issues and extra coding in many cases also... like telephone records, or orders for a person in a CRM which regularly have no records, so you'd then have to catch that error and ignore it? #hmmm sounds messy... just seems more consistant to have query.isEmpty(throw=true) from framework instead if you want that funcionality perhaps
Also revisiting our DAL and methods, we also have none()/getNone() as a method on our DAL as you may just want the field names or the query construct.
@Dawesi,
Yeah, having to catch errors just to see if something exists (for example), can definitely be a pain. Sometimes it makes sense, sometimes it's just a nuisance. If ColdFusion had always had a solid concept of
null
(they do nowish, but I haven't tried turning it on), then I wonder if I would have relied on that more for non-results.That's an interesting idea with the
none()
method. Curious what kind of stuff uses that? Is that more for internal debugging?On a somewhat related note, I'm going to start playing with "Maybe" data-structures. That is, a representation of an entity that may or may not exist at the moment:
www.bennadel.com/blog/4328-considering-control-flow-and-transient-data-relationships-in-coldfusion.htm
This way, I can easily - and efficiently - consume transient relationships in my workflows without having to do a bunch of Null-checks or try/catch blocks.
Post A Comment — ❤️ I'd Love To Hear From You! ❤️
Post a Comment →