I Always Design The Database Schema First, Then The ColdFusion Code

By Ben Nadel

Published 2022-01-27 in ColdFusion, SQL, Work — Comments (16)

Throughout my career, I've often heard that it is a best practice to design your "Domain Objects" and your "Business Logic" first and then, eventually, to design a database schema that allows your domain objects to be persisted. I've even seen many ORM (Object-Relational Mapping) systems that will happily churn-out database schemas based solely on your Objects (and their metadata). Personally, I've never done this. In fact, I find this approach to be antagonistic to how my brain operates. When I'm working on a ColdFusion application (or a feature therein), I always start with the database schema first and then layer the ColdFusion application upon it using an iterative, ground-up approach.

Relational Database Schemas Are Agile, Mostly Like Code

Relational database schemas are living, breathing organisms that have to evolve alongside the needs of the business. Adding tables, modifying tables, altering column definitions, adjusting database indexes - it's all par for the course. Some database operations are certainly more challenging than others: years ago, I used to manually perform online ALTER TABLE migrations using shadow tables.

These days, however, even relational database systems (as opposed to NoSQL document databases) have greatly lowered the barrier to entry. More mutations can be applied without locking or downtime. And, for operations that do require a table rebuild, tools like Percona's pt-online-schema-change exist to make this relatively painless (and transparent to your users).

Or course, locking really only becomes a concern with either high data volume or high user traffic - a "good problem" that most of don't actually have enough of. I'm sure for the majority of uses-cases, a good old ALTER statement will just get the job done.

To be clear, I'm not trying to argue that a relational database schema is as dynamic as a NoSQL document database. But, we do need to dispense with any anachronistic notions that relational database schemas are poured in concrete and unable to be changed. That simply isn't the world we live in anymore.

Building Blocks, From The Ground-Up

When I say that I "build the database schema first", I don't mean that I build the entire schema first and then the entire application on top of it. That would be entirely foolish and a waste of time. After all, we never know as little about our applications as we do "right now". We simply don't have enough information on-hand to design out an entire anything at this moment, let alone a database schema.

When I say that I "build the database schema first", what I mean is that, for any facet or feature within an application, I design just enough database schema so that my brain has something on which to anchor. Like a toddler, my brain needs a "safe space" from whence it can start to explore. Without the safe space of a table definition, my brain feels lost - like it doesn't know which direction to go in; or even which directions it has available to it.

Once I have a CREATE TABLE statement in front of me, I can start to build up the world around it in my head. I can start to think about data-access patterns that my application will likely use. And perhaps most importantly, I can start to consider uniqueness constraints and how those uniqueness constraints may be able to drive idempotent workflows.

Obviously, I won't know everything that I need to know about how this data will be used. But, there are some timeliness questions that always make sense:

Will I ever need to list-out the data in this table? If so, does the data need to be filtered or sorted? If so, should those columns be included as part of an index (spoiler alert, yes, most likely).
Will I ever need to aggregate any of the rows for different User Interfaces (UI) within the application? If so, do I have indexes that allow those aggregations to be calculated with good performance? Do I need different indexes to facilitate different UI aggregations?
If the table has columns that reference primary keys in another table, does it make sense for those values to be duplicated? Or, should I use a UNIQUE KEY in order to enforce uniqueness as a safety fall-back beyond the business logic?
Does the table have any non-foreign references that need to be unique (such as an email address)? If so, should I use a UNIQUE KEY in order to enforce that uniqueness for data integrity?
Will I ever need to "soft-delete" or "archive" any of the rows? If so, do I need to take that into consideration with my index design? Or, perhaps move the archived rows into an "archive table" in order to reduce the volume within the active data table?
Do any of the columns within one table merit a more robust data model? Meaning, should a column be "promoted" to it's own table so that I can store more relevant data about the concept being captured by that column?
Are any of the columns likely to change at a radically different tempo - perhaps I need to split my high-writes and my high-reads into different tables in order to reduce row-level locking?

Again, my goal here isn't to get it right; my goal is just to get the ball rolling - to give my brain something to start chewing on. I know that the database schema will change over time and I'm totally OK with that.

Data Access Layer

Once I have my CREATE TABLE statement(s) in place, I start working on the next building block: my data access abstraction components (aka "Data Access Objects" (DAO), aka "Data Gateways", aka "Data Repositories"). These are the ColdFusion components that encapsulate the SQL statements for CRUD (Create, Read, Update, Delete) operations. Usually, these ColdFusion components will have methods like:

GetThingByID( id )
GetThingsByFilter( ...filters )
DeleteThingByID( id )
DeleteThingsByFilter( ...filters )
UpdateThingByID( id, ...values )

Then, depending on much I've thought-through the application, I may have some more specialized operations like:

IncrementThingColumnValue( id )
DoesThingExistByID( id )

These data access objects are here for low-level CRUD-style operations only. These are not for reporting purposes! One of the biggest mistakes that I see teams make is an attempt to collocate CRUD queries and reporting queries into the same components - this becomes a maintenance nightmare (and often a performance nightmare) that only compounds over time. Reporting queries should be broken-out into their own specialized components.

Entity Access Layer (aka, Service Layer)

Once I have my data access components in place, I start working on the next building block: my entity access components. These ColdFusion components deal entirely with native ColdFusion data structures (think Structs and Arrays) and hide all low-level data-persistence operations through the use the data access layer.

I usually refer to this layer as the "Service Layer" of the application. It enforces business logic that is local to a given entity or "aggregate" (in Domain Driven Design parlance). Though, no all ColdFusion components in the service layer deal with "entities" - some are more infrastructure related (like sending emails and maintaining rate-limit counters).

Usually there is a 1-to-1 relationship between an Entity Access component and a Data Access component. So, for a theoretical Thing entity, I would have:

ThingService.cfc - the entity access layer.
ThingGateway.cfc - the data access layer for the entity access layer.

... and, the ThingGateway would be provided to the ThingService using Inversion of Control (IoC), possibly wired together using a Dependency-Injection (DI) container; though, for simple applications, I manually wire this kind of stuff together in order to reduce complexity.

ASIDE: A data access object (DOA) should only ever be consumed by its relevant entity access object. The moment a data access object is being used by multiple higher-level components, I consider that a code-smell. Instead, the "entity access" component should be the shared point of interaction.

The methods on the entity access layer may be reminiscent of the underlying data access component, but they are usually more "business focused". Meaning, the entity access components never expose a GetByFilter() style method; instead, they would expose several different GetBy___() methods, each of which is focused on accessing the data through different facets and may have different constraints.

Commands (aka, Workflows) And Queries (aka, Partials) Layer

Once I have my CREATE TABLE statements that feed my data access layer which is, in turn, utilized by my entity access layer, I have about as much "theoretical" code as I can create using nothing but my nascent mental model. At this point, I can start to work on the next building blocks: my commands and queries. Together, this is the layer that represents the "business rules" of the application. While the entity access layer may contain data-level constraints, the commands and queries contain business-level constraints; and, how those constraints need to be applied across multiple entities.

ASIDE: Entities don't know anything about each other. They may reference each other though foreign keys; but, this is just opaque data. All cross-entity constraints need to be enforced at the command layer.

The Command / Workflow layer is the set of ColdFusion components that tells the application to "do" something. Create this, delete that, upgrade this account, send out that alert. It's the way the state of the application is mutated over time.

The Query / Partial layer is the set of ColdFusion components that aggregates data for specific views within an application. And while I believe that this is an integral part of the application, if you squint hard enough, you might consider the Partials an "optimization technique." It is there to pull data out of the database with high-performance intended to answer a specific set of UI questions.

In theory, you could gather all your UI data using nothing but the entity access layer; though, I suspect this would quickly lead to N+1 problems. Which is why I consider the Query / Partial layer a necessity, not just an optimization.

This is where most of the iteration happens. All of the lower-layers within the application can be started using the power of thought and imagination. But, the Commands and Queries are where the rubber meets the road - where the theory meets the practice. At this point, I'll start to add missing entity layer methods that I need in order to power the business logic of the Command / Workflow components.

Much of the time, these new entity layer methods can turn around and consume existing data layer methods. However, sometimes new data layer methods need to be created as well. And, sometimes, new database table columns need to be created (or altered) in order to serve these new methods. A new table index may even have to be added in order to make sure that a newly-exposed data access pattern is performant.

The Query / Partial components also impact the structure of the database; however, this is usually more about indexes and performance. Though, sometimes, a partial will precipitate a rethinking of an existing structure and a potential decomposition of a single table into multiple tables.

Iterating Up And Down The Stack

When I start a new application or feature, I always start with the database design. This gives me a solid base on which to build my mental model and my ColdFusion code. I then work up the stack, creating larger "building blocks" that can be used to perform larger operations. As I build out these operations, though, I will often have to work back down the stack adding new capabilities. This in turn allows me to work back up the stack, fleshing out business logic.

And so on, up the stack and back down the stack, iterating as the needs of the application become more readily apparent; and, as the needs of the business change over time.

Certainly, I've made many mistakes in my ColdFusion application development. And, after maintaining a single, large application for a decade, I've learned a lot about what works and what doesn't work. And for me - and the way my brain functions - starting with the database schema first and then working my way up is a recipe for success.

Epilogue: ColdFusion Application Layering

I've been working with ColdFusion since 1999. And, in that time, I've learned a lot of ways to not build a maintainable application. This journey of discovery continues to this day, especially as technology evolves and the expectations of our users changes.

And while most this post talks about my iterative approach to designing a layered ColdFusion application, an older post of mine - Software Application Layers And Responsibilities, 2019 Edition - goes much more in depth on what each one of these layers does.

Of course all of this will continue to change as I get more experience. So, consider this my approach "as of now" - not a static concept frozen in time.

Short link: https://bennadel.com/4191

Reader Comments

Jason Jan 27, 2022 at 4:17 PM

142 Comments

Yep, totally with you. I haven't used CF in 5+ years now, but I was a CF dev for 20 years from 1996 on, and I have always worked up from the database. Partly because I started with database work before I got into web work, but also because my mental model is just like yours: once I can "see" the data structure, it's so much easier to think about workflow and UI, and then those paths quickly reveal changes / additions for the data structure.

These days I mostly use Angular on the frontend and Express + data access libraries on the backend (isn't it nice how CF's server lets us do both in a single app, though?!?), but I still start with that mental data model and then the actual data model before creating API endpoints and then Angular Services to consume them. It has just always made sense to me.

And, yes, it's generally (usually-but-not-always) easy to change data structures as necessary! Don't Fear the Refactor LOL

Ben Nadel Jan 27, 2022 at 4:20 PM

15,996 Comments

@Jason,

Totally! I think people over-estimate how hard it is to change a database table. And, sometimes, it is. But, most of the time, it's not that different from changing other stuff.

Sometimes, I'll actually DROP a table in production if it's still behind a feature flag and I need to change the structure more than a few simple alterations. If I know I need to make big changes, I'll just drop the whole table and create a new one. There's no sense in throwing good money after bad, so to speak, when a page-1 rewrite is what the doctor ordered (how many metaphors can I mix there?!).

Charles Robertson Jan 27, 2022 at 9:22 PM

462 Comments

Ben. Another great article! I think I am fairly similar, in that I need DB skeleton up & running before I start on the client/server stuff.

@Jason This is like the twilight zone. Although I still use CF for private projects, I haven't had any commercial CF work since about 2017. I spent 20 years working for various UK Companies as a CF Dev. There are very few CF opportunities in the UK, now.

I started working as an Angular Dev, commercially, a couple of years ago. And I also create NodeJs REST API apps with Express. Although I enjoy Angular/Node, wow do I miss being part of a Coldfusion team 😢 Anyway, it's interesting how similar our paths are 🙏

Ben Nadel Jan 27, 2022 at 9:24 PM

15,996 Comments

@Charles,

Angular is great, and I enjoy NodeJS; but, there's nothing that feels as easy as putting together a CFML back-end with the tag-based templating. I really wish there were better templating options for Node - it feels like all the ones I've seen are trying to do too much "magic" and not enough simplicity.

Jason Jan 27, 2022 at 10:24 PM

142 Comments

@Charles,

That's pretty cool. Old school CF for the win!

Charles Robertson Jan 27, 2022 at 11:53 PM

462 Comments

@Ben,

When you say templating options in Node, what exactly do you mean? I am afraid my only exposure to Node, is creating REST APIs with Express. I am also learning NestJS, which is an amazing TypeScript oriented system, used as a REST API framework for Node. It is heavily influenced by Angular ❤️. It uses decorators, DI, services and many other Angular features:

https://docs.nestjs.com/

But, yes, I am 100% with you. The beautiful simplicity of Coldfusion is its strongest asset. CF is incredibly flexible. I now use CF FW1 MVC framework, when I create pure server side rendered applications and Angular + CF Taffy.io REST API, when I want to use a client/server model. The combinations are literally endless! I will always use Coldfusion as my primary back end language, but I will also have to be practical and learn other server side technologies to pay the bills 😀

Ben Nadel Jan 28, 2022 at 10:58 AM

15,996 Comments

@Charles,

It's been a while but, I believe with Express.js you can just tell it use a different templating engine for the View rendering, like "handlebars" or "pug". But, I am not really talking from authority.

Charles Robertson Jan 28, 2022 at 11:24 AM

462 Comments

@Ben. OK. I see. So we are talking HTML templates etc. I guess, because I have had only built REST API with Node, I have never been exposed to this side of a Node website!
But I just had a look at pug. Oh the horror! 😱
Now, I realise how spoilt I have been, developing with CFML. The tag based language makes creating HTML content, completely seamless. In fact, CFML tags look just like HTML tags! 😂
Even the script based Coldfusion language looks easier than pug. I mean, we can use WriteOutput(), although I never use the script based version to output HTML.
I usually only use the script based language in my controller & service etc.

But, thanks for the heads up 🙏

Ben Nadel Jan 28, 2022 at 11:28 AM

15,996 Comments

@Charles,

Yeah, it's like the templating languages in the Node world are allergic to HTML 🤣

Charles Robertson Jan 28, 2022 at 11:31 AM

462 Comments

@Ben
One other thing.
I have recently been using a brilliant MD based WYSIWYG editor. It is Vanilla JS based and allows users to add MarkDown but in a TinyMCE like editor.

I thought it might be a nice addition to your blog?

It is extremely easy to set up and seeing as you already parse MarkDown, then there should be nothing to change on the server side. I presume you use FlexMark?

It is called SimpleMDE:

https://simplemde.com/

Ben Nadel Jan 28, 2022 at 11:39 AM

15,996 Comments

@Charles,

Groovy, I'll take a look and see what it does / how it works. Cheers!

Jesse Earle Jan 29, 2022 at 10:33 AM

1 Comments

I'm guessing the reason we hear Domain Objects and Business Logic first is because way too often the schema isn't controlled by the developer. So it's sort of a wishful writing/complaining to devs 20 years ago.

In other words, I'm given a greenfield UI and API in the latest and greatest, that still needs to rely on a 20 year old schema.

Typically what happens is I'm given a UI mockup with a form that has 2 fields Shoe and Hat. Shoe and Hat end up mapping to the columns Orange and Apple in the database (literally that counterintuitive and random). So I have no choice but to start from the modern domain first and work backward through adapters to the original domain.

If i'm lucky I can write a nodejs gateway between the old domain the modern UI. If I'm not I have to do the translation layer on the frontend (which has all the challenges of a backend).

I end up writing backend code on the frontend and I have a similar convention. What you call the Data Access Layer/Gateway I call Sources and the Entity Access Layer I also call Services with what sounds like the exact same IoC:

my-api/
  services/
  sources/
  index.ts

index.ts:

export const createMyAPI = ({ uri }: Props) => {
  const sources = createSources({ axios, uri });
  const services = createServices({ sources });

  return services;
}

I put what you call Command / Workflow and Query / Partial at the top level of a feature and optionally share it between features. So a Command / Workflow becomes a Query / Partial if it's shared to a typically child feature and that child feature needs to use it in a particular way. I've never had language around the Query / Partial so that's a great find.

Whether it comes from a sqldb, nosqldb or http response, we end up sharing the exact same problems without a shared language to speak about it.

Ben Nadel Jan 29, 2022 at 10:48 AM

15,996 Comments

@Jesse,

This is a great call-out. There's definitely a world of difference between "building a feature" and "maintaining a feature". When you get dropping into the middle of some existing "big ball of mud", then you just do what you gotta do. And, the idea of creating some sort of translation / adapter layer that hides some of the garbagey stuff feels like the right move.

As far as terminology, I borrow a little from here and a little from there. There is a notion of (Wikipedia: "Commands and Queries") that just generally talks about separation concerns. Then, there's a more generalized notion of "CQRS" - Command Query Responsibility Segregation. Though, this latter notion is often times coupled to "Event Sourcing", which is far more complicated than anything I've ever done.

What I call "Workflows", I think Robert "Uncle Bob" Martin calls "Use Cases". And, I'm pretty sure I borrowed the terminology for "Partial" from older view-templating languages where a piece of the View layer was called a "partial" (Ruby on Rails: Using Partials).

Ultimately, the biggest lesson I ever learned was simply this:

Don't use your CRUD layer to aggregate View data.

In my experience, this leads all sorts of nonsense building up in the CRUD layer that shouldn't be there. And, the more developers you have on the team, the more and more and more cruft shows up. Until, eventually, your getProductByID() method pulls back the product ... and the store description... and the name of the user that keyed it in... and the current inventory value... and the related products... and what should have been a simple WHERE id=? clause is now five INNER JOIN clauses. Why? Because this developer needed this, and that developer needed that, and they were all too lazy to build their own queries to get it 😱

Carlos May Feb 9, 2022 at 10:44 AM

1 Comments

@Ben, It is nice to say hello! thank you so much for sharing your experience. I did a CFML project when I was a student few years ago... These days adobe was the platform to run it. I would like to know how I can running (a service like apache or something) in my server... I am so lost now but I love CFML. Thanks for answering! best regards! 🙌

Ben Nadel Feb 9, 2022 at 10:47 AM

15,996 Comments

@Carlos,

That's awesome that you're into ColdFusion - it's such a great language. To be honest, though, I'm not really much of a "server" person. I actually pay for managed hosting because I don't feel comfortable (yet) dealing with things like IIS, Apache, Nginx, or whatever else might be serving requests.

That said, you might want to checkout Ortus Solutions' "Command Box" product - it's a CLI that helps you start/start ColdFusion servers. I use it locally for testing; but, I know that other people use it to run stuff in production.

I wish I had better advice. I know the CFML language really well; but, I'm still learning a lot when it comes to running servers.

Charles Robertson Feb 9, 2022 at 12:18 PM

462 Comments

@Carlos Well, I can tell you that I run two Coldfusion based Virtual Private Servers [VPS]. A VPS is very similar to bare metal, but costs a lot less to run. It runs Windows 2012R2, with Lucee application server, which is open source and free:

Lucee Application Server

I then use MySQL, community server, as my database. Again, this is free:

MySQL

I store all my static assets on the VPS, but you can look at cheap solution like AWS S3:

AWS S3

Actually, CFML now integrates with S3.

I am running IIS7+, which integrates seamlessly when you install Lucee.

Good Luck! 🙏

Oh my chickens, this post is old!

Hit me up on Twitter if you want to discuss it further.