Skip to main content
Ben Nadel at the jQuery Conference 2011 (Cambridge, MA) with: Scott Fitchet
Ben Nadel at the jQuery Conference 2011 (Cambridge, MA) with: Scott Fitchet

More Experimenting With The JSON Column Type In MySQL 5.7.32 And Lucee CFML 5.3.7.47

By
Published in , Comments (1)

Over the weekend, I took my first look at the JSON column type in MySQL 5.7. The JSON column type introduces a Document-like feature-set on top of the traditional relational database framework. In that post, I looked at storing Entity-Attribute-Values (EAV) in a JSON field. In this post, I wanted to look at an approach that we use at InVision in which an "invitation" to the system can be associated with Prototypes and Boards. Currently, we maintain these associations with a comma-delimited list of IDs (which is a garbage approach); and, I wanted to see if it would feel less like garbage if I used a JSON column type. To explore this, I'm using MySQL 5.7.32 and Lucee CFML 5.3.7.47.

CAUTION: This Should Have Always Been a Normalized Set Of Tables

With InVision, when you invite a user to the platform, you can start associating the pending invitation with other entities in the system such that when the recipient ultimately accepts the invitation, they are automatically associated with each of the aforementioned entities. Today, this is implemented in a rather naive fashion in which foreign entity IDs are stored as a comma-delimited list in the invitation record. Imagine a table that looks something like this (truncated):

  • email - The recipient of the invitation.
  • prototypeIDList - The comma-delimited list of prototypes associated with the invite.
  • boardIDList - The comma-delimited list of boards associated with the invite.

This type of structure leads to all sorts of problems:

  • We run into field-length overflows when too many IDs are added to the list.
  • We have to parse the ID-list as a String when searching for associations.
  • We can't index the IDs.
  • We end up leaving "dead IDs" in the various lists because it's overly complicated to remove IDs when the relevant entities are deleted.

Ultimately, the right solution to this problem is to break out the associations into their own tables in which each row associates a given invitation with a given entity. Something like:

  • invitation_board_association
  • invitation_prototype_association

Then, we could have added indexes to these tables that would have protected us against duplicates, made look-ups faster, and allowed us to easily delete associations as part of a cascading series of deletes.

The Better Version of a Bad Situation?

Using normalized tables is the right approach. But, my goal in this blog post is to see if it would feel any better to use the "current" approach with a JSON column type instead of a comma-delimited list in some VARCHAR fields.

At first, in this exploration, I tried to store the IDs as an array. But, this lead to more verbosity; and, ultimately failed because, at least in MySQL 5.7.x, you cannot search for integers within an Array. This became a show-stopped when I tried to break the association between an invitation and a foreign entity. Once I ran into this issue, I switched over to using a Hash in which the entity ID was the key and the value was always a hard-coded, true.

The more I tried to get this exploration to work, the more I realized it was just an incremental improvement beyond the current comma-delimited list approach. It became clear that a JSON column type was not the right approach. That said, it was still fun to explore the new JSON features in MySQL 5.7.x; so, I'm still going to share my implementation.

The Exploration

With all the copious caveating and hand-wringing out of the way, let's look at what I actually did. To explore this idea, I created a user_invitation table in which prototypeIDs and boardIDs would be stored as keys in a JSON hash:

CREATE TABLE `user_invitation` (
	`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
	`teamID` int(10) unsigned NOT NULL,
	`email` varchar(255) NOT NULL,
	`createdAt` datetime NOT NULL,
	`lastSentAt` datetime NOT NULL,
	`initialMemberships` json NOT NULL,
	PRIMARY KEY (`id`),
	UNIQUE KEY `byEmail` (`teamID`,`email`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

In this case, the initialMemberships JSON field looks something like this:

{
	"prototypeIDs": {
		"11": true,
		"22": true
	},
	"boardIDs": {
		"33": true
	}
}

In each of these sub-hashes, the keys are actually IDs and the true value is inconsequential - just a hook on which to hang the associated key/ID. Now, when an invitation is associated with prototype, for example, a new key-value pair is added to the prototypeIDs hash. And, when an invitation is disassociated from a prototype, a key is removed from the prototypeIDs hash.

To get a sense of how this might work, I created a data access layer (DAL) that provides some simple CRUD (Create, Read, Update, Delete) methods for acting on this database table:

component
	output = false
	hint = "I provide data persistence methods for user invitations."
	{

	/**
	* I add the given board as an initial membership for the given invitation.
	* 
	* @id I am the invitation being accessed.
	* @boardID I am the board being associated with the invitation.
	*/
	public void function addBoard(
		required numeric id,
		required numeric boardID
		) {

		addMembership( id, "boardIDs", boardID );

	}


	/**
	* I add the given prototype as an initial membership for the given invitation.
	* 
	* @id I am the invitation being accessed.
	* @prototypeID I am the prototype being associated with the invitation.
	*/
	public void function addPrototype(
		required numeric id,
		required numeric prototypeID
		) {

		addMembership( id, "prototypeIDs", prototypeID );

	}


	/**
	* I create a new user invitation.
	* 
	* @teamID I am the team to which the recipient is being invited.
	* @email I am the recipient of the invitation.
	* @createdAt I am the date at which the invitation was created.
	* @lastSentAt I am the date at which the invitation was last sent to the recipient.
	* @initialMemberships I am the set of initial associations to boards and prototypes.
	*/
	public numeric function createInvitation(
		required numeric teamID,
		required string email,
		required date createdAt,
		required date lastSentAt,
		required struct initialMemberships
		) {

		<!--- NOTE: There is a unique key on (TEAMID, EMAIL) field. --->
		```
		<cfquery result="local.results">
			INSERT INTO
				user_invitation
			SET
				teamID = <cfqueryparam value="#teamID#" sqltype="integer" />,
				email = <cfqueryparam value="#email#" sqltype="varchar" />,
				createdAt = <cfqueryparam value="#createdAt#" sqltype="timestamp" />,
				lastSentAt = <cfqueryparam value="#lastSentAt#" sqltype="timestamp" />,
				initialMemberships = <cfqueryparam value="#serializeJson( initialMemberships )#" sqltype="longvarchar" />
			;
		</cfquery>
		```

		return( val( results.generatedKey ) );

	}


	/**
	* I delete the invitation associated with the give email address. Each email can only
	* ever be associated with a single invitation.
	* 
	* @email I am the email for which invitations are being deleted.
	*/
	public void function deleteInvitationByEmail( required string email ) {

		```
		<cfquery name="local.results">
			DELETE
				i
			FROM
				user_invitation i
			WHERE
				i.email = <cfqueryparam value="#email#" sqltype="varchar" />
			;
		</cfquery>
		```

	}


	/**
	* I get invitations that have an initial membership for the given board.
	* 
	* @teamID I am the team under which we are searching.
	* @boardID I am the board association being searched.
	*/
	public void function getInvitationsByBoard(
		required numeric teamID,
		required numeric boardID
		) {

		return( getInvitationsByMembership( teamID, "boardIDs", boardID ) );

	}


	/**
	* I get invitations that have an initial membership for the given prototype.
	* 
	* @teamID I am the team under which we are searching.
	* @prototypeID I am the prototype association being searched.
	*/
	public void function getInvitationsByPrototype(
		required numeric teamID,
		required numeric prototypeID
		) {

		return( getInvitationsByMembership( teamID, "prototypeIDs", prototypeID ) );

	}


	/**
	* I remove the given board as an initial membership for the given invitation.
	* 
	* @id I am the invitation being accessed.
	* @boardID I am the board being disassociated from the invitation.
	*/
	public void function removeBoard(
		required numeric id,
		required numeric boardID
		) {

		removeMembership( id, "boardIDs", boardID );

	}


	/**
	* I remove the given prototype as an initial membership for the given invitation.
	* 
	* @id I am the invitation being accessed.
	* @prototypeID I am the prototype being disassociated from the invitation.
	*/
	public void function removePrototype(
		required numeric id,
		required numeric prototypeID
		) {

		removeMembership( id, "prototypeIDs", prototypeID );

	}

	// ---
	// PRIVATE METHODS.
	// ---

	/**
	* I create an association between the given invitation and the given type (either a
	* prototype or board membership). The association is stored as a key-value pain in
	* which the KEY is the MEMBERSHIP ID and the VALUE is always "true". The value, in
	* this doesn't really matter.
	* 
	* @id I am the invitation being accessed.
	* @membershipType I am the type of membership being accessed (prototypeIDs, boardIDs).
	* @membershipID I am the entity ID being added.
	*/
	private void function addMembership(
		required numeric id,
		required string membershipType,
		required numeric membershipID
		) {

		```
		<cfquery name="local.results">
			UPDATE
				user_invitation i
			SET
				i.initialMemberships = JSON_SET(
					i.initialMemberships,
					<cfqueryparam value="$.#membershipType#.""#membershipID#""" sqltype="varchar" />,
					true
				)
			WHERE
				i.id = <cfqueryparam value="#id#" sqltype="integer" />
			;
		</cfquery>	
		```

	}


	/**
	* I get the invitations associated with the given membership.
	* 
	* @teamID I am the team under which we are searching.
	* @membershipType I am the type of membership being accessed (prototypeIDs, boardIDs).
	* @membershipID I am the entity ID being searched.
	*/
	private query function getInvitationsByMembership(
		required numeric teamID,
		required string membershipType,
		required numeric membershipID
		) {

		```
		<cfquery name="local.results">
			SELECT
				i.id,
				i.teamID,
				i.email,
				i.createdAt,
				i.lastSentAt,
				i.initialMemberships
			FROM
				user_invitation i
			WHERE
				i.teamID = <cfqueryparam value="#teamID#" sqltype="integer" />
			AND
				JSON_CONTAINS_PATH(
					i.initialMemberships,
					'one',
					<cfqueryparam value="$.#membershipType#.""#membershipID#""" sqltype="varchar" />
				)
		</cfquery>
		```

		results = results.map(
			( row ) => {

				return({
					id: row.id,
					teamID: row.teamID,
					email: row.email,
					createdAt: row.createdAt,
					lastSentAt: row.lastSentAt,
					// The calling context should always deal in STRUCTS. It should not
					// know that the memberships go through stringification.
					initialMemberships: deserializeJson( row.initialMemberships ),
				});

			}
		);

		return( results );

	}


	/**
	* I break an association between the given invitation and the given type (either a
	* prototype or board membership).
	* 
	* @id I am the invitation being accessed.
	* @membershipType I am the type of membership being accessed (prototypeIDs, boardIDs).
	* @membershipID I am the entity ID being removed.
	*/
	private void function removeMembership(
		required numeric id,
		required string membershipType,
		required numeric membershipID
		) {

		```
		<cfquery name="local.results">
			UPDATE
				user_invitation i
			SET
				i.initialMemberships = JSON_REMOVE(
					i.initialMemberships,
					<cfqueryparam value="$.#membershipType#.""#membershipID#""" sqltype="varchar" />
				)
			WHERE
				i.id = <cfqueryparam value="#id#" sqltype="integer" />
			;
		</cfquery>	
		```

	}

}

As you can see, this ColdFusion component gateway relies on three JSON-functions in MySQL 5.7:

  • JSON_SET() - This is used to add new associations to a given invitation record. Each call adds a key-value pair to one of the sub-hashes within the JSON column. Each key is an ID for a foreign entity.

  • JSON_CONTAINS_PATH() - This is used to see if a key exists in one of the sub-hashes within the JSON column. In our case, since each key is an ID of a foreign entity, the existence of a given path indicates the existence of a relationship to that entity.

  • JSON_REMOVE() - This is used to remove an existing key in one of the sub-hashes within the JSON column. This, effectively, breaks the relationship to one of the foreign entities.

And, to test this out, I ran a simple script that put the various CRUD methods through their paces:

<cfscript>

	gateway = new UserInvitationGateway();
	teamID = 4;

	// Reset the demo.
	gateway.deleteInvitationByEmail( "ben@bennadel.com" );

	// Create a new invitation. Initially, the invitation will not be associated with any
	// particular set of prototypes or boards.
	id = gateway.createInvitation(
		teamID = teamID,
		email = "ben@bennadel.com",
		createdAt = now(),
		lastSentAt = now(),
		// NOTE: Originally, I was trying to store the IDs as ARRAYS. However, there is a
		// significant issue with JSON ARRAYS in MySQL 5.7 which is that you cannot
		// search JSON ARRAYS for INTEGER values. That said, storing the IDs in a hash
		// actually makes several things easier, at the expense of more storage space (
		// since Strings take more storage space than Integers in most cases and we have
		// to store an arbitrary "true" value for each ID).
		// --
		// Read More: https://bugs.mysql.com/bug.php?id=79233
		initialMemberships = {
			prototypeIDs: {},
			boardIDs: {}
		}
	);

	// Add some initial membership associations.
	gateway.addPrototype( id, 11 );
	gateway.addPrototype( id, 11 ); // Making sure duplicates don't show up.
	gateway.addPrototype( id, 11 ); // Making sure duplicates don't show up.
	gateway.addPrototype( id, 22 );
	gateway.addPrototype( id, 333333 );
	gateway.addBoard( id, 44 );
	gateway.addBoard( id, 55 );
	gateway.addBoard( id, 666666 );

	// Remove some initial membership associations.
	gateway.removePrototype( id, 333333 );
	gateway.removeBoard( id, 666666 );

	// Search for invitations.
	// --
	// NOTE: In order to prevent a FULL TABLE SCAN, all searching has to be done under
	// a given team. This way, the number of records is reduced to the point where having
	// to search though JSON objects shouldn't matter all that much.
	dump( gateway.getInvitationsByPrototype( teamID, 11 ) );
	echo( "<br />" );
	dump( gateway.getInvitationsByBoard( teamID, 44 ) );

</cfscript>

As you can see, we create an invitation record; and then, add and removed prototype and board associations before ultimately searching for the invitation based on a few of the associations. And, when we run this ColdFusion code, we get the following output:

Using JSON hashes to store associations in MySQL 5.7 and Lucee CFML.

As you can see, it works - each prototype and board association is stored as a key in the JSON column sub-hashes.

Again, Only Incrementally Better Than a Comma-Delimited List

Using a JSON column type is better than using a comma-delimited list of IDs in a VARCHAR field. But, only incrementally. I think this is a great example of where trying to use a "Document Store" starts to break down; and, a traditional "Relational Database" with its normalization standards would be best. But again, this was a fun - if not frustrating at times - exploration. And is starting to get me more familiar with how the MySQL JSON column type might be leveraged in a Lucee CFML application.

Want to use code from this post? Check out the license.

Reader Comments

15,848 Comments

@All,

A quick follow-up to this post, I've been thinking about the delta between the MySQL JSON support and the feature-set we get in more robust "document" databases (and the like). And, one thing I wanted to see if I could do was atomically increment a value within a JSON column:

www.bennadel.com/blog/3935-atomically-incrementing-json-column-values-in-mysql-5-7-32-and-lucee-cfml-5-3-7-47.htm

Turns out, since MySQL is build with ACID properties, this is actually fairly easy to build (if not a bit verbose).

I believe in love. I believe in compassion. I believe in human rights. I believe that we can afford to give more of these gifts to the world around us because it costs us nothing to be decent and kind and understanding. And, I want you to know that when you land on this site, you are accepted for who you are, no matter how you identify, what truths you live, or whatever kind of goofy shit makes you feel alive! Rock on with your bad self!
Ben Nadel