Converting A Query Into A Human-Readable CSV In Two Phases In ColdFusion
The other day, I was having a chat with Adam Cameron regarding a very old (2008) post that I wrote for converting a ColdFusion query into a CSV payload. The code in that post makes me cringe; and represents both a style and a mindset that feels archaic. As such, I wanted to go about modernizing that code. But, as I was rewriting it, I kept running into hurdles. What I realized is that converting a ColdFusion query directly into a CSV is simply not something I do that often. Instead, I use a two-phase process that first builds an Array-based representation of the "report data"; and then, I serialize this intermediary value as CSV (Comma Separated Values).
If all you need to do is take a ColdFusion query and serialize it as a system exchange payload to be consumed by another system, then you probably don't need much customization; all you need is a format that can be read by another computer. The problem is, in my line of work, I'm almost never generating CSV payloads for computers, I'm generating them for humans.
And, once a human is your target audience, then User Experience (UX) becomes a factor. Which means, how you format your report data matters. You can't just dump data to the screen: you have to consider how people are going to read it; and, how you might be able to remove "noise" in order to make it more human-consumable.
The other big issue that I run into with CSV report generation is that I often need to pull data from multiple sources. Which means, even if I had a function that converted a single query into a single CSV payload, I couldn't use it consistently without a lot of pre-serialization SQL and code gymnastics.
Since each CSV report has its own special constraints, I've taken to breaking the process up into two phases:
Generating a customized, two-dimensional array of data.
Generating a CSV from the intermediary array.
Having two phases means more overhead for the report generation. But, I have yet to run into a report where the database speed wasn't the biggest bottleneck. As such, I'm not concerned. And, I believe it makes the code more flexible, explicit, and easier to maintain.
Here's an example of generating a User report. You'll see that my intermediary data structure makes decisions about header names, column order, formatting dates, and omitting Falsy values in order to reduce noise in a Yes/No column. These aren't "best practices" across the board - they are simply the choices I've made for this report:
NOTE: I'm using the
CsvSerializer.cfc
that I created in my previous post on remediating CSV Injection attacks in ColdFusion.
<cfscript>
include "./sample-data2.cfm";
// ------------------------------------------------------------------------------- //
// ------------------------------------------------------------------------------- //
// STEP ONE: Taking the database query and mapping it onto an intermediary two-
// dimensional array that represent the cells in a spreadsheet. This allows me to be
// very exacting in how I want to name and format data. For example, I can completely
// rename headers, choose different date-masks for different columns, and exclude
// values that are "falsy".
rows = [
[
"ID",
"Name",
"Location",
"Created",
"Still Active"
]
];
loop query = sample {
rows.append([
sample.id,
sample.name,
sample.location,
// These are report-specific formatting functions.
formatReportDate( sample.createdAt ),
formatReportYesNo( sample.isActive )
]);
}
// STEP TWO: Once I have my row-data specified, complete with custom row headers and
// report-specific formatting for dates, Booleans, etc, I can now go about serializing
// the data down into a CSV payload. And, since we've extracted the high-touch, per-
// report formatting, it means that the CSV generation can be very generic.
csvData = new CsvSerializer()
.serializeArray( rows )
;
echo( "<pre>#encodeForHtml( csvData )#</pre>" );
// ------------------------------------------------------------------------------- //
// ------------------------------------------------------------------------------- //
/**
* I return the date/time format for the given value, returning an empty string for
* null values.
*/
private string function formatReportDate( required date input ) {
// If this is a NULL'able date, skip it.
if ( input == "" ) {
return( "" );
}
return( dateFormat( input, "yyyy-mm-dd" ) );
}
/**
* I return the Yes/No format for the given value, returning an empty string for false
* values.
*/
private string function formatReportYesNo( required boolean input ) {
return( input ? "Yes" : "" );
}
</cfscript>
As you can see, I'm simply looping over the ColdFusoin query and I'm mapping it (so to speak) onto an array, where each row is an array unto itself. I find this code to be straightforward. And, more importantly, I find it easy to modify. Meaning, within the CFLoop
- for example - I can easily add row-specific logic. Heck, I can even have rows of different lengths because the CsvSerializer.cfc
doesn't need to have a strict "Grid" of data - it just deals with Cells.
If I run this ColdFusion code in Lucee CFML, we get the following output:
As you can see, we were able to generate a CSV (Comma Separated Value) output in ColdFusion with customized headers, customized date masks, and customized Yes/No values.
I'm not saying that all CSVs have to be generated like this. This is just the approach that I use specifically for generating CSVs for human consumption. That said, I feel like it creates a nice separation of concerns; and doesn't overload the CSV serialization.
Want to use code from this post? Check out the license.
Reader Comments
I wish everyone wrote code as clearly and as easily read as you do. Heck, I wish I did. Yours is the standard I strive for though. As always, thanks for sharing
@Chris,
Thank you sir -- that is the highest compliment 😊
Post A Comment — ❤️ I'd Love To Hear From You! ❤️
Post a Comment →