Using CFLoop To Iterate Over A File Line-By-Line In ColdFusion
Yesterday, at work, I had to take a CSV (Comma-Separated Values) file with 4.7 million lines of data in it and break it up into smaller files that each had 25K lines of data. I don't do a lot of file I/O (Input, Output) at work, so I'm a bit rusty. I ended up using the fileOpen()
, fileReadLine()
, and fileClose()
functions to imperatively iterate over the file without reading it fully into memory. It wasn't until after I was done that I remembered the CFLoop
tag can actually do all of that for me declaratively in ColdFusion.
I've actually written about using CFLoop to iterate over a file 13-years ago. But, practice makes perfect; and, I'm hoping that by exploring the topic once again I'll be better about recalling it next time.
First, let's look at the imperative approach that I used to consume the text file one line at a time:
<cfscript>
echo( "<h1> Using FileOpen() </h1>" );
// When we OPEN a file (as opposed to reading a file), we allow the file to be
// consumed as a stream. This reduces the pressure that we exert on the memory of the
// server.
namesSource = fileOpen( "./names.txt", "read", "utf-8" );
nameIndex = 0;
try {
// Read one line in at a time until we hit "EOF" (End of File).
while ( ! fileIsEof( namesSource ) ) {
nameIndex++;
name = fileReadLine( namesSource );
echo( "#nameIndex# : #name# <br />" );
}
} finally {
fileClose( namesSource );
}
</cfscript>
Instead of reading the entire file into memory at one time, the fileOpen()
function returns a file object, which can be used to stream data from the file into memory as needed. The fileReadLine()
function takes the file object and consumes just enough character data to return the next line. And finally, when we're done consuming the file (when we hit the "End-of-File" state), we close the file, removing any potential locking issues.
This code works well enough; but, we have to manage all the file details on our own. And, when we see just how little code we end up having in our CFLoop
solution, even the above code seems absurdly verbose! The following ColdFusion code does the exact same thing, only we're using a declarative approach, offloading all of the heavy-lifting to the CFML engine (Lucee CFML in this case):
<cfscript>
echo( "<h1> Using CFLoop </h1>" );
// Instead of using FileOpen(), and explicitly reading lines and closing files when
// we're done, we can defer all of the heavy-lifting to ColdFusion itself. The CFloop
// tag encapsulates all of that logic!
loop
index = "nameIndex"
item = "name"
file = "./names.txt"
charset = "utf-8"
{
echo( "#nameIndex# : #name# <br />" );
}
</cfscript>
As you can see, instead of dealing with all the file I/O explicitly in the code, we're simply telling ColdFusion that we want to iterate of a file, and we're letting the CFLoop
tag implicitly figure out how to do that most effectively. It's so easy!
And, when we run both of these ColdFusion files side-by-side, we get the following output:
As you can see, each approach yields the same outcome: reading the text file line-by-line. However, the declarative CFLoop
approach uses significantly less code. It's ColdFusion just amazing!
Want to use code from this post? Check out the license.
Reader Comments
I would have reached for one of the file operators too. My eyes are now open to the loop option 👀! Thanks 🙏
@Chris,
I love that ColdFusion has lots of ways to do stuff; but, it does mean that it's hard to keep all the options in one's head.
When processing the file has been completed (after the fileClose operation), how long do you have the process wait before moving or deleting the file?
If a file move or delete is performed immediately afterwards (on Windows), I usually encounter an error indicating that the file is "still in use".
@James,
This isn't something that I've thought about in a while. I feel like I remember having issues with that in earlier versions of ColdFusion; but, it's not something that I currently think to do in more modern CF releases. That said, I do definitely get some strange IO issues showing up in the logs from time to time within a
finally
block that is meant to delete a scratch directory. But, I wouldn't say that I get any issues consistently.Also, at work we've been using K8 with unix servers, so maybe it's a bigger issue on Windows? I have a Windows VPS; but, I don't do much in the way of file processing on it.
Post A Comment — ❤️ I'd Love To Hear From You! ❤️
Post a Comment →