Using Seekable Read Files In ColdFusion
Yesterday, when I was looking at how to loop over a file using CFLoop
, I came across an old post of mine in which Raymond Camden mentioned the fileSeek()
function. In all my years of ColdFusion, I've never used fileSeek()
- which allows us to jump to an arbitrary offset within a file. I believe that fileSeek()
works with both readable and writable files; however, the documentation on this is unclear. For this post, I'm looking at using seekable read files in ColdFusion.
In order to use the fileSeek()
function, you must specify that a file is "seekable" when you call fileOpen()
. At that point, you can call fileSeek()
providing a 0-based, positive offset from the start of the file.
From the ColdFusion documentation, it's unclear if the offset is the number of bytes or the number of characters. If we look at the Java documentation for RandomAccessFile
, they define .seek()
as taking a number of bytes. As such, I'll assume that ColdFusion is also using the number of bytes. This likely has implications if your file data contains multi-byte characters (like many of the emoji characters in the astral plane).
ASIDE: By looking at the Lucee CFML source-code, it seems that Lucee will create an instance of
java.io.RandomAccessFile
on-the-fly when you attempt to callfileSeek()
for the first time.
That said, for this exploration, I'm only going to deal with vanilla, single-byte characters. Specifically, the English alpha-numeric characters since the 26-letter alphabet and 10-digits give us easy offsets to play with.
In the following ColdFusion code, I'm simply going to open a seekable file, jump around to random-access points, and then read some data from those points:
<cfscript>
// Create a demo file with a known pattern for testing.
path = expandPath( "./data.txt" );
data = (
// First 26 characters.
"ABCDEFGHIJKLMNOPQRSTUVWXYZ" &
// Second 26 characters, offset 26.
"abcdefghijklmnopqrstuvwxyz" &
// Last 10 characters, offset 52.
"1234567890"
);
fileWrite( path, data, "utf-8" );
dataSource = fileOpen( path, "read", "utf-8", true ); // TRUE = Seekable.
try {
echoLine( dataSource.read( 3 ) );
echoLine( "Seek to 26..." );
dataSource.seek( 26 );
echoLine( dataSource.read( 3 ) );
echoLine( "Seek to 52..." );
dataSource.seek( 52 );
echoLine( dataSource.read( 3 ) );
echoLine( "Seek to 0..." );
dataSource.seek( 0 );
echoLine( dataSource.read( 3 ) );
} finally {
dataSource.close();
}
// ------------------------------------------------------------------------------- //
// ------------------------------------------------------------------------------- //
public void function echoLine( required string value ) {
writeOutput( value & "<br />" );
}
</cfscript>
As you can see, I'm jumping ahead a few times before finally jumping back to the start of the file. And, when we run this ColdFusion code (in either Adobe ColdFusion or Lucee CFML), we get the following output:
ABC
Seek to 26...
abc
Seek to 52...
123
Seek to 0...
ABC
As you can see, by using fileSeek()
, we were able to jump around to arbitrary offsets within the file before reading character data. To be honest, I don't have a use-case for this that jumps to mind; but, now that I know how this works, who knows what my brain will come up with.
Want to use code from this post? Check out the license.
Reader Comments
And it always puts 3 bytes of data from that starting point?
@Chris,
The
.seek()
just jumps to a specific location. It's the subsequent call to.read()
that is pulling-out 3-bytes:dataSource.read( 3 )
And, it might be worth mentioning for completeness that calling
.read()
will also advance the internal pointer. Meaning, if you call.read(3)
two times in a row, you'll read 6 subsequent characters, not the same 3 characters twice.In the post above, I'm looking at using the
fileRead()
method in conjunction with seekable files. In a follow-up post, I look at usingfileReadLine()
:www.bennadel.com/blog/4515-using-filereadline-with-seekable-files-in-coldfusion.htm
I'm noodling on ways to create a resumable large-text-file process that can pick up where it left off if it gets interrupted.
Post A Comment — ❤️ I'd Love To Hear From You! ❤️
Post a Comment →