Posted May 27, 2008 at
7:59 AM
Tags:
ColdFusion
Last week, I posted a ColdFusion user defined function that took a ColdFusion query object and turned it into a comma separated value (CSV) string. I had made heavy use of the Java StringBuffer to build the string. This however, required many method calls and extra string concatenation that apparently was slowing things down. This is a good example of what happens when you try to optimize something without fully weighing the pros and cons of the tools in use.
Using some of the feedback that I received on my function, I am posting an updated version of it here:
Launch code in new window » Download code as text file »
- <cffunction
- name="QueryToCSV"
- access="public"
- returntype="string"
- output="false"
- hint="I take a query and convert it to a comma separated value string.">
-
- <cfargument
- name="Query"
- type="query"
- required="true"
- hint="I am the query being converted to CSV."
- />
-
- <cfargument
- name="Fields"
- type="string"
- required="true"
- hint="I am the list of query fields to be used when creating the CSV value."
- />
-
- <cfargument
- name="CreateHeaderRow"
- type="boolean"
- required="false"
- default="true"
- hint="I flag whether or not to create a row of header values."
- />
-
- <cfargument
- name="Delimiter"
- type="string"
- required="false"
- default=","
- hint="I am the field delimiter in the CSV value."
- />
-
- <cfset var LOCAL = {} />
-
- <cfset LOCAL.ColumnNames = [] />
-
- <cfloop
- index="LOCAL.ColumnName"
- list="#ARGUMENTS.Fields#"
- delimiters=",">
-
- <cfset ArrayAppend(
- LOCAL.ColumnNames,
- Trim( LOCAL.ColumnName )
- ) />
-
- </cfloop>
-
- <cfset LOCAL.ColumnCount = ArrayLen( LOCAL.ColumnNames ) />
-
-
- <cfset LOCAL.Buffer = CreateObject( "java", "java.lang.StringBuffer" ).Init() />
-
- <cfset LOCAL.NewLine = (Chr( 13 ) & Chr( 10 )) />
-
-
- <cfif ARGUMENTS.CreateHeaderRow>
-
- <cfset LOCAL.RowData = [] />
-
- <cfloop
- index="LOCAL.ColumnIndex"
- from="1"
- to="#LOCAL.ColumnCount#"
- step="1">
-
- <cfset LOCAL.RowData[ LOCAL.ColumnIndex ] = """#LOCAL.ColumnNames[ LOCAL.ColumnIndex ]#""" />
-
- </cfloop>
-
- <cfset LOCAL.Buffer.Append(
- JavaCast(
- "string",
- (
- ArrayToList(
- LOCAL.RowData,
- ARGUMENTS.Delimiter
- ) &
- LOCAL.NewLine
- ))
- ) />
-
- </cfif>
-
-
-
- <cfloop query="ARGUMENTS.Query">
-
- <cfset LOCAL.RowData = [] />
-
- <cfloop
- index="LOCAL.ColumnIndex"
- from="1"
- to="#LOCAL.ColumnCount#"
- step="1">
-
- <cfset LOCAL.RowData[ LOCAL.ColumnIndex ] = """#Replace( ARGUMENTS.Query[ LOCAL.ColumnNames[ LOCAL.ColumnIndex ] ][ ARGUMENTS.Query.CurrentRow ], """", """""", "all" )#""" />
-
- </cfloop>
-
-
- <cfset LOCAL.Buffer.Append(
- JavaCast(
- "string",
- (
- ArrayToList(
- LOCAL.RowData,
- ARGUMENTS.Delimiter
- ) &
- LOCAL.NewLine
- ))
- ) />
-
- </cfloop>
-
-
- <cfreturn LOCAL.Buffer.ToString() />
- </cffunction>
Now, rather than calling StringBuffer.Append() on every field value, I am build an array of fields in a row, then I join that array using ArrayToList() and append it to the ongoing string value that is held in the StringBuffer. So, I am still using the Java StringBuffer to hold the CSV value, but I am using an intermediary ColdFusion array to build up individual line values. I am also escaping any double-quotes that are in the query values - something which I was neglecting to do before. I have to say that this did greatly simplify the amount of code that was in the method. This pleases me; I love to see more simple solutions be more effective.
Some people did tell me that the StringBuffer was completely overkill and that they, in fact, were getting better results using just array-joining for both the fields and the rows. I found this hard to believe as the StringBuffer has proven useful so many times in the past. I had trouble believing that using just array joining would be the fastest method, and so, I did a little testing of my own. I re-wrote the above user defined method to use only arrays and no StringBuffer:
Launch code in new window » Download code as text file »
- <cffunction
- name="QueryToCSV2"
- access="public"
- returntype="string"
- output="false"
- hint="I take a query and convert it to a comma separated value string.">
-
- <cfargument
- name="Query"
- type="query"
- required="true"
- hint="I am the query being converted to CSV."
- />
-
- <cfargument
- name="Fields"
- type="string"
- required="true"
- hint="I am the list of query fields to be used when creating the CSV value."
- />
-
- <cfargument
- name="CreateHeaderRow"
- type="boolean"
- required="false"
- default="true"
- hint="I flag whether or not to create a row of header values."
- />
-
- <cfargument
- name="Delimiter"
- type="string"
- required="false"
- default=","
- hint="I am the field delimiter in the CSV value."
- />
-
- <cfset var LOCAL = {} />
-
- <cfset LOCAL.ColumnNames = [] />
-
- <cfloop
- index="LOCAL.ColumnName"
- list="#ARGUMENTS.Fields#"
- delimiters=",">
-
- <cfset ArrayAppend(
- LOCAL.ColumnNames,
- Trim( LOCAL.ColumnName )
- ) />
-
- </cfloop>
-
- <cfset LOCAL.ColumnCount = ArrayLen( LOCAL.ColumnNames ) />
-
-
- <cfset LOCAL.NewLine = (Chr( 13 ) & Chr( 10 )) />
-
- <cfset LOCAL.Rows = [] />
-
-
- <cfif ARGUMENTS.CreateHeaderRow>
-
- <cfset LOCAL.RowData = [] />
-
- <cfloop
- index="LOCAL.ColumnIndex"
- from="1"
- to="#LOCAL.ColumnCount#"
- step="1">
-
- <cfset LOCAL.RowData[ LOCAL.ColumnIndex ] = """#LOCAL.ColumnNames[ LOCAL.ColumnIndex ]#""" />
-
- </cfloop>
-
- <cfset ArrayAppend(
- LOCAL.Rows,
- ArrayToList( LOCAL.RowData, ARGUMENTS.Delimiter )
- ) />
-
- </cfif>
-
-
-
- <cfloop query="ARGUMENTS.Query">
-
- <cfset LOCAL.RowData = [] />
-
- <cfloop
- index="LOCAL.ColumnIndex"
- from="1"
- to="#LOCAL.ColumnCount#"
- step="1">
-
- <cfset LOCAL.RowData[ LOCAL.ColumnIndex ] = """#Replace( ARGUMENTS.Query[ LOCAL.ColumnNames[ LOCAL.ColumnIndex ] ][ ARGUMENTS.Query.CurrentRow ], """", """""", "all" )#""" />
-
- </cfloop>
-
-
- <cfset ArrayAppend(
- LOCAL.Rows,
- ArrayToList( LOCAL.RowData, ARGUMENTS.Delimiter )
- ) />
-
- </cfloop>
-
-
- <cfreturn ArrayToList(
- LOCAL.Rows,
- LOCAL.NewLine
- ) />
- </cffunction>
As you can see, this uses ArrayToList() not only for the individual field values, but for all of the rows. Then, I put both of these to a basic speed test on several thousand rows of data:
Launch code in new window » Download code as text file »
- <cfset qPart = QueryNew(
- "id, serial_number, price",
- "cf_sql_integer, cf_sql_varchar, cf_sql_decimal"
- ) />
-
-
- <cfloop
- index="intI"
- from="1"
- to="5000"
- step="1">
-
- <cfset QueryAddRow( qPart ) />
-
- <cfset qPart[ "id" ][ qPart.RecordCount ] = JavaCast( "int", qPart.RecordCount ) />
- <cfset qPart[ "serial_number" ][ qPart.RecordCount ] = JavaCast( "string", CreateUUID() ) />
- <cfset qPart[ "price" ][ qPart.RecordCount ] = JavaCast( "float", RandRange( 1, 100 ) ) />
-
- </cfloop>
-
-
- <cftimer
- type="outline"
- label="Java String Buffer">
-
- <cfset strOutput = QueryToCSV(
- qPart,
- "id, serial_number, price"
- ) />
-
- Done.
-
- </cftimer>
-
-
- <cftimer
- type="outline"
- label="ArrayToList() Only">
-
- <cfset strOutput = QueryToCSV2(
- qPart,
- "id, serial_number, price"
- ) />
-
- Done.
-
- </cftimer>
I am happy to say that both of them executed in milliseconds, so no matter which way you go, you can't really go wrong. The StringBuffer method did, however, slightly outperform the ArrayToList() method over about 10 trials:
QueryToCSV() - ArrayToList() And StringBuffer
Average Time: 295 ms
QueryToCSV2() - ArrayToList() Only
Average Time: 373 ms
As you can see, we're really only talking fractions of a second, so very little difference. That being said, as much as the StringBuffer was slightly faster, definitely go with whatever version you feel more comfortable with - they are both quite fast.
Download Code Snippet ZIP File
Comments (1) |
Post Comment |
Ask Ben |
Permalink |
Other Searches |
Print Page
What Other People Are Searching For
[ local search ]
create a csv file using coldfusion
[ local search ]
how to create csv from query in coldfusion
[ local search ]
convert a query to a csv in coldfusion
Hi Ben,
Nice little research!... However I am really interested to know about the performance stats of POIUtility for this same scenario...
Posted by Dav R
on May 27, 2008
at 9:04 PM
Post Comment |
Ask Ben