Creating Excel Documents With ColdFusion And XML - GOTCHA!
I am all excited about the ability to create Microsoft Excel documents using ColdFusion and XML. However, I just ran into a tiny road block. This probably isn't XML-Excel specific, but rather a general Excel issue (or perhaps something related to Excel and the RAM on my computer). My documents have been crapping out and it has taken me over an hour to figure out why. Finally I narrowed it down to the record length. I was trying to create a report that had over 70,000 rows in it (at the client's request). Apparently, though, Excel can only handle 65,536 rows in a single sheet (at least that's what mine tells me).
This was such a pain to debug because the error in the Excel log was:
XML Spreadsheet Warning in Worksheet Setting
REASON: Bad Value
ACTION: Ignoring Tag
FILE: C:\Documents and Settings\bnadel\Desktop\matt_full_orders.cfm.xml
GROUP: Pane
TAG: ActiveRow
VALUE: 70873
"Bad Value"... that was like a wild goose chase because my initial thought was that the number of Row XML Nodes was different than the value in the ActiveRow node. Nope! The Bad Value in this case meant that I had too many rows for a viable Excel document.
Oh well. Now I have to work with the client to get a smaller report anyway (something I should have done anyway).
Reader Comments
Havent had a chance to test this Excel XML myself, but what kind of speed issues are you experiencing? How fast does it generate the excel files?
Havent had a chance to test this Excel XML myself, but what kind of speed issues are you experiencing? How fast does it generate the excel files?
Matt,
The above example created a 31 Megabyte file with 70,000+ records in about 25 seconds. Plus, I am stripping out ALL the inter-tag white space for smaller file size, which is probably a good chunk of that processing time.
I don't have a lot of experience, but to me, it seems file size is going to become more of an issue before processing time ever does.
This stuff is pretty exciting, though. I feel like I have so much more power over my Excel creation than ever before. Plus, I love that this is in XML format that ColdFusion can parse if it ever needs to.
Ben,
Just a few quick notes based on my experience working with CF/Excel:
In versions of Excel prior to 2007, the row limit is 64k, regardless of whether you generate the file using HTML, Excel, or Java.
In Excel 2007, the row limit is increased to 1M.
Using Java (or COM) to generate Excel files is both faster and makes for much smaller file sizes. We had one app that used to us COM that we then changed to the HTML method. That resulted in a HUGE bloat in the file size, somewhere between 10x-20x, depending on the file. We then switched over to using Apache POI, which works great and reduced the file size back down to where the COM method had been.
The bottom line is that generating a binary Excel file instead of a text file which Excel opens natively (as in the case of CSV, HTML, and XML) results in much smaller files.
Hope this helps...
-Rob
Ben,
Just a few quick notes based on my experience working with CF/Excel:
In versions of Excel prior to 2007, the row limit is 64k, regardless of whether you generate the file using HTML, Excel, or Java.
In Excel 2007, the row limit is increased to 1M.
Using Java (or COM) to generate Excel files is both faster and makes for much smaller file sizes. We had one app that used to us COM that we then changed to the HTML method. That resulted in a HUGE bloat in the file size, somewhere between 10x-20x, depending on the file. We then switched over to using Apache POI, which works great and reduced the file size back down to where the COM method had been.
The bottom line is that generating a binary Excel file instead of a text file which Excel opens natively (as in the case of CSV, HTML, and XML) results in much smaller files.
Hope this helps...
-Rob
@Rob,
I completely agree with the file bloat due to all the XML. In the end, I would love to be able to create the binary, which is why I have been fooling around with the JExcel package. However, I try to do all of this without installing anything on the server (other than physical files)... so for now, if I have to install a COM object, I am ruling that out until it becomes required. I am crazy like that ;)
Either way, I don't envision myself creating such large documents. I think having huge documents means that I am pulling the wrong information. Who can look at 70,000+ rows and understand it in any useful way?
Reports shouldn't be huge, they should be small with highly useful information. To this end, file size is not going to be a huge factor. But surely, I want to figure out how to use JExcel (which I can load using UrlClassLoader).
I am just exploring all options. Thanks so much for you feedback and real-world experiences.
Ben,
You would be surprised how people want to us the data. In my use case, we have a customer facing reporting application that pulls from terrabytes of legacy information and delivers the results in either HTML or Excel. We allow users to schedule reports for automatic delivery daily, weekly, or monthly. Many of those files contain tens of thousands of rows of data - because that's exactly what the customers want to see. Many of them use the data to run sophisticated planning and forecasting models, while others use the data to feed B2B systems. The common denominator is that they all want the data delivered in Excel, not in flat files, and not via EDI or other methods.
The COM method I mentioned was an old way we did things, long before CF supported Java, and long before Apache POI existed. We had a custom written COM wrapper for Excel which we used.
As for Apache POI, it's Java, so you should be able to use the same classloader as for JExcel. Also, CF comes with a (slightly older) version of POI already installed, so there's nothing for you to have to put on the server. the POI jar can be found here (your location may vary):
C:\JRun4\servers\cfusion\cfusion-ear\cfusion-war\WEB-INF\cfusion\lib\poi-2.5.1-final-20040804.jar
-Rob
PS, I haven't tried JQuery myself, but I hear it's pretty good as well.
Rob,
Very true re: how client wants to use the data... I mean, come on, that's how I found out about the max row stuff in the first place :)
I didn't know that a version of POI came with it already!!! That's some sweet stuff. I will look into that ASAP. I would rather use an already-installed object than a URL loaded object. Thanks for the awesome tip!
After your tip I did some fast Googling and found this by Dave Ross:
http://www.d-ross.org/index.cfm?objectid=9C65ECEC-508B-E116-6F8A9F878188D7CA
Thanks for pointing me in a new direction to explore!
Ive been writing a CFC to use as a "wrapper" for creating Excel documents using POI. Since POI is installed in MX7, it was my choice at first. I still am interested in the XML version, and will do some experimenting myself with it.
POI for me tends to slow down when you get a large number of records. For example i am creating excel reports with about 10,000+ records and it takes about 20-30 seconds to create this file. I ended up putting in a status message so the user knows what is being created.."Creating Worksheet x....y...z" . I ended up optimizing my queries to include some filtering, so Im not pulling in 10k all the time. Id hate to see 70,000.
Keep up the good work with this. Ive been playing with Excel and CF for a good while now.
Ive been writing a CFC to use as a "wrapper" for creating Excel documents using POI. Since POI is installed in MX7, it was my choice at first. I still am interested in the XML version, and will do some experimenting myself with it.
POI for me tends to slow down when you get a large number of records. For example i am creating excel reports with about 10,000+ records and it takes about 20-30 seconds to create this file. I ended up putting in a status message so the user knows what is being created.."Creating Worksheet x....y...z" . I ended up optimizing my queries to include some filtering, so Im not pulling in 10k all the time. Id hate to see 70,000.
Keep up the good work with this. Ive been playing with Excel and CF for a good while now.
@Matt,
That sounds like a good idea (the CFC wrapper). As far as POI vs. JExcel vs. XML... I think as developers, the more tools we have, the better decisions we can make for any given task.
This is all very exciting for me, I tell you what.
Just a footnote that you can output Excel files using <cfreport> and the ReportBuilder (guess that's what the bundled POI class is for). Formatting is quite limited but you do get a binary file that's smaller and can be opened in older versions of Excel.
@Julian,
Thanks for the tip. Unfortunately, I have never used CFReport. I will have to look into that. From what I have studied, it seems cool, but I have never seen the Report Builder either :)
Ben,
can u tell me how to convert datagrid in flex to excel with coldfusion query?thanks.. :)
I am having problems generating xls files via xml that are larger than 34k with cf9. The file has over 40 column, which may be part of the problem. The problem is I don't get any output/error after 34k, which could also point to a file size restriction on the web server?? I am adding requestTimeout=999999 to the url, but that doesn't help get the file built. After about 5 minutes, the only msg I get is: 500 - Internal server error.
There is a problem with the resource you are looking for, and it cannot be displayed. No debug info either.