Ask Ben: Working With Lists In An Object Oriented Programming Application
Your site has been a great help to me over the past year, so thank you and keep it up! However I am now in the position where I just can't seem to find a coherent answer to my question anywhere else, so I need to ask directly. I am trying to learn OO and (I think) I was doing great until I needed to get data back out for a seeming simple operation, output a list comprised of company name, their state and primary contact.
I have created a company object, using composition each company object has contact and address objects etc. Normally in proceedural style I would simply do a query with joins to get all the data as a single record set and display, but with my new beans and DAOs and so on, the only way I can see to get that data back out and keep it in the objects I've just spent so long creating, is to do 3 seperate queries!
This would be okay if it was only 1 record, but what about for a row view page with 1000 records? Instead of 1 query, I'd be making 2001!! (1 for the companies returning 1000 rows, 1 contact and 1 address query per row). I can't seem to find any articles on repopulating complex objects from stored data, only on re-populating simple, non-nested objects. Any help you could provide would be invaluable.
I am certainly no OOP expert - heck, I'm not even mediocre at OOP yet. I am just getting my feet wet in this world, so please take this with a grain of salt. From everything that I have heard and read about OOP in ColdFusion (which is slightly distinct from the capabilities of true OOP purity), when you are dealing with lists, there is nothing wrong at all with using a ColdFusion query. Queries are awesome objects and are part of what make ColdFusion such an easy language to use.
In a true OOP application, you probably would be creating all these objects, but ColdFusion just isn't there yet in terms of speed of component instantiation. Therefore, we can create a decent number of objects with good speed, but there is a point at which it is just too slow. Of course, keep your use-cases in mind - are you ever going to be displaying a page with 1,000 records? How many pages of your application will actually do that? As much as it is fun to get bogged down in the theoretical, always keep one foot in the reality of your situation.
That being said, I would continue to use a query for what you are doing. Remember, OOP is supposed to provide a benefit; if it is creating more friction than solution, it's clearly trying to solve a problem that you don't have.
Now, what is nice about OOP is that is has a bunch of encapsulated functionality that might not be easily achieved with a query alone. But, as you say, you also don't want to be creating objects all the time or re-querying the database a thousand times. One thing we could try to do is to leverage a bastardized idea of Peter Bell's Iterating Business Object, but put a slightly different twist on it so that it can handle, not just single components, but components that have composed objects.
First, let's create a really simple Person.cfc ColdFusion component. This will only have a name, date of birth, and a "Best fried forever" (BFF) property:
<cfcomponent
output="false"
hint="I am a really really simple person object.">
<!--- Default instance variables. --->
<cfset VARIABLES.Instance = {
Name = "",
DateOfBirth = "",
BFF = ""
} />
<cffunction
name="Init"
access="public"
returntype="any"
output="false"
hint="I return an initalized object.">
<!--- Return This reference. --->
<cfreturn THIS />
</cffunction>
<cffunction
name="Get"
access="public"
returntype="any"
output="false"
hint="I get a given property value. I the property is invalid, I return empty string.">
<!--- Define arguments. --->
<cfargument
name="Property"
type="string"
required="true"
hint="I am the property being retieved."
/>
<!---
Check to make sure that the property is valid, that
is, that it is already a key in the instance object.
--->
<cfif StructKeyExists( VARIABLES.Instance, ARGUMENTS.Property )>
<!--- Return property value. --->
<cfreturn VARIABLES.Instance[ ARGUMENTS.Property ] />
<!---
The property did not exist, but maybe there was a
pseudo property that could be gotten.
--->
<cfelseif (
IsDate( Get( "DateOfBirth" ) ) AND
(ARGUMENTS.Property EQ "Age")
)>
<!--- Return calculated age. --->
<cfreturn DateDiff(
"yyyy",
Get( "DateOfBirth" ),
Now()
) />
<cfelse>
<!---
The property is not valid, so just return the
empty string rather than throwing an error.
--->
<cfreturn "" />
</cfif>
</cffunction>
<cffunction
name="Set"
access="public"
returntype="any"
output="false"
hint="I store a value into the instance data. I return THIS object for method chaining.">
<!--- Define arguments. --->
<cfargument
name="Property"
type="string"
required="true"
hint="I am the property being set. I must already exist in the instance data."
/>
<cfargument
name="Value"
type="any"
required="true"
hint="I am the property value being set."
/>
<!---
Check to make sure that the property is valid, that
is, that it is already a key in the instance object.
--->
<cfif StructKeyExists( VARIABLES.Instance, ARGUMENTS.Property )>
<!--- Store the property value. --->
<cfset VARIABLES.Instance[ ARGUMENTS.Property ] = ARGUMENTS.Value />
</cfif>
<!--- Return This for method chaining. --->
<cfreturn THIS />
</cffunction>
</cfcomponent>
This contains just a bit of information and allows you to Get() and Set() that information. Notice that in our Get() method, we check not only for existing property values, but also for some special cases; while "Age" is not an internal property, we know that we can get this value based on the DateOfBirth property. It is this kind of encapsulated functionality that makes Object Oriented Programming so exciting.
Ok, now let's look at our Iterating Business Object, PersonIBO.cfc. This object is designed to hold a query of a very specific structure - we can't just go throwing anything in there. It is designed to take a query that had done the SQL JOINs for us and returned both the Person data as well as the BFF data in a single record set:
<cfcomponent
output="false"
hint="I take a person query and mimic the iteration of person objects.">
<!--- Default the instance variables. --->
<cfset VARIABLES.Instance = {
Query = "",
Person = CreateObject( "component", "Person" ).Init(),
BFF = CreateObject( "component", "Person" ).Init()
} />
<cffunction
name="Init"
access="public"
returntype="any"
output="false"
hint="I return an initalized object.">
<!--- Define arguments. --->
<cfargument
name="Query"
type="query"
required="true"
hint="This is the query object that I will use to mimic my true objects."
/>
<!--- Store the query. --->
<cfset VARIABLES.Instance.Query = ARGUMENTS.Query />
<!--- Return This reference. --->
<cfreturn THIS />
</cffunction>
<cffunction
name="GetPerson"
access="public"
returntype="any"
output="false"
hint="I return the person at this row of the 'Collection.'">
<!--- Define arguments. --->
<cfargument
name="Index"
type="numeric"
required="true"
hint="I am the row in the record set that we want to go to and use objects for."
/>
<!--- Set the row index. --->
<cfset VARIABLES.Instance.RowIndex = ARGUMENTS.Index />
<!--- Populate the our mock Person object. --->
<cfset VARIABLES.Instance.Person
.Set(
"Name",
VARIABLES.Instance.Query[ "name" ][ ARGUMENTS.Index ]
)
.Set(
"DateOfBirth",
VARIABLES.Instance.Query[ "dob" ][ ARGUMENTS.Index ]
)
.Set(
"BFF",
VARIABLES.Instance.BFF
)
/>
<!--- Populate the mock BFF object. --->
<cfset VARIABLES.Instance.BFF
.Set(
"Name",
VARIABLES.Instance.Query[ "bff_name" ][ ARGUMENTS.Index ]
)
.Set(
"DateOfBirth",
VARIABLES.Instance.Query[ "bff_dob" ][ ARGUMENTS.Index ]
)
.Set(
"BFF",
VARIABLES.Instance.Person
)
/>
<!--- Return the perosn object. --->
<cfreturn VARIABLES.Instance.Person />
</cffunction>
<cffunction
name="Size"
access="public"
returntype="numeric"
output="false"
hint="I return the number of objects in my 'Collection.'">
<!---
Since we are going to have an object for each
record, just return the record count of our
internal query.
--->
<cfreturn VARIABLES.Instance.Query.RecordCount />
</cffunction>
</cfcomponent>
This IBO acts as a sort of pseudo collection object. Every time the user requests a Person object via the GetPerson( i ) method, the IBO takes internal objects and re-populates them with data from the internal query object. This allows us to create the Person and BFF instances only once and then repopulate them whenever they are requested (without going back to the database). When you do this, though, you have to be SUPER aware that you do NOT have unique object instances; the IBO is designed to mimic OOP collections, but does not really create new objects. Therefore, you can't be caching these returned Person.cfc instances of passing them off to other services as the data will become immediately corrupted when the next GetPerson() method call is made.
That being said, let's take a look at an example:
<!--- Create a fake query. --->
<cfset qPerson = QueryNew(
"name, dob, bff_name, bff_dob",
"cf_sql_varchar, cf_sql_varchar, cf_sql_varchar, cf_sql_varchar"
) />
<!--- Add several rows. --->
<cfset QueryAddRow( qPerson, 5 ) />
<!--- Store person data in our query. --->
<cfset qPerson[ "name" ][ 1 ] = "Ben Nadel" />
<cfset qPerson[ "dob" ][ 1 ] = "09/21/1980" />
<cfset qPerson[ "bff_name" ][ 1 ] = "Marisa Miller" />
<cfset qPerson[ "bff_dob" ][ 1 ] = "08/06/1978" />
<cfset qPerson[ "name" ][ 2 ] = "Libby Smith" />
<cfset qPerson[ "dob" ][ 2 ] = "02/14/1982" />
<cfset qPerson[ "bff_name" ][ 2 ] = "Daniel Pink" />
<cfset qPerson[ "bff_dob" ][ 2 ] = "11/16/1980" />
<cfset qPerson[ "name" ][ 3 ] = "John Crugar" />
<cfset qPerson[ "dob" ][ 3 ] = "03/14/1979" />
<cfset qPerson[ "bff_name" ][ 3 ] = "Maggie Allen" />
<cfset qPerson[ "bff_dob" ][ 3 ] = "02/22/1975" />
<cfset qPerson[ "name" ][ 4 ] = "Kim Hall" />
<cfset qPerson[ "dob" ][ 4 ] = "08/03/1981" />
<cfset qPerson[ "bff_name" ][ 4 ] = "Michael Greenburg" />
<cfset qPerson[ "bff_dob" ][ 4 ] = "06/12/1981" />
<cfset qPerson[ "name" ][ 5 ] = "Steve Harris" />
<cfset qPerson[ "dob" ][ 5 ] = "01/27/1988" />
<cfset qPerson[ "bff_name" ][ 5 ] = "Peggy Sue-Ellen" />
<cfset qPerson[ "bff_dob" ][ 5 ] = "01/27/1987" />
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html>
<head>
<title>Composed Object IBO Example</title>
</head>
<body>
<cfoutput>
<h1>
Composed Object IBO Example
</h1>
<!---
Create an IBO for our person query. Now, we can't
just pass this any query. This iterating business
object was designed to handle a query of a very
specific structure (designed above).
--->
<cfset objIBO = CreateObject( "component", "PersonIBO" )
.Init( qPerson )
/>
<!---
Loop over the people "Collection". We are going to
treat this like an array of People instances.
--->
<cfloop
index="intIndex"
from="1"
to="#objIBO.Size()#"
step="1">
<!--- Get the next person in the collection. --->
<cfset objPerson = objIBO.GetPerson( intIndex ) />
<!--- Get the BFF instance. --->
<cfset objBFF = objPerson.Get( "BFF" ) />
<p>
#objPerson.Get( "Name" )#
(Age: #objPerson.Get( "Age" )#)
has the best friend forever:
#objBFF.Get( "Name" )#
(Age: #objBFF.Get( "Age" )#)
</p>
</cfloop>
</cfoutput>
</body>
</html>
As you can see, our qPerson ColdFusion query object has columns for the Person and as well as that person's BFF information. The query is then loaded into the PersonIBO.cfc which is then treated as if it were a collection of Person.cfc instances. And, since a real Person.cfc instance is returned from the GetPerson() method, we can use all of the methods that we would normally, including the calculated Get( "Age" ) method, which would not be available via the query alone (without the duplication of business logic - shame shame shame).
Running the above code, we get the following output:
Composed Object IBO Example
Ben Nadel (Age: 27) has the best friend forever: Marisa Miller (Age: 29)
Libby Smith (Age: 26) has the best friend forever: Daniel Pink (Age: 27)
John Crugar (Age: 29) has the best friend forever: Maggie Allen (Age: 33)
Kim Hall (Age: 26) has the best friend forever: Michael Greenburg (Age: 26)
Steve Harris (Age: 20) has the best friend forever: Peggy Sue-Ellen (Age: 21)
So, that's my stab at answering your question. This is completely theoretical - I have never tried this in a real application. And remember, I have never even really tried OOP style programming in a real application. This could be way off base, so use with caution! But, I think what is cool about this is that you get to use object oriented programming practices without having to actually create a lot of different objects or have to query the database N times.
Want to use code from this post? Check out the license.
Reader Comments
I think this is what you need, but I'm actually just some jerk who may have no clue what the hell he's doing and all...
First, make sure you are using a framework which loads your object CFCs into the application scope (i believe most of them do). I think that would speed up object creation.
Second, separate DB calls into a separate CFC. (Eventually you should break these into CRUD and Gateway, but one step at a time). They should also be in application scope.
You have two choices for the Read operation: One function that always returns an array of objects or two functions, one to return a single object, the other to return an array of objects. Honestly, it doesnt matter (unless you want to really start developing true integration with Flex and display models... then it just gets easier to always return an array..)
So you mentioned you are not doing a flat DB query. i.e. you would need to call data from multiple tables to fill your needs. Just do it. Write a query that gets all your data sorted by object. So, if you had to get all clients and their addresses (and there could be >1 address per client), then grab the whole thing, ordered by client, address (with joins and where clauses of course).
Also, I'm assuming in your client object, you'd have an array to store the addresses (so it's an array of address Objects which defaults to ArrayNew(1)).
Then you loop through your query and group the loop statement by the sorted objects (i use cfoutput instead of cfloop because it's easier to loop inside queries you are already looping through... if that makes sense). In this example, I'm assuming you have a function in your cfc called init which returns 'this' (the variable).
<pre>
<cfset var clients = arrayNew(1)/>
<cfset var tmpClient = ""/>
<cfset var tmpAddress = ""/>
<!--- loop through each row grouped by client --->
<cfoutput query="q" group="clientId">
<!--- create client object --->
<cfset tmpClient = createObject("component", "beans.clientBean").init(
firstname = q.firstname,
lastname = q.lastname,
etc = q.etc
) />
<!--- loop through addresses from DB --->
<cfoutout>
<!--- create this address --->
<cfset tmpAddress = createObject("component", "beans.address").init(
city = q.city,
state = q.state
soForth = q.soForth
) />
<!--- attach this address to the client object --->
<cfset ArrayAppend(tmpClient.address, tmpAddress) />
</cfoutput>
<!--- finished sublooping, now add the client to array --->
<cfset ArrayAppend(clients, tmpClient) />
</cfoutput>
<cfreturn clients />
</pre>
if any of this shows up wrong, i apologize... (Ben, make the comments box wider and taller, will ya? maybe a nice widget from the YUI library or something that floats the text box in an expanding container?).
Here's the key, this code COULD be right under your query in your CRUD cfc. OR, since you may be doing multiple Read queries based on different where clauses (but with the same Select statement) you can make this a separate function that is call if needed before returning the array of objects.
Just a few notes... you should have getters and setters for each property of the bean. Also, you can create functions in the bean to add and remove items from each Array of Objects. So you would ideally have "addAddress" as a method (in the bean) and you would use <cfset tmpClient.addAddress(tmpAddress) /> to add the address to the client object.
OK, that was my longest post anywhere for a while. I hope this helps!
One thing to keep in mind when tackling collections of objects, is that size does matter, and even in OO languages other than CFML (i.e. - Java, C#, Ruby, etc.), you probably won't find yourself creating a collection with 1000+ objects, at least I don't, and I work on some fairly large systems. It's just not performant.
The representation of data as an object is really only useful when behavior is associated with that object through methods, events, etc. If you're simply using an object as the equivalent of a struct (Value Object pattern), you really don't gain much of a benefit in this case. That being said, when trying to show a list of members in a collection, especially with gobs of members, you really are solving a reporting problem. When dealing with reports in an OOP mindset, the report itself can be considered your object and the data associated with that report is a property of that object. Since CF has a native datatype for query, this is usually a great datatype for the property of your report object. To render the report, based on how you like to architect your software, the report object would have a render method, or you could have something else render it, which would produce the list you want to show to your user.
Using the report-like objects does become tedious however when you start to use CRUD-based tools and scaffolding. In these cases, I found it more useful to look into using things like paging to reduce the size of the collection you have to deal with or, if performance isn't an issue, just don't worry about it at all.
In the end my take is, if the user wants to see 1000+ members in a collection, it's probably because they want to look at it from a reporting perspective and your object model should reflect that. If they want to find a member in a collection, the user would rather have good UI tools to help them find that member, rather than see a list of all 1000+ members.
Just my thoughts, feel free to brandish me now :'(
Ben, what if you needed to output your IBO like a <cfoutput query="myqry" group="bff_name"> #bff_name# <cfoutput> rest of query </cfoutput></cfoutput> ?
@CFJerk,
I see what you are saying, but I am not sure how your discussion addresses the question that was posed. The person asking the question was concerned with performance on huge lists and OOP; the concern was not so much about general data access or arrays of objects. I think your points are more general best practices and not much about this specific scenario.
@Brian,
The representation of data as an object is really only useful when behavior is associated with that object through methods, events, etc. If you're simply using an object as the equivalent of a struct (Value Object pattern), you really don't gain much of a benefit in this case. That being said, when trying to show a list of members in a collection, especially with gobs of members, you really are solving a reporting problem. When dealing with reports in an OOP mindset, the report itself can be considered your object and the data associated with that report is a property of that object. Since CF has a native datatype for query, this is usually a great datatype for the property of your report object.
This might just be the smartest thing I've heard all day! I think you are absolutely right. I know personally, that as I am learning about OOP all I think about is in terms of Model objects. I don't necessarily think about Software Concerns. But what you say feels right - the concern here isn't to get an array of objects... the real concern is to provide a list of people and their best friends; a report, as you say.
Why should different concerns be implemented in the same way? I don't think they need / should be. Nice thought point.
@BradB,
Good point. This is just personal for me, but I absolute HATE the "group" attribute of the cfoutput tag. In fact, I HATE using CFOutput to loop over queries. I think this was a really poor feature in the language. Now, if it were part of the CFLOOP tag, that would be a totally different situation; but the fact that it is part of the CFOutput tag makes this totally useless in my eyes.
Sorry if that sounds a bit strong, but after getting "nested CFOutput" errors a few times, I decided that this feature was too unusable to be helpful in any way that was consistent with the rest of my application.
@Brad:
What Ben said +1. I think you've really hit on something there that could help a lot of us struggling with OO in CF. Squillions of objects in a collection = reporting = different domain - nice!
I just want to point out (and getting away from what the OP was asking) that sometimes the fact that an object is in a collection is important, even if we don't care right now about the other objects in the collection. The collection may be a business object in its own right with its own business rules , and where we are dealing with collections of value objects or of strictly composed child objects, the object lifecycle will be controlled by the parent object via the collection.
The actual implementation of these ideas in CF may end up not looking much like what I've described above, but I find it handy to keep the principle in view.
@Ben, Jaime - To add to that, consider the example of a populated collection of User objects. What's to say you don't create an object called Group which is a derivation, or abstraction, of a List-like collection. You can give the object Group behaviors like bulkAddPermission(). Search is another great example of where custom collections are nice. Consider paging being a behavior of a SearchResults collection for objects of type SearchResult. SearchResults could abstract paging from the DB in a getPage() method, or be greedy and cache a large dataset internally for querying later. By using an object for SearchResults, the behavior of paging is encapsulated and closely associated with its semantics.
A lot of people have a different approach to software design but the way I figure, keep it simple by building for the basics. Use objects to emulate the problem domain you have to solve, don't just come up with a solution to solve the problem at hand. The customer will always help you grow your application from there :) The closer you emulate their problem domain with your software, the more flexible your software will surprisingly become to maintain and enhance. Again just my take on it.
As always great post Ben, keep up the great work.