Ask Ben: Delete Values In A Given List Using ColdFusion
I have an array but it could be a list using listtoarray or vise verse is there an easy way to delete all of a particular 'value' from a list or an array. For example I have a list or array with:
"1,1,1,1,1,2,2,2,2,2,3,3,3,4,4,4,4,5"
And say i want to remove all instances of the number two... is there a simple way to do that?
There are probably a ton of ColdFusion user defined functions out there that already do this, but I figured I would take a crack at it as I figure it will keep my mind sharp (and plus, I always like having a place to point people on future questions). For this problem, let's assume that you want to delete a given value OR a list of values. To accomplish this, I am converting the target list to an array. This will allow us to iterate over the list faster than doing a list loop. I am also converting the value list (that we want to delete) into a struct in which the list values are the struct keys. This will provide a super fast value look-up that will tell us instantly whether or not we have a value we want to delete.
In the interest of speed, I am making some sacrifices. For starters, by converting the target list to an array, we are losing the delimiters. That means that if we have a multi-delimited list, there is no way we can recover the proper delimiters without some serious overhead. Also, by using the struct-key lookup, we can no longer perform a case-sensitive value comparison since ColdFusion struct keys are not case sensitive.
While these are limitations, I don't think they are that bad. If you think about the common scenarios that involve list manipulation, I think you will find that often times it is a single-delimited list and usually involves numeric values which do not have case. Therefore, I think this function is speed-tailored for the general case scenario.
That being said, here is ListDeleteValue( TargetList, ValueList [, Delimiters] ), the ColdFusion user defined function that I came up with:
<cffunction
name="ListDeleteValue"
access="public"
returntype="string"
output="false"
hint="Deletes a given value (or list of values) from a list. This is not case sensitive.">
<!--- Define arguments. --->
<cfargument
name="List"
type="string"
required="true"
hint="The list from which we want to delete values."
/>
<cfargument
name="Value"
type="string"
required="true"
hint="The value or list of values that we want to delete from the first list."
/>
<cfargument
name="Delimiters"
type="string"
required="false"
default=","
hint="The delimiting characters used in the given lists."
/>
<!--- Define the local scope. --->
<cfset var LOCAL = StructNew() />
<!---
Create an array in which we will store our new list.
This will be faster than building a list via string
concatenation.
--->
<cfset LOCAL.Result = ArrayNew( 1 ) />
<!---
Convert the target list into an array for faster
list iteration.
--->
<cfset LOCAL.ListArray = ListToArray(
ARGUMENTS.List,
ARGUMENTS.Delimiters
) />
<!---
Convert our value list into struct. This will allow us
to do super fast value look ups to see if we have a
value requires deletion. We aren't going to bother
converting this list to an array first (as we did above)
because the likely scenario is that we won't have many
values (and generally only one).
--->
<cfset LOCAL.ValueLookup = StructNew() />
<!--- Loop over value list to create index. --->
<cfloop
index="LOCAL.ValueItem"
list="#ARGUMENTS.Value#"
delimiters="#ARGUMENTS.Delimiters#">
<!--- Create index entry. --->
<cfset LOCAL.ValueLookup[ LOCAL.ValueItem ] = true />
</cfloop>
<!---
Now that we have our index in place, it's time to start
looping over the target list and looking for target
values in our index. NOTE: Since our index is a struct,
the lookups will NOT be case sensisitve.
--->
<cfloop
index="LOCAL.ValueIndex"
from="1"
to="#ArrayLen( LOCAL.ListArray )#"
step="1">
<!--- Get a short hand to the current list value. --->
<cfset LOCAL.Value = LOCAL.ListArray[ LOCAL.ValueIndex ] />
<!--- Check to see if this value is in the index. --->
<cfif NOT StructKeyExists(
LOCAL.ValueLookup,
LOCAL.Value
)>
<!---
We are not deleting this value so add it to
the taret array.
--->
<cfset ArrayAppend(
LOCAL.Result,
LOCAL.Value
) />
</cfif>
</cfloop>
<!---
At this point, our target list has been trimmed and
stored in the results array. Now, we have to convert
the array back to a list. This poses a little bit of
complication: we can only use one delimiter. Therefore,
we might lose some meaningful delimiters. This has been
done in the tradeoff for faster processing.
--->
<cfreturn ArrayToList(
LOCAL.Result,
Left( ARGUMENTS.Delimiters, 1 )
) />
</cffunction>
And now, just to run a quick test. Let's create create a multi-delimiter list of numbers and then delete the even values:
<!--- Create a multi-delimited list. --->
<cfset lstNumbers = "1,1,2,2:3:3,4,4:5:5,6,6:7:7,8,8:9:9" />
<cfset lstOddNumbers = ListDeleteValue(
lstNumbers,
"2,4,6,8",
",:"
) />
<!--- Output odd values. --->
#lstOddNumbers#
Running the above code, we get the following output:
1,1,3,3,5,5,7,7,9,9
Notice that we deleted all the even value. Notice also that our dual-delimited list was converted into a comma-delimited list. This is because when creating the resulting list, we select only the first available delimiter.
Want to use code from this post? Check out the license.
Reader Comments
I've never done this before, but what about using replaceList to swap the values you want to get rid of with an empty string (e.g., lstNumbers = replaceList(lstNumbers, "2,4,6,8", ""))? Granted, that will leave behind extra delimiters, but as far as ColdFusion list functions are concerned (e.g., listLen), those values would be gone. Another issue would be that replaceList only supports comma as the list delimiter, but running listChangeDelims over the list in advance could take care of that.
There are lots of ways to do this, here is mine:
<cffunction name="removeDuplicateListElements" returntype="string">
<cfargument name="list" required="yes" type="string">
<cfargument name="delimiter" required="no" type="string" default=",">
<cfscript>
var returnStruct = structNew();
var x = 1;
for(x = 1; x LTE listLen(arguments.list, arguments.delimiter); x = x + 1){
returnStruct[listgetAt(arguments.list, x, arguments.delimiter)] = "";
}
</cfscript>
<cfreturn structKeyList(returnStruct)>
</cffunction>
<cfscript>
list = "1,2,1,1,2,3,4,1,1,3,4,5,5,3,2,2,4";
writeOutput(removeDuplicateListElements(list));
</cfscript>
Here's the most succint solution I can think of:
<cfset thelist="1,2,1,1,1,1,2,2,2,2,2,3,3,3,4,4,4,2,4,5">
<cfset thelist=ReplaceNoCase(thelist,"2","","all")>
<cfset thelist=REReplace(thelist,",{2,}",",","all")>
<cfoutput>#thelist#</cfoutput>
If you want to replace multiple values at the same time:
<cfset thelist="1,2,1,1,1,1,2,2,2,2,2,3,3,3,4,4,4,2,4,5">
<cfset thelist=ReplaceList(thelist,"2,4",",")>
<cfset thelist=REReplace(thelist,",{2,}",",","all")>
<cfoutput>#thelist#</cfoutput>
Extraordinarily succinct indeed! The issue might be that by operating on the list as a string it would compromise the segregational integrity (not a real expression) of the list items. So, if you had some items that included one of the characters or sequences of characters targeted for removal they'd be chopped off, potentially changing unmatching list items as well as matching.
For example if the list was "12,2,1,1,1,1,2,2,2,2,2,3,3,3,4,4,4,2,4,5", the first value of the resulting list would be 1 instead of 12.
I'm sure the issue could be fixed by using a regular expression for the first replace as well as the second though. Which would probably win awards for conciseness and succinctness. Sometimes I think regular expressions are the secret to all power in programming. They'd probably let you even adapt the solution to handle infinitely many delimiters, one of the sacrifices Ben's solution consciously made for speed. Not to mention the option of enforcing case-sensitivity or not.
Speaking of which. Ben, you seem to be well-versed with ColdFusion performance. Would reReplace operations be significantly slower than your array vs. structure method? And how do they, and regular string searching and replacing compare to array and list operations in general?
Exciting discussion.
"Running the above code, we get the following output:
1,1,3,3,4,4,5,5,6,6,7,7,8,8,9,9
Notice that we deleted all the even value."
Erm... unless I'm missing something here (it happens a lot with me :P ) then the output you listed above is incorrect - as only the 2 is removed, NOT all even numbers??
Don't know if it was just a bad copy and paste??
@David
Good catch. Here's a method using a regular expression:
<cfset thelist="1,21,2,1,1,4,1,1,22,2,2,82,26,3,33,3,4,4,94,4,2,ab4,5">
<cfset itemstodelete="1|2|4">
<cfset delimiter=",">
<cfset thelist=ListChangeDelims(thelist,chr(7)&chr(7),delimiter)>
<cfset thelist=REReplaceNoCase(thelist,"(^|#chr(7)#)(#itemstodelete#)(#chr(7)#|$)","\1","all")>
<cfoutput>#ListChangeDelims(thelist,delimiter,chr(7))#</cfoutput>
Granted it's a little unconventional, but it executes 4 times faster than something more conventional, like this:
<cfset thelist="1,21,2,1,1,4,1,1,22,2,2,82,26,3,33,3,4,4,94,4,2,ab4,5">
<cfset delimiter=",">
<cfset newlist="">
<cfset itemstodelete="1,2,4">
<cfloop list="#thelist#" index="i" delimiters="#delimiter#">
<cfif not ListFind(itemstodelete,i,delimiter)>
<cfset newlist=ListAppend(newlist,i,delimiter)>
</cfif>
</cfloop>
<cfoutput>#newlist#</cfoutput>
Good ideas with the ListReplace() methodology. That didn't even occur to me. I don't see anything wrong with leveraging the fact that ColdFusion ignores empty list elements. If anything, that is a very clever use of that feature.
@Paolo,
Good catch. That was just a bad copy-n-paste job. After I tested that the list of values would work, I also tested to make sure a single-value delete would work as well (2). I must have copied the wrong test result.
@David,
As far as Regular Expression performance vs. list looping and struct usage, I cannot say for sure. I think it probably depends on the size of the list, but that's just a gut feeling, not an educated one. On a small list, I am sure regular expressions will be quite fast. The only think you have to be concerned about is searching for reserved characters.
Yes, REReplace is the obvious solution, but it seems needlessly brittle here because it requires that you either know your data or take a bunch of precautions. For example, you can't just embed the delimiters and characters to remove into your regex without either writing them in regex syntax to begin with or escaping regex metacharacters within them.
For the task at hand specifically, <cfset lst = reReplace(lst, "\b[2468][,:]", "" "all") /> is probably sufficient.
Ah, Ben covered the special character issue... hadn't seen his comment when I responded.
BTW, when I said regexes are the obvious solution, that's because I'm a regex nut. I hope that didn't come across in a condescending way. But yeah, in the majority of cases a simple replaceList or reReplace based one-liner is probably good enough, despite the limitations.
great function to work with
I tried something with some regexp
<cfscript>
list1= "1,2,2,3,2,4,5,5,455,1122,231,5,2";
list2= "2,3,3,1";
replacementlist2= list2.ReplaceAll(",","|");
result=list1.ReplaceAll('(,?\b(#replacementlist2#),?\b)*','');
</cfscript>
<cfdump var="#result#">
@Ben, @Steve
The ListReplace method suffers from the same problem mentioned by David -if you had some items that included one of the characters or sequences of characters targeted for removal they'd be chopped off, potentially changing unmatching list items as well as matching.
For example if the list was "12,2,1,1,1,1,2,2,2,2,2,3,3,3,4,4,4,2,4,5", the first value of the resulting list would be 1 instead of 12.
I found a short piece of cfscript that seems to do a great job.
http://www.cflib.org/index.cfm?event=page.udfbyid&udfid=1815
I have List (^B^C^^E^^^) I want to remove all the empty value at the end of the list, but keep all other empty field
(^B^C^^E^^^) = (^B^C^^E)
how could I acomplish it
thanks