User-Friendly Sort Of Alpha-Numeric Data In JavaScript
At InVision, we deal with a lot design files. And, when you deal with a lot of design files, you deal with a large number of user-created naming conventions. Unfortunately, a standard alphabetical sort of file names doesn't usually behave in alignment with the user's intended sorting. As such, I wanted to play around with creating a more "intelligent" sorting algorithm that would normalize values with mixed, alpha-numeric data.
The root of this alpha-numeric sorting problem can be demonstrated with three file names using a standard alphabetical sort:
- kittens-v1.jpg
- kittens-v11.jpg
- kittens-v2.jpg
Clearly, the user is using the "vN" approach to organizing their files; however, if you look at the sort above, you can see the problem - "v11" comes after "v2" in a logical context; but, in an alphabetically context, "v11" is less than "v2".
To try an overcome this disconnect, I wanted to see if I could normalize the numeric values within a given file name. This is a little bit tricky because you have to deal with values as both integers and decimals. And, then you can have some naming conventions that use multiple dot notations (ex. v3.5.01.5). However, for the sake of this exploration, I'm going to only care about "normal" decimal formats.
To normalize the values, I'm going to created fixed-width numbers. This means that integers will have leading-zeroes and decimals will have trailing-zeroes. Then, once the fixed-width numbers are in place, I'll preform a normal alphabetical sort:
<!doctype html>
<html ng-app="Demo" ng-controller="DemoController">
<head>
<meta charset="utf-8" />
<title>
User-Friendly Sort Of Alpha-Numeric Data In JavaScript
</title>
</head>
<body>
<h1>
User-Friendly Sort Of Alpha-Numeric Data In JavaScript
</h1>
<ul>
<li ng-repeat="file in files">
{{ file.name }}
</li>
</ul>
<form ng-submit="saveFile()">
<input type="text" ng-model="form.name" size="20" />
<input type="submit" value="Add File" />
</form>
<!-- Load jQuery and AngularJS from the CDN. -->
<script
type="text/javascript"
src="//code.jquery.com/jquery-2.0.0.min.js">
</script>
<script
type="text/javascript"
src="//ajax.googleapis.com/ajax/libs/angularjs/1.0.4/angular.min.js">
</script>
<script type="text/javascript">
// Create an application module for our demo.
var app = angular.module( "Demo", [] );
// -------------------------------------------------- //
// -------------------------------------------------- //
// I control the main application.
app.controller(
"DemoController",
function( $scope ) {
// I am the initial list of files.
$scope.files = [
{
id: 1,
name: "kittens-1.jpg"
},
{
id: 2,
name: "kittens-2.jpg"
},
{
id: 3,
name: "kittens-12.jpg"
}
];
// Sort the initial list of files so that they are
// in mixed-type, alpha-numeric order.
sortFiles();
// I hold the form values for ngModel.
$scope.form = {
name: "kittens-3.jpg"
};
// ---
// PUBLIC METHODS.
// ---
// I process the intake form for file names.
$scope.saveFile = function() {
if ( ! $scope.form.name ) {
return;
}
addFile( $scope.form.name );
};
// ---
// PRIVATE METHODS.
// ---
// I add a file with the given name to the current
// collection.
function addFile( name ) {
$scope.files.push({
id: ( new Date() ).getTime(),
name: $scope.form.name
});
sortFiles();
}
// I take a value and try to return a value in which
// the numeric values have a standardized number of
// leading and trailing zeros. This *MAY* help makes
// an alphabetic sort seem more natural to the user's
// intent.
function normalizeMixedDataValue( value ) {
var padding = "000000000000000";
// Loop over all numeric values in the string and
// replace them with a value of a fixed-width for
// both leading (integer) and trailing (decimal)
// padded zeroes.
value = value.replace(
/(\d+)((\.\d+)+)?/g,
function( $0, integer, decimal, $3 ) {
// If this numeric value has "multiple"
// decimal portions, then the complexity
// is too high for this simple approach -
// just return the padded integer.
if ( decimal !== $3 ) {
return(
padding.slice( integer.length ) +
integer +
decimal
);
}
decimal = ( decimal || ".0" );
return(
padding.slice( integer.length ) +
integer +
decimal +
padding.slice( decimal.length )
);
}
);
console.log( value );
return( value );
}
// I sort the current files based on the file name.
function sortFiles() {
$scope.files.sort(
function( a, b ) {
// Normalize the file names with fixed-
// width numeric data.
var aMixed = normalizeMixedDataValue( a.name );
var bMixed = normalizeMixedDataValue( b.name );
return( aMixed < bMixed ? -1 : 1 );
}
);
}
}
);
</script>
</body>
</html>
This approach requires a lot of work. Since I don't want to overwrite the original values, I have to recalculate the fixed-numeric-width values for each comparison involved in the sort. Clearly, you could add some sort of caching, but for this exploration, the overhead is not a deterrent.
Using the Kittens example from above, the sorting approach in this demo provides the following sorted output:
- kittens-v1.jpg
- kittens-v2.jpg
- kittens-v5.jpg
- kittens-v11.jpg
- kittens-v22.jpg
- kittens-v50.jpg
Notice that, while this doesn't implement a normal alphabetical sort, it does sort the values in alignment with what you'd "expect" based on the naming convention.
Want to use code from this post? Check out the license.
Reader Comments
That's a great trick.
I think it might be more MVC-ish to use
and add
thereby putting the sorting into the view.
Oops, I meant
and I forgot the return keyword in my function. Derp.
This is called natural sort. I created a CF solution for it based on this Javascript solution: http://sourcefrog.net/projects/natsort/natcompare.js
Oh, and here is a description of the algorithm as well as implementations in other languages: http://sourcefrog.net/projects/natsort/
More thoughts:
I think this might be really useful as a global function that can be applied to orderBy as desired, so here's what I came up with:
jsFiddle: http://jsfiddle.net/TyHQj/
The function natural() is added to the $rootScope - obviously this may or may not work for everyone, but it's an easy solution.
That function takes one argument, the property on the object that is going to be sorted. It then returns a function that processes the natural sort on the object.
I also modified the function to include multiple-dot values, like versioning. In the case of a number containing 2 or more dots, it treats it as a series of integers. This seems to work pretty well.
Usage is really simple:
Now you can enable natural sorting for any array easily. It's also completely removed from the controllers, so you don't have to mix presentation and business logic.
The only real drawback is a flaw (in my opinion) in orderBy, that there is no way to flip the direction of a function-based sorting parameter, unless you want to flip the whole result. Meaning, if you sorted by, say, ['userid', natural('title'), 'date'], you could flip the direction for userid and date, but not for natural().
@Phil,
I honestly don't know that much about filters; so, some of your code is a little bit hard to read (basically where the definition and invocation involves the "-id" thing - not sure how filters are wired together.
That said, I thought your modification to treat multiple dots as a series of integers was awesome! I can't believe I didn't think of that. At first, I thought about treating it like a bunch of decimals... but that wouldn't have made any sense. Your way makes so much more sense! Awesome.
@Sean,
Ahh, nice, "natural sort". I figured it had to have a name, but I didn't know it. Thanks for the link - I like that yours appears to be able to use any length of numbers, where as mine was fixed to the width of the given "padding" value. Good stuff.
@Ben,
Hope you don't mind, I took your idea and ran with it. I created a reusable module, and added date parsing into it, too.
The source code is much better commented than the jsFiddle was.
I wrote up a blog about it here: http://blog.overzealous.com/post/55829457993/natural-sorting-within-angular-js
There's also a repository for the code: https://bitbucket.org/OverZealous/angularjs-naturalsort
@Phil,
Really cool! Left a comment on your blog.
Why it is giving syntax error at some templates when I execute this code? I am unable to understand the error though I find this trick working at some places.