Examining the Scriptaculous Unit Testing Implementation
5: toQueryParams
: an Example of Prototype Extension
In the previous page, I remarked on the curious string of symbols in the line of code that appears in both of Test.Unit.Runner's "query" functions: window.location.search.parseQuery()
. Now I'm going to make sense of the code, which will lead me deep into the workings and even the philosophy of prototype.js.
I know the global object window has a property called
location: it's usually the value in the browser's location bar. When I open string_html.html in my browser, the location should be something like
file:///C:/here/be/dragons/scriptaculous-js-1.7.0/test/unit/string_test.html
Now, the next symbol is .search. Does the location have a search member? Yes, because
window.location is not a simple string, but an object. The
search member represents the portion of the URL after the '?'. In my case, there's nothing after the question mark. But it looks as though we could control the values returned by the two functions by specifying certain URL parameters.
window.location.search returns a string. So what is
parseQuery
-- a string method, or maybe something inherited from
object? No:
parseQuery
is an addition to the String
prototype that's made in prototype.js. In fact, it's an alias for an addition. Here's the code:
Object.extend(String.prototype, { ... toQueryParams: function(separator) { var match = this.strip().match(/([^?#]*)(#.*)?$/); if (!match) return {}; return match[1].split(separator || '&').inject({}, function(hash, pair) { if ((pair = pair.split('='))[0]) { var name = decodeURIComponent(pair[0]); var value = pair[1] ? decodeURIComponent(pair[1]) : undefined; if (hash[name] !== undefined) { if (hash[name].constructor != Array) hash[name] = [hash[name]]; if (value) hash[name].push(value); } else hash[name] = value; } return hash; }); }, ... }); ... String.prototype.parseQuery = String.prototype.toQueryParams;
The first two lines of the toQueryParams
function check to see whether the string matches a pattern for URL query information. In line one, this
is the string object to which the toQueryParams
function now "belongs" through inclusion in the String
class's prototype. Unless the string matches the pattern, this function will just return an empty hash/object.
Assuming we get past the regex test, we next come right upon something we were not expecting: a return
statement. There are a dozen lines left in this function, but we're already returning a value? This is confusing to read. It's an example of the chaining of function calls and inline-function definitions that are becoming more common in contemporary JavaScript code. I've even started writing this way at times too, because it seems the most expressive and least awkward way to do some things. But I won't deny that it is hard to read. The density of those powerful dots, parens, and brackets is off-putting, and it's still jarring to see "function" near the right end of a long line instead of at the left end where it's always belonged before.
Here's what's going on. In the first lines, we applied a regular expression against the value of the String
object. That call to match stored two strings in the match array variable. The rest of the toQueryParams
function concerns itself with the second match string, match[1]
. Specifically, it splits this string, either by the argument separator
, if it has been supplied, or by an ampersand. That 'either or' functionality is effected by the ||
operator. As described before, if the first element in an or expression returns a value, that is the value of the whole expression. Otherwise, if the second element has a value, that is returned. The value of '&' after all is the string '&'.
The split
function returns an array. Chained to the call to split
is a call to the function inject
. By chained, I mean that the dot operator between the split
call and the inject
call indicates that inject
will be interpreted as a method of the object returned by split
. What is inject
? It is a new function for arrays and hashes, provided by prototype.js, and it's quite sophisticated. Here's the inject
function's code, in a nutshell.
inject({}, function(hash, pair) { ... })
You can find some documentation on inject, and the rest of the prototype.js API, at http://www.gotapi.com/prototypejs. I also recommend Scott Raymond's Prototype Quick Reference.
The first thing you should know about the inject
function is that it is going to act over the entire array, element by element, on which it is called. in other words, if my_array
has ten members, inject
is going to act ten times. prototype.js includes several of these iterative functions for arrays and hashes.
How does inject
behave? The first thing it does, before any iteration takes place, is set up an initial value. You pass it that value in the first argument you pass to inject
. in this case, it's an empty hash. The second argument you pass is a function. This function is going to execute once for each member of the array. Here is the function passed in our example:
function(hash, pair) { if ((pair = pair.split('='))[0]) { var name = decodeURIComponent(pair[0]); var value = pair[1] ? decodeURIComponent(pair[1]) : undefined; if (hash[name] !== undefined) { if (hash[name].constructor != Array) hash[name] = [hash[name]]; if (value) hash[name].push(value); } else hash[name] = value; } return hash; }
This function is by convention called an iterator. inject
expects the iterator function to have a certain signature: it has to accept two or three arguments. The first is for an object called an accumulator. Each time the iterator function is called, inject
passes the accumulator as the first argument. The first time the iterator is called, the initial value is passed as the accumulator. It is up to you, the coder, to make sure that each iteration has the potential to alter the accumulator object as appropriate and that the accumulator is always returned at the end of the iteration. The
The second argument passed to the iterator function is called the value. It is going to receive, from inject, a representation of the element of the array on which the iterator is to act. Remember, in toParseQuery
, the inject
function is called on an array of strings. So the value of each element of the array is a string. The iterator assigns the string to the variable pair
.
Now, what's going on inside the iterator? First, we attempt to split the pair string by an equal sign. There's some very C-like code in that test statement: (pair = pair.split('='))[0]
. The pair
variable symbol is now assigned to the array that results by splitting pair
the string at an equal sign. And then the
[0]
notation returns the first element of the resulting new array. If that element resolves to a boolean value of
true
, then we continue. Otherwise, we don't do anything except return the hash variable, which is our accumulator, back to
inject
without any modification.
Assuming pair
the string split successfully, pair
the array is ready for use. We assign some well-named variables, name
and value
, with the name and value segments of the pair
array. Well, we do some stuff to the values first--we attempt to decode them from URI encoding--but let's skip over that.
Here's the next section of code:
if (hash[name] !== undefined) { if (hash[name].constructor != Array) hash[name] = [hash[name]]; if (value) hash[name].push(value); } else hash[name] = value;
So the first line checks to see whether there is a record in the hash array variable, the variable that is serving as our accumulator, with a key that matches the value of the string variable name. If there isn't such a record, hash[name]
returns undefined
. The undefined
value is an odd beast. It isn't a boolean false
, and it isn't the null value, although it can be treated as both in certain contexts. The first line handles it the way the ECMAScript spec tells you to handle it, by using the !==
operator, not !=
, to test whether a value is unequal to the undefined
value. If you use !=
, then the undefined
variable is treated as the null value. (Yikes, I am going to need to review some of my code now.)
If there is a record for name in hash
, the second line of code above checks to see whether the value hash[name]
is an array. Note how it does this--by checking whether the constructor for hash[name]
equals 'Array'. What is 'Array'? It is not a class type. It is a function, the constructor function used to make instances of Array. What would we have gotten using the typeof
operator on an array? That would have returned 'object'. Checking the constructor is a better way to determine whether an object is of a specific pseudoclass.
What does line three do? Executing when hash[name]
is not an array, it reassigns hash[name]
to an array with one value, the previous value of hash[name]
. So, if hash['bunny']
were equal to 'Harvey', hash['bunny']
is now equal to ['Harvey'].
Line four pushes the new value onto the array hash[name]
. So if value were 'Peter' at this point, hash['bunny']
now equals ['Harvey', 'Peter'].
How about that push
method for our array. If you have experience programming JavaScript, you may remember this: my_array[my_array.length] = "foo"
That was always kind of awkward. There's something much more satisfying about my_array.push("foo")
.
Finally, if there were no matching record already in hash
, we add such a record. Note that the value of the record is only a string, not an array. If there proves to be another value with this key to add later, then we'll convert this value into an array.
This brings us to the final line of the iterator function:
return hash;
This returns the hash
variable, our accumulator object, back to the inject
function. This closes the circle, if you like, handing the accumulator object over to be used again in the next iteration. What if there is no further iteration? The inject
function returns the accumulator object, and that in turn is returned, via that return
statement that perplexed us earlier, as the value of toQueryParams()
.
Whew.
BTW, the third, optional, argument of the iterator function, if it were present, would receive the index of the member. In this case, in toQueryParams
, the index didn't matter, so it was omitted.
Why was I looking so closely at the extended String method toQueryParams
? So that I would know what the methods parseResultsURLQueryParameter
and parseTestsQueryParameter
of Test.Unit.Runner
do:
So let's get back to those functions.