Innovaptor Logo

Performance Benchmark of Methods for Enumerating a NSArray Pt.1

A few days ago, a colleague of ours had to compare the objects in an NSArray with each other in order to find two specific matching elements. When he asked us how to achieve this in a more efficient manner I started thinking.

The first thing that comes to mind is Fast Enumeration of course. The name implies that it is the fastest method. But if your comparison is somewhat expensive, you don’t want to simply use two nested Fast Enumeration Loops because you would compare each pair of elements twice. Therefore, you have to find a way to start the enumeration in the inner loop from the current element of the outer loop. A graphical illustration of the comparisons would yield a triangle.

When I started playing around with different methods to achieve this task, I figured that I might as well test the performance of various types of enumerations. Because unless you test it, assumptions about the performance may as well be educated guesses.

In this first part, I examine various methods for a very commonly used task: enumerating every object of an array. In the upcoming parts, I will take a look at enumerating only parts of an array and enumerating every combination of two elements in an array.

Most of the techniques can be used for enumerating other collections as well, however performance results may vary drastically.

Simple Enumeration

In this part, I started with one of the most common collection tasks: enumerating through an NSArray.

Three methods come to mind:

  • counting for-loop
  • Iterator Pattern (NSEnumerator)
  • Fast Enumeration
For completeness and because it is needed later I’ve also evaluated backward enumeration for every method.

counting for-loop

The naive approach is similar to what you would do to enumerate a c-array. It uses a simple for-loop, but instead of using pointer arithmetic to access the single elements, you have to use NSArray’s method objectAtIndex:. The Code is pretty straight foreward:

// Sample 1
for(NSInteger i = ([_testArray count] - 1); i >= 0 ; i--)
{
    NSNumber *number = [_testArray objectAtIndex:i];
    // your code here
}
// Sample 2
for(NSUInteger i = 0; i < [_testArray count]; i++)
{
    NSNumber *number = [_testArray objectAtIndex:i];
    // your code here
}

NSEnumerator

NSEnumerator is the Cocoa-Implementation of the Iterator Design-Pattern. Usually iterators have two Methods:

  • bool hasNext()
  • id next()
NSEnumerator implements only the latter and will return nil if there is no next object. Different collections provide different enumerators. For example, a NSDictionary has methods for both an object-enumerator and a key enumerator, whereas NSArray offers the methods objectEnumerator and reverseObjectEnumerator to obtain an NSEnumerator for forward an backward enumeration.

// Sample 3
NSEnumerator *enumerator = [_testArray objectEnumerator];
NSNumber *number;
while(number = [enumerator nextObject])
{
    // your code here
}
// Sample 4
NSEnumerator *enumerator = [_testArray reverseObjectEnumerator];
NSNumber *number;
while(number = [enumerator nextObject])
{
    // your code here
}

Fast Enumeration

In Objective-C 2.0, a convenient way for enumerating collections was introduced: the FastEnumeration protocol. Aside from more concise and readable code, it is claimed to be much faster and, in fact, Apple encourages you to use FastEnumeration over the other methods.

// Sample 5
for(NSNumber *number in _testArray)
{
    // your code here
}

FastEnumeration does not directly provide a way to enumerate your array backwards, however it will take an NSEnumerator Object that you provide, so you can provide an reverseObjectEnumerator like this:

// Sample 6
for(NSNumber *number in [_testArray reverseObjectEnumerator])
{
    // your code here
}

Enumerate with Block

With the release of GCD on the iPhone with iOS 4.0, Apple added yet another way to enumerate collections: blocks. You pass a block with 3 parameters - the current object, the index of the current object and a stop flag - to the collection. It will then automatically invoke the aforementioned block for every object within the collection, passing it as the first object into the block.

// Sample 7
[_testArray enumerateObjectsUsingBlock:^(NSNumber *number, NSUInteger idx, BOOL *stop)
{
// your code here
}];
// Sample 8
[_testArray enumerateObjectsWithOptions:NSEnumerationReverse
                             usingBlock:^(NSNumber *number, NSUInteger idx, BOOL *stop)
{
    // your code here
}];

Since Fast Enumeration imposes the restriction that the collection may not be mutated during enumeration anyway, you can even execute these blocks concurrently!

Utilizing multicore processors for potentially costly enumerations is really just as easy as passing NSEnumerationConcurrent as option to the call:

// Sample 9
[_testArray enumerateObjectsWithOptions:NSEnumerationConcurrent
                             usingBlock:^(NSNumber *number, NSUInteger idx, BOOL *stop)
{
    // your code here
}];

Results

Benchmark results were gathered by running the samples on an iPad 2 with an array containing 1.000.000 NSNumber Objects (numbers from 0 to 999.999). Total time is the time needed to iterate over the whole array. Time per element is the total time divided by the total number of objects in the array. The factor is relative to Fast Enumeration.

Sample Nr. Enumeration Total Time Enumeration Time per Element Setup Total Time Setup Time per Element Factor
Fast Enumeration (whole array) 42.51 ms 0.0425 µs negligible negligible 1
Counting for-Loop 1 639.28 ms 0.6392 µs negligible negligible 14.26
Fast-Enumeration, Pointer Check 2 45.99 ms 0.04599 µs negligible negligible 1.08
Fast-Enumeration, Subarray 3 42.55 ms 0.0425 µs 177.14 ms 0.1771 us 1.0001
C-Style-Array 4 356.90 ms 0.3569 µs 14.77 ms 0.0147 us 8.39
Fast-Enumeration, Ignore List 5 781.94 ms 0.781942 us negligible negligible 17.45
Enumeration with Block 7 469.27 ms 0.469 us 23.61 ms 0.0236 us 11.03
Enumeration with Block concurrent 9 217.30 ms 0.2173 us 24.33 ms 0.0243 us 5.11
Also, the more general Methods have been measured with an extra testcase where 20 random elements are excluded from the enumeration: –>
Sample Nr. Enumeration Total Time Enumeration Time per Element
Fast-Enumeration, Ignore List 6 1385.19 ms 1.3851 us
Enumeration with Block 8 576.39 ms 0.5763 us
Enumeration with Block concurrent 214.29 ms 0.2142 us

Instruments

The following image shows an overview of my profiling session in instruments. The timeline shows each one of the samples executed after each other. The profiler results show in which function the program spends the most time during execution, so in essence, this shows a ranking of the methods. Sample 4, which is NSEnumerator with reverse Order took the most time, whereas Sample 5 - the Fast Enumeration - was the fastest one.

Digging deeper into the stack traces allows for a more thorough understanding:

  • counting for-loop (Sample 1 and 2): Only 4,7% (28ms) of the time is actually spent within the objectAtIndex: method. In total, more than 60% of execution time is needed for objc_retain and objc_release messages. Moreover, there is almost no difference for forward- and backwards enumeration, which is not very surprising.
  • For NSEnumeration (sample 3 and 4), the distribution is quite similar to the counting for-loop, more than 50 % of the time are spent with objc_retain and objc_release calls. But unexpectedly for me, the [NSEnumerator nextObject]; method took 8,5% which is with 53ms almost double the time that objectAtIndex: took. For the reverseEnumerator, it was even higher with 9% (61ms). All in all, NSEnumeration is a little bit faster than a for-loop. On a side-note: a NSFastEnumerationEnumerator object is used for the forward enumeration.
  • Fast Enumeration (sample 5): Nothing. The stack-trace consists solely of the the call to the enumeration method, the only thing that shows up is the NSLog statement that I use after the enumeration to log the time it took. I’m not quite sure what I should think about this, either the sampling frequency of 1ms is too slow and misses everything that goes on (unlikely), or the Fast Enumeration Protocol is implemented on such a low level that doesn’t show up in the sample instrument, which could of course explain the almost non-existent performance overhead for the enumeration.
  • Fast Enumeration reverse (sample 6): 6.6% (17ms) is spent in the NSEnumerator method countByEnumeratingWithState:objects:count. A look at the documentation for the FastEnumeration protocol reveals that this is the one method that every object that adopts the NSFastEnumeration protocol has to implement. The rest of the FastEnumeration is was not recorded by the sampler instrument just like for the previous sample.
  • The block methods (sample 7, 8 and 9) are all alike, enumerateObjectsWithOptions:usingBlock calls NSArrayEnumerate which again calls NSArrayChunkIterate where about 1/3 of the time is spent. Then, the block is invoked (which takes 15%) and the rest is objcretain and *objcrelease calls. *The only difference for concurrent blocks is that there are some GCD calls inbetween, which don’t take much time tough. The CPU strategy view shows that the workload is evenly distributed over both CPU cores

Load Distribution for concurrent Blocks *Load Distribution*

Conclusion

Fast Enumeration is fast, being more than 10 times faster than every other method. Keep in mind that these numbers come from iterating a fairly large array and once you do work in each iteration, the time of the iteration method itself becomes less significant, but there is no reason why you would use NSEnumerator or a for-loop when you need to enumerate the whole array.

Although reverse enumeration seems to be slower in all methods, keep in mind that these numbers are just pure enumeration. If you have to iterate over your array in reverse order, take a closer look at the work being done in each iteration. If it is significantly more than enumeration itself, the hassle of changing your algorithm to forward enumeration because of the faster enumeration is likely not worth it.

Enumeration with concurrent blocks open up a interestingly easy way to utilize multicore CPUs.

Sample Project

I’ve created a small Sample-Project containing all presented methods (including the methods for the upcoming parts).

You can get it here: https://github.com/iv-mexx/enumerationbenchmark

Update

I’ve already posted the next part about partial enumeration, you can find it here.

Markus Chmelar Portrait

Markus Chmelar, MSc

Markus is a technical mastermind and one of the founders of Innovaptor. He studied Computer Engineering at Vienna University of Technology and contributes a valuable diversification to Innovaptor's qualifications. Markus is eager to find innovative solutions to complex problems in order to provide products that are tailored directly to the customers' needs