The Software Purist |

May/10

12

Binary Searches and the 80/20 Principle

I recently came across this post: http://reprog.wordpress.com/2010/04/19/are-you-one-of-the-10-percent/ . I thought this was a very interesting topic, not because a binary search is a particularly interesting algorithm, but because of the realization that only 10% of programmers could get this right on the first try. To test, I did it myself, on paper. I was able to work it out on paper before officially testing it, but I can also see how most people can miss it. It appears to work, according to my tests. Here’s the solution I came up with:

template <class TCollection>
size_t binarySearch(const TCollection& p_collection, const typename TCollection::value_type& p_value)
{
	size_t oldMax = p_collection.size() - 1;
	size_t oldMin = 0;
	size_t newIndex = (oldMax + oldMin) / 2;

	while (oldMin != oldMax)
	{
		if (p_collection[newIndex] > p_value)
		{
			if (oldMax == newIndex)
			{
				break;
			} // end if

			oldMax = newIndex;
			newIndex = (oldMax + oldMin) / 2;
		}
		else if (p_collection[newIndex] < p_value)
		{
			if (oldMin == newIndex)
			{
				break;
			} // end if

			oldMin = newIndex;
			newIndex = (oldMax + oldMin + 1) / 2;
		}
		else
		{
			return newIndex;
		} // end if
	} // end while

	return size_t(-1);
} // end binarySearch

void test()
{
	std::vector<int> collection;
	collection.push_back(1);
	collection.push_back(2);
	collection.push_back(3);
	collection.push_back(5);
	collection.push_back(6);
	collection.push_back(7);
	collection.push_back(9);
	collection.push_back(10);

	assert(binarySearch(collection, 0) == size_t(-1));
	assert(binarySearch(collection, 1) == 0);
	assert(binarySearch(collection, 2) == 1);
	assert(binarySearch(collection, 3) == 2);
	assert(binarySearch(collection, 4) == size_t(-1));
	assert(binarySearch(collection, 5) == 3);
	assert(binarySearch(collection, 6) == 4);
	assert(binarySearch(collection, 7) == 5);
	assert(binarySearch(collection, 8 ) == size_t(-1));
	assert(binarySearch(collection, 9) == 6);
	assert(binarySearch(collection, 10) == 7);
	assert(binarySearch(collection, 11) == size_t(-1));
} // end test



It’s easy to miss it, because there are edge cases, a truncation problem, and multiple termination conditions. I only discovered these cases when I worked through a very small sample set in my head. What worried me about the initial find is that I wasn’t shocked by the result of only 10% of developers getting this right. I think what’s concerning to me is that many people wouldn’t have tested it in isolation, just felt that they had a solution that worked, and moved on. But, then I can’t blame developers for this. I think this is the fault of working in an industry where being thorough isn’t as rewarded as it should be.

The more I work, the more I notice this problem. There is so much focus on short-term goals and results, that it’s easy to cut corners. “You need to finish feature X by COB today.” Well, in order to accomplish this, I notice most people will skip otherwise essential things such as unit testing, and sometimes not even test at all. The design phase is also often skipped, perhaps even more often than testing. When you’re given ambitious or impossible deadlines, it’s natural to skip designing and start coding right away. Worse is that if you do spend time designing and testing the software, you often aren’t rewarded for your efforts; Nobody notices.

Ultimately, what I’m noticing is that a lot of developers remove as much as 50% of the engineering effort, for 80% of the result. While maybe not the 80/20 principle, it’s close enough to feel like it. The problem is that I don’t believe software can follow the 80/20 principle and be effective. That loss of 20% is a lot of bugs… many relatively benign, but I also feel like that remaining 20% also encompasses most of the difficult to reproduce bugs; The bugs we are forced to write off and sometimes never successfully fix. At best, we pull out a roll of duct tape, use enough tape to keep it stable enough to hold together, hoping nobody notices.

As a developer, I can fully admit that when I’ve been pressed with unreasonable schedules, I’ve occasionally responded by cutting corners, just like most people would have handled it. This rarely resulted in a better product. It’s also almost impossible to later go back and rework the foundation when the tower is in danger of toppling over. In other industries, proceeding in this manner would be seen as ludicrous, but in software, it’s accepted. But, it’s unfair to blame developers for this; I blame short-sighted business models and managers who focus mostly on CYOA.

Anyway, coming full circle, I wouldn’t be surprised if there’s a search algorithm in your code base that doesn’t quite work for all cases. If you find a broken search in your code base, let me know and maybe I’ll even post your findings on this website.

RSS Feed

1 Comment for Binary Searches and the 80/20 Principle

Tweets that mention Binary Searches and the 80/20 Principle - The Software Purist -- Topsy.com | May 15, 2010 at 11:04 am

[...] This post was mentioned on Twitter by The Software Purist. The Software Purist said: Binary Searches and the 80/20 Principle http://bit.ly/axLtpF [...]

Leave a comment!

<<

>>

Find it!