This is a simple guide on knowing how to write a binary search in a non-duplicate array. It could improve your code performance by a small to big amount depending on the size of the list you’re keeping track of and on whether you are adding and removing to the list often. It could also be very handy in interview questions.
I find myself often looking at code where the usual technique to add items to an array is just appending at the end. Whenever you need to remove items from the list you do a quick
1 |
myArray.splice(myArray.indexOf(item), 1); |
or even worse
1 |
jQuery.inArray(item); |
(and do not get me started about why I do not like jQuery)
Now let us examine the time complexity here. Insertion takes O(1) while removing an item takes O(n) (In case you did not know, indexOf() takes O(n)!!). Can we do better? Of course we can. Let’s try to keep a sorted array. The thing that I like best with using Binary Search, is that you can use the same piece of code for both insertion and deletion.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
function binarySearchAddRemove(myArray, item, isRemove) { var minIdx = 0; var maxIdx = myArray.length - 1; var currIdx = ((minIdx + maxIdx) / 2) | 0; // Same as Math.floor but faster. var currElement; var itemFound = false; while ((minIdx < maxIdx) && !itemFound ) { currIdx = ((minIdx + maxIdx) / 2) | 0; currElement = myArray[currIdx]; if (currElement < item) { minIdx = currIdx + 1; } else if (currElement > item) { maxIdx = currIdx - 1; } else { itemFound = true; } } //removing and adding cases if (isRemove && itemFound) { myArray.splice(currIdx, 1); } else if (!isRemove && (currElement > item)) { myArray.splice(currIdx, 0, item); } else if (!isRemove) { myArray.splice(currIdx + 1, 0, item); } } |
Code is pretty straight forward. The only thing you might find confusing is the | 0. As explained in the comment that does the same operation as Math.floor would. How is that you ask? Well | is the simple bitwise or operation. Bitwise operators convert numbers to a 32-bit sequence. So for any positive signed numbers (which is what we would have for indexes of an array), the conversion will drop the decimal part.
Now let’s do a little analysis, which would be very positive in an interview. When do we want to use our binarySearchAddRemove? The time complexity for adding and/or removing is O(log(n)). Hence in any case when we expect adding and removing items from a list we will be better off using binarySearchAddRemove. On the other hand if you were only going to add items to the list (and the possibly cleared it), you will be better off using the simple appending at the end of the list discussed at the beginning.
Edit: Since I was asked about it, below you’ll find a quick Java adaptation of the code. Enjoy 🙂
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
public void binarySearchAddRemove(ArrayList<Integer> myArray, Integer item, boolean isRemove) { int minIdx = 0; int maxIdx = myArray.size() - 1; int currIdx = (minIdx + maxIdx) / 2; int currElement = myArray.get(currIdx);; boolean itemFound = false; while ((minIdx < maxIdx) && !itemFound ) { currIdx = (minIdx + maxIdx) / 2; currElement = myArray.get(currIdx); if (currElement < item) { minIdx = currIdx + 1; } else if (currElement > item) { maxIdx = currIdx - 1; } else { itemFound = true; } } //removing and adding cases if (isRemove && itemFound) { myArray.remove(currIdx); } else if (!isRemove && (currElement > item)) { myArray.add(currIdx, item); } else if (!isRemove) { myArray.add(currIdx + 1, item); } } |
Edit2: New version of the code ;D with less checks (as per prof. Gries comment)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
public void binarySearchAddRemove(ArrayList<Integer> myArray, Integer item, boolean isRemove) { int minIdx = 0; int maxIdx = myArray.size() - 1; while (minIdx < maxIdx) { int currIdx = (minIdx + maxIdx) / 2; if (myArray.get(currIdx) < item) { minIdx = currIdx + 1; } else { maxIdx = currIdx; } } // covering case where list is empty, you don't wanna run out of bounds do you? boolean itemFound = ((maxIdx != -1) && (myArray.get(maxIdx) == item)); // removing and adding cases if (itemFound && isRemove) { myArray.remove(maxIdx); } else if (!isRemove && (myArray.get(maxIdx) > item)) { myArray.add(currIdx, item); } else if (!isRemove) { myArray.add(currIdx + 1, item); } |
Final thoughts and things to note: Our code now runs a check at the end of the loop. This will decrease our best case performance (since now we have to go through all the loop all the time / cannot exit when we find the element), but will increase our performance for the worst case scenario, since we are removing 2 checks from the loop>
Other than that we have to be mindful about the 2 edge cases:
1. List is empty
2. Item we’re looking for is greater than everything in the list so it need to be appended at the end.
Finally with this final rewriting, we support a duplicate array too, where we will find the leftmost occurrence of our item :).
Latest posts by Ajk (see all)
- Find Median from Numbers Array - January 1, 2018
- Find Whether a Number Can Be the Sum of Two Squares - August 12, 2017
- How to Find Two Primes Whose Sum is Equal to a Number N - April 22, 2017
Nice tip!
Here’s a lovely interview-style followup: in your while loop, you test && !itemFound on every iteration. It of course evaluates to true at most once. The same thing goes for your if/else statement: the second clause, currElement > item, will only be false when/if you find the item.
One consequence of this style of binary search is that, if there are duplicate items, you don’t know which one you found. It’s fairly straightforward to rewrite the loop so that it finds the index of the rightmost item <= the item you're looking for. Your code gets shorter as a result!
That's the first challenge: simplify the boolean expression in your while condition, and also simplify your if/else/else so that it doesn't need the extra expressions. You'll have less code in your loop, although you'll need to add a statement after the loop to set itemFound.
The second challenge is a profiling one: is this new approach faster? Always? What if the list is enormous so that swapping occurs — do you get a benefit if you exit early?
Professor Gries! It’s an honor to have you comment on my writing :).
Very nice follow up question. I wrote the code for an out of the loop condition check which will increase our performance on the worst case (by removing the extra check(s) in every iteration), but will decrease it on the best case (where we will have to do the full log(n) runs of the loop and not immediately exit).
I would assume that with gigantic lists we would like to exit the loop as early as possible, since the paging costs of running the loop until the end on best and average cases, will outweigh “a few extra checks per iteration” on the worst case scenario, therefore giving us a better average case complexity and a tiny bit worse worst-case. ;D
[…] off try to keep the array sorted. This way we will only have to do a Binary Search whenever we check for an item in the list. This will significantly reduce our time […]