XOR using Tries - Part 1

XOR using Tries - Part 1

Trie

Trie can store information about keys/numbers/strings compactly in a tree.

Tries consists of nodes, where each node stores a character/bit. We can insert new strings/numbers accordingly.

Storing numbers in trie

We can store numbers in trie using binary representation.
If we have 4 numbers - 1, 4, 5, 7.

We first write binary representations of the same, which are:

001
100
101
111

And then, for every node, 0 goes on the left hand side and 1 goes on the right hand side.

Lets insert 1, 0001

  1. We start of with epsilon (E), which is root.

  2. Then first bit is 0. So, go to left of root, create a node.

  3. Next bit is 0, go to left of this node and create a node.

  4. Next bit is 0, go to left of this node and create a node.

  5. Next bit is 1, go to right of this node and create a node.

  6. Similarly for other numbers.

While traversing the trie, if we already have a node while inserting 0 or 1, simply go to that node and move on to the next bit.

The number of leaves of the tree are the number of integers we are storing in the trie.

XOR

Exclusive or is a logical operation that outputs true only when inputs differ.

Problem

Given an array, find two numbers whose XOR is maximum in the array.

Simple solution

Run two loops and for every pair combination, calculate the xor. If the xor is greater than max_xor_so_far, it replaces the value in the max_xor_so_far.

Algorithm

A[n] = [a1, a2, a3 …. an]
Max_so_far = -1
For i=1 to n
For j=1 to i
If a[i] xor a[j] > max_so_far
Max_so_far = a[i] xor a[j]
Print max_so_far

But, as we see, this has 2 loops and so takes O(N2).

Optimised solution

Now, we use the properties of xor, bit manipulation and a data structure, trie to optimise this solution.

Let us try exploring the properties of xor.

Now, basic properties:

0 xor 0 = 0
0 xor 1 = 1
1 xor 0 = 1
1 xor 1 = 0

Now, forget about getting maximum xor pair from the array.

What is maximum N-bit number we can get when we xor an N-bit number with another N-bit number?

The answer is simple, it is all 1s.

Or

XXXXXXX xor YYYYYYY = 1111111

So, basically if I have a number say 19, I can represent it in binary form as 10011.

If I wish to xor it with some other and get 11111 (all 1s), what do I xor it with? 🤔

Here is where we use the basic properties.

Let’s call that number X.

So,

19 xor X = 31

Or

10011 xor X = 11111

From left hand side to right hand side,

1st bit of X should be 0, since, 1 xor 0 = 1
2nd bit of X should be 1, since 0 xor 1 = 1

Likewise...

The number should be 01100

So, basically every bit of X should be opposite of every bit of 19 to get 31.

10011
01100
--------
11111

At this stage, we can have an intuition for our original problem. So, basically, for each number in the array, I need to look for some other number in the array whose bits are opposite of this number. Pretty simple right?

But, how do we ensure we have a number with opposite bits "at all places" for each number in the array?

Here’s the interesting part, we, sort of, compromise at places where we are unable to get the opposite bit, so, we take the same bit and move on to the next place.

What we mean by this will be much clearer through the below example.

Suppose, I have an array: 9, 3, 10, 1, 13.

We have already learnt how to insert numbers in a trie, so using that, insert 9 into the trie.

Now, consider number 3.

3 = 0011

So, we need 1100 to make 1111 (maximum possible), that is 0011 xor 1100 = 1111

We have 1, go to it.

Next, we need 1 but don’t have it, so we compromise and take 0 and go to it.

Next, we need 0, so go to it.

Next, we need 0, but don’t have it so we compromise again and take 1 and go to it.

Finally, we get 1001, which is 9.

So, before 3, 9 is the number that will fetch the maximum xor for 3, that is, 3 xor 9 = 10.

This is max_so_far = 10.

Now, insert 3 into the trie.

Now, consider number 10.

10 = 1010

So, we need 0101 to make 1111.

We have 0, so go to it.
We don't have 1, so take 0 and go to it.
We don't have 0, so take 1 and go to it.
We have 1, so take it.

So, finally we have 0011 which is 3 as the number in the array before 10, which gives us the maximum possible xor for 10 (considering numbers appearing before 10 in the array), that is, 9.

As max_so_far = 10 > 9, our max_so_far stays.

Insert 10 into the trie.

Now, consider 1.

1 = 0001

So, we need 1110 to make it 1111.

We have 1, so take it.
We don't have 1 so, take 0.
We have 1 so take it.
We have 0 so take it.

So, we get 1010 which is 10, so for 1, 10 gives us maximum possible xor (considering numbers appearing before 1 in the array), that is, 11.

Max_so_far = 10 < 11, so max_so_far = 11

Insert 1 into the trie.

Now, consider 13.
13 = 1101
So, we need 0010 to make 1111

We have 0 so take it
We have 0 so take it
We have 1 so take it
We don't have 0, so take 1

So, we get 0011, which is 3, so for 13, 3 gives us maximum possible xor, that is, 14.

Max_so_far = 14 > 11, so max_so_far = 14

Insert 13 into the trie.

Now, this is the complete trie with all the numbers.

So, finally the maximum xor we got from the array for a pair of numbers is 14.

Now, let’s see it’s time complexity.

Well, we are inserting N numbers into trie.
Each insertion takes log(MAX) operations.

What is MAX now?

MAX is the number which is maximum possible for given number of bits or say (all 1s).

How to choose it, well, choose the maximum number from the array, 13 in the above case, choose the next power of 2 minus 1, which is 16-1, 15, so, 1111, so 4 bits. So we need a trie where we represent every number in the array as 4 bits.

Next the traversal in the trie takes log(MAX) operations too. For N numbers we do N traversals, so, totally for insertions and traversals, thats 2*N*log(MAX) which is, O(N log(MAX)).

Neat! We have drastically reduced it from O(N2) to O(N log(MAX)).

Algorithm

A[N] = [a1, a2, a3 ..... an]
insert_in_trie(a1)
for i = 2 to N
max_xor_so_far_for_ai = traverse_in_trie_to_get_max_xor(ai)
if max_xor_so_far_for_ai xor ai > max_so_far
max_so_far = max_xor_so_far_for_ai xor ai
insert_in_trie(ai)
print max_so_far

We will explore another similar use case in another problem.