Monday, January 31, 2011

Binary Trees Revisited

In a previous post, I said that my array-ed BT would only work on a balanced tree. I realize now this is false. It works on any type of tree, there just needs to be checking to make sure a 'node' exists there first.

Also, I've been thinking about how to make a binary tree balance itself. I know that there are probably plenty of data structures and other things already in existence,  but why ruin the fun when I can just think about it first?

So this is how I think it would work:
In the case of a completely skewed tree, you grab the 'middle' node, then from there, you assign that as the root. from the nodes that were higher than that node, you grab the middle of that, and from the lower nodes, the middle of that, those are it's right and left nodes respectively.

In other words, recursively grab the middle of the tree. If you had this:
5
6       this is the left of the middle node
7
8        this is the middle node.
9
10       this is the right of the middle node
11

can you picture it? It's not too hard to code either. The matter is, that that example was a completely skewed tree. But what about if you have a more normal tree? like:
     5
  2     8
1  3      9
               10

I really hope that formats correctly... Anyway, OBVIOUSLY you grab the 9 and throw it where the 8 is and you get a perfectly balanced tree. But what exactly is the thought process?

As a matter of fact, it's the same thing we were doing before, just part of our tree is already balanced, and if you take a look, our tree contains a high value of 10, half of that is 5, our root! In other thoughts, that means that if our tree already has the proper root we shouldn't bother changing it, but continue onward and examine the branches of that root. You'll notice that once again, the left sub tree of our root is balanced already, so no need to check it, but the right side is skewed. In fact, if you ignore the root and left sub tree, you'd just have the skewed tree of 8,9,10. and if you ran our algorithm on that? Well, there are 3 nodes in our tree, and it so happens, that the middle one of those is 9 and to balance our tree? We need that 9 to be the root. Tada.

Now you may have noticed a bit of... entropy creeping into my arguments for example, when we look at our middle, I've used both the value of the node, and the node themselves. So which should our algorithm take on? Obviously, the value side of this. So in that case, how do we select 9? Well, we can look at tree depth? But once again that's more of our nodes really. 10 + 8 divided by 2 gets us it, but is this just a happy coincidence?

If we look at our left tree we see it's already balanced, but what if it were like our right side? it'd go 3 2 1. The last node, and the first node added together and then divided by two? get's us 2. Which happens to be the value we need to balance! Seems nice right?

What if you don't have the right value? How do you find the closest? If two values are the same distance away, then how do you choose? I'm going to think about these more and get back to you. Maybe even with some code to show!