Data Structures and Algorithms Discussion Board
May 18, 2012, 04:24:29 PM *
Welcome, Guest. Please login or register.
Login with username, password and session length
News: Looking for a reliable webhosting provider? Read HostGator review to find 7 arguments in support of HostGator.
 
   Home   Help Search Login Register  
Pages: [1]
  Print  
Author Topic: complexity of BST union  (Read 4514 times)
anildas
Newbie
*

Rating: 0
Offline Offline

Posts: 3


« on: July 17, 2009, 01:26:39 PM »

Sedgewick's Algorithms in Java book gives a BST join algorithm and claims the complexity is linear.

The algorithm (in pseudo code) is:

join(a, b)
/* Join the trees rooted at a and b and return the root of the join */
{
  if (a == null) return b;
  if (b == null) return a;

  r = root_insert(b, a);
  r.left = join(a.left, r.left);
  r.right = join(a.right, r.right);
  return r;
}

i.e. Insert the root of a into b, using "root insert" method so that it becomes the new root of b.
The left subtree of a will now have nodes that are all less than this new root. So will the new left subtree of b.
Recursively join these two subtrees and set it as the new left subtree of b. Do a similar operation for the right subtree.

root_insert is done by inserting the node into the tree, and then rotating it up to the root, recursively using order preserving rotations.

insert_root(r, n)
/* Insert node n into subtree tree rooted at r and return new root */
{
  if (r == null) return n;
  if (n.key < r.key) {
     r.left = insert_root(r.left, n); r = rotR(r); return r;
  } else {
     r.right = insert_root(r.right, n); r = rotL(r); return r;
  }
}

I won't write out rotR and rotL. They are well known. rotR rotates the tree to the right so that the left child is the new root and so on.

Anyways, for the join algorithm, the book claims "Each node can be the root node on a recursive call at most once, so the total time is linear.". This has me confused. It is true that join will be called on each node only once, but each join call results in an insert_root call, and the insert root has cost proportional to the height of the subtree because of the insertion and the rotations. So wouldn't the overall cost of the join be proportional to the number of nodes times the height of b? And that doesn't seem to be any better than just inserting each node of a into b, one by one.

What am I missing?
Logged
anildas
Newbie
*

Rating: 0
Offline Offline

Posts: 3


« Reply #1 on: July 20, 2009, 09:06:54 AM »

I have some further analysis which suggests that the cost might indeed be linear, if the tree into which insertion is done remains close to balanced through the entire process.

Consider a BST where all root-to-leaf paths are of height h. This tree has 2^(h+1)-1 nodes. Now number the levels, starting from 1 at the leaves, and ending with h+1 at the root. If the cost of some operation at each node is proportional to its level number (i.e. number of nodes in its path to the leaf), then the total cost can be summed as:

Sum[i=1...h+1](i*2^(h+1-i)) =   2^(h+1)*Sum[=1...h+1](i*2^-i).
The Sum factor in the rhs above is the partial sum of a convergant infinite series (1/2 + 2*1/4 +3*1/8+4*1/16.....) which converges to the constant, 2.0. So, the cost is proportional to 2 * 2^(h+1), and since n = 2^(h+1)-1, this is O(n).

Each operation in the join is a root insert of an element from a, into the current tree b. I do think root insert tends to keep the tree in balance, but I have no proof for it, esp if b is not balanced to begin with. I will leave this at that.


Logged
Pages: [1]
  Print  
 
Jump to:  

 
Partners Ads        eharmony