|NU Year||Set: 3.(a) Marks: 5 Year: 2008|
inary search trees are used to organize a set of keys for fast access: the tree maintains
the keys in-order so that comparison with the query at any node either results in a
match, or directs us to continue the search in left or right subtree.
• A balanced search tree achieves a worst-case time O(log n) for each key search, but
fails to take advantage of the structure in data.
• For instance, in a search tree for English words, a frequently appearing word such as
“the” may be placed deep in the tree while a rare word such as “machiocolation” may
appear at the root because it is a median word.
• In practice, key searches occur with different frequencies, and an Optimal Binary Search
Tree tries to exploit this non-uniformity of access patterns, and has the following
• The input is a list of keys (words) w1, w2, . . . , wn, along with their access probabilities
p1, p2, . . . , pn. The prob. are known at the start and do not change.
• The interpretation is that word wi will be accessed with relative frequency (fraction
of all searches) pi
. The problem is to arrange the keys in a binary search tree that
minimizes the (expected) total access cost.
• In a binary search tree, accessing a key at depth d incurs search cost d + 1. Therefore,
if the word wi
is placed at depth di
in the tree, the total search cost (the quantity we
want to minimize) is:
pi × (di + 1)