A Short History of File Structure Design (Cont.)
1970s (Hashing)
B-trees can guarantee that one file entry among millions of others can be found with only three or four trips to the disk.
An approach called hashing can reduce the number of disk accesses to one.
Hashing is a scheme for providing rapid access to data items which are distinguished by some key.
It functions as follows:
- Each data item to be stored is associated with a key.
- A hash function is applied to the item's key.
- The resulting hash value is used as an index to select one of a number of “hash buckets” in a hash table.
- The table contains pointers to the original items.
Summary
The key problem addressed throughout this evolution has been
finding ways to minimize disk accesses for files that keep changing in content and size.
Tracking these developments takes us first through work on sequential file access, then through developments in tree-structured access, and finally to relatively recent work on direct access to information in files.
Though B-tree is probably the most popular file structure, none of the file structures dominates the applications.