Abstract
Traditional file structures that provide multikey access to records, for example, inverted files, are extensions of file structures originally designed for single-key access. They manifest various deficiencies in particular for multikey access to highly dynamic files. We study the dynamic aspects of file structures that treat all keys symmetrically, that is, file structures which avoid the distinction between primary and secondary keys. We start from a bitmap approach and treat the problem of file design as one of data compression of a large sparse matrix. This leads to the notions of a grid partition of the search space and of a grid directory, which are the keys to a dynamic file structure called the grid file. This file system adapts gracefully to its contents under insertions and deletions, and thus achieves an upper bound of two disk accesses for single record retrieval; it also handles range queries and partially specified queries efficiently. We discuss in detail the design decisions that led to the grid file, present simulation results of its behavior, and compare it to other multikey access file structures.
- 1 BATORY, D.S. On searching transposed files. ACM Trans. Database Syst. 4, 4 (Dec. 1979), 531- 544. Google ScholarDigital Library
- 2 BENTLEY, J.L. Multidimensional search trees used for associative searching. Commun. ACM 18, 9 (Sept. 1975}, 509-517. Google ScholarDigital Library
- 3 BENTLEY, J.L. Multidimensional binary search trees in database applications. IEEE Trans. Softw. Eng. SE-5, 4 (July 1979), 333-340.Google ScholarDigital Library
- 4 BURKHARD, W.A. Interpolation-based index maintenance. In Proc, A CM Syrup. Principles of Database Systems (1983), 76-89. Google ScholarDigital Library
- 5 CASEY, R.G. Design of tree structures for efficient querying. Commun. ACM 16, 9 (Sept. 1973), 549-556. Google ScholarDigital Library
- 6 FAGIN, R., NIEVERGELT, J., PIPPENGER, N., AND STRONG, H.R. Extendible hashing~a fast access method for dynamic files. ACM Trans. Database Syst. 4, 3 (Sept. 1979), 315-344. Google ScholarDigital Library
- 7 FINKEL, R.A., AND BENTLEY, J.L. Quad trees--a data structure for retrieval on composite keys. Acta Inf. 4 {1974), 1-9.Google Scholar
- 8 GONNET, G.H., AND LARSON, P.A. External hashing with limited internal storage. In Proc. ACM Syrup. Principles of Database Systems (1982), 256-261. Google ScholarDigital Library
- 9 GUETING, H., AND KRIEGEL, H.P. Multidimensional B-tree: an efficient dynamic file structure for exact match queries. Forschungsbericht Nr. 105, Informatik, Univ. Dortmund, Dortmund, West Germany, 1980.Google Scholar
- 10 H{NRICHS, K., AND NIEVERGELT, J. The grid file: a data structure designed to support proximity queries on spatial objects. In Proc. Workshop on Graph Theoretic Concepts in Computer Science (Osnabruck, 1983).Google Scholar
- 11 H{NTERBERGER, H., AND NIEVERGELT, J. Concurrency control in two-level file structures. Working paper, Informatik, ETH Zurich, 1983.Google Scholar
- 12 J0SHI, S.M., SANYAL, S., BANERJEE, S., AND SRIKUMAR, S. Using grid files for a relational database management system. Speech and Digital Systems Group, Tata Institute of Fundamental Research, Homi Bhabha Road, Bombay 400 005, India.Google Scholar
- 13 KASHYAP, R.L., SUBAS, S.K.C., AND YAO, S.B. Analysis of the multiattribute tree database organization. IEEE Trans. Softw. Eng. 2, 6 (Nov. 1977).Google Scholar
- 14 KNUTH, D.E. The Art of Computer Programming. Vo{. 3, Sorting and Searching. Addison- Wesley, Reading, Mass., 1973. Google ScholarDigital Library
- 15 LEE, D.T., AND TONG, C.K. Quintary trees: a file structure for multidimensional database systems. ACM Trans. Database Syst. 5, 3 (Sept. 1980), 339-353. Google ScholarDigital Library
- 16 L}TWXN, W. Linear hashing: a new tool for file and table addressing. In Proc. 6th international Conference on Very Large Data Bases, 1980, pp. 212-223.Google Scholar
- 17 LIOU, J.H., AND YAO, S.S. Multidimensional clustering for database organizations. Inf. Syst. 2 (1977), 187-198.Google ScholarCross Ref
- 18 LUM, V.Y. Multiattribute retrieval with combined indices. Commun. ACM 13, 11 (Nov. 1970), 660-665. Google ScholarDigital Library
- 19 BARNES, M.C., COLLENS, D.S. Storing hierarchic database structures in transposed form. Datafair, 1973.Google Scholar
- 20 MERRETT, T.H. Multidimensional paging for efficient database querying. In Proc. ICMOD (Milano, Italy, June 1978), pp. 277-289.Google Scholar
- 21 MERRETT, T.H., AND OTOO, E.J. Dynamic multipaging: a storage structure for large shared data banks. Rep. SOCS-81-26, McGill Univ., 1981.Google Scholar
- 22 MULLIN, J.K. Retrieval-update speed trade-offs using combined indices. Commun A CM 14, 12 {Dec. 1971), 775-778. Google ScholarDigital Library
- 23 NIEVERGELT, J. Trees as data and file structures. In CAAP '81, Proc. 6th Colloquium on Trees in Algebra and Programming, E. Astesiano and C. Bohm, Eds., Lecture Notes in Computer Science 112, Springer Verlag, 1981, pp. 35-45. Google ScholarDigital Library
- 24 NIEVERGELT, J., HINTERBERGER, H., AND SEVCIK, K.C. The grid file: an adaptable, symmetric multikey file structure. In Trends in Information Processing Systems, Proc. 3rd ECI Conference, A. Duijvestijn and P. Lockemann, Eds., Lecture Notes in Computer Science 123, Springer Verlag, 1981, pp. 236-251. Google ScholarDigital Library
- 25 ORENSTEIN J.A. Multidimensional TRIEs used for associative searching. Inf. Process. Left. 14, 4 (June 1982), 150-157.Google ScholarCross Ref
- 26 PFALTZ, J.L., BERMAN, W.J., AND CAGLEY, E.M. Partial-match retrieval using indexed descriptor files. Commun. ACM 23, 9 (Sept. 1980), 522-528. Google ScholarDigital Library
- 27 RIVEST, R.L. Partial-match retrieval algorithms. SIAM J. Comput. 5, 1 (1976), 19-50.Google ScholarCross Ref
- 28 ROBINSON, J.T. The k-d-B-tree: a search structure for large multidimensional dynamic indexes. In Proc. SIGMOD Conference 1981, ACM, New York, pp. 10-18. Google ScholarDigital Library
- 29 ROTHNIE, J.B., AND LOZANO, T. Attribute-based file organisation in a paged environment. Commun. ACM 17, 2 (Feb. 1974), 63-69. Google ScholarDigital Library
- 30 SCHEUERMANN, P., AND OUKSEL, M. Multidimensional B-trees for associative searching in database systems, Inf. Syst. 7, 2 (1982), 123-137.Google ScholarCross Ref
- 31 TAMMINEN, M. The extendible cell method for closest point problems. BIT 22 (1982), 27-41.Google ScholarCross Ref
- 32 VALLARINO, O. On the use of bit maps for multiple key retrieval. ACM SIGPLAN Notices 11, (Mar. 1976), 108-114.Google ScholarDigital Library
Index Terms
- The Grid File: An Adaptable, Symmetric Multikey File Structure
Recommendations
Gfarm Grid File System
AbstractGfarm Grid file system is a global distributed file system to share data and to support distributed data-intensive computing. It federates local file systems on compute nodes to maximize distributed file I/O bandwidth, and allows to store multiple ...
A multiple-file write scheme for improving write performance of small files in Fast File System
Fast File System (FFS) stores files to disk in separate disk writes, each of which incurs a disk positioning (seek + rotation) limiting the write performance for small files. We propose a new scheme called co-writing to accelerate small file writes in ...
Comments