Studies in Applied Data Structures

Research output: ThesisDoctoral Thesis (compilation)

Abstract

The design of efficient data structures is of primary importance in creation of theoretical algorithms as well their more tangible descendants, computer programs. In this dissertation we study computational aspects of data structures and their respective algorithms from a theoretical viewpoint, which are however of direct importance in the implementation of solutions for real-world problems. We present results for the following problems:

In tolerancing, the Out-Of-Roundness factor determines the relative circularity of planar shapes. We show that the Minimum Radial Separation algorithm given by Le and Lee runs in Theta(n^2) time even for convex polygons. Furthermore, we present an optimal O(n) time algorithm to compute the Minimum Radial Separation of convex polygons, which represents not only a factor n improvement over the previously best known algorithm, but also a factor of log n improvement over Le and Lee's conjectured complexity for the problem.

We consider the general problem of (2-dimensional) range reporting allowing arbitrarily convex queries. We show that using a traditional approach, a polylogarithmic query time can not be achieved unless more than linear space is used. Our arguments are based on a new non-trivial lower bound in a new model of computation, Layered Partitions, which can be used to describe all known algorithms for processing range queries, as well as many other data structures used to represent multi-dimensional data. We show that Omega((log n)/(log T(n))) partitions must be used to allow queries in O(T(n) + k) time, for n total and k reported elements, and for any growing function T(n).

We discuss an intrinsic generalization of the suffix tree, designed to index a string of length n which has a natural partitioning into m multi-character substrings or words. This word suffix tree} represents only the m suffixes that start at word boundaries. Since traditional suffix tree construction algorithms rely heavily on the fact that all suffixes are inserted, construction of a word suffix tree is nontrivial, in particular when only O(m) construction space is allowed. We solve this problem, presenting an algorithm with O(n) expected running time. In applications that require strict node ordering, an additional cost of sorting O(m') characters arises, where m' is the number of distinct words. In either case, this is a significant improvement over previously known solutions. Furthermore, when the alphabet is small, we may assume that the n characters in the input string occupy o(n) machine words. We illustrate that this can allow a word suffix tree to be built in sublinear time.

We propose a new data structure for storing sparse matrices which are too large to fit entirely within main memory. This data structure is optimized to use the computer's page size and is arranged in order to be able to efficiently handle random access and updates useful for a wide range of matrix operations. We also present several variations on an ancillary structure which greatly decreases the probability of unnecessary page faults when accessing the structure, even when the size of main memory is extremely limited. We assert that these data structures are easy to implement and provide very good results in practice.

Details

Authors
  • Kurt Swanson
Organisations
Research areas and keywords

Subject classification (UKÄ) – MANDATORY

  • Computer Science

Keywords

  • Systems engineering, Sparse Matrix, Suffix Tree, Roundness, Range searching, Data- och systemvetenskap, computer technology
Original languageEnglish
QualificationDoctor
Awarding Institution
Supervisors/Assistant supervisor
  • [unknown], [unknown], Supervisor, External person
Award date1998 Oct 8
Publisher
  • Department of Computer Science, Lund University
Print ISBNs91-628-3155-0
Publication statusPublished - 1998
Publication categoryResearch

Bibliographic note

Defence details Date: 1998-10-08 Time: 10:15 Place: E:1406 External reviewer(s) Name: Boyar, Joan Title: Dr Affiliation: Odense --- Article: Kurt Swanson, Der-Tsai Lee, and Vanban L. Wu.An optimal algorithm for roundness determination on convex polygons.Computational Geometry: Theory and Applications, 5(4):225-235, 1995. Article: Arne Andersson and Kurt Swanson.On the difficulty of range searching.Computational Geometry: Theory and Applications, 8(3):115-122, 1997. Article: Arne Andersson, N. Jesper Nilsson, and Kurt Swanson.Suffix trees on words.Algorithmica. To appear. Article: Arne Andersson and Kurt Swanson.Efficient representation of sparse matrices in secondary memory.Submitted.