CIKM'09 (abstract)
AS-Index is a new index structure for exact string search in disk
resident databases. It uses hashing, unlike known alternatives, whether
baesd on trees or tries. It typically indexes every n-gram in the
database, though non-dense indexing is possible.
The hash function uses the algebraic signatures of n-grams.
Use of hashing provides for constant index access time for arbitrarily long
patterns, unlike other structures whose search cost is at best
logarithmic.
The storage overhead of AS-Index is basically 500 - 600%, similar to that of
alternatives or smaller.
We show the index structure, our use of algebraic signatures and the search
algorithm. We
present the theoretical and experimental
performance analysis.
We compare the AS-Index to main alternatives. We conclude that
AS-Index is an attractive structure and we indicate directions for future work.
Retourner à la liste des publications.