Next: The design space
Up: Data Structures for
Previous: The piece table
The discussions in the previous two sections suggest that
it is possible to characterize all sequence data structures
given the following assumptions:
- The computer has main memory with sequentially addressed cells
and has disk memory with sequentially addressed blocks
of sequentially addressed cells.
- Items have a fixed size in memory and on disk.
- Items are stored directly in the memory or on disk, that is,
they are not encoded.
Hence every item must exist somewhere in memory or on disk.
- The main memory is of limited size and hence cannot hold all
the items in large sequences.
- The environment provides dynamic memory allocation
(although some sequence data structures will do their own
and not use the environment's dynamic memory allocation).
- The environment provides a reasonably efficient file system
for item storage that provides files of
(for all practical purposes) unlimited size.
The following concepts are used in the model.
- An item is the basic component of our sequences.
In most cases it will be a character or byte
(but it might be a descriptor
in a recursive sequence data structure).
- A sequence is an ordered set of items.
During editing, items will be inserted into
and deleted from the sequence.
The items in the sequence are logically contiguous.
- A buffer is some contiguous space in main memory
or on disk that can contain items (Assumption 4).
All items in the sequence are kept in buffers (Assumption 3).
Consecutive items in a buffer are physically contiguous.
- A span is one or more items that are logically
contiguous in the sequence and are also physically
contiguous in the buffer.
(Assumption 1)
- A descriptor is a data structure that represents a span.
Usually the descriptor contains a pointer to the span
but it is also possible for the descriptor to
contain the buffer that contains the span.
A sequence data structure is either
- A basic sequence data structure which is one of:
- An array.
- An array with a gap.
- A linked list of items.
- A more complex linked structure of items.
- A recursive sequence data structure which comprises:
- Zero or more buffers each of which contains zero or more spans.
- A (recursive) sequence data structure
of zero or more descriptors to spans in the buffers.
This model is recursive in that to implement a sequence of
items it is necessary to implement a sequence of descriptors.
This recursion is usually only one step, that is, the sequence
of descriptors in implemented with a basic sequence data structure.
The deficiencies of the basic sequence data structures
for implementing character
sequences are less critical for sequences of descriptors
since there are usually far fewer descriptors and so sophisticated
methods are not required.
Next: The design space
Up: Data Structures for
Previous: The piece table
Charlie Crowley
Thu Jun 27 15:36:10 MDT 1996