CHum 240

SQL Keys and Indexes

Is the plural 'indexes' or 'indicies'? I've heard both, and
according to the dictionary, the latter is mostly used by us
technical types.

A key is a significant bit of information, while an index is
a device for finding information.


Keys

Each row in a table should refer to one and only one
subject. At least one, but often two or more, column should
identify the subject of the row. These columns are called
'key' columns. (The remainder of the columns describe other
non-key attributes of the subject.)

In order to join the rows multiple tables together, at least
one column in those rows must have some fixed relation to
one another. Often this relation is equality, and our join
statement combines the rows by having the equality test in
the WHERE clause:

  select * from a, b where a.col = b.col;

But the relation can be computed as well:

  select * from a, b where a.col * 2 <= b.col;

A key column in one table refers to, or points to, a key
column in another table.


Indexes

In a printed book, an index contains keywords and page
numbers to help readers to quickly find the information they
are looking for. To find information in a unindexed book
that one has not read before, one must read or scan it from
the beginning to find the passage of interest.

Likewise, a database table without an index results in what
is called a 'table scan'. the DBMS examines every row in the
table to find the information requested. This can take a
very, very long time if there are many rows and the matching
criteria are complex.

Because a key column in one table refers to a key column in
another table, keys and indexes are often used together.


Creating indexes

To speed things up, we instruct the DBMS to create indexes.
When we search for things the DBMS will automatically use
the indexes that will help make the search more efficient.

We, as database creators, decide what needs to be indexed
and what does not. We try to discriminate between columns
holding values we are likely to search for which columns do
not. Creating indexes that will never be used is a waste of
space.

When we search for data, the DBMS looks at the indexes we
have created and decides which ones will help find the
desired information quickly and efficiently. Creating
indexes that will never be used wastes the DBMSs time.

The strength of the algorithms for choosing indexes is one
way DBMS vendors distinquish themselves from one another.


Types of Indexes

There are several types of indexes depending on the nature
of the data being indexed. Each has advantages and 
disadvantages, but choosing the type of index to use is
straight forward.

A database index is often just another table, and to create 
one we use either the 'CREATE INDEX...' statement, or in the 
'CREATE TABLE...' statement.


Indexes and Composite Indexes

An index is created by using the 'CREATE INDEX' statement:

  create index mediumind on art(medium);

Because an index is a table, the name of the index must be
unique. A common idiom is to append a common suffix to all
indexes you create. For example, you might add the letter
'i' or the letters 'ind' or 'inx' to the index name. It is
also a good idea to use a name that gives a hint as to which
column is being indexed.
  
An index can include more than one column:

  create index pplbdidx on people (place, birth);
  
An index on multiple columns will be used only when all of
the columns appear in the WHERE clause. The order of the
columns in the index column list may or may not be
significant, depending on how smart the DBMS is.

Remember, the DBMS decides which indexes will be used, and
the only thing you can do is create indexes you hope will
be helpful.

To get rid of an index, use the 'DROP INDEX' statement:

  drop index mediumind;


Unique Indexes

A UNIQUE index requires that no two rows in the index have
the same value. Its advantage is that it can be used to
locate the desired data more quickly than a simple index
because for a given key, only one row needs to be found.

  create unique index empssnidx on emp (ssn);


Primary Indexes

A table may have only one primary index, which is limited to
one column, and the values for each row must be distinct (or
unique) for the entire table.

In some DBMSs the primary key determines how the data is
stored, which makes searching using the primary key as
efficient as possible. When this is done, the layout of data
in the table itself serves as the index, and no index table
is required, thus saving space and time.

If a table has a primary index, it must be specified when
the table is created as part of the 'CREATE TABLE'
statement:

  create table emp( ssn int primary key, ln varchar(25) );

It is often possible to instruct the DBMS to automatically
generate a unique value for the primary key of each row,
although it is a non-standard feature, and so, the syntax
varies across DBMSs.

  create table t( id integer primary key autoincrement, ...
  create table t( id integer primary key auto_increment, ...

SQLite requires that autoincrement columns be of type
INTEGER, the native numeric type. Other DBMS may have other
limitations or requirements.

Automatically assigning a key value to each row saves you a
lot of time and worry otherwise spent in trying to make sure
the key values are unique. But, because the values are
unrelated to the subject of the row, overuse of
automatically generated primary keys can lead to situations
where you are forced to deal with large quantities of
meaningless numbers when you could be using something
meaningful instead.

Avoid using automatically generated primary keys unless
there is no other way to assign a unique, meaningful value
to a row.


NULL and Indexes

The NULL (or unknown) value can be indexed just like any
other value. Because every NULL value is different from
every other NULL value, primary and unique indexes can
contain NULLs.


Table Creation and Indexing

Some DBMSs expand the create table statement to allow
indexes to be specified at table creation time.
  
  create table u (a int, b char(2), index ua (a));

This just makes documenting the keys and indexes a little
simpler.

- - - -

Copyright ©2007 Brigham Young University