DataModel1.gif (5854 bytes) Data Model Reference

 


About Look-up Tables

Purpose

The purpose of this whitepaper is to discuss and describe concepts and techniques related to Look-up Tables, also called "domain tables", "code tables", etc. The goal is for the reader to understand when and how to use look-ups, and how best to construct them. The reader will also be challenged to think critically about the business situation before adopting any given strategy for handling look-up tables.

Why Look-up Tables?

In any database application, there will be situations where the user will be required to enter a value from a list of valid values, rather than just enter text strings or numbers. For instance, a common data entry field to encounter is "State", as in Iowa or Nebraska. If the field on the screen is just called "State", and is left as free-form text, the user will be unclear what to enter. Should they type out the name or abbreviation. Data quality issues result as some people either misspell state names, or enter the wrong abbreviation for the state (is MI for Michigan or Mississippi?). There are only so many valid values to enter, so presenting the user with a list is much more efficient for the user, and ensures better data in the database.

Where to Put Them?

Now the question is where to store the list of valid values? The programmer could use an array in code to store the list, or maybe a text file on the local hard drive. For absolutely static domains this might just work, but what about slowly changing domains, such as Country? New countries form periodically, and others split or change names. If the list of countries is in an array, then the programmer would have to edit the code and redeploy the application every time a change was made. If the local file is used, the file will have to be pushed out to the users. A solution to that may be to put the look-up file on the network. Now the issue is how many people can open the file at the same time. Files accessed by many people at the same time over a network are big red flags to any system administrator, and an opportunity to use the inherent strengths of the database engine.

Is it Really a Look-up Table?

So, the list of values should be in the database, as a look-up table. Of course that is simplistic. There are many lists of valid values. Some have only 2 or 3 values, while others have hundreds. Some are simply a list of names, while others have a lot of other information about each value. Arguably, when a look-up table has additional metadata other than the name, it is no longer truly a look-up, but rather an ordinary table like any other. As a matter of fact, the "State" table is rarely just a list of states, but includes foreign country provinces and has links to a "Country" table and possibly other ISO specific information. For international databases, this table can be complex indeed. Considering the risk of assuming simplicity, the data modeler should be very careful in designing these tables.

Consolidate or Not?

Now, after the look-up tables have been split off from the more complex tables, the temptation is to consolidate them into a single abstract look-up table with a "Code Type" idea to segregate the values, as shown in the picture:
Code Table
For any program to use the Code Table, it must know which code type to use for each list of values. There is no direct referential tie, because most DBMS will not allow PK/FK links with a static value. This makes it difficult to manage the list of valid values if it needs to change, since there is no easy way to know exactly where it is used. The best way to model the look-up tables it to keep them distinct through the modeling process, treating each as unique types of business information. In the physical model, if there is absolutely no other choice but to consolidate, then do so at that time, but keep the logical and conceptual models "pure", as with all other denormalizations.

Best Practices

As domains of values come out of business requirements discussions, convert them from the initial attribute to an entity. For instance, the business originally said they need a "Product Type" field. Initially, this is just an attribute. In later discussions, pin them down on exactly what types of products there are, and how much other information they may need about the product types. Promote the "Product Type" attribute to an entity, with a relationship back to the original entity it started in. Add the Product Type attributes (Code, Owner, Name, Description, Effective Date, Expiration Date, etc, are common things the business will come out with for these look-ups. Some shops will give the entity a distinct name, such as "Product Type Domain". This makes it easier to communicate with IT the idea that these tables are look-ups rather than transactional tables. Having a consistent structure also makes it easier to manage look-up tables.

Disclaimer and Notes

Your mileage may vary. In other words, if you disagree, great. Please provide your feedback to the webmaster regarding the information I have provided. I am always wanting to learn, and heck, I might very well have left out some very important details, considering I wrote this whole thing in about an hour off the top of my head.


Visitors on this page: [an error occurred while processing this directive]
Last Updated: [an error occurred while processing this directive]