There are usually many ways to represent any given data, ranging from fully normalized to fully denormalized and anything in between. In a normalized database, each fact is represented once and only once. Conversely, in a denormalized database, information is duplicated, or stored in multiple places.
Pros and Cons of a Normalized Schema
Normalized Schema works well for the following reasons:
- Normalized updates are usually faster than denormalized updates.
- When the data is well normalized, there's little or no duplicated data, so there's less data to change.
- Normalized tables are usually smaller, so they fit better in memory and perform better.
- The lack of redundant data means there's less need for DISTINCT or GROUP BY queries when retrieving lists of values.
The drawbacks of it usually have to do with retrieval. Any nontrivial query on a well-normalized schema will probably require at least one join, and perhaps several. This is not only expensive, but it can make some indexing strategies impossible.
Pros and Cons of a Denormalized Schema
Denormalized Schema works well because everything is in the same table, which avoids joins. If you don't need to join tables, the worst case for most queries—even the ones that don't use indexes—is a full table scan. This can be much faster than a join when the data doesn't fit in memory, because it avoids random I/O. A single table can also allow more efficient indexing strategies.
Mixture of Normalized and Denormalized
Both normalized and denormalized schemas have benefits and drawbacks, how can we choose the best design? The most common way to denormalize data is to duplicate, or cache, selected columns from one table in another table. In newer MySQL,we can use triggers to update the cached values, which makes the implementation easier. You must consider how frequently you'll have to make changes (change in multiple places when using normalized schema) and how long they will take, compared to how often you'll run the SELECT query.