The physical design of your database optimizes performance while ensuring data integrity by avoiding unnecessary data redundancies. The task of building the physical design is a job that truly never ends. As a result, you need to continually monitor the performance and data integrity as time passes. Many factors necessitate periodic refinements to the physical design.

In this article, we will discuss the concept of how physical structures of databases affect performance, including specific examples, guidelines, as well as best and worst practices.

Physical Database Design Process

Physical database design is the process of transforming a data model into the physical data structure of a particular database management system (DBMS). Normally, Physical Design is accomplished in multiple steps, which include expanding a business model into a fully attributed model (FAM) and then transforming the fully attributed model into a physical design model.

Conceptual, Logical, and Physical Data Models

You begin with a summary-level business data model that’s most often used on strategic data projects. It typically describes an entire enterprise, which allows you to understand at a high level the different entities in your data and how they relate to one another. Due to its highly abstract nature, it may be referred to as a conceptual model.

Common characteristics of a conceptual data model:

  • A conceptual data model identifies important entities and the high-level relationships among them. This means no attribute or primary key is specified. Moreover, complexity increases as you expand from a conceptual data model.

A logical data model, otherwise known as a fully attributed data model, allows you to understand the details of your data without worrying about how the data will be implemented in the database. Additionally, a logical data model will normally be derived from and or linked back to objects in a conceptual data model. It is independent of DBMS, technology, data storage or organizational constraints.

Common characteristics of a logical data model:

  • Unlike the conceptual model, a logical model includes all entities and relationships among them. Additionally, all attributes, the primary key, and foreign keys (keys identifying the relationship between different entities) are specified. As a result, normalization occurs at this level. 

The steps for designing the logical data model are as follows:

  • First, specify primary keys for all entities.
  • Then find the relationships between different entities.
  • Find all attributes for each entity.
  • Lastly, resolve many-to-many relationships.

Finally, the physical data model will show you exactly how to implement your data model in the database of choice. This model shows all table structures, including column name, column data type, column constraints, primary key, foreign key, and relationships between tables. Correspondingly, the target implementation technology may be a relational DBMS, an XML document, a spreadsheet, or any other data implementation option.

Common characteristics of a physical data model:

  • Describes data requirements for a single project or application.
  • Specifies all tables and columns.
  • Contains foreign keys used to identify relationships between tables.
  • Physical considerations may cause the physical data model to be different from the logical data model.

The physical design is where you translate schemas into actual database structures. After that, you transform the entities into tables, instances into rows, and attributes into columns. At this time, you have to map:

  • Entity to Table
  • Attribute to Column
  • Primary Key and Alternate Key to Unique Index
  • Index to Non-unique Index
  • Foreign Keys to Non-unique Index

Transformation

  • Choosing a physical data structure for the data constructs in the data model.
  • Optionally choosing DBMS options for the existence constraints in the data model.
  • Does not change the business meaning of the data.
  • First transformation of a data model should be a “one-to-one” transformation.
  • Should not denormalize the data unless required for performance.
  • Based on shop design standards and DBA experience & biases.
  • Entity Subsetting – Choosing to transform only a subset of the attributes in an entity.
  • Dependent Encasement – Collapsing a dependent entity into its parent to form a repeating group of attributes or collapsing a dependent entity into its parent to form a new set of attributes.
  • Category Encasement
  • Category Discriminator Collapse
  • Horizontal Split
  • Vertical Split
  • Data Migration
  • Synthetic Keys – The merging of similar entities using a made-up key. Always loses business vocabulary and is never more than BCNF. Also used a lot in packages to make them easily extendable.
  • Adding Summaries – Adding new entities that are each a summary of data upon a single level of a dimension.
  • Adding Dimensions

Moreover, it is often necessary to apply multiple transforms to a single entity to get the desired physical performance characteristics. All physical design transformations are compromises.

Database Definition Language (DDL)

Except for data cubes, a one-to-one translation of the Physical Model into the specific DDL of the database:

  • Entity to Table
  • Attribute to Column
  • Primary Key and Alternate Key to Unique Index
  • Index to Non-unique Index
  • Foreign Keys to Non-unique Index 

Data Cubes

Data cubes are the exception to the DDL one-to-one translation because:

  • Proprietary physical databases
  • Have their own structure definition language
  • Have their own data loaders

Physical Database Design: Watson-Watt’s Law of Third Best

  • The best never comes.
  • Second Best takes too long.
  • Identify the Third Best – the design that can be validated in time to meet an identified.

In conclusion, physical database design is no easy feat. If you feel intimidated by it, you’re not the only one. Learn more about Relational Junction for data warehousing and integration or contact a member of our team for more information!