Database Normalization: Boyce-Codd Normal Form

Table containing data about the student, course, and instructor relationship
Using student and course as the key
More data anomalies
Student Instructor table after removing Course
Resulting Instructor table
Using student and instructor as the key
Removing course
Creating a new table with course

This article follows on from Database Normalization: 3rd Normal Form

E.F. Codd and R.F. Boyce, two of the people instrumental in the development of the database model, have been honored by the name of this normal form. E.F. Codd developed and expanded the relational model, and also developed normalization for relational models in 1970, while R.F. Boyce was one of the creators of Structured Query Language (then called SEQUEL).

In spite of some resources stating the contrary, Boyce-Codd normal form is not the same as 4th normal form. Let's look at an example of data anomalies, which are presented in 3rd normal form and solved by transforming into Boyce-Codd normal form, before defining it.

Table containing data about the student, course, and instructor relationship

Student Course Instructor table
Student
Course
Instructor

Assume that the following is true for the table above:

Each instructor takes only one course
Each course can have one or more instructors
Each student only has one instructor per course
Each student can take one or more courses

What would the key be? None of the fields on their own would be sufficient to uniquely identify a records, so you have to use two fields. Which two should you use?

Perhaps student and instructor seem like the best choice, as that would allow you to determine the course. Or you could use student and course, which would determine the instructor. For now, let's use student and course as the key:

Using student and course as the key

Student Course Instructor table
Student
Course
Instructor

What normal form is this table in? It's in 1st normal form, as it has a key and no repeating groups. It's also in 2nd normal form, as the instructor is dependent on both other fields (students have many courses, and therefore instructors, and courses have many instructors). Finally, it's also in 3rd normal form, as there is only one non-key attribute.

But there are still some data anomalies. Look at the data sample below:

More data anomalies

Student	Course	Instructor
Conrad Pienaar	Biology	Nkosizana Asmal
Dingaan Fortune	Mathematics	Kader Dlamini
Gerrie Jantjies	Science	Helen Ginwala
Mark Thobela	Biology	Nkosizana Asmal
Conrad Pienaar	Science	Peter Leon
Alicia Ncita	Science	Peter Leon
Quinton Andrews	Mathematics	Kader Dlamini

The fact that Peter Leon teaches science is stored redundantly, as are Kader Dlamini with mathematics and Nkosizana Asmal with biology. The problem is that the instructor determines the course. Or put another, course is determined by instructor. The table conforms to 3rd normal form rules because no non-key attribute is dependent upon a non-key attribute! Again, you use the familiar method of removing this field and placing it into another table, along with its key:

Student Instructor table after removing Course

Student Course Instructor table
Student
Instructor

After removing the course field, the primary key needs to include both remaining fields to uniquely identify a record.

Resulting Instructor table

Student Course Instructor table
Instructor
Course

Although we had chosen course as part of the primary key in the original table, the instructor determines the course, which is why we make it the primary key in this table. As you can see, the redundancy problem has been solved.

Thus, a table is in Boyce-Codd normal form if:

it is in 3rd normal form
each determinant is a candidate key

That sounds scary! For most people new to database design, these are new terms. If you followed along with the example above, however, the terms will soon become clear:

a determinant is an attribute that determines the value of another attribute.
a candidate key is either the key or an alternate key (in other words, the attribute could be a key for that table)

In the initial table, instructor is not a candidate key (alone it cannot uniquely identify the record), yet it determines the course, so the table is not in Boyce-Codd normal form.

Let's look at the example again, and see what happens if you chose student and instructor as the key. What normal form is the table in this time?

Using student and instructor as the key

Student Course Instructor table
Student
Instructor
Course

Once again it's in 1st normal form because there is a primary key and there are no repeating groups. This time, though, it's not in 2nd normal form because course is determined by only part of the key: the instructor. By removing course and its key, instructor, you get the structure shown below:

Removing course

Student Instructor table
Student
Instructor

Creating a new table with course

Student Course Instructor table
Instructor
Course

Either way you do it, by making sure the tables are normalized into Boyce-Codd normal form, you get the same two resulting tables. It's usually the case that when there are alternate fields to choose as a key, it doesn't matter which ones you choose initially because after normalizing you get the same results either way.

Comments

Comments loading...

Content reproduced on this site is the property of its respective owners, and this content is not reviewed in advance by MariaDB. The views, information and opinions expressed by this content do not necessarily represent those of MariaDB or any other party.

↑ Database Normalization ↑
Database Normalization Overview
Database Normalization: 1st Normal Form
Database Normalization: 2nd Normal Form
Database Normalization: 3rd Normal Form
Database Normalization: Boyce-Codd Normal Form
Database Normalization: 4th Normal Form
Database Normalization: 5th Normal Form and Beyond
Understanding Denormalization

Database Normalization: Boyce-Codd Normal Form

Contents

Table containing data about the student, course, and instructor relationship

Using student and course as the key

More data anomalies

Student Instructor table after removing Course

Resulting Instructor table

Using student and instructor as the key

Removing course

Creating a new table with course

Comments

Products

Services

Pricing

Resources

About Us

Download MariaDB