Normal Forms in Normalization: DBMS

There are several different normal forms of normalization in DBMS that can be used to normalize a database, each with its own specific requirements

Introduction

In the context of database design, normalization is the process of organizing a database in a way that reduces redundancy and dependency and improves data integrity. There are several different normal forms that can be used to normalize a database, each with its own specific requirements.

Normalization Forms

The most common normalization forms are:

1st Normal Form (1NF)

The first normal form (1NF) is a property of a database table that requires that all attributes in the table be :

  • Each column should contain atomic values atomic. This means that each attribute must represent a single piece of data, and cannot be further divided into smaller pieces.
  • A column should contain values that are of the same type. Do not intermix different types of values in any column.
  • Each column should have a unique name to avoid confusion at the time of data retrieval.
  • The order in which data is saved doesn’t matter.
    Using SQL Query, you can easily fetch data in any order from a table.

For example, consider a table that stores information about students, including their names, address, and phone number. In the first normal form, the name attribute would be broken down into separate attributes for first name, middle name, and last name. The address attribute would be broken down into separate attributes for street address, city, state, and zip code. And the phone number attribute would be broken down into separate attributes for the area code, exchange, and number.

By breaking down the attributes into their atomic components, the table is said to be in its first normal form. This helps to ensure that the data is organized in a way that is easy to understand and maintain, and reduces the risk of data inconsistencies and errors.

2nd Normal Form (2NF)

The second normal form (2NF) is a property of a database table that requires the table to follow these rules:

  • The table should be in first normal form (1NF)
  • The table should not have any Partial Dependencies that all non-key attributes be fully dependent on the primary key.
Partial dependency is a type of dependency that occurs when an attribute in a database table is dependent on only part of a composite primary key, rather than the entire key. This can lead to data redundancy and inconsistencies, and can make it more difficult to maintain the table.

For example, consider a table that stores information about students, including their student ID, name, and enrollment status. The student ID attribute could be used as the primary key, and the name and enrollment status attributes would be dependent on the student ID.

However, if the name attribute is only dependent on the student ID and not the enrollment status, this is considered a partial dependency. This means that the name of a student could be repeated for different enrollment statuses, leading to data redundancy and the possibility of data inconsistencies.

To eliminate partial dependencies and ensure that a table is in second normal form (2NF), all non-key attributes must be fully dependent on the primary key. In this example, this would mean that the name attribute would need to be dependent on both the student ID and the enrollment status. This helps to reduce redundancy and dependency in the table, and improves data integrity.

In other words, for a table to be in second normal form, it must have a primary key (a set of one or more attributes that uniquely identifies each row in the table) and all other attributes in the table must be dependent on the primary key. This means that each attribute must be related to the primary key in some way, and cannot be determined by any other attribute in the table.

For example, consider a table that stores information about students, including their student ID, name, and enrollment status. The student ID attribute could be used as the primary key, and the name and enrollment status attributes would be dependent on the student ID. This means that the name and enrollment status of a particular student can be determined based on their student ID, but not the other way around.

By ensuring that all non-key attributes are fully dependent on the primary key, the table is said to be in second normal form. This helps to reduce redundancy and dependency in the table and improves data integrity.

3rd Normal Form (3NF)

The third normal form (3NF) is a property of a database table that requires that the table is in the second normal form (2NF) and that all non-key attributes be dependent on the primary key and nothing else.

In other words, for a table to be in third normal form, it must have a primary key (a set of one or more attributes that uniquely identifies each row in the table) and all other attributes in the table must be directly dependent on the primary key. This means that each attribute must be related to the primary key in some way, and cannot be determined by any other attribute in the table, with the exception of the primary key itself.

For example, consider a table that stores information about students, including their student ID, name, and enrollment status. The student ID attribute could be used as the primary key, and the name and enrollment status attributes would be dependent on the student ID. This means that the name and enrollment status of a particular student can be determined based on their student ID, but not the other way around.

By ensuring that all non-key attributes are directly dependent on the primary key, the table is said to be in third normal form. This helps to further reduce redundancy and dependency in the table and improves data integrity. It also makes it easier to update and maintain the table, since changes to the primary key will not affect any of the other attributes in the table.

Boyce-Codd normal form (BCNF)

Boyce-Codd normal form (BCNF) is a stricter version of the third normal form (3NF) that requires that all determinants (attributes that determine other attributes) must be part of the primary key.

In other words, for a table to be in BCNF, it must have a primary key (a set of one or more attributes that uniquely identifies each row in the table) and all non-key attributes must be fully dependent on the primary key. In addition, all determinants (attributes that determine other attributes) must also be part of the primary key.

For example, consider a table that stores information about students, including their student ID, name, and enrollment status. The student ID attribute could be used as the primary key, and the name and enrollment status attributes would be dependent on the student ID. This means that the name and enrollment status of a particular student can be determined based on their student ID, but not the other way around.

However, if the enrollment status attribute is used to determine the name attribute (i.e., if a student’s name is different depending on their enrollment status), then the enrollment status attribute would be considered a determinant. In this case, the enrollment status attribute would need to be part of the primary key in order for the table to be in BCNF.

By ensuring that all determinants are part of the primary key, the table is said to be in Boyce-Codd’s normal form. This helps to further reduce redundancy and dependency in the table and improves data integrity. It also makes it easier to update and maintain the table, since changes to the primary key will not affect any of the other attributes in the table.

4th Normal Form (4NF)

The fourth normal form (4NF) is a property of a database table that requires that the table is in Boyce-Codd normal form (BCNF) and that there be no multiple sets of multivalued dependencies.

In other words, for a table to be in fourth normal form, it must have a primary key (a set of one or more attributes that uniquely identifies each row in the table) and all non-key attributes must be fully dependent on the primary key. In addition, there must be no multiple sets of multivalued dependencies in the table.

A multivalued dependency is a type of dependency that occurs when one attribute is dependent on another attribute, but not on the primary key. For example, consider a table that stores information about students, including their student ID, name, and enrollment status. If the enrollment status attribute is dependent on the name attribute, but not on the student ID, this would be considered a multivalued dependency.

Multiple sets of multivalued dependencies can lead to data redundancy and inconsistencies and can make it more difficult to maintain the table. To eliminate multiple sets of multivalued dependencies and ensure that a table is in fourth normal form, the table must be redesigned to eliminate the dependencies.

By eliminating multiple sets of multivalued dependencies, the table is said to be in fourth normal form. This helps to further reduce redundancy and dependency in the table and improves data integrity. It also makes it easier to update and maintain the table, since changes to the primary key will not affect any of the other attributes in the table.

5th Normal Form (5NF)

The fifth normal form (5NF) is a property of a database table that requires that the table is in the fourth normal form (4NF) and that there be no non-trivial join dependencies.

In other words, for a table to be in fifth normal form, it must have a primary key (a set of one or more attributes that uniquely identifies each row in the table) and all non-key attributes must be fully dependent on the primary key. In addition, there must be no non-trivial join dependencies in the table.

A join dependency is a type of dependency that occurs when the primary key of a table can be reconstructed from the combination of two or more attributes in the table. This is known as a trivial join dependency and is allowed in the fifth normal form. However, a non-trivial join dependency occurs when the primary key cannot be reconstructed from the combination of the attributes and is not allowed in the fifth normal form.

Non-trivial join dependencies can lead to data redundancy and inconsistencies and can make it more difficult to maintain the table. To eliminate non-trivial join dependencies and ensure that a table is in the fifth normal form, the table must be redesigned to eliminate the dependencies.

By eliminating non-trivial join dependencies, the table is said to be in fifth normal form. This helps to further reduce redundancy and dependency in the table and improves data integrity. It also makes it easier to update and maintain the table, since changes to the primary key will not affect any of the other attributes in the table.

Conclusion

It is important to note that normalization is not always necessary, and in some cases, denormalizing a database (i.e., relaxing some of the normalization rules) can actually improve performance. The decision to normalize or denormalize a database depends on the specific needs of the application.

Share your love
Lamba Pankaj
Lamba Pankaj
Articles: 14

Leave a Reply

Your email address will not be published. Required fields are marked *

I'm not a robot *Time limit exceeded. Please complete the captcha once again.