The normalization is explored in DBMS to know about the
- Problems without normalization and
- How does normalization solve these problems by removing anomalies
What is Normalization?
so normalization is a technique of organizing the data into multiple related tables, to minimize data redundancy. Normalization makes the data more meaningful and usable.
A formal process of decomposing relations with anomalies to produce smaller, well-structured, and stable relations.
Must read:
Normal forms in Normalization: DBMS
Algorithms Analysis and why it is important? : DSA
How to Create Beautiful Info Cards – CSS Tutorial for Beginners
What is Data Redundancy and why should we reduce it?
Redundancy normally is the repetition of data.
Repetition of data increases the size of the Database and causes other issues like:
- Insertion Problems
- Deletion Problems
- Updation Problems
Normalization we can say “Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that avoid unnecessary duplication of data.
A well-structured relation that contains minimal data redundancy and allows the users to insert, delete, and update rows without causing data inconsistencies.
Anomalies in DBMS
- Insertion Anomaly – adding new rows will force users to create duplicate data.
- Deletion Anomaly – deleting a row may cause the loss of other data representing different facts.
- Modification Anomaly – changing data in a row forces changes to other rows because of duplication.
Let’s see How Normalization will solve this problem?
consider an example of a STUDENT Table containing students data along with their roll_no, name, branch name, their branch HOD name, and HOD’s office tel.
roll_no | name | branch | HOD | office_tel |
---|---|---|---|---|
1 | Ravinder | CSE | X | 12357 |
2 | Baldev | CSE | X | 12357 |
3 | Mahesh | CSE | X | 12357 |
4 | Anil | CSE | X | 12357 |
If in a situation the Hod of branch CSE changes then you have to change every row of Hod name for all the students and if you left a single row it will result in data inconsistency.
So to reduce the redundancy we will normalize this table and will create two different tables here.
First naming with STUDENT table and second naming BRANCH table.
roll_no | name | branch |
---|---|---|
1 | Kavish | CSE |
2 | Nahid | CSE |
3 | Sahil | CSE |
branch | HOD | office_tel |
---|---|---|
CSE | X | 12357 |
Now, in every student record the branch name CSE will repeat and in the BRANCH table the Hod name for branch CSE has to be written once.
if we have to change the value of Hod then just have to update a single field.
Less redundancy means fewer problems in inserting, deleting, and updating the data.
Using the above example of student records let’s see how the data will look like after normalization in this case.
- Insertion Anomaly – If we add a new tuple of new student data then we just have to add roll_no, name, and branch name. The data about the branch will be automatically used from the BRANCH table. so we do not need to repeat the Hod name and office_tel of Hod for every student.
- Deletion Anomaly – If we delete all the student data containing tuples, we still have the branch information so we can add new students for branch CSE.
- Modification Anomaly – If the Hod changes, we have to change only a single row in the BRANCH table.
Must read:
Normal forms in Normalization: DBMS
Algorithms Analysis and why it is important? : DSA
How to Create Beautiful Info Cards – CSS Tutorial for Beginners
Types of Normalization
Normally these types are of normalization:
- 1st Normal Form
- 2nd Normal Form
- 3rd Normal Form
- The advanced version is BCNF
read in detail about normal forms : Normal forms in Normalization: DBMS
Conclusion
Normalization solves the problem by simply dividing the data into simple logical entities that are independent and related to relations with the primary key and others.
If this article helped you in understanding what really normalization and its terminologies like redundancy, and anomaly are. Then leave a comment in the box below.