Question
Introduction This research paper mainly focuses on history of evolvement of normalization from a set of relation schemas, along with sequential normalization forms with suitable
- Introduction
- Database normalization 1. the paper must concentrate on a topic in the textbook. the textbook must be used as one of the minimum three references for the paper. 2. apa format should be following in regard to the completion of this paper. it should be apa 6.0 format. 3. minimum three references used and identified in body of paper. concentrate on topic: normalization
This research paper mainly focuses on history of evolvement of normalization from a set of relation schemas, along with sequential normalization forms with suitable examples. This paper includes the importance of Normalization in Relational Database Management system in the real world.
Normalization is a process of arranging data in a database. The main aim of the database is to develop relation schemas without any redundant data (Hoffer, 2011). Database stores the data in the form of tables consisting of entities and attributes. Entities are the primary building blocks of a data model which translates the conceptual design to logical relationship. Attributes specifies the properties of entity. For example, student, course, and grade are some of the tables included in a university database, with other entities to determine the organizational data structure. Student have attributes such as StudentName, StudentId, address, Email along with other properties which segregates each attribute record within the entity. The entities, its attributes and their relationships are graphically represented as Entity – Relationship diagram. Relationships are the connections between the entities that shows how the entities are related. Normalization helps in reducing input and output delays, reduce usage of memory, minimizes the data redundancy and so on. Normalization is the method that eliminates the data redundancy by normal form approach with some rules and regulations (Hoffer, 2011). Normal form is a rule followed for normalization. The different types of normal forms are First Normal Form(1NF), Second Normal Form(2NF), Third Normal Form(3NF), Boyce-codd Normal Form (BCNF), Fourth Normal Form (4NF) and Fifth Normal Form(5NF).
Normalization is mainly used to eliminate Redundancy of data and Functional dependency. Redundancy is nothing but repetition of data whereas functional dependency is the limitation between attributes in a relational data, in which value of one attribute is defined by value of another attribute. Anomalies (Differences) occur, when the certain attributes cannot be changed (inserted/deleted/updated), without the change in other attributes.
Normalization helps in reducing the differences, and eliminate the dependencies to such an extent where there is a least redundancy is occurred.
Process of Normalization:
I would like to explain the Normalization process with an example, a data schema of purchase orders in a fragrance distributing warehouse as shown below. (Please refer to below table
as an example only)
PO_ NO | Date | Cust_ID | Cust_Name | Cust_Address | Product_ ID | Product_Description | Product_Cost | PO_Quantity |
PO1 | 4/18/2016 | Cust1 | A | FL | 001 | D&G | 12 | 1 |
002 | ARMANI | 10 | 2 | |||||
PO2 | 4/18/2016 | Cust2 | B | NY | 004 | BURBERRY | 9 | 3 |
006 | GUCCI | 7 | 4 | |||||
008 | RALPH LUREN | 10 | 5 | |||||
PO3 | 4/18/2016 | Cust3 | C | CA | 001 | D&G | 12 | 6 |
From the above table we observe that purchase order’s po1 and po2 have repeating groups of products in each orders.
First Normal form (1NF) requires that there are no redundancy groups in a relation and a single fact for each row and column, and each row must be uniquely identified by a primary key. Thus, when the redundancy groups are isolated as per 1NF, the table looks as below.
PO_NO | Date | Cust_ID | Cust_name | Cust_Address | Product_ID | Product Description | Product Cost | PO_ Quantity |
Ord1 | 1/1/2016 | Cust1 | A | NJ | 1 | D&G | 12 | 1 |
Ord1 | 1/1/2016 | Cust1 | A | NJ | 2 | ARMANI | 10 | 2 |
Ord2 | 1/2/2016 | Cust2 | B | NY | 4 | BURBERRY | 9 | 3 |
Ord2 | 1/2/2016 | Cust2 | B | NY | 6 | GUCCI | 7 | 4 |
Ord2 | 1/2/2016 | Cust2 | B | NY | 8 | RALPH LUREN | 10 | 5 |
Ord3 | 1/3/2016 | Cust3 | C | NC | 1 | D&G | 12 | 6 |
The functional dependencies from the above table are:
PO_NO -> Date, Cust_ID, Cust_Name and Cust_Address
Cust_ID -> Cust_ID, Cust_Name and Cust_Address
ProductID -> Product_Description, Product _Cost
PO_NO, ProductID -> Ordered_Quantity
When there are no duplicate rows, redundancy groups and same attributes are identified then the table is in the first normal form (Wyllys. 2003).
After applying first normal form for the data, then the data can be seen that if any request should be overhauled or included, every single other insight with respect to the client likewise should be redesigned or added again which prompts abnormalities (Malaika, 2011). Additionally, if there is any adjustment in points of interest of client, it should be redesigned in all the important records. With a specific end goal to understand the issues, ensuing Normal structures appeared. (Malaika, 2011)
The Second Normal Form reports the distinctions in the first normal form and requires that for a relation is said to be in Second Normal Form, if it fulfills the states of First Normal form and any partial dependencies be isolated (Malaika, 2011). All together for a table be in second ordinary structure, make a connection for every primary key quality that decides fractional reliance, which is an essential key in new relation, and move the non-key characteristics subject to primary key credits to the new relationship.
When partial dependencies are removed from the first table then it would result in three separate tables: Purchase order (PO_NO), Product and Customer-order.
Order table:
PO_NO | ProductID | PO_Quantity |
Product Table:
Product_ID | Product Description | Product Cost |
Customer-order:
PO_NO | Date | Cust_ID | Cust_Name | Cust_Address |
From the above table, if there is any need of a Purchase order to be updated for a specific order, then the customer details also needed to be updated as well. And if the customer details are changed, then all the records related to the customer must be updated to the newer values. The above problems can be easily overcomed in third normal form (3NF).
In third normal form the relation must be in the second normal form with no transitive dependencies. Transitive Dependencies are function dependencies between primary key and another non key characteristic which is subject to primary key by means of another non key attribute (Wabler, 2011). From the above decomposed tables, the Product table and Order table are already in Third Normal Form and Customer-Order table have transitive dependencies in the form of PO_NO->Cust_ID-> Cust_Name and PO_NO->Cust_ID->Cust_Address. This can be removed by creating a new relation for each non key attribute that is determinant in the relation, move all the related attributes that are related to primary key of new relation, to the new relation, and leave the primary key of the new relation in the old table.
Basically, the Cust_Name and Cust_Address columns should be moved to a new table and the Cust_ID should be made as the primary key for the new table, and the Cust_ID must be left in the Customer-Order table as shown below:
Order:
PO_NO | Date | Cust_ID |
Customer:
Cust_ID | Cust_Name | Cust_Address |
Hence, the purchase order of the fragrance warehouse is completely normalized disintegrating the relations into smallest possible ones.
Boyce-Codd Normal Form needs to separate the tables with irregularities resulting from multiple candidate keys. Hence the example we already took is in Boyce-Codd form, We can now take another example. Typically, Boyce-Codd Normal form requires that “Each attribute must represent a fact about the key, the whole key, and nothing but the key” (Kozubek. 2014). In other words, there should not be any non-trivial functional dependencies.
For example, I have following table,
Food | Cuisine | Chef | Restaurant | taste |
The above table has functional dependencies as Food -> Taste, Cuisine & Chef -> Restaurant. The above table can be disintegrated into three different tables as shown below:
Food | Cuisine | taste |
|
If there are still any differences resulting from Multivolume dependencies would be addressed in Fourth and Fifth normal form.
Two or more independent multi-valued attributes must not be contained in a record about an entity in fourth normal form. For example, I have below table,
Student | Language | Skillset |
Anusha | English | |
Anusha | Telugu | |
Anusha | ORACLE | |
Anusha | JAVA |
The above table must be disintegrated as,
Student | Language |
Anusha | English |
Anusha | Telugu |
Student | Skillset |
Anusha | ORACLE |
Anusha | JAVA |
Until Third Normalization Form, majority of the relations would be successively disintegrated but to reduce the irregularities further we use Boyce-Codd Normal Form and Fourth Normal Form. Fifth Normal Form is rarely used to reduce the anomalies further.
Step by Step Solution
3.38 Rating (157 Votes )
There are 3 Steps involved in it
Step: 1
It was first proposed by British computer scientist Edgar F Codd as part of his relational model Normalization entails organizing the columns attributes and tables relations of a database to ensure th...Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started