Database Normalization: Explain 1NF, 2NF, 3NF, BCNF With Examples + PDF: The purpose of normalization is to make the life of users easier. Insertion, Updation and Deletion Anamolies are very frequent if database is not normalized. To understand these anomalies let us take an example of a Student . A relation is said to be in 2NF, if it is already in 1NF and each and every In the 3NF example,Stud_ID is super-key in Student_Detail relation.

Author: Faujas Bajora
Country: South Sudan
Language: English (Spanish)
Genre: Art
Published (Last): 25 November 2011
Pages: 135
PDF File Size: 6.36 Mb
ePub File Size: 3.49 Mb
ISBN: 956-3-30051-324-2
Downloads: 69155
Price: Free* [*Free Regsitration Required]
Uploader: Negrel

When developing the schema of a relational database, one of the most important aspect to be taken into account is to ensure that the duplication is minimized. This is done 3mf 2 purposes:. Database Normalization is a technique that helps in designing the schema of the database in an optimal manner so as to ensure the above points. The core idea of database normalization is to divide the tables into smaller subtables and store pointers to data rather than replicating it.

For a better understanding of what we just said, here is a simple Normalization example: Here is what a sample database could look like:. At first, this design seems to be good. However, issues start to develop once we need to modify information. For instance, suppose, if Prof. George changed his mobile number.

In such a situation, we will have to make edits in 2 places. What if someone examole edited the mobile number against CS, but forgot to edit it for CS?

What are database normal forms and can you give examples? – Stack Overflow

Basically, we store the instructors separately and exxmple the course table, we do not store the entire data of the instructor. We rather store the ID of the instructor. Also, if we were to change the mobile number of Prof. George, it can be done in exactly 3nc place. Further, if you observe, the mobile number now need not be stored 2 times. We have stored it at just 1 place. This also saves storage. This may not be obvious in the above simple example.

However, think about the case when there are hundreds of courses and instructors and for each instructor, we have to store not just the mobile number, but also other details like office address, email address, specialization, availability, etc. In such a situation, replicating so much data will increase the storage requirement unnecessarily.

The above is a exam;le example of how database normalization works. We will now more fxample study it. Each normal form has an importance which helps in optimizing the database to save storage and to reduce redundancies. The First normal form simply says that each sith of a table should contain exactly one value. Let us take an example. Suppose we are storing the courses that a particular instructor takes, we can store it like this:.


Here, the issue is that in the first row, we are storing 2 courses against Prof. A better method would be to store the courses separately.

Instructor’s name Course code Prof. Also, observe that each row stores unique information.

Normalization of Database

There is no repetition. This is the First Normal Form. The first point is obviously straightforward since we just studied 1NF. Let us understand the first point — 1 column primary key. Well, a primary key is a set of columns that uniquely identifies a row. Basically, no 2 rows have the same primary keys. Here, in this table, the course code is unique. So, that becomes our primary key. Let us take another example of storing student enrollment in various courses.

Each student may enrol in multiple witu. Similarly, each course may have multiple enrollments. A sample table may look like this student name and course code:. Here, the first column is the student name and the second column is the course taken by the student. Similarly, the course code column is not unique as we can see that there are 2 entries corresponding to course code CS in row 2 and row 4.

However, the tuple student name, course code is unique since a student cannot enroll in the examplw course more than once. So, these 2 columns when combined form the primary key for the database. To achieve the same 1NF to 2NFwe can rather break it into 2 tables:. Student name Enrolment number Rahul 1 Rajat 2 Raman 3 Here the second column is unique and it indicates the enrollment number for the student.

Clearly, the enrollment number is unique. Now, we can attach each of these enrollment numbers with course codes. Before we delve into details of third normal form, let us understand the concept of a functional dependency on a table.

Column A is said to be functionally dependent on column B if changing the value of A may require a change in the value of B. As an example, consider the following table:. Here, the department column is dependent on the professor name column. This is because if in a particular row, we change exa,ple name of wih professor, we will also have to change the department value. As an example, exampe MA is now taken by Prof.

Ronald who happens to be from the Mathematics department, the table will look like this:. Here, when we changed the name of the professor, we also had to change the department column. This is not desirable since someone who is updating the database may remember to change the name of the professor, but may forget updating the department value.

This can cause inconsistency in the database. We can simply use the ID. Boyce-Codd Normal form is a stronger generalization of third normal form. Let us first understand what a superkey means. Here, the first column course code is unique across various rows. So, it is a superkey. Consider the combination of columns course code, professor name.


It is also unique across various rows. So, it is also a superkey. A superkey is basically a set of columns such that the value of that set of columns is unique across various rows. That is, no 2 rows have the same set of values for those columns.

A superkey whose size number of columns is the smallest is called as a candidate key. For instance, the first superkey above has just 1 column.

The second one and the last one have 2 columns. So, the first superkey Course code is a candidate key. A trivial functional dependency means that all columns of B are contained in the columns of A. A is a superkey: Basically, if a set of columns B can be 2ng knowing some other set of columns Athen A should be a superkey. Superkey basically determines each row uniquely.

It is a trivial functional dependency: This may lead to an inconsistent database. A table is said to be in fourth normal form if there is no two or 22nf, independent and multivalued data describing the relevant entity.

The various forms of database normalization are useful while designing the schema of a database in such a way that there is no data replication which may possibly lead to inconsistencies. While designing schema for applications, we should always think about how can we make use of these forms.

Entrepreneur, Coder, Speed-cuber, Blogger, fan of Air crash investigation! Fascinated by the world of technology he went on to build his own start-up – AllinCall Research and Solutions to build the next generation of Artificial Intelligence, Machine Learning and Natural Language Processing based solutions to power businesses.

View all posts by Aman Goel. Lock is the mechanism to prevent the overwriting of data. Database locks serve to protect shared resources or objects like tables, rows etc. In the Star schema, dimensions are denormalized. For example, if you have an employee dimension and the employee belongs to a particular department. Then in star schema, you will only have the employee table and repeat the department data for 1ng employee.

This will increase the data retrieval speed and save the storage. Fact tables are completely normalized because the redundant information is maintained in the dimensions table. Dimensions table can be wifh or denormalized. If anyone say that fact table is denormalized as it might contain duplicate foreign key then it would be partially correct to say denormalized.

There can be some situations where fact table contains lot of columns. In that case, we can say that fact table is denormalized, but it would be much better to say that schema is denormalized.