Normalization 
            Normalization
theory is built the concept of normal forms. Normalizations reduces redundancy.
Redundancy is unnecessary repetition of data. It can cause problems with
storage and retrieval of data. During the process of normalization, dependencies
can be identified, which can cause problems during deletion and updation.
Normalization theory is based on the fundamental notation of functional
dependency. Normalization helps in simplifying the structure of tables.
Suppose for an entity customer if attributes like cunno,
name, address are required, then it can be perceived that for a particular
cusno, only one name and address is possible. Hence the attribute name, address
are said to be functionally dependent on the attribute cusno. There are four
levels of normalization.
Need of  Normalization  
 Normalization is the aim of well design Relational Database
Management System (RDBMS). It is step by step set of rules by which data is put
in its simplest forms. We normalize the relational database management system because
of the following reasons: 
  
Minimize data redundancy i.e. no unnecessarily duplication
of data.
To make database structure flexible i.e. it should be
possible to add new data values and rows without reorganizing the database
structure.
Data should be consistent throughout the database i.e. it
should not suffer from following anomalies.
Insert Anomaly - Due to lack of data i.e., all the data
available for insertion such that null values in keys should be avoided. This
kind of anomaly can seriously damage a database
Update Anomaly - It is due to data redundancy i.e. multiple
occurrences of same values in a column. This can lead to inefficiency.
Deletion Anomaly - It leads to loss of data for rows that
are not stored else where. It could result in loss of vital data.
Complex queries required by the user should be easy to
handle.
On decomposition of a relation into smaller relations with
fewer attributes on normalization the resulting relations whenever joined must
result in the same relation without any extra rows. The join operations can be
performed in any order. This is known as Lossless Join decomposition.
The resulting relations (tables) obtained on normalization
should possess the properties such as each row must be identified by a unique
key, no repeating groups, homogenous columns, each column is assigned a unique
name etc.
ADVANTAGES OF NORMALIZATION 
- More
     efficient data structure.
 
- Avoid
     redundant fields or columns.
 
- More
     flexible data structure i.e. we should be able to add new rows and data
     values easily
 
- Better
     understanding of data.
 
- Ensures
     that distinct tables exist when necessary.
 
-  
Easier to maintain data structure i.e. it is easy to perform
operations and complex queries can be easily handled. 
 
-    
Minimizes data duplication.
 
-     Close modeling of real world entities, processes and their
relationships. 
 
DISADVANTAGES OF NORMALIZATION 
  
The following are disadvantages of normalization. 
1. You
cannot start building the database before you know what the user needs. 
2. On
Normalizing the relations to higher normal forms i.e. 4NF, 5NF the performance
degrades. 
3. 
It
is very time consuming and difficult process in normalizing relations of higher
degree. 
4.  Careless
decomposition may leads to bad design of database which may leads to serious
problems. 
Normal forms-
      First Normal Form
- Second Normal Form
 
- Third Normal Form
 
- Boyce-Codd Normal Form
 
- Fourth Normal Form
 
- Fifth Normal Form
 
First Normal Form (1st NF)
- The table cells must be of single value. 
 
- Eliminate repeating groups in individual tables. 
 
- Create a separate table for each set of related data. 
 
- Identify each set of related data with a primary key.
 
Second Normal Form (2nd NF)
- Remove Partial
     Dependencies. 
 
- Functional
     Dependency: The value of one attribute in a table is determined entirely
     by the value of another. 
 
- Partial
     Dependency: A type of functional dependency where an attribute is
     functionally dependent on only part of the primary key (primary key must
     be a composite key). 
 
- Create separate
     table with the functionally dependent data and the part of the key on
     which it depends. Tables created at this step will usually contain
     descriptions of resources.
 
A relation is in 2NF if it is in 1NF and every non-key attribute is fully dependent on each candidate key of the relation. 
Third
Normal Form (3rd NF)
- Remove
     transitive dependencies. 
 
- Transitive
     Dependency A type of functional dependency where an attribute is
     functionally dependent on an attribute other than the primary key. Thus
     its value is only indirectly determined by the primary key. 
 
- Create a
     separate table containing the attribute and the fields that are
     functionally dependent on it. Tables created at this step will usually
     contain descriptions of either resources or agents. Keep a copy of the key
     attribute in the original file.
 
A
relation is in third normal form, if it is in 2NF and every
non-key attribute of the relation is non-transitively dependent on each
candidate key of the relation. 
Boyce-Codd
Normal Form (BCNF)
- When a relation
     has more than one candidate key, anomalies may result even though the
     relation is in 3NF. 
 
- 3NF does not
     deal satisfactorily with the case of a relation with overlapping candidate
     keys 
 
- i.e. composite
     candidate keys with at least one attribute in common. 
 
- BCNF is based on
     the concept of a determinant. 
 
- A determinant is
     any attribute (simple or composite) on which some other attribute is fully
     functionally dependent. 
 
- A relation is in
     BCNF is, and only if, every determinant is a candidate key. 
 
Fourth
Normal Form (4th NF)
An entity is in Fourth Normal Form (4NF) when it meets the requirement of being
in Third Normal Form (3NF) and additionally: 
- Has no multiple
     sets of multi-valued dependencies. In other words, 4NF states that no
     entity can have more than a single one-to-many relationship within an
     entity if the one-to-many attributes are independent of each other. 
 
- Fourth Normal
     Form applies to situations involving many-to-many relationships. 
 
In
relational databases, many-to-many relationships are expressed through
cross-reference tables.
A table is in fourth normal form (4NF) if and only if it is in BCNF and
contains no more than one multi-valued dependency.
  
Fifth Normal Form (5th NF)
- A relation that
     has a join dependency cannot be decomposed by a projection into other
     relations without spurious results
     
 
- A relation is in
     5NF when its information content cannot be reconstructed from several
     smaller relations i.e. from relations having fewer attributes than the
     original relation
 
A
table is in fifth normal form (5NF) or Project-Join Normal Form (PJNF) if it is
in 4NF and it cannot have a lossless decomposition into any number of smaller
tables.
 
Denormalization 
            Normalization
helps in reducing redundancy. But at times when information is required from
more than one table, at a faster rate, it is wiser to introduce some amount of
redundancy in the table. This intentional introduction of redundancy in to the
table is called Denormalization.
For example, if the query were used to display the names of
students and their marks obtained in each subject, then it would be advisable
to maintain information about the students. Otherwise, every time such a query
is made, a join will have to be carried out on the two tables. For the sake of
better performance some amount of redundancy has to be introduced.
The language widely used in relational database is
Structured Query Language, more popularly known as SQL. It is the standard
relational database access language. They can be roughly divided in to three
categories based on their functions namely Data Definition Language, Data
Manipulation Language and Data Control Language.