SlideShare a Scribd company logo
DATABASE CONCEPTS
Prof. K ADISESHA (Ph. D)
Introduction
Data Abstraction
Architecture of DBMS
Data Models
Data Warehouse
2DATABASE CONCEPTS
Prof. K. Adisesha
Introduction
Prof. K. Adisesha (Ph. D)
3
Definition:
 Data:
 Data is a collection of facts, numbers, letters or symbols that the computer process
into meaningful information.
 Information:
 Information is processed data, stored, or transmitted by a computer.
 Database:
 A Database is a collection of logically related data organized in a way that data can
be easily accessed, managed and updated.
Introduction
Prof. K. Adisesha (Ph. D)
4
Applications of Database:
 Banking: For customer information, accounts and loans, and banking transactions.
 Colleges: For student information, course registrations and grades.
 Credit card transactions: For purchases on credit cards and generation of monthly
statements.
 Finance: For storing information about holdings, sales and purchases of financial
 instruments such as stocks and bonds.
 Telecommunication: For keeping records of call made, generating monthly bills, and
storing information about the communication networks.
 Voter id/Aadhaar database: This is the biggest database in the world storing a data about
60 million people residing in India.
 Sales: For customer, product, and purchase information.
Introduction
Prof. K. Adisesha (Ph. D)
5
Difference between Manual and Computerized data processing:
Manual Data Processing Computerized Data Processing
• The volume of data, which can be
processed, is limited.
• The volume of data, which can be
processed is large
• Requires large quantity of paper • Requires less quantity of paper
• Speed and accuracy is executed is limited • Execution is Faster and Accurate
• Labor cost is high • Labor cost is low
• Storage medium is paper. • Storage medium is Hard disk etc.
Data processing cycle
Prof. K. Adisesha (Ph. D)
6
Data processing cycle:
The order in which information is processed in a computer information management
system is called data process cycle.
 To design, use and maintain the database, Data processing cycle involves.
 Data Collection
 Data Input
 Data Processing
 Data storage
 Output
 Communication
Data processing cycle
Prof. K. Adisesha (Ph. D)
7
Data processing cycle:
To design, use and maintain the database, many peoples are involved.
 Data Collection: It is the process of systematic gathering of data from various sources that has been
systematically observed, recorded and organized.
 Data Input: The raw data is put into the computer using a keyboard, mouse or other devices such as
the scanner, microphone and the digital camera.
 Data Processing: Processing is the series of actions or operations on the input data to generate
outputs.
 Data storage: Data and information should be stored in memory so that it can be accessed later.
 Output: The result obtained after processing the data must be presented to the user in user
understandable form.
 Communication: Computers have communication ability in communication connections, data may be
transmitted as an e-mail or posted to the website where the online services are rendered.
Features of Database
Prof. K. Adisesha (Ph. D)
8
Features or advantages of Database:
 Redundancy can be minimized or controlled: In DBMS environment if redundancy is
present, then it can be controlled by propagating updates in all the places where ever
redundant data is present.
 Data Integrity: Data Integrity refers to the correctness of the data in the database. In
other words, the data available in the database is reliable data.
 Data Sharing: In DBMS, data is stored in the centralized database and all the
permitted users can access the same piece of information required at the same time.
 Database Security: DBMS provides a variety of security mechanisms for the user to
protect his or her data stored in the database.
 Supports Concurrent access: DBMS supports concurrent access to the same data
stored in the database by applying locking and time stamp mechanisms.
Database users
Prof. K. Adisesha (Ph. D)
9
Database users:
To design, use and maintain the database, many peoples are involved.
 The people who work with the database include:
 System Analysts
 Application programmers
 Database Administrators (DBA)
 End Users (Database Users)
Database users
Prof. K. Adisesha (Ph. D)
10
Database users:
 System Analysts: System analysts determine the requirement of end users; (especially
end users), to create a solution for their business need and focus on non-technical and
technical aspects.
 Application programmers: These are the computer professionals who implement the
specifications given by the system analysts and develop the application programs.
 Database Administrators (DBA): DBA is a person who has central control over both
data and application. The responsibilities of DBA are authorization access, schema
definition and modification, software installation and security enforcement and
administration.
 Database users: Are those who interact with the database in order to query and update
the database, and generate reports.
Data Abstraction
Prof. K. Adisesha (Ph. D)
11
Data Abstraction:
A major purpose of a database system is to provide users with an abstract view of the data.
 That is the system hides certain details of how the data are stored and maintained.
 There are three level of data abstraction.
 Physical Level( Internal level)
 Conceptual Level (Logical level)
 View Level(External level)
Data Abstraction
Prof. K. Adisesha (Ph. D)
12
Data Abstraction:
Physical Level:
 It is the lowest level of abstraction that describes how the data are actually stored.
 The physical level describes complex low-level data structures in detail.
 It contains the definition of stored record and method of representing the data fields
and access aid used.
Data Abstraction
Prof. K. Adisesha (Ph. D)
13
Data Abstraction:
Conceptual Level:
 It is the next higher level of abstraction that describes what data are stored in the
database and what relationships exist among those data.
 It also contains the method of deriving the objects in the conceptual view from the
objects in the internal view.
Data Abstraction
Prof. K. Adisesha (Ph. D)
14
Data Abstraction:
View Level:
 It is the highest level of abstraction that describes only part of the entire database.
 It also contains the method of deriving the objects in the external view from the objects
in the conceptual view.
DBMS Architecture
Prof. K. Adisesha (Ph. D)
15
DBMS Architecture:
The design of Database Management System highly depends on its architecture:
 It can be centralized or decentralized or hierarchical.
 Database architecture is logically divided into three types.
 Logical one-tier in 1-tier Architecture
 Logical two-tier Client/Server Architecture.
 Logical three-tier Client/Server Architecture.
DBMS Architecture
Prof. K. Adisesha (Ph. D)
16
Logical one-tier in 1-tier Architecture:
DBMS is the only entity where user directly sits on DBMS and uses it.
 Any changes done here will directly be on DBMS itself.
 It does not provide handy tools for end users and preferably database designers and
programmers use single tier architecture
DBMS Architecture
Prof. K. Adisesha (Ph. D)
17
Logical two-tier Client/Server Architecture:
Two-tier Client / Server architecture is used for User Interface program and
Application Programs that runs on client side.
 An interface called ODBC (Open Database Connectivity) provides an API that allows
client side program to call the DBMS.
 Most DBMS vendors provide ODBC drivers.
 A client program may connect to several DBMS's.
DBMS Architecture
Prof. K. Adisesha (Ph. D)
18
Logical three-tier Client/Server Architecture:
Three-tier Client / Server database architecture is commonly used architecture for web
applications. Intermediate layer called Application server or Web Server stores .
 The web connectivity software and the business logic (constraints) part of application
used to access the right amount of data from the database server.
 This layer acts like medium for sending partially processed data between the database
server and the client.
Database Model
Prof. K. Adisesha (Ph. D)
19
Database Model:
Data model is a collection of conceptual tools for describing data, data relationship, data
semantics and constraints.
 Data model theory, which is a formal description of how data may be structured and used.
 Data model instance, which is a practical data model designed for a particular
application.
 In history of database design, three models have been in use.
 Hierarchical Model
 Network Model
 Relational Model
Database Model
Prof. K. Adisesha (Ph. D)
20
Hierarchical data model:
The Hierarchical data model organizes data in a tree structure. In this data model, data is
represented by a collection of records and the relationships are represented by links.
 In this model, each entity has only one parent but can have several children.
 At the top of hierarchy, there is only one entity, which is called Root node.
Database Model
Prof. K. Adisesha (Ph. D)
21
Hierarchical data model:
Advantages:
 Simplicity: The relationship between the various layers is logically simple.
 Data Security: The data security is provided by the DBMS.
 Data Integrity: There is always link between the parent segment and the child segment
under it.
 Efficiency: It is very efficient because when the database contains a large number of one
to many relationships and when the user requires large number of transaction.
Database Model
Prof. K. Adisesha (Ph. D)
22
Hierarchical data model:
Disadvantages:
 Implementation complexity
 Database management problem
 Lack of structural Independence.
 Operational Anomalies
Database Model
Prof. K. Adisesha (Ph. D)
23
Network data model:
In 1971, the Conference on Data Systems Languages (CODASYL) formally defined the
network models. In this model, data is represented by a collection of records and the
relationships are represented by links.
 Each record is collection of fields, which contains only one data value. A link is an
association between two records. In the network model, entities are organized in a graph,
in which some entities can be accessed through several paths.
Database Model
Prof. K. Adisesha (Ph. D)
24
Network data model:
Advantages:
 It is simple and easy to implement.
 It can handle many relationships within the organization.
 It has better data independence compared to hierarchical model.
Disadvantages:
 More complex system of database structure
 Lack of structural dependence.
Database Model
Prof. K. Adisesha (Ph. D)
25
Relation Data Model:
E.F Codd developed the relation data model in 1970. Unlike, hierarchical and network
model, there are no physical links. All data is maintained in the form of tables consisting
of rows and columns.
 Each row (record) represents an entity and a column (field) represents an attribute of the
entity.
 In this model, data is organized in two-dimensional tables called relations.
 The tables or relations are related to each other.
Database Model
Prof. K. Adisesha (Ph. D)
26
Normalization:
Normalization is a step by step process of removing the different kinds of redundancy
and anomaly one step at a time from the database.
 E.F Codd developed for the relation data model in 1970.
 Normalization rules are divided into following normal form:
Database Model
Prof. K. Adisesha (Ph. D)
27
Normalization:
Normalization is a step by step process of removing the different kinds of redundancy
and anomaly one step at a time from the database.
Data Independence
Prof. K. Adisesha (Ph. D)
28
Data Independence:
The capacity to change data at one layer does not affect the data at another layer is
called data independence.
 Two types of data independence are
 Physical Data Independence
 File Organization
 Data Model
 Logical Data Independence
 Relational Data Model
 Entity Relationship
Data Independence
Prof. K. Adisesha (Ph. D)
29
Physical data independence :
It is the capacity to change the internal level without having to change either the
schemas at the conceptual or external level.
 Changes to the internal schema may be needed because some physical files had to be
reorganized.
 Physical data independence refers to the data insulation of an application from the
physical storage structure only, it is easier to achieve than logical data independence.
 The physical data independence are:
 File Organization
 Database Architecture
 Database Models
File organization
Prof. K. Adisesha (Ph. D)
30
File organization Methods:
The difference file organization methods are:
 Serial File Organization:
 Direct Access File Organization
 Index sequential file organization (ISAM)
File organization
Prof. K. Adisesha (Ph. D)
31
File organization Methods:
The difference between serial and direct access file organization.
 Serial File Organization:
 Organization is continuous and simple.
 Data processing, which requires the use of all records, is best suited to use this
method.
 Direct Access File Organization
 The type of storage device used is comparatively expensive.
 It is less efficient in the usage of storage space compared to the sequential
organization.
File organization
Prof. K. Adisesha (Ph. D)
32
Index sequential file organization (ISAM):
The index sequential file organization is a combination of Sequential file organization
and an Index file. It is also referred as ISAM (indexed sequential access method).
 Data is stored physically in adjacent storage locations and there exists a logical
relationship among the data stored by using ordering field. An additional file called as
Index file would be created, which contains n number of records.
 Each record of index file has two fields:
 The field is of the same data type as the ordering key field and
 The second field is a pointer to a disk block (a block address).
E-R diagram
Prof. K. Adisesha (Ph. D)
33
Components of E-R model:
ER-Diagram is a visual representation of data that describes how data is related to each
other.
Entity:
 An Entity can be any object, place, person or class.
Attribute:
 An Attribute describes a property or characteristic of an entity.
 Example: Roll_No, Name and Birth date can be attributes of a student
Relationship:
 A relationship type is a meaningful association between entity types.
 Relationship types are represented on the E-R diagram by a series of lines.
E-R diagram
Prof. K. Adisesha (Ph. D)
34
Different notations of E-R diagram:
ER-Diagram is a visual representation of data that describes how data is related to each
other.
 Different notations of E-R diagram:
 Entity: An entity is represented using rectangles.
 Attribute: Attributes are represented by means of eclipses.
 Relationship: Relationship is represented using diamonds shaped box.
E-R diagram
Prof. K. Adisesha (Ph. D)
35
Relationship:
A Relationship describes relations between entities. Relationship is represented using
diamonds shaped box.
 There are three types of relationship that exist between entities:
 Binary Relationship
 Recursive Relationship
 Ternary Relationship
E-R diagram
Prof. K. Adisesha (Ph. D)
36
Binary Relationship:
It means relation between two entities.
 This is further divided into three types.
 One to One
 One to Many
 Many to Many
 One to One:
 This type of relationship is rarely seen in real world.
 The above example describes that one student can enroll only for one course and a
course will have only one Student. This is not what you will usually see in relationship.
E-R diagram
Prof. K. Adisesha (Ph. D)
37
Binary Relationship:
 One to Many:
 It reflects business rule that one entity is associated with many number of same
entity.
 For example, Student enrolls for only one Course but a Course can have many
Students.
 Many to Many:
 It reflects business rule that many entity are associated with many number of same
entity.
 The above diagram represents that many students can enroll for more than one
course.
Relational Keys
Prof. K. Adisesha (Ph. D)
38
Keys used in database:
The different types of keys are:
 Primary key:
 It is a field in a table which uniquely identifies each row/record in a database table.
Primary keys must contain unique values.
 A primary key column cannot have NULL values.
 Ex: In Relation STUDENT, Regno serves as a primary key.
 Candidate Key:
 When more than one or group of attributes serve as a unique identifier, they are
each called as candidate key.
Relational Keys
Prof. K. Adisesha (Ph. D)
39
Keys used in database:
The different types of keys are:
 Alternate Key:
 The alternate key of any table are those candidate keys, which are not currently
selected as the primary key. This is also known as secondary key.
 Foreign key:
 A key used to link two tables together is called a foreign key, also called as
referencing key.
 Foreign key is a field that matches the primary key column of another table.
Generalization
Prof. K. Adisesha (Ph. D)
40
Generalization:
Generalization is a bottom-up approach in which two lower level entities combine to
form a higher level entity.
 In generalization, a number of entities are brought together into one generalized entity
based on their similar characteristics.
 For example, Student and Parent details can all be generalized as a group ‘Person’ as
Personal details.
Specialization
Prof. K. Adisesha (Ph. D)
41
Specialization:
Specialization is a Top-down approach in which one higher level entity can be broken
down into two lower level entities.
 Specialization is the opposite of generalization.
 In specialization, a group of entities is divided into sub-groups based on their
characteristics.
 Take a group ‘Person’ for example. A person has name, date of birth, gender, etc.
Similarly, in a school database, persons can be specialized as teacher, student, or a staff,
based on what role they play in school as entities.
Relation Algebra
Prof. K. Adisesha (Ph. D)
42
Relation Algebra:
Relational algebra is a procedural query language that consists of a set of operations
that take one or more relations as input and result into a new relation as an output.
 The relational algebraic operations can be divided into:
 Basic set-oriented operations:
 Union, Set different, Cartesian product
 Relational-oriented operations:
 Selection, Projection, Division, Joins
Data Warehouse
Prof. K. Adisesha (Ph. D)
43
Data Warehouse:
A data warehouse is a repository of an organization's electronically stored data.
Data warehouse are designed to facilitate reporting and supporting data analysis.
 The concept of data warehouses was introduced in late 1980's.
 The components of data warehouse are:
 Data Source
 Data Transformation
 Reporting
 Metadata
 Additional components are Dependent data marts, Logical Data marts,
Operational Data store.
Data Mining
Prof. K. Adisesha (Ph. D)
44
Data Mining:
Data mining is a process of discovering patterns in large data sets involving
methods at the intersection of machine learning, statistics, and database systems.
 Data mining is the process of finding anomalies, patterns and correlations within
large data sets to predict outcomes.
 Data mining allows you to:
 Sift through all the chaotic and repetitive noise in your data.
 Understand what is relevant and then make good use of
that information to assess likely outcomes.
 Accelerate the pace of making informed decisions.
Questions
Important Questions:
 Define the following database terms:
a. Data Model b. Tuple c. Domain d. Primary key e. Foreign key
 Write the difference between manual and electronic data processing.
 Explain any five applications of database.
 Briefly explain the data processing cycle.
 Write the difference between Hierarchical data model and network data model.
 What is normalization? Explain second normal form with an example.
 What is database model? Explain Hierarchical model.
 Explain 3-level DBMS architecture.
 What is data warehouse? Briefly explain its components.

More Related Content

Dbms

  • 1. DATABASE CONCEPTS Prof. K ADISESHA (Ph. D)
  • 2. Introduction Data Abstraction Architecture of DBMS Data Models Data Warehouse 2DATABASE CONCEPTS Prof. K. Adisesha
  • 3. Introduction Prof. K. Adisesha (Ph. D) 3 Definition:  Data:  Data is a collection of facts, numbers, letters or symbols that the computer process into meaningful information.  Information:  Information is processed data, stored, or transmitted by a computer.  Database:  A Database is a collection of logically related data organized in a way that data can be easily accessed, managed and updated.
  • 4. Introduction Prof. K. Adisesha (Ph. D) 4 Applications of Database:  Banking: For customer information, accounts and loans, and banking transactions.  Colleges: For student information, course registrations and grades.  Credit card transactions: For purchases on credit cards and generation of monthly statements.  Finance: For storing information about holdings, sales and purchases of financial  instruments such as stocks and bonds.  Telecommunication: For keeping records of call made, generating monthly bills, and storing information about the communication networks.  Voter id/Aadhaar database: This is the biggest database in the world storing a data about 60 million people residing in India.  Sales: For customer, product, and purchase information.
  • 5. Introduction Prof. K. Adisesha (Ph. D) 5 Difference between Manual and Computerized data processing: Manual Data Processing Computerized Data Processing • The volume of data, which can be processed, is limited. • The volume of data, which can be processed is large • Requires large quantity of paper • Requires less quantity of paper • Speed and accuracy is executed is limited • Execution is Faster and Accurate • Labor cost is high • Labor cost is low • Storage medium is paper. • Storage medium is Hard disk etc.
  • 6. Data processing cycle Prof. K. Adisesha (Ph. D) 6 Data processing cycle: The order in which information is processed in a computer information management system is called data process cycle.  To design, use and maintain the database, Data processing cycle involves.  Data Collection  Data Input  Data Processing  Data storage  Output  Communication
  • 7. Data processing cycle Prof. K. Adisesha (Ph. D) 7 Data processing cycle: To design, use and maintain the database, many peoples are involved.  Data Collection: It is the process of systematic gathering of data from various sources that has been systematically observed, recorded and organized.  Data Input: The raw data is put into the computer using a keyboard, mouse or other devices such as the scanner, microphone and the digital camera.  Data Processing: Processing is the series of actions or operations on the input data to generate outputs.  Data storage: Data and information should be stored in memory so that it can be accessed later.  Output: The result obtained after processing the data must be presented to the user in user understandable form.  Communication: Computers have communication ability in communication connections, data may be transmitted as an e-mail or posted to the website where the online services are rendered.
  • 8. Features of Database Prof. K. Adisesha (Ph. D) 8 Features or advantages of Database:  Redundancy can be minimized or controlled: In DBMS environment if redundancy is present, then it can be controlled by propagating updates in all the places where ever redundant data is present.  Data Integrity: Data Integrity refers to the correctness of the data in the database. In other words, the data available in the database is reliable data.  Data Sharing: In DBMS, data is stored in the centralized database and all the permitted users can access the same piece of information required at the same time.  Database Security: DBMS provides a variety of security mechanisms for the user to protect his or her data stored in the database.  Supports Concurrent access: DBMS supports concurrent access to the same data stored in the database by applying locking and time stamp mechanisms.
  • 9. Database users Prof. K. Adisesha (Ph. D) 9 Database users: To design, use and maintain the database, many peoples are involved.  The people who work with the database include:  System Analysts  Application programmers  Database Administrators (DBA)  End Users (Database Users)
  • 10. Database users Prof. K. Adisesha (Ph. D) 10 Database users:  System Analysts: System analysts determine the requirement of end users; (especially end users), to create a solution for their business need and focus on non-technical and technical aspects.  Application programmers: These are the computer professionals who implement the specifications given by the system analysts and develop the application programs.  Database Administrators (DBA): DBA is a person who has central control over both data and application. The responsibilities of DBA are authorization access, schema definition and modification, software installation and security enforcement and administration.  Database users: Are those who interact with the database in order to query and update the database, and generate reports.
  • 11. Data Abstraction Prof. K. Adisesha (Ph. D) 11 Data Abstraction: A major purpose of a database system is to provide users with an abstract view of the data.  That is the system hides certain details of how the data are stored and maintained.  There are three level of data abstraction.  Physical Level( Internal level)  Conceptual Level (Logical level)  View Level(External level)
  • 12. Data Abstraction Prof. K. Adisesha (Ph. D) 12 Data Abstraction: Physical Level:  It is the lowest level of abstraction that describes how the data are actually stored.  The physical level describes complex low-level data structures in detail.  It contains the definition of stored record and method of representing the data fields and access aid used.
  • 13. Data Abstraction Prof. K. Adisesha (Ph. D) 13 Data Abstraction: Conceptual Level:  It is the next higher level of abstraction that describes what data are stored in the database and what relationships exist among those data.  It also contains the method of deriving the objects in the conceptual view from the objects in the internal view.
  • 14. Data Abstraction Prof. K. Adisesha (Ph. D) 14 Data Abstraction: View Level:  It is the highest level of abstraction that describes only part of the entire database.  It also contains the method of deriving the objects in the external view from the objects in the conceptual view.
  • 15. DBMS Architecture Prof. K. Adisesha (Ph. D) 15 DBMS Architecture: The design of Database Management System highly depends on its architecture:  It can be centralized or decentralized or hierarchical.  Database architecture is logically divided into three types.  Logical one-tier in 1-tier Architecture  Logical two-tier Client/Server Architecture.  Logical three-tier Client/Server Architecture.
  • 16. DBMS Architecture Prof. K. Adisesha (Ph. D) 16 Logical one-tier in 1-tier Architecture: DBMS is the only entity where user directly sits on DBMS and uses it.  Any changes done here will directly be on DBMS itself.  It does not provide handy tools for end users and preferably database designers and programmers use single tier architecture
  • 17. DBMS Architecture Prof. K. Adisesha (Ph. D) 17 Logical two-tier Client/Server Architecture: Two-tier Client / Server architecture is used for User Interface program and Application Programs that runs on client side.  An interface called ODBC (Open Database Connectivity) provides an API that allows client side program to call the DBMS.  Most DBMS vendors provide ODBC drivers.  A client program may connect to several DBMS's.
  • 18. DBMS Architecture Prof. K. Adisesha (Ph. D) 18 Logical three-tier Client/Server Architecture: Three-tier Client / Server database architecture is commonly used architecture for web applications. Intermediate layer called Application server or Web Server stores .  The web connectivity software and the business logic (constraints) part of application used to access the right amount of data from the database server.  This layer acts like medium for sending partially processed data between the database server and the client.
  • 19. Database Model Prof. K. Adisesha (Ph. D) 19 Database Model: Data model is a collection of conceptual tools for describing data, data relationship, data semantics and constraints.  Data model theory, which is a formal description of how data may be structured and used.  Data model instance, which is a practical data model designed for a particular application.  In history of database design, three models have been in use.  Hierarchical Model  Network Model  Relational Model
  • 20. Database Model Prof. K. Adisesha (Ph. D) 20 Hierarchical data model: The Hierarchical data model organizes data in a tree structure. In this data model, data is represented by a collection of records and the relationships are represented by links.  In this model, each entity has only one parent but can have several children.  At the top of hierarchy, there is only one entity, which is called Root node.
  • 21. Database Model Prof. K. Adisesha (Ph. D) 21 Hierarchical data model: Advantages:  Simplicity: The relationship between the various layers is logically simple.  Data Security: The data security is provided by the DBMS.  Data Integrity: There is always link between the parent segment and the child segment under it.  Efficiency: It is very efficient because when the database contains a large number of one to many relationships and when the user requires large number of transaction.
  • 22. Database Model Prof. K. Adisesha (Ph. D) 22 Hierarchical data model: Disadvantages:  Implementation complexity  Database management problem  Lack of structural Independence.  Operational Anomalies
  • 23. Database Model Prof. K. Adisesha (Ph. D) 23 Network data model: In 1971, the Conference on Data Systems Languages (CODASYL) formally defined the network models. In this model, data is represented by a collection of records and the relationships are represented by links.  Each record is collection of fields, which contains only one data value. A link is an association between two records. In the network model, entities are organized in a graph, in which some entities can be accessed through several paths.
  • 24. Database Model Prof. K. Adisesha (Ph. D) 24 Network data model: Advantages:  It is simple and easy to implement.  It can handle many relationships within the organization.  It has better data independence compared to hierarchical model. Disadvantages:  More complex system of database structure  Lack of structural dependence.
  • 25. Database Model Prof. K. Adisesha (Ph. D) 25 Relation Data Model: E.F Codd developed the relation data model in 1970. Unlike, hierarchical and network model, there are no physical links. All data is maintained in the form of tables consisting of rows and columns.  Each row (record) represents an entity and a column (field) represents an attribute of the entity.  In this model, data is organized in two-dimensional tables called relations.  The tables or relations are related to each other.
  • 26. Database Model Prof. K. Adisesha (Ph. D) 26 Normalization: Normalization is a step by step process of removing the different kinds of redundancy and anomaly one step at a time from the database.  E.F Codd developed for the relation data model in 1970.  Normalization rules are divided into following normal form:
  • 27. Database Model Prof. K. Adisesha (Ph. D) 27 Normalization: Normalization is a step by step process of removing the different kinds of redundancy and anomaly one step at a time from the database.
  • 28. Data Independence Prof. K. Adisesha (Ph. D) 28 Data Independence: The capacity to change data at one layer does not affect the data at another layer is called data independence.  Two types of data independence are  Physical Data Independence  File Organization  Data Model  Logical Data Independence  Relational Data Model  Entity Relationship
  • 29. Data Independence Prof. K. Adisesha (Ph. D) 29 Physical data independence : It is the capacity to change the internal level without having to change either the schemas at the conceptual or external level.  Changes to the internal schema may be needed because some physical files had to be reorganized.  Physical data independence refers to the data insulation of an application from the physical storage structure only, it is easier to achieve than logical data independence.  The physical data independence are:  File Organization  Database Architecture  Database Models
  • 30. File organization Prof. K. Adisesha (Ph. D) 30 File organization Methods: The difference file organization methods are:  Serial File Organization:  Direct Access File Organization  Index sequential file organization (ISAM)
  • 31. File organization Prof. K. Adisesha (Ph. D) 31 File organization Methods: The difference between serial and direct access file organization.  Serial File Organization:  Organization is continuous and simple.  Data processing, which requires the use of all records, is best suited to use this method.  Direct Access File Organization  The type of storage device used is comparatively expensive.  It is less efficient in the usage of storage space compared to the sequential organization.
  • 32. File organization Prof. K. Adisesha (Ph. D) 32 Index sequential file organization (ISAM): The index sequential file organization is a combination of Sequential file organization and an Index file. It is also referred as ISAM (indexed sequential access method).  Data is stored physically in adjacent storage locations and there exists a logical relationship among the data stored by using ordering field. An additional file called as Index file would be created, which contains n number of records.  Each record of index file has two fields:  The field is of the same data type as the ordering key field and  The second field is a pointer to a disk block (a block address).
  • 33. E-R diagram Prof. K. Adisesha (Ph. D) 33 Components of E-R model: ER-Diagram is a visual representation of data that describes how data is related to each other. Entity:  An Entity can be any object, place, person or class. Attribute:  An Attribute describes a property or characteristic of an entity.  Example: Roll_No, Name and Birth date can be attributes of a student Relationship:  A relationship type is a meaningful association between entity types.  Relationship types are represented on the E-R diagram by a series of lines.
  • 34. E-R diagram Prof. K. Adisesha (Ph. D) 34 Different notations of E-R diagram: ER-Diagram is a visual representation of data that describes how data is related to each other.  Different notations of E-R diagram:  Entity: An entity is represented using rectangles.  Attribute: Attributes are represented by means of eclipses.  Relationship: Relationship is represented using diamonds shaped box.
  • 35. E-R diagram Prof. K. Adisesha (Ph. D) 35 Relationship: A Relationship describes relations between entities. Relationship is represented using diamonds shaped box.  There are three types of relationship that exist between entities:  Binary Relationship  Recursive Relationship  Ternary Relationship
  • 36. E-R diagram Prof. K. Adisesha (Ph. D) 36 Binary Relationship: It means relation between two entities.  This is further divided into three types.  One to One  One to Many  Many to Many  One to One:  This type of relationship is rarely seen in real world.  The above example describes that one student can enroll only for one course and a course will have only one Student. This is not what you will usually see in relationship.
  • 37. E-R diagram Prof. K. Adisesha (Ph. D) 37 Binary Relationship:  One to Many:  It reflects business rule that one entity is associated with many number of same entity.  For example, Student enrolls for only one Course but a Course can have many Students.  Many to Many:  It reflects business rule that many entity are associated with many number of same entity.  The above diagram represents that many students can enroll for more than one course.
  • 38. Relational Keys Prof. K. Adisesha (Ph. D) 38 Keys used in database: The different types of keys are:  Primary key:  It is a field in a table which uniquely identifies each row/record in a database table. Primary keys must contain unique values.  A primary key column cannot have NULL values.  Ex: In Relation STUDENT, Regno serves as a primary key.  Candidate Key:  When more than one or group of attributes serve as a unique identifier, they are each called as candidate key.
  • 39. Relational Keys Prof. K. Adisesha (Ph. D) 39 Keys used in database: The different types of keys are:  Alternate Key:  The alternate key of any table are those candidate keys, which are not currently selected as the primary key. This is also known as secondary key.  Foreign key:  A key used to link two tables together is called a foreign key, also called as referencing key.  Foreign key is a field that matches the primary key column of another table.
  • 40. Generalization Prof. K. Adisesha (Ph. D) 40 Generalization: Generalization is a bottom-up approach in which two lower level entities combine to form a higher level entity.  In generalization, a number of entities are brought together into one generalized entity based on their similar characteristics.  For example, Student and Parent details can all be generalized as a group ‘Person’ as Personal details.
  • 41. Specialization Prof. K. Adisesha (Ph. D) 41 Specialization: Specialization is a Top-down approach in which one higher level entity can be broken down into two lower level entities.  Specialization is the opposite of generalization.  In specialization, a group of entities is divided into sub-groups based on their characteristics.  Take a group ‘Person’ for example. A person has name, date of birth, gender, etc. Similarly, in a school database, persons can be specialized as teacher, student, or a staff, based on what role they play in school as entities.
  • 42. Relation Algebra Prof. K. Adisesha (Ph. D) 42 Relation Algebra: Relational algebra is a procedural query language that consists of a set of operations that take one or more relations as input and result into a new relation as an output.  The relational algebraic operations can be divided into:  Basic set-oriented operations:  Union, Set different, Cartesian product  Relational-oriented operations:  Selection, Projection, Division, Joins
  • 43. Data Warehouse Prof. K. Adisesha (Ph. D) 43 Data Warehouse: A data warehouse is a repository of an organization's electronically stored data. Data warehouse are designed to facilitate reporting and supporting data analysis.  The concept of data warehouses was introduced in late 1980's.  The components of data warehouse are:  Data Source  Data Transformation  Reporting  Metadata  Additional components are Dependent data marts, Logical Data marts, Operational Data store.
  • 44. Data Mining Prof. K. Adisesha (Ph. D) 44 Data Mining: Data mining is a process of discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems.  Data mining is the process of finding anomalies, patterns and correlations within large data sets to predict outcomes.  Data mining allows you to:  Sift through all the chaotic and repetitive noise in your data.  Understand what is relevant and then make good use of that information to assess likely outcomes.  Accelerate the pace of making informed decisions.
  • 45. Questions Important Questions:  Define the following database terms: a. Data Model b. Tuple c. Domain d. Primary key e. Foreign key  Write the difference between manual and electronic data processing.  Explain any five applications of database.  Briefly explain the data processing cycle.  Write the difference between Hierarchical data model and network data model.  What is normalization? Explain second normal form with an example.  What is database model? Explain Hierarchical model.  Explain 3-level DBMS architecture.  What is data warehouse? Briefly explain its components.