Evolution of Database Management Systems

                      
 RELATIONAL
                         DATABASE
                       MANAGEMENT
                             SYSTEM
1950s file systems, punched cards
1960s hierarchical 
 
IMS
  
1970s network 
  
CODASYL, IDMS
1980s relational 
  
INGRES, ORACLE, DB2, Sybase
Paradox, dBase
  
1990s object oriented and object
relational
O2, GemStone, Ontos
Sets
collections of items of the same type
no order
no duplicates
Mappings
domain
range
1:many
many:1
1:1
many:many
Database  
A database consists of a collection of interrelated data.
R
D
B
M
S
 
A
 
r
e
l
a
t
i
o
n
a
l
 
d
a
t
a
b
a
s
e
 
m
a
n
a
g
e
m
e
n
t
 
s
y
s
t
e
m
(
R
D
B
M
S
)
 
i
s
 
a
 
d
a
t
a
b
a
s
e
 
e
n
g
i
n
e
/
s
y
s
t
e
m
 
b
a
s
e
d
 
o
n
 
t
h
e
r
e
l
a
t
i
o
n
a
l
 
m
o
d
e
l
 
s
p
e
c
i
f
i
e
d
 
b
y
 
E
d
g
a
r
 
F
.
 
C
o
d
d
-
-
t
h
e
 
f
a
t
h
e
r
 
o
f
m
o
d
e
r
n
 
r
e
l
a
t
i
o
n
a
l
 
d
a
t
a
b
a
s
e
 
d
e
s
i
g
n
-
-
i
n
 
1
9
7
0
.
A
 
r
e
l
a
t
i
o
n
a
l
 
d
a
t
a
b
a
s
e
 
r
e
f
e
r
s
 
t
o
 
a
 
d
a
t
a
b
a
s
e
 
t
h
a
t
 
s
t
o
r
e
s
 
d
a
t
a
i
n
 
a
 
s
t
r
u
c
t
u
r
e
d
 
f
o
r
m
a
t
,
 
u
s
i
n
g
 
r
o
w
s
 
a
n
d
 
c
o
l
u
m
n
s
.
Relation 
a subset of the cartesian product of its domains. Given a
relation schema R, a relation on that schema r, a set of attributes A
1
..A
n
for that relation then
   
r(R) 
 
(dom(A
1
) 
 dom(A
2
) 
 ... 
 dom(A
n
))
Relation Schema
 denoted by R(A
1
, A
2
, …, A
n
), is made up of relation
name R and list of attributes A
1
, A
2
, …, A
n
.
(N)-tuple 
a set of (n) attribute-value pairs representing a single instance
of a relation’s mapping between its domains.
Degree 
the number of attributes a relation has.
Cardinality 
a number of tuples a relation has.
Attribute 
a function on a domain for each instance of the mapping or
tuple
Attribute Value 
the result of the attribute function. Each instance of
the mapping is represented by one attribute value drawn from each
domain or a special NULL value. Given a tuple t and an attribute A for
a relation r, t[A]--> a, where a is the attribute’s value for that tuple.
Domain 
set of all possible values for an attribute; for attribute A, the
domain is represented as dom(A). A domain has a format and a base
data type.
 
SuperKey
a 
set
 of attributes whose values 
together
 
uniquely
 identify a
tuple in a relation
Candidate Key
a superkey for which no proper subset is a superkey…a key
that is 
minimal
 . 
Can be more than one for a relation
Primary Key
a candidate key chosen to be the main key for the relation.   
One for each relation
Keys can be 
composite
Foreign Key
     A foreign key is a column or group of columns in a relational
database table that provides a link between data in two
tables. It acts as a cross-reference between tables because it
references the primary key of another table, thereby
establishing a link between them
.
 
 
Domain Constraints
Referential Integrity
Assertions
Triggers
Functional Dependencies
Integrity constraints guard against accidental damage to
the database, by ensuring that authorized changes
to the database do not result in a loss of data consistency.
Functional dependencies (FDs) are used to
specify 
formal measures
  of the "goodness" of
relational designs
FDs and keys are used to define 
normal forms
for relations
FDs are 
constraints
 that are derived from the
meaning
  and 
interrelationships
  of the data
attributes
 
A set of attributes X 
functionally determines
  a
set of attributes Y if the value of X determines a
unique value for Y
X 
Y holds if whenever two tuples have the same
value for X, they 
must have
  the same value for Y
If
  t1[X]=t2[X], 
then
  t1[Y]=t2[Y] 
in any relation
instance r(R)
X 
 
Y in R specifies a 
constraint
 on all relation
instances r(R)
FDs are derived from the real-world constraints
on the attributes
Given a set of FDs F, we can 
infer
 additional
FDs that hold whenever the FDs in F hold
Armstrong's inference rules
A1. (Reflexive) If Y 
subset-of
 X, then X 
 
Y
A2. (Augmentation) If X 
 
Y, then XZ 
 
YZ
  
 (Notation: XZ stands for X 
U
 Z)
A3. (Transitive) If X 
 
Y and Y 
 
Z, then X 
 
Z
A1, A2, A3 form a 
sound
  and
 complete
  set
of inference rules
Normalization
: Process of decomposing
unsatisfactory "bad" relations by breaking
up their attributes into smaller relations
Normal form
: Condition using keys and FDs
of a relation to certify whether a relation
schema is in a particular normal form
2NF, 3NF, BCNF based on keys and FDs of a
relation schema
4NF based on keys, multi-valued dependencies
91.2914
12
F
i
r
s
t
 
N
o
r
m
a
l
 
F
o
r
m
W
e
 
s
a
y
 
a
 
r
e
l
a
t
i
o
n
 
i
s
 
i
n
 
1
N
F
 
i
f
 
a
l
l
 
v
a
l
u
e
s
 
s
t
o
r
e
d
 
i
n
 
t
h
e
 
r
e
l
a
t
i
o
n
 
a
r
e
 
s
i
n
g
l
e
-
v
a
l
u
e
d
 
a
n
d
 
a
t
o
m
i
c
.
1NF places restrictions on the structure of relations. Values must be
simple.
91.2914
13
T
h
e
 
f
o
l
l
o
w
i
n
g
 
i
n
 
n
o
t
 
i
n
 
1
N
F
EmpNum
EmpPhone
EmpDegrees
123
233-9876
333
233-1231
BA, BSc, PhD
679
233-1231
BSc, MSc
EmpDegrees is a multi-valued field:
employee 679 has two degrees: 
BSc
 and 
MSc
employee 333 has three degrees:
 BA, BSc, PhD
91.2914
14
S
e
c
o
n
d
 
N
o
r
m
a
l
 
F
o
r
m
A relation is in 
2NF
 if it is in 1NF, and every non-key attribute is fully
dependent on each candidate key. (That is, we don’t have any partial
functional dependency.)
 
2NF (and 3NF) both involve the concepts of key and 
 
non-key attributes.
 A 
key attribute
 is any attribute that is part of a key; 
 
any attribute that is not
a key attribute, is a 
non-key 
 
attribute
. 
 Relations that are not in BCNF have data redundancies
 A relation in 2NF will not have any partial dependencies
91.2914
15
T
h
i
r
d
 
N
o
r
m
a
l
 
F
o
r
m
A relation is in 
3NF
 if the relation is in 2NF and all determinants
of 
non-key
 attributes are candidate keys
 
That is, for any functional dependency: X 
 Y, where Y is a non-
key attribute (or a set of non-key attributes), X is a candidate
key.
This definition of 3NF differs from BCNF only in the specification
of non-key attributes - 3NF is weaker than BCNF. (BCNF requires
all determinants to be candidate keys.)
A relation in 3NF will not have any transitive dependencies
 
of non-key attribute on a candidate key through another non-
key attribute.
91.2914
16
B
o
y
c
e
-
C
o
d
d
 
N
o
r
m
a
l
 
F
o
r
m
BCNF is defined very simply:
a relation is in BCNF if it is in 1NF and if every determinant is a
candidate key.
If our database will be used for OLTP (on line transaction
processing), then BCNF is our target. Usually, we meet this
objective. However, we might denormalize (3NF, 2NF, or 1NF) for
performance reasons.
91.2914
17
Formalism for creating new relations from
existing ones
Its place in the big picture:
Declartive
query
language
Algebra
Implementation
SQL,
relational calculus
Relational algebra
Relational bag algebra
Five operators:
Union: 
Difference: -
Selection:

Projection: 
Cartesian Product: 
Derived or auxiliary operators:
Intersection, complement
Joins (natural,equi-join, theta join, semi-join)
Renaming:

R1 
 R2
Example:
ActiveEmployees 
 RetiredEmployees
R1 – R2
Example:
AllEmployees -- RetiredEmployees
It is a derived operator
R1 
 R2 = R1 – (R1 – R2)
Also expressed as a join (will see later)
Example
UnionizedEmployees 
 RetiredEmployees
Returns all tuples which satisfy a condition
Notation: 

c
(R)
Examples
 
Salary > 40000
 
(Employee)
 
name = “Smith”
 
(Employee)
The condition c can be =, <, 
, >,
 
, <>
Salary > 40000
 
(Employee)
Eliminates columns, then removes
duplicates
Notation:   

A1,…,An
 
(R)
Example: project social-security number
and names:
  
 
SSN, Name
 (Employee)
Output schema:   Answer(SSN, Name)
 
Name,Salary
 (Employee)
Each tuple in R1 with each tuple in R2
Notation: R1 
 R2
Example:
Employee 
 Dependents
Very rare in practice; mainly used to express
joins
 
Notation: R1 |
|
 R2
Meaning:  R1 
|
|
 R2 = 
A
(
C
(R1 
 R2))
Where:
The selection 
C 
checks equality of all common
attributes
The projection eliminates the duplicate common
attributes
R=                                S=
R |
|
 S=
A join that involves a predicate
R1 |
|
 
 R2   =  
 
 (R1 
 R2)
Here 

can be any condition
Eq-join
A theta join where 

is an equality
R1 |
|
 
A=B
 R2   =  

A=B
 (R1 
 R2)
Example:
Employee |
|
 
SSN=SSN
 Dependents
Most useful join in practice
R |
 S  = 
 
A1,…,An
 (R |
|
 S)
Where A
1
, …, A
n
 are the attributes in R
Example:
Employee |
 Dependents
Cannot compute “transitive closure”
Find all direct and indirect relatives of Fred
Cannot express in RA !!!  Need to write C
program
Slide Note
Embed
Share

The evolution of Database Management Systems (DBMS) began with file systems and punched cards in the 1950s, followed by hierarchical and network models in the 1960s and 1970s. The 1980s introduced relational databases like Ingres, Oracle, DB2, and Sybase. The 1990s saw the rise of object-oriented and object-relational databases. A DBMS consists of interrelated data stored in a structured format using rows and columns. It utilizes relational models with attributes, tuples, keys, and integrity constraints to maintain data consistency.

  • Database Management Systems
  • DBMS Evolution
  • Relational Databases
  • Integrity Constraints
  • Data Consistency

Uploaded on Nov 12, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. RELATIONAL RELATIONAL DATABASE DATABASE MANAGEMENT MANAGEMENT SYSTEM SYSTEM

  2. 1950s file systems, punched cards 1960s hierarchical IMS 1970s network CODASYL, IDMS 1980s relational INGRES, ORACLE, DB2, Sybase Paradox, dBase 1990s object oriented and object relational O2, GemStone, Ontos

  3. Sets collections of items of the same type no order no duplicates Mappings domain range 1:many many:1 1:1 many:many

  4. Database A database consists of a collection of interrelated data. RDBMS A relational database management system (RDBMS) is a database engine/system based on the relational model specified by Edgar F. Codd--the father of modern relational database design--in 1970. A relational database refers to a database that stores data in a structured format, using rows and columns. Relation a subset of the cartesian product of its domains. Given a relation schema R, a relation on that schema r, a set of attributes A1..An for that relation then r(R) (dom(A1) dom(A2) ... dom(An)) Relation Schema denoted by R(A1, A2, , An), is made up of relation name R and list of attributes A1, A2, , An.

  5. (N)-tuple a set of (n) attribute-value pairs representing a single instance of a relation s mapping between its domains. Degree the number of attributes a relation has. Cardinality a number of tuples a relation has. Attribute a function on a domain for each instance of the mapping or tuple Attribute Value the result of the attribute function. Each instance of the mapping is represented by one attribute value drawn from each domain or a special NULL value. Given a tuple t and an attribute A for a relation r, t[A]--> a, where a is the attribute s value for that tuple. Domain set of all possible values for an attribute; for attribute A, the domain is represented as dom(A). A domain has a format and a base data type.

  6. SuperKey a set of attributes whose values togetheruniquely identify a tuple in a relation Candidate Key a superkey for which no proper subset is a superkey a key that is minimal . Can be more than one for a relation Primary Key a candidate key chosen to be the main key for the relation. One for each relation Keys can be composite Foreign Key A foreign key is a column or group of columns in a relational database table that provides a link between data in two tables. It acts as a cross-reference between tables because it references the primary key of another table, thereby establishing a link between them.

  7. Integrity constraints guard against accidental damage to the database, by ensuring that authorized changes to the database do not result in a loss of data consistency. Domain Constraints Referential Integrity Assertions Triggers Functional Dependencies

  8. Functional dependencies (FDs) are used to specify formal measures of the "goodness" of relational designs FDs and keys are used to define normal forms for relations FDs are constraints meaning and interrelationships of the data attributes normal forms constraints that are derived from the

  9. A set of attributes X functionally determines a set of attributes Y if the value of X determines a unique value for Y X Y holds if whenever two tuples have the same value for X, they must have the same value for Y If t1[X]=t2[X], then t1[Y]=t2[Y] in any relation instance r(R) X Y in R specifies a constraint on all relation instances r(R) FDs are derived from the real-world constraints on the attributes

  10. Given a set of FDs F, we can infer additional FDs that hold whenever the FDs in F hold Armstrong's inference rules A1. (Reflexive) If Y subset-of X, then X Y A2. (Augmentation) If X Y, then XZ YZ (Notation: XZ stands for X U Z) A3. (Transitive) If X Y and Y Z, then X Z A1, A2, A3 form a sound and complete set of inference rules

  11. Normalization unsatisfactory "bad" relations by breaking up their attributes into smaller relations Normal form of a relation to certify whether a relation schema is in a particular normal form 2NF, 3NF, BCNF based on keys and FDs of a relation schema 4NF based on keys, multi-valued dependencies Normalization: Process of decomposing Normal form: Condition using keys and FDs

  12. First Normal Form We say a relation is in 1NF if all values stored in the relation are single- valued and atomic. 1NF places restrictions on the structure of relations. Values must be simple. 12 91.2914

  13. The following in not in 1NF EmpNum 123 333 679 EmpNum EmpPhone 233-9876 233-1231 BA, BSc, PhD 233-1231 BSc, MSc EmpPhone EmpDegrees EmpDegrees EmpDegrees is a multi-valued field: employee 679 has two degrees: BSc and MSc employee 333 has three degrees: BA, BSc, PhD 13 91.2914

  14. Second Normal Form A relation is in 2NF dependent on each candidate key. (That is, we don t have any partial functional dependency.) 2NF if it is in 1NF, and every non-key attribute is fully 2NF (and 3NF) both involve the concepts of key and A key attribute is any attribute that is part of a key; a key attribute, is a non-key attribute. Relations that are not in BCNF have data redundancies A relation in 2NF will not have any partial dependencies non-key attributes. any attribute that is not 14 91.2914

  15. Third Normal Form A relation is in 3NF of non-key attributes are candidate keys That is, for any functional dependency: X Y, where Y is a non- key attribute (or a set of non-key attributes), X is a candidate key. This definition of 3NF differs from BCNF only in the specification of non-key attributes - 3NF is weaker than BCNF. (BCNF requires all determinants to be candidate keys.) A relation in 3NF will not have any transitive dependencies of non-key attribute on a candidate key through another non- key attribute. 3NF if the relation is in 2NF and all determinants 15 91.2914

  16. Boyce-Codd Normal Form BCNF is defined very simply: a relation is in BCNF if it is in 1NF and if every determinant is a candidate key. If our database will be used for OLTP (on line transaction processing), then BCNF is our target. Usually, we meet this objective. However, we might denormalize (3NF, 2NF, or 1NF) for performance reasons. 16 91.2914

  17. In 3NF, but not in BCNF: In 3NF, but not in BCNF: Instructor teaches one course only. student_no course_no instr_no Student takes a course and has one instructor. {student_no, course_no} instr_no instr_no course_no since we have instr_no course-no, but instr_no is not a Candidate key. 17 91.2914

  18. Formalism for creating new relations from existing ones Its place in the big picture: Declartive query language Algebra Implementation Relational algebra Relational bag algebra SQL, relational calculus

  19. Five operators: Union: Difference: - Selection: Projection: Cartesian Product: Derived or auxiliary operators: Intersection, complement Joins (natural,equi-join, theta join, semi-join) Renaming:

  20. R1 R2 Example: ActiveEmployees RetiredEmployees R1 R2 Example: AllEmployees -- RetiredEmployees

  21. It is a derived operator R1 R2 = R1 (R1 R2) Also expressed as a join (will see later) Example UnionizedEmployees RetiredEmployees

  22. Returns all tuples which satisfy a condition Notation: c(R) Examples Salary > 40000(Employee) name = Smith (Employee) The condition c can be =, <, , >, , <>

  23. SSN 1234545 5423341 4352342 Name John Smith Fred Salary 200000 600000 500000 Salary > 40000(Employee) SSN 5423341 4352342 Name Smith Fred Salary 600000 500000

  24. Eliminates columns, then removes duplicates Notation: A1, ,An(R) Example: project social-security number and names: SSN, Name (Employee) Output schema: Answer(SSN, Name)

  25. SSN 1234545 5423341 4352342 Name John John John Salary 200000 600000 200000 Name,Salary (Employee) Name John John Salary 20000 60000

  26. Each tuple in R1 with each tuple in R2 Notation: R1 R2 Example: Employee Dependents Very rare in practice; mainly used to express joins

  27. Notation: R1 || R2 Meaning: R1 | | R2 = A( C(R1 R2)) Where: The selection C checks equality of all common attributes The projection eliminates the duplicate common attributes

  28. R= S= A X X Y Z B Y Z Z V B Z V Z C U W V R | | S= A X X Y Y Z B Z Z Z Z V C U V U V W

  29. A join that involves a predicate R1 | | R2 = (R1 R2) Here can be any condition Eq Eq- -join join A theta join where is an equality R1 | | A=B R2 = A=B (R1 R2) Example: Employee | | SSN=SSN Dependents Most useful join in practice

  30. R | S = A1,,An (R || S) Where A1, , An are the attributes in R Example: Employee | Dependents

  31. Cannot compute transitive closure Name1 Fred Mary Mary Nancy Name2 Mary Joe Bill Lou Relationship Father Cousin Spouse Sister Find all direct and indirect relatives of Fred Cannot express in RA !!! Need to write C program

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#