Introduction
The relational model is one of the most popular data models in used nowadays. It was first formalized in 1970. This model is a formal model with a mathematical foundation based on set theory and first-order predicate logic. It has no graphical representation which makes it less suitable to be used as a conceptual model.
It consists of the following basic concepts.
The relational model represents the database as a collection of relation
- A relation is defined as a collection of tuples that each represents a similar, real-world entity such as product, supplier, employee, etc.
- A tuple is an ordered list of (attribute) values that each describes an aspect of this entity such as supplier number, name, address, etc. Each tuple is uniquely identified by its primary key. Besides, tuples are interrelated by means of foreign keys.
Various commercial implementations exist, provided by Oracle, IBM, Microsoft, etc.
The Relational Model
A relation can also be interpreted as a table of values. Let’s take for example a relation called SUPPLIER. A relation corresponds to an entity type from the EER model. Each tuple corresponds to a row in a table or to an entity in the EER model. Attribute types can be seen as column names. These are identical to the attribute types in the EER model (Ex: supplier number, name, address, city, status, etc.). Each attribute type value corresponds to a single cell.
A relation schema R, denoted by R(A1, A2, A3, …, An) is made up of a relation name R and a list of attribute types (A1, A2, A3, …, An). Each attribute type Ai is the name of a role played by some domain Di in the relation schema R.
Di is called the domain of Ai and is denoted by dom(Ai). A gender domain can specify the values male and female. A time domain can specify time as day followed by month followed by year. R is the name of the relation schema. It is recommended to use a meaningful name.
Example:
Student (SN, Name, HomePhone, Address)
Professor (SSN, Name, HomePhone, office phone, e-mail) Course (CourseNo, CourseName
A domain can be used multiple times in a relation scheme. An advantage of using a domain is that if definition would ever have to changed, then this changed should only be done in the domain definition. Therefore, this improves the maintainability of the model.
A relation also, called extension, r(R) for the relation schema R(A1, A2, A3, …, An) is a set of m- tuples r={ t1, t2, t3, …, tm}. each tuple is an ordered list of n values t = <v1, v2, v3,… vn>, corresponding to a particular entity such as a particular student, supplier, etc. Each value vi, 1 ≤ i ≤ n, is an element of dom(Ai) or is a special null value. NULL value means that the value is missing, irrelevant or not applicable.
Example:
Student(100, Michael Johnson, 123 456 789, 532 Seventh Avenue)
Professor(50, Bart Baesens, 987 654 321, 876 543 210, Bart.Baesens@kuleuven.be) Course(10, Principles of Database Management)
More formally, A relation r(R) is a mathematical relation of degree n on the domains dom(A1), dom(A2), dom(A3), …, dom(An) and is a subset of the Cartesian product of the domains that define R : r(R) ⊆ (dom(A1) x dom(A2) x dom(A3) x … x dom(An))
The Cartesian product specifies all possible combinations of values from the underlying domains. Of all these possible combinations, the current relation state represents only the valid tuples that represent a particular state of the real world.
It is important to know that a representation essentially represents a set. Hence, there is no logical ordering of tuple sin a relation. The relation also does not have any duplicate tuples. There is however an ordering on tuples based on how the relationship was defined. According to the relational model, each attribute type is single-valued and atomic. Therefore, no composite or multivalued attribute types are allowed.
Type of Constraints
The relational model allows defining of various constraints on the values of the attribute types.
1. Domain constraints
Domain constraints state that the value of each attribute type A must be an atomic and single value from the domain dom(A).
Example:
COURSE(coursenr, coursename, study points)
Right: (10, Principles of Database Management, 6)
Wrong: (10, (Principles of Database Management, Database Modeling), 6). It specifies two values for the course name.
2. Key constraints
To specify the key and uniqueness constraints, it seems important to first explain the concept of a key. A relation is a set of tuples. All tuples must therefore be distinct. No two tuples can have the same combination of values for all their attribute types. A super key is defined as a subset of attribute types of a relation schema R with the property that no two tuples in any relation state should have the same combination of values for these attribute types. A super key specifies a uniqueness constraint in the sense that no distinct tuples in a stat can have the same value for the super key.
Every relation has at least one default superkey, the set of all its attribute types. A super key can have redundant attribute types. Ex: {StudentNr, StudentName, StudentBirthdate} is a superkey, but note that both StudentName and StudentBirthdate are redundant.
A key K of a relation scheme R is a super key of R with the additional property that removing any attribute type from K leaves a set of attribute types that is no super key of R. A key does not have redundant attribute types; hence it is also called a minimal super key. For a student relation, Student number is a key.
In general, a relation scheme may have more than one key. For instance, for a product relation, a product can have a unique product number and unique product name. Each of these keys is called a candidate key. One of them is designated as the primary key of the relation. This primary key will be used to identify tuples in the relation and establish connections in other relations. It can also be used for storage purposes and define indexes in the physical scheme. The other candidate keys are then referred to as candidate keys.
3. Entity integrity
The attribute types that make up the primary key must always satisfy the NOT NULL constraint. Otherwise it would not be possible to identify some tuples. This is the entity integrity constraint. A NOT NULL constraint can also be defined for other attribute types such as the alternative keys.
A relational database schema S is a set of relation schemas {R1, R2, R3, …, Rm} and a set of integrity constraints IC.
A relational database state DB of S is a set of relation states {r1, r2, r3, …, rm} such that each ri is a state of Ri and such that the relation states satisfy the integrity constraints specified in IC.
The DBMS will have to take care that the integrity constraints are always checked and the violations reported if the database state is updated.
4. Referential integrity constraints
Similar to relationship types in the EER model, also relations in the relational model can be connected. These connections are established thanks to a foreign key. A set of attribute types FK in a relation schema R1 is a foreign key of R1 if two conditions are satisfied.
The attribute types in FK have the same domains as the primary key attribute types PK of a relation R2;
A value FK in a tuple t1 of the current state r1 either occurs as a value of PK for some tuple t2 in the current state r2 or is NULL.
The conditions for a foreign key specify a referential integrity constraint between two relation schemas R1 and R2.
5. Semantic integrity constraints
The relational model has its shortcomings. These are more general constraints, which cannot be enforced in the standard relation mode. Extensions have been developed to the relational model to enforce some of these constraints e.g. stored procedures or triggers.