35 terms

# ch6 Relational algebra

#### Terms in this set (...)

Relational algebra
is the basic set of operations for the relational model.
-These operations enable a user to specify basic retrieval requests (or queries)
-The result of an operation is a new relation, which may have been formed from one or more input relations
-procedural way of starting query
The algebra operations
produce new relations
relational algebra expression
A sequence of relational algebra operations
Relational Algebra consists of several groups of operations
1) Unary:
SELECT (symbol: σ (sigma))
PROJECT (symbol: π (pi))
RENAME (symbol: ρ (rho))
2)Relational Algebra Operations From Set Theory:
-UNION ( ∪ ), INTERSECTION ( ∩ ), DIFFERENCE (or MINUS, - )
-CARTESIAN PRODUCT ( x )
3)Binary Relational Operations
-JOIN (several variations of JOIN exist)
-DIVISION
OUTER JOINS, OUTER UNION
AGGREGATE FUNCTIONS
The SELECT operation
σ <selection condition>(R)
is used to select a subset of the tuples from a relation based on a selection condition.
-The selection condition acts as a filter
-Keeps only those tuples that satisfy the qualifying condition
-Tuples satisfying the condition are selected whereas the other tuples are discarded (filtered out)
SELECT Operation Properties
1.The SELECT operation produces a relation S that has the same schema (same attributes) as R
2. is commutative
3.Because of commutativity property, a cascade (sequence) of SELECT operations may be applied in any order:
4. A cascade of SELECT operations may be replaced by a single selection with a conjunction(AND) of all the conditions
5.The number of tuples in the result of a SELECT is less than (or equal to) the number of tuples in the input relation R
PROJECT Operation
π<attribute list>(R))
-keeps certain columns (attributes) from a relation and discards the other columns
-creates a vertical partitioning
-removes any duplicate tuples
-result of the project operation must be a set of tuples
PROJECT Operation Properties
The number of tuples in the result of projection is always <= to the number of tuples in R
-If the list of attributes includes a key of R, then the number of tuples in the result of PROJECT is equal to the number of tuples in R
-is NOT commutative
The RENAME operator ρ
- when a query requires multiple operations
-Necessary in some cases (see JOIN operation later)
ρS (B1, B2, ..., Bn )(R)
changes both:
the relation name to S, and
the column (attribute) names to B1, B1, .....Bn
ρS (R) changes:
the relation name only to S
ρ(B1, B2, ..., Bn )(R)
changes:
the column (attribute) names only to B1, B1, .....Bn
UNION Operation ∪
-The result of R ∪ S, is a relation that includes all tuples that are either in R or in S or in both R and S
-Duplicate tuples are eliminated
-The two operand relations R and S must be "type compatible" (or UNION compatible)
-R and S must have same number of attributes
-Each pair of corresponding attributes must be type compatible (have same or compatible domains)
R1(A1, A2, ..., An) and R2(B1, B2, ..., Bn) are type compatible if:
-they have the same number of attributes
-the domains of corresponding attributes are type compatible (i.e. dom(Ai)=dom(Bi) for i=1, 2, ..., n).
INTERSECTION ∩
The result of the operation R ∩ S, is a relation that includes all tuples that are in both R and S
-The two operand relations R and S must be "type compatible"
SET DIFFERENCE (MINUS or EXCEPT) -
The result of R - S, is a relation that includes all tuples that are in R but not in S
-The two operand relations R and S must be "type compatible"
CARTESIAN (or CROSS) PRODUCT Operation
-used to combine tuples from two relations in a combinatorial fashion.
-Denoted by R(A1, A2, . . ., An) x S(B1, B2, . . ., Bm)
-Result is a relation Q with degree n + m attributes:
Q(A1, A2, . . ., An, B1, B2, . . ., Bm), in that order.
-The resulting relation state has one tuple for each combination of tuples—one from R and one from S.
-Hence, if R has nR tuples (denoted as |R| = nR ), and S has nS tuples, then R x S will have nR * nS tuples.
-The two operands do NOT have to be "type compatible"
JOIN Operation (babochka)
-The sequence of CARTESIAN PRODUCT followed by SELECT is used quite commonly to identify and select related tuples from two relations, JOIN combines this sequence into a single operation
-it allows us combine related tuples from various relations
-R(A1, A2, . . ., An) and S(B1, B2, . . ., Bm) is:
R <join condition>S
where R and S can be any relations that result from general relational algebra expressions.
Some properties of JOIN
R(A1, A2, . . ., An)JOINS(B1, B2, . . ., Bm)
-Result is a relation Q with degree n + m attributes:
-The resulting relation state has one tuple for each combination of tuples—r from R and s from S, but only if they satisfy the join condition r[Ai]=s[Bj]
-Hence, if R has nR tuples, and S has nS tuples, then the join result will generally have less than nR * nS tuples.
-Only related tuples (based on the join condition) will appear in the -
-Theta-join(can be any general boolean expression on the attributes of R and S; for example:
)
EQUIJOIN Operation
The most common use of join involves join conditions with equality comparisons only
-In the result of an EQUIJOIN we always have one or more pairs of attributes (whose names need not be identical) that have identical values in every tuple.
NATURAL JOIN Operation *
Another variation of JOIN called NATURAL JOIN — denoted by * — was created to get rid of the second (superfluous) attribute in an EQUIJOIN condition.because one of each pair of attributes with identical values is superfluous
-The standard definition of natural join requires that the two join attributes, have the same name in both relations
complete set
The set of operations including SELECT σ, PROJECT π , UNION ∪, DIFFERENCE − , RENAME ρ, and CARTESIAN PRODUCT X is called a complete set because any other relational algebra expression can be expressed by a combination of these five operations.
DIVISION Operation
-applied to two relations
-R(Z) ÷ S(X), where X subset Z. Let Y = Z - X (and hence Z = X ∪ Y); that is, let Y be the set of attributes of R that are not attributes of S.
-The result of DIVISION is a relation T(Y) that includes a tuple t if tuples tR appear in R with tR [Y] = t, and with
tR [X] = ts for every tuple ts in S.
-For a tuple t to appear in the result T of the DIVISION, the values in t must appear in R in combination with every tuple in S.
-for tuple t to appear in the result T of the DIVISIOn, the values in t must must appear in R in combination with every tuple in S
aggregate functions
-A type of request that cannot be expressed in the basic relational algebra is to specify mathematical aggregate functions on collections of values from the database.
-retrieving the average or total salary of all employees or the total number of employee tuples.
-These functions are used in simple statistical queries that summarize information from the database tuples.
-Common functions applied to collections of numeric values include
SUM, AVERAGE, MAXIMUM, and MINIMUM.
-The COUNT function is used for counting tuples or values.
Recursive Closure Operations
This operation is applied to a recursive relationship.
The OUTER JOIN Operation
-can be used when we want to keep all the tuples in R, or all those in S, or all those in both relations in the result of the join, regardless of whether or not they have matching tuples in the other relation.
The left outer join operation
keeps every tuple in the first or left relation R in R S; if no matching tuple is found in S, then the attributes of S in the join result are filled or "padded" with null values
right outer join
keeps every tuple in the second or right relation S in the result of R S.
Full outer join
keeps all tuples in both the left and the right relations when no matching tuples are found, padding them with null values as needed.
OUTER UNION Operations
The outer union operation was developed to take the union of tuples from two relations if the relations are not type compatible.
-This operation will take the union of tuples in two relations R(X, Y) and S(X, Z) that are partially compatible, meaning that only some of their attributes, say X, are type compatible.
-The attributes that are type compatible are represented only once in the result, and those attributes that are not type compatible from either relation are also kept in the result relation T(X, Y, Z).
relational calculus
expression creates a new relation, which is specified in terms of variables that range over rows of the stored database relations (in tuple calculus) or over columns of the stored relations (in domain calculus).
-In a calculus expression, there is no order of operations to specify how to retrieve the query result—a calculus expression specifies only what information the result should contain.
-is considered to be a nonprocedural language.
The tuple relational calculus
based on specifying a number of tuple variables.
-Each tuple variable usually ranges over a particular database relation, meaning that the variable may take as its value any individual tuple from that relation.
-A simple tuple relational calculus query is of the form
{t | COND(t)}
where t is a tuple variable and COND (t) is a conditional expression involving t.
-The result of such a query is the set of all tuples t that satisfy COND (t).
universal quantifier (∀) ,
the existential quantifier (∃)
-The formula (∃ €t)(F) is true if the formula F evaluates to true for some (at least one) tuple assigned to free occurrences of t in F; otherwise (∃ t)(F) is false.
-The formula (∀ t)(F) is true if the formula F evaluates to true for every tuple (in the universe) assigned to free occurrences of t in F; otherwise (∀ t)(F) is false.
∀ is called the universal or "for all" quantifier
every tuple in "the universe of" tuples must make F true to make the quantified formula true.
∃ is called the existential or there exist
quantifier because any tuple that exists in "the universe of" tuples may make F true to make the quantified formula true.