41 terms

Physical Design

Physical Design : Purpose
Design structure of database and identify technical
specifications to optimize database performance
Physical Design : Output
An implementable database design
Physical Database Design
Process of designing the structure of a database and
identifying technical specifications to optimize database
performance at runtime
Ensuring Data Integrity Requirements (4)
Domain integrity
values entered into columns are valid, Specify the appropriate data type for each column
can a column contain null values?
Check constraint
do the column values belong to a set list,
range, etc.?
Unique constraint
is the column value unique?
Default constraint
is there a default value if nothing is
Entity integrity
each row is uniquely identified, Each entity has a primary key
Artificial key
auto generated number / identity column
Referential integrity
references to other tables remain valid , The value of a foreign key is "constrained" to the values of
a primary key in a different table
Policy integrity
values adhere to business rules, May be enforceable through domain, entity, and/or
referential integrity , More complicated rules - use triggers or application
Snippets of programs that run (or fire) in response to an
event, More resource intensive than domain, entity, or referential
integrity constraints
Constraining Relationship (FK)
Specify what should happen to the many side of a 1:M
On Delete (Update) - Restrict
If we try to delete a record on the "one side" of a 1:M
relationship, the database will reject the delete if there are any
matching FK values on the "many side"
On Delete (Update) - Cascade
If we try to delete a record on the "one side" of the 1:M
relationship, both that record and all matching records on the
"many side" value will be deleted
On Delete (Update) - Set-to-Null
If a record on the "one side" of the 1:M relationship is deleted,
that record will be deleted and the matching FK on the "many
side" will be changed to null
On Delete (Update) - Default
If a record on the "one side" of the 1:M relationship is deleted,
that record will be deleted and the matching FK on the "many
side" will be changed to a predefined default value
Factors Affecting DB Performance: Data-related
Large data volumes: How many records, how large is each record
Factors Affecting DB Performance: Database structure
Searching on attributes without direct access
Composite keys
Factors Affecting DB Performance: Data storage
Data dispersed all over the disk (multiple tables): Slows data access
Factors Affecting DB Performance: Application
Need for joins: Consumes substantial time and resources
Need to calculate totals: Time-consuming for large tables
Factors Affecting DB Performance: Business environment
Too many data access operations: How frequently are tables
Overly liberal data access: Who access these tables, for what
purpose (security)
Indexes are mechanisms for providing direct access to specific
fields (or combination of fields) in a table
Why use indexes?
Improves data retrieval speed
Useful for search and join operations
Which field(s) to index?
Primary keys
Foreign keys used for joins
Search attributes
Attributes used for grouping
Why not to use indexes?
Having many indexes may take more space than actual data
Too many indexes may slow down performance: each update
causes DBMS to update related indexes
Small tables may not need indexes (read entire table into
Querying a table without an index
Read each record into memory
Find records meeting the condition
Return records containing required data
Querying a table with an index
Read index into memory
Search index to Iind records meeting the condition
Access only those records containing required data
Tradeoffs to not using an index
Slower queries, increase disk accesses
- Except for very small tables
Faster maintenance: only need to add / delete / update
Tradeoffs to using an index
Faster queries, since disk accesses are reduced
Slower maintenance: Require at least two accesses for
adding/deleting/ updating records
Static databases beneIit more overall
logical combinations of a subset of records and
attributes in one or more tables , Often implemented via table joins , No data is physically duplicated - hence no redundancy, Can protect data security and privacy, by restricting the
data provided to users
SpliVng Tables: Horizontal Par''oning
Records of a table are split into two or more separate files
Each partition has the same fields but different records
Why split?
Works well if applications do not need to access all records at
the same time
Easier to retrieve data from smaller tables
Improves performance
Splitting Tables: Vertical Partitioning
Fields in a table are distributed across two or more files/
partitions, Works well if different fields are needed for different
Adding Attributes: Creaying New PK
Changes the logical design
Replace composite primary key with a new single-attribute
May be ideal for association relations
Single attribute PK are faster to index than composite keys
Single attribute PK are faster to join than composite keys
Adding Attributes: Storing Derived Data
If the same values have to be calculated over and over
again, compute them and store as new Iield
Introduces transitive and/or partial dependency: Table no
longer in 3NF/2NF
Data redundancy introduced: Derived data must be updated
every time values are added or updated in the referenced
Rule: Use sparingly and with caution
Combining Tables: 1:1 Rela'onships
Avoids the need for joins
Makes joint retrieval of data from both tables faster
Tables no longer logically or physically independent
May introduce anomalies (e.g., can't add OfIice_Num without SP_Num)
Makes retrieval of data from each table slower
Rule of thumb:
Somewhat safe for 1:1 relationships; don't try for 1:M or M:N
Combining Tables: Repea'ng Groups
If repeating groups are well controlled, they can be folded
into one table
Rule of thumb:
"Well controlled" means no more than two repeating groups
If more than two of, if in doubt, don't combine tables
Combining Tables: Denormaliza'on
The process of combining normalized relations into larger
unnormalized relations to improve database performance