The design of the physical ___ of the data warehouse is central to optimizing query performance.
What are the 2 kinds of schema types?
snowflake and star
___ occurs any time data is stored multiple times. (aka stored redundantly)
When a schema contains no redundancy it is considered completely ___.
What are the 3 basic forms of a snowflake schema?
Completely normalized; Moderately denormalized, Completely denormalized
t/f: a completely normalized snowflake schema doesn't store any data redundantly.
A completely normalized snowflake schema's lookup table will contain only 3 pieces of information/
1) Attribute ID 2) Attribute Desc 3) ID of immediate parent attribute
One obvious benefit of normalization is that the ___ store the minimum amount of information.
One obvious disadvantage of normalization is that the number of tables in a snowflake schema requires many ___ to relate in the SQL.
A ___ ___ snowflake schema's lookup tables will contain an Attribute ID, an Attribute Desc, the immediate parent Attribute ID, and IDs of all higher-level attributes
Which type of snowflake schema contains the greatest level of redundancy?
What additional level of information does a completely denormalized snowflake lookup table contain (not counting the pieces of info also found in a moderately denormalized schema)?
descriptions of all other higher-level attributes, if they exist
What snowflake would typically require the fewest number of joins to relate information?
A ___ ___ is a design that contains only one lookup table for each hierarchy in the data model instead of having separate lookup tables for each attribute.
t/f: a star schema is always completely denormalized.
t/f: because a star schema is completely denormalized, there are many joins involved in relating attributes and facts in the SQL
t/f: using a star schema is not recommended if you plan to use aggregated fact tables because it can result in double counting (as a result of the redundant data in a lookup table, for example).
Factors to consider for optimal query performance...
1) LU tbl volume 2) fact tbl volume 3) data type of ID columns 4) fact tbl keys
___ ___ refers to the amount of data in a table.
t/f: for LU tables, table volume is determined by the cardinality of the attributes and the amount of descriptive information stored for those attributes.
The recommended optimal schema design is to completely ___ the lookup tables.
If a lookup table has massive ___ volume, any subsequent SQL joins will consume system resources. As such, those extra joins should be avoided.
t/f: If lookup table has minimal table volume, then extra joins are less of a resource hog. As such, you can retain a normalized schema for the lookup tables.
For ___ tables, the table volume is calculated by the number of records stored and the amount of information (Attribute IDs) stored for each record.
t/f: denormalizing the fact table to include attributes can help reduce the amount of joins to lookup tables, thus potentially improving overall query performance.
t/f: when creating indexes, the ID columns in all columns should be text or character (varchar).
False (should always be integer, number or date)
MSTR recommends not defining a ___ key for fact tables as it can cause indexes to be used inefficiently.
t/f: individual indexes should be used on each key in the fact table in order to maximize optimal indexing.
The ___ key consists of any columns in a table for which you have defined an attribute in the project.
t/f: when the logical key is used and the attributes are present on the report template, the SQL only needs to select the records from the fact table - it does not need to aggregate for a given fact or group by given attributes.
t/f: a fact table contains an ID that has an unmapped/nonexistant Attribute in the project, then MSTR does not include it in the logical key.
The volume of a LU tbl is determined by ___ of attributes and characteristic/descriptive attributes stored in the tbl.