WGU C175 Study Notecards

STUDY
PLAY

Terms in this set (...)

Flat Files
a file having no internal hierarchy
Hashed Files
A file that has been encrypted for security purposes.
Heap File
An unsorted set of records.
Information
The transformation of raw data into useful facts.
Punch Card
A card that is perforated and can hold commands or data.
Structured Data
Information with a high degree of organization.
Unstructured Data
Information that does not have structure (such as text)
Binary Relationship
A relationship between two entity types.
Unary Relationship
An associate occurrence of an entity type with other occurrences of the same entity type.
Cardinality
The maximum number of entities that can be involved in a particular relationship.
E-R Model

*E-R = Entity - Relationship
Diagram of entities together with their attributes and the relationship among them.
Intersection Data
It is data that describes a many-to-many relationship.
Modality
It is a minimum number of entity occurrences that can be involved in a relationship.
One-to-one Binary Relationship
It means that a single occurrence of one entity type can be associated with a single occurrence of the other entity type and vice versa.
Ternary Relationship
Involves three different entity types.
Unique identifier
It is used to uniquely identify each record in a database table.
Attribute
A property, characteristic, or fact that we know about an entity.
"A salesperson works in one office."

What is the name of this relationship?
One-to-one binary relationship
"A salesperson sells to many customers."

What is the name of this relationship?
One-to-many binary relationship
"A salesperson is authorized to sell many products, and a product can be sold by many salespersons."

What is the name of this relationship?
Many-to-many binary relationship
What is the positioning and meaning for Cardinality and Modality on an E-R model?
Cardinality is the outer symbol; represents the maximum.

Modality is the inner symbol; represents the minimum.
"A salesperson works in a minimum of one and a maximum of one office, and an office may be occupied by or assigned to a minimum of zero and a maximum of one salesperson."
"A salesperson may have no customers or many customers."
Describe the ER model for "Each salesperson is authorized to sell to at least one or many products, and each product can be sold by at least one or many salespeople."
"One salesperson backs-up another salesperson."

What is the name of this model?
One-to-one unary relationship
"A salesperson manages zero to many other salespersons, and a salesperson is managed by exactly one other salesperson."

What is the name of this model?
One-to-many unary relationship
"A product can either be part of no other products or be part of several other products, and a product can either be composed of no other products or be composed of several other products."

What is the name of this model?
Many-to-many unary relationship
What does 'refer' in Referential Integrity imply?
This revolves around the circumstance of trying to refer to data in one relation in the database, based on values in another relation.
Define the delete rule RESTRICT.
If the delete rule between two relations is RESTRICT and an attempt is made to delete a record on the "one side" of the one-to-many relationship, the system will forbid the delete to take place if there are any matching foreign key values in the relation on the "many side".
Define the delete rule CASCADE.
If the delete rule between two relations is CASCADE and an attempt is made to delete a record on the "one side" of the relationship, not only will the record be deleted but all of the records on the "many side" of the relationship that have a matching foreign key value will also be deleted.

In other words, the delete will "cascade" from one relation to the other.
Define the delete rule SET-TO-NULL.
If the delete rule between the two relations is SET-TO-NULL and an attempt is made to delete a record on the "one side" of the one-to-many relationship, that record will be deleted and the matching foreign key values in the records on the "many side" of the relationship will be set to null.

Essentially, it's exactly like the CASCADE delete option, but instead of completely deleting all possible values, the values are set to NULL instead.
Which entity is uniquely identified by concatenating the primary keys of the two entities it connects?
Associative Entity
Which type of entity is also called a dependent entity?
Weak Entity
Candidate Key
This is when a relation has more than one attribute or minimum group of attributes that represents a way of uniquely identifying the entity.
Concurrency Problem
When two or more users are trying to update the same record simultaneously.
Equijoin
Combines two or more tables based on a column that is common to the tables.

Example: Joining Client and Salesman tables that both contain the SalesmanID column which have the exact same values.
Foreign Key
When an attribute or group of attributes serves as the primary key of one relation and also appears in another relation.
Natural Join
Matches each row in a table against each row in another table based on common values found in columns sharing a common name and data type.
Tuple
Rows/records are referred to as tuples when talking about relations. They serve the exact same function, it just has a different name in the context of relations.
What are the five basic principles of The Database Concept?
1. The creation of a datacentric environment that is a significant company resource, which can be shared inside and outside the company.
2. The ability to achieve data integration while storing data in a non-redundant fashion.
3. The ability to store data representing entities involved in multiple relationships w/o introducing data redundancy.
4. Managing data control issues such as data security, backup and recovery, and concurrency control.
5. High degree of data independence.
What are the four major DBMS approaches?
- Hierarchical
- Network
- Relational
- Object-oriented
What are four key differences between a RELATION and a FILE?
- The columns of a relation can be arranged in any order w/o affecting the meaning of the data. That is not true of a file.
- Similarly, the rows of a relation can be arranged in any order, which is not true of a file.
- Every row/column position, sometimes referred to as a "cell", can have only a single value, which is not necessarily true in a file.
- No two rows of a relation are identical, which is not necessarily true in a file.
* in the SELECT clause
- It indicates that all attributes of the selected row are to be retrieved
AND operator
- It displays a record if more than one condition is true
AVG() function
- It returns the average value of a numeric column.
BETWEEN operator
- It allows you to specify a range of numeric values in a search.
DISTINCT operator
- It is used to eliminate duplicate rows in a query result.
IN operator
- It allows you to specify a list of character strings to be included in a search
JOIN clause
- It is used to combine rows from more than one table, based on a common field between them. Sometimes it is done by using the '=' symbol.
LIKE operator
- It allows you to specify partial character strings in a "wildcard" sense.
OR operator
- It displays a record it either the first condition OR the second condition is true.
ORDER BY clause
- It simply takes the result of a SQL query and orders them by one or more specified attributes.
SELECT command
- Data retrieval in SQL is accomplished with the SELECT command.
Subquery
- When on SELECT statement is "nested" within another in a format, it is known as subquery. This is shown when there is a second SELECT phrase within a set of parenthesis.
Common DDL commands:
- DROP
- ALTER
- RENAME
- CREATE
- TRUNCATE
Common DML commands:
- UPDATE
- DELETE
- INSERT
- MERGE
- SELECT
Write the basic SQL query command:
SELECT<columns>
FROM<table>
WHERE<predicates identifying rows to be included>
Write the SQL query to "Find the commission percentage and year of hire of salesperson 186":
SELECT COMMPERCT, YEARHIRE
FROM SALESPERSON
WHERE SPNUM=186;
Write the SQL query to "Retrieve the entire record for salesperson 186":
SELECT *
FROM SALESPERSON
WHERE SPNUM=186;
Write the SQL query to "List the salesperson numbers and salesperson names of those salespersons who have a commission percentage of 10.":
SELECT SPNUM, SPNAME
FROM SALESPERSON
WHERE COMMPERCT=10;
Write the SQL query to "List the salesperson numbers, salesperson names, and commission percentages of the salespersons whose commission percentage is less than 12.":
SELECT SPNUM, SPNAME, COMMPERCT
FROM SALESPERSON
WHERE COMMPERCT<12;
Write the SQL query to "List the customer numbers and headquarters cities of all customers that have a customer number of at least 1700":
SELECT CUSTNUM, HQCITY
FROM CUSTOMER
WHERE CUSTNUM>=1700;
Write the SQL query to "List the customer numbers, customer names, and headquarters cities of the customers that are headquartered in New York and that have a customer number higher than 1500":
SELECT CUSTNUM, CUSTNAME, HQCITY
FROM CUSTOMER
WHERE HQCITY='New York'
AND CUSTNUM>1500;
Write the SQL query to "List the customer numbers, customer names, and headquarters cities of the customers that are headquartered in New York OR that have customer numbers higher than 1500":
SELECT CUSTNUM, CUSTNAME, HQCITY
FROM CUSTOMER
WHERE HQCITY='New York'
OR CUSTNUM>1500;
Write the SQL query to "List the customers, customer names, and headquarters cities of the customers that are headquartered in New York or that satisfy the two conditions of having a customer number higher than 1500 and being headquartered in Atlanta":
SELECT CUSTNUM, CUSTNAME, HQCITY
FROM CUSTOMER
WHERE HQCITY='New York'
OR (CUSTNUM>1500
AND HQCITY='Atlanta');
Write the SQL query to "List the customer records for those customers whose names begin with the letter 'A' ":
SELECT *
FROM CUSTOMER
WHERE CUSTNAME LIKE 'A%';
Write the SQL query to "Find the customer numbers, customer names, and headquarters cities of those customers with the customer numbers greater than 1000. List the results in alphabetic order by headquarters cities (and have the customer names within the same city alphabetized)":
SELECT CUSTNUM, CUSTNAME, HQCITY
FROM CUSTOMER
WHERE CUSTNUM>1000
ORDER BY HQCITY, CUSTNAME;
Write the SQL query to "Find the average quantity of units of the different products that Salesperson 137 has sold":
SELECT AVG(QUANTITY)
FROM SALES
WHERE SPNUM=137;
Write the SQL query to "Find the total quantity of units of all products that Salesperson 137 has sold":
SELECT SUM(QUANTITY)
FROM SALES
WHERE SPNUM=137;
Write the SQL query to "Find the name of the salesperson responsible for Customer Number 1525":
SELECT SPNAME
FROM SALESPERSON, CUSTOMER
WHERE SALESPERSON.SPNUM=CUSTOMER.SPNUM
AND CUSTNUM=1525;
Write the SQL query to "List the NAMES of the products of which salesperson Adams has sold more than 2000 units":
SELECT PRODNAME
FROM SALESPERSON, PRODUCT, SALES
WHERE SALESPERSON.SPNUM=SALES.SPNUM
AND SALES.PRODNUM=PRODUCT.PRODNUM
AND SPNAME='Adams'
AND QUANTITY>2000;
CREATE TABLE command
The command that creates base tables and tells the system what attributes will be in them.
CREATE VIEW command
Specifies the base tables on which the view is to be based and the attributes and rows of the table that are to be included in the view.
DELETE command
Specify which row(s) of a table are to be deleted based on data values within those rows.
DROP TABLE command
Discards an entire table from a database.
DROP VIEW command
Discards views.
Normalization
The process of organizing the fields and tables of a relational database to minimize redundancy (duplication) and dependency.
Second Normal Form
All non-key attributes must be functionally dependent on the entire key of that table.
Third Normal Form
Non-key attributes are not allowed to define other non-key attributes.
What are three important points about Third Normal Form?
1. It is completely free of redundancy
2. All foreign keys appear where needed to logically tie together related tables.
3. It is the same structure that would have been derived from a properly drawn entity-relationship diagram of the same business environment.
Write the SQL query to "Add a new salesperson into the SALESPERSON table whose salesperson number is 489, name is Quinlan, commission percentage is 15, year of hire is 2011, and department number is 59.":
INSERT INTO SALESPERSON
VALUES
('489','Quinlan',15,'2011','59');

*Hint, this is DML, so remember that INSERT is one of the keywords for DML.
Write the SQL query to "Delete the row for salesperson 186 from the SALESPERSON table.":
DELETE FROM SALESPERSON
WHERE SPNUM = '186';
What is the correct syntax of the INSERT command?
INSERT INTO table_name VALUES (value1,value2,value3,...):
What is the correct syntax of the CREATE VIEW command?
CREATE VIEW view_name AS
SELECT column_name(s)
FROM table_name
WHERE condition
What is called a decomposition process?
Data normalization
In which of the normal forms should every non-key attribute be fully functionally dependent on the entire key of a table?
Second form
What is the correct syntax of the CREATE TABLE command?
CREATE TABLE table_name (
column_name data_type(size),
);
What is the correct syntax of the UPDATE command?
UPDATE table_name
SET column1=value1,column2=value2,...
WHERE some_column=some_value;
Association Rules
Association rules specify a relation between attributes that appears more frequently than expected if the attributes were independent.
Business Intelligence
The processes, technologies, and tools needed to turn data into information, information into knowledge, and knowledge into plans that drive profitable business action.
Classification
Classification involves examining the attributes of a particular object and assigning it to a defined class.
Clustering
Clustering is the task of taking a large collection of objects and dividing them into smaller groups of objects that exhibit some similarity.
Affinity Grouping
Affinity Grouping is a process of evaluating relationships or associations between data elements that demonstrate some kind of affinity between objects.
What are the values of Business Intelligence?
- Financial value associated w/ increased profitability.
- Productivity value associated with increased throughput.
- Trust value (customer, employee, supplier satisfaction) as well as increased confidence in forecasting.
- Risk value - decreased risk associated with decision making
What are the reasons for using the Dimensional Model for Business Intelligence?
- Simplicity.
- Lack of bias.
- Extensibility.
What are the fundamental aspects of a Data Warehouse?
- Centralized repository of information.
- Organized around relevant subject areas.
- Provides platform for queries.
- Used for analysis and not transactional
processing.
- Data is nonvolatile.
- Target location for integrating data from multiple sources.
What is the general theme of the ETL process?
1. Get the data
2. Map the data to staging area
3. Validate and clean the data
4. Apply necessary transformations
5. Map data to loading model
6. Move data to repository
7. Load data to warehouse
What is the key factor based on the need for linear scalability?
Performance
What is used for populating summaries or any cube dimensions that can be performed at the staging area (ETL)?
Aggregation
What data mining activity is a process of assigning some continuously valued numeric value to an object?
Estimation
What includes exploiting the discovery of table and foreign keys for representing linkage between different tables?
Integration
What data mining activity is the process of organizing data into predefined classes?
Classification
Which activity groups data members that have similarities?
Clustering
Data Warehouse
A data warehouse is the primary source of information that feeds the analytical processing within an organization.
Data Mart
A data mart is a subject-oriented data repository, similar in structure to the enterprise data warehouse, but it's main purpose is to serve directed reporting and drill down into specific data.
OLAP
OLAP (Online Analytical Processing) is both a process of viewing comparative metrics via a multidimensional analysis of data and the infrastructure to support that process.
OTAP
OTAP (online transaction processing) provides a means for presenting data sourced from a data warehouse or a data mart in a way that allows the data consumer to view comparative metrics across multiple dimensions.
Cartesian product
Usually the result of a missing join condition or a method of expanding the data of 1 table by the number of rows in the second table.
Data volatility
Describes how often stored data is updated.
DCL
Data control language is used to control access to data stored in a database.
Definer
Definer is a MySQL term where AuthID is the same for another DBMS
Domain of values
The shared values between a primary key and foreign key.
Extraction essentially boils down to two questions:
1. What data should be extracted?
2. How should that data be extracted?
Inner join
Shows row that have matches in both tables
Logical view
Logical view is a mapping onto a physical table or tables that allows an end user to access only a specified portion of data.
Outer join
Shows rows in one table that have no match in the other table. Two kinds of outer joins are left and right joins.
Referential Integrity
Referential integrity is a database concept that ensures that relationships between tables remain consistent.
Response Time
Is the delay from the time that the Enter Key is pressed to execute a query until the result appears on screen.
Scalar subquery
Is the most restrictive subquery because it produces only a single value
Superkey
A superkey is any number of columns that forces every row to be unique.
Throughput
Is the measure of how many queries from simultaneous users must be satisfied in a given period of time by the application set and the database that it supports.
UNION
To create a result set that combines the results from several queries
What are 6 potential problems that need to be considered with databases?
1. Size
2. Ease of updating
3. Accuracy
4. Security
5. Redundancy
6. Importance
OTHER SETS BY THIS CREATOR