Database Systems Chapter 2
Terms in this set (73)
Chapter 2 Summary - One
A data model is an abstraction of a complex real-world data environment. Database designers use data models to communicate with programmers and end users. The basic data-modeling components are entities, attributes, relationships, and constraints. Business rules are used to identify and define the basic modeling components within a specific real world environment.
Chapter 2 Summary - Two
The hierarchical and network data models were early models that are no longer used but some of the concepts are found in current data models.
Chapter 2 Summary - Three
The relational model is the current database implementation standard. In the relational model, the end user perceives the data as being stored in tables. Tables are related to each other by means of common values in common attributes. The entity relationship (ER) model is a popular graphical tool for data modeling that complements the relational model. The ER model allows database designers to visually present different views of the data - as seen by database designers, programmers, and end-users - and to integrate the data into a common framework.
Chapter 2 Summary - Four
The object-oriented data model (OODM) uses objects as the basic modeling structure. Like the relational model's entity, an object is described by its factual content. Unlike an entity, however, the object also includes information about relationships between the facts, as well as relationships with other objects, thus giving its data more meaning.
Chapter 2 Summary - Five
The relational model has adopted many object oriented (OO) extensions to become the extended relational data model (ERDM). Object/relational database management systems (O/R DBMS) were developed to implement the ERDM. At this point the OODM is largely used in specialized engineering and scientific applications, while the ERDM is primarily geared to business applications.
Chapter 2 Summary - Six
Emerging Big Data technologies such as Hadoop, MapReduce, and NoSQL provide distributed, fault-tolerant and cost-efficient support for Big Data analytics. NoSQL databases are a new generation of databases that do not use the relational model and are geared to support the very specific needs of Big Data organizations. NoSQL databases offer distributed data stores that provide high scalability, availability, and fault-tolerance by sacrificing data consistency and shifting the burden of maintaining relationships and data integrity to the program code.
Chapter 2 Summary - Seven
Data-modeling requirements are a function of different data views (global versus local) and the level of data abstraction. The American National Standards Institute Standards Planning and Requirements Committee ( ANSI / SPARC) describes three levels of data abstraction: external, conceptual, and internal. The fourth and lowest level of data abstraction, called the physical level, is concerned exclusively with physical storage methods.
The process of creating a specific data model for a determined problem domain. A problem domain is a clearly defined area within the real-world environment with a well-defined scope and boundaries that will be systematically addressed. It is a progressive process, you start with a simple understanding of the problem domain, and as your understanding increases, so does the level of detail of the data model.
A representation, usually graphic, of a complex "real-world" data structure. Data models are used in the database design phase of the Database Life Cycle. An implementation ready data model should contain the following:
1. A description of the data structure that will store the end user data
2. A set of enforceable rules to guarantee the integrity of the data
3. A data manipulation methodology to support the real-world transformations.
In short, data models are a communication tool.
A person, place, thing, concept, or event for which data can be stored. Also see attribute.
A characteristic of an entity or object. An attribute has a name and a data type.
For example, a CUSTOMER entity would be described by attributes such as customer last name, customer first name, customer phone number, customer address, and customer credit limit.
An association between entities
One-to-Many ( 1:M or 1..*) Relationship
Associations among two or more entities that are used by data models. In a 1:M relationship, one entity instance is associated with many instances of the related entity.
Many-to-Many (M:N or
Association among two or more entities in which one occurrence of an entity is associated with many occurrences of a related entity and one occurrence of the related entity is associated with many occurrences of the first entity.
One-to-One (1:1 or 1..1) Relationship
Associations among two or more entities that are used by data models in a 1:1 relationship, one entity instance is associated with only one instance of the related entity.
A restriction placed on data, usually expressed in the form of rules. For example, "A student's GPA must be between 0.00 and 4.00." Constraints are important because they help to ensure data integrity.
A description of a policy, procedure, or principle within an organization. For example, a pilot cannot be on duty for more than 10 hours during a 24hr period, or a professor may teach up to four classes during a semester.
The process of identifying and documenting business rules is essential to database design for several reasons:
1. It helps to standardize the company's view of data.
2. It can be a communication tool between users and designers.
3. It allows the designer to understand the nature, role, and scope of the data.
4. It allows the designer to understand business processes.
5. It allows the designer to develop appropriate relationship participation rules and constraints and to create an accurate data model.
An early database model whose basic concepts and characteristics formed the basis for subsequent database development. This model is based on an upside-down tree structure in which each record is called a segment. The top record is the root segment. Each segment has a 1:M relationship to the segment directly below it.
In the hierarchical data model, the equivalent of a file system's record type.
An early data model that represented data as a collection of record types in 1:M relationships.
A logical grouping of database objects, such as tables, indexes, views and queries, that are related to each other. The conceptual organization of the entire database as viewed by the database administrator.
The portion of the database that interacts with application programs that actually produce the desired information from the data within the database.
Data Manipulation Language (DML)
The set of commands that allows an end-user to manipulate the data in the database, such as SELECT, INSERT, UPDATE, DELETE, COMMIT, and ROLLBACK.
Data Definition Language (DDL)
The language that allows a database administrator to define the database structure, schema, and subschema.
Developed by E.F. Codd of IBM in 1970, the relational model is based on mathematical set theory and represents data as independent relations. Each relation (table) in conceptually represented as a two-dimensional structure of intersecting rows and columns. The relations are related to each other through the sharing of common entity characteristics ( values in columns).
A logical construct perceived to be a two dimensional structure composed of intersecting rows (entities) and columns (attributes) that represents an entity set in the relational model.
In the relational model, a table row.
Relational Database Management System (RDBMS)
A collection of programs that manages a relational database. The RDBMS software translates a user's logical requests (queries) into commands that physically locate and retrieve the requested data.
A graphical representation of a relational database's entities, the attributes within those entities, and the relationships among the entities.
Entity Relationship (ER) Model (ERM)
A data model that describes relationships (1:1, 1:M, M:N) among entities at the conceptual level with the help of ER diagrams. The model was developed by peter chen.
Entity Relationship Diagram (ERD)
A diagram that depicts an entity relationship model's entities, attributes and relations.
Entity Instance (Entity Occurance)
A row in a relational table.
A collection of like entities.
The type of relationship between entities. Clarifications include 1:1, 1:M, M:N.
See entity relationship (ER) model
Crow's Foot Notation
A representation of the entity relationship diagram that uses a three pronged symbol to represent the "many" sides of the relationship
Class Diagram Notation
The set of symbols used in the creation of class diagrams.
Object Oriented Data Model (OODM)
A data model whose basic modeling structure is an object. Unlike an entity, an object includes information about relationships between the facts about the object, as well as information about its relationships with other objects. Therefore, the facts within the object are given greater meaning.
An abstract representation of a real-world entity that has a unique identity, embedded properties, and the ability to interact with other objects and itself.
Object Oriented Database Management System (OODBMS)
Data management software used to manage data in an object -oriented database model.
Semantic Data Model
The first of a series of data models that more closely represented the real world, modeling both data and their relationships in a single structure known as an object. The SDM, published in 1981, was developed by M. Hammer and D. McClead.
A collection of similar objects with shared structure (attributes) and behavior (methods). A class encapsulates an object's data representation and a method's implementation . Classes are organized in a class hierarchy.
The organization of classes in a hierarchical tree in which each parent class is a superclass and each child class is a subclass. also see inheritance.
In the object-oriented data model, the ability of an object to inherit the data structure and methods of the classes above it in the class hierarchy. also see class hierarchy.
Unified Modeling Language (UML)
A language based on object-oriented concepts that provides tools such as diagrams, and symbols to graphically model a system.
A diagram used to represent data and their relationships in UML object notation.
Extended Relational Data Model (ERDM)
A model that includes the object-oriented model's best features in an inherently simpler relational database structural environment. see extended entity relationship model (EERM)
Object/Relational Database Management System (O/R DBMS)
A DBMS based on the extended relational model (ERDM). The ERDM, championed by many relational database researchers, constitutes the relational model's response to the OODM. This model includes many of the object-oriented model's best features within an inherently simpler relational database structure.
Extensible Markup Language (XML)
A metalanguage used to represent and manipulate data elements. Unlike other markup languages , XML permits the manipulation of a document's data elements. XML facilitates the exchange of structured documents such as orders and invoices over the internet.
A movement to find new and better ways to manage large amounts of web-generated data and derive business insight from it, while simultaneously providing high performance and scalability at a reasonable cost.
3 V 's
The three characteristics of big data databases:
1. Volume- refers to the amounts of data being stored
2. Velocity- refers to not only the speed with which data grows but also to the need to process this data quickly in order to generate information and insight.
3. Variety- refers to the fact that the data being collected comes in multiple different data formats.
A java based, open source, high speed, fault tolerant distributed storage and computational framework. Hadoop uses low-cost hardware to create clusters of thousands of computer nodes to store and process data.
Hadoop Distributed File System (HDFS)
A highly distributed, fault-tolerant file storage system designed to manage large amounts of data at high speeds.
One of three types of nodes used in the Hadoop Distributed File System (HDFS). The name node stores all the metadata about the file system. see also client node and data node.
One of the three types of nodes used in the Hadoop Distributed File System (HDFS). The data node stores fixed-size data blocks (that could be replicated to other da-ta nodes). see also client node and name node.
One of three types of nodes used in the Hadoop Distributed File System (HDFS). The client node acts as the interface between the user application and the HDFS. see also name node and data node.
An open-source application programming interface (API) that provides fast data analytics services; one of the main Big Data technologies that allows organizations to process massive data stores.
A new generation of database management systems that is not based on the traditional relational database model.
A data model based on a structure composed of two data elements: a key and a value, in which every key has a corresponding value or set of values. The key-value data model is also called the associative or attribute-value data model.
A case in which the number of table attributes is very large but the number of actual data instances is low.
A model for database consistency in which updates to the database will propagate through the system so that all data copies will be consistent eventually.
The application programmer's view of the data environment. Given its business focus, an external model works with data subset of the global database schema.
The specific representation of an external view; the end user's view of the data environment.
The output of the conceptual design process. The conceptual model provides a global view of an entire database and describes the main data objects, avoiding details.
A representation of the conceptual model, usually expressed graphically. see also conceptual model.
A property of any model of application that does not depend on the software used to implement it.
A condition in which a model does not depend on the hardware used in the model's implementation. Therefore, changes in the hardware will have no effect on the database design at the conceptual level.
A stage in the design phase that matches the conceptual design to the requirements of the selected DBMS and is therefore software-dependent. Logical design is used to translate the conceptual design into the internal model for a selected database management system, such as DB2, SQL Server, Oracle, IMS, Informix, Access, or Ingress.
In database modeling, a level of data abstraction that adapts the conceptual model to a specific DBMS model for implementation. The internal model is the representation of a database as "seen" by the DBMS. In other words, the internal model requires a designer to match the conceptual model's characteristics and constraints to those of the selected implementation model.
A representation of an internal model using the database constructs supported by the chosen database
A condition in which the internal model can be changed without affecting the conceptual model. (the internal model is hardware-independent because it is unaffected by the computer on which the software is installed. Therefore, a change in storage devices or operating systems will not affect the internal model.)
A model in which physical characteristics such as location, path, and format are described for the data. The physical model is both hardware- and software- dependent. see also physical design.
A condition in which the physical model can be changed without affecting the internal model.