Upgrade to remove ads
Terms in this set (34)
Performance tuning is all about
it's a set of procedures that reduce the response time of a database system. all factors operate at optimum level with minimal bottlenecks
People are calling commands to the RDBMS, and the critical backend connects to the backend data. Performance tuning aims to make this as streamlined as possible.
Client side SQL performance tuning
Generates SQL query that returns the correct answer in the least amount of time
Server side DBMS performance tuning
DBMS environment configured to respond to clients' requests as fast as possible
All data in database stored in data files.
*Data files automatically expand in predefined increments known as "extends" (every time it's full at 2GB)
Data files are grouped in file groups or table spaces
* Table space or file group: logical grouping of several data files that store data with similar characteristics
Data cache or buffer cache: shared, reserved memory area. Stores most recently accessed data blocks in RAM. Runs the SQL queries and does data calculations.
SQL cache: stores the most recently executed SQL statements or PL/SQL procedures
Makes use of the SQL cache and the Data cache. The optimizer looks to see if the access plan for the query already exists in the SQL cache, if so it uses it. analyzes SQL queries and finds the most efficient way to access the data
Set of instructions generated when the application is compiled that predetermines how the application will access the database at run time. Stored in the SQL cache.
SQL Cache or Procedure cache
A shared, reserved memory area that stores the most recently executed SQL statements or PL/SQL procedures, including triggers and functions.
DBMS retrieves data from permanent storage and places them in RAM. Input/Output request: low-level data access operation that reads or writes data from the computer devices (note: reads fetch entire disk datablocks)
Data cache is faster than working with data files.
Majority of performance-tuning activities focus on minimizing I/O operations (the bottleneck, critical backend)
Listens for clients' requests and handles the processing of the SQL requests to other DBMS processes. Once a valid request is received, the listener passes the request to the appropriate user process.
The DBMS creates a user process to manage each client session. Therefore, when you log on to the DBMS, you are assigned a user process. This process handles all requests you submit to the server.
The scheduler process organizes the concurrent execution of SQL requests. How to interleave concurrent transactions. Maximize the number of people that can run in parallel.
This process manages all locks places on database objects, including disk pages
Database Query Optimization Modes
Algorithms proposed for query optimization are based on selection of the optimum order for faster runtime, selection of sites to be accessed to minimize communication costs (what kind of pattern used for accessing the data).
Evaluated on operation mode (automatic/manual), timing of operation (static/dynamic), type of information (statistically based/rule based)
Operation Mode: Automatic query optimization
DBMS finds the most cost-effective access path without user intervention. You don't need to know anything about it, but it may become inefficient over time.
Operation Mode: Manual query optimization
Requires that the optimization be selected and scheduled by the end user or programmer. More powerful, but need to know a lot about what you're optimizing.
Timing of operation: Static query optimization
best optimization strategy is selected when the query is compiled by the DBMS
- at compilation time, best for embedded queries
maybe not valid after a while
Timing of operation: Dynamic query optimization
access strategy is dynamically determined by the DBMS at run time, using the most up-to-date information about the database
- more overhead processing
Type of information: Statistically based query optimization algorithm
Statistics are used by the DBMS to determine the best access strategy (constantly gathered as people are accessing the data). Data about tables, indexes, resources)
Stats info generated through:
dynamic statistical generation mode - auto eval (e.g. have it run every few hours)
manual statistical generation mode - via user (whenever the user runs it)
A cost-based optimizer takes into account the processing cost, RAM cost, I/O cost, etc.
reminds him of machine learning
Type of information: Rule-based query optimization algorithm
based on a set of user-defined rules to determine the best query access strategy. Go to an expert to write rules about the best access strategy.
A rule-based optimizer minimizes the cost.
reminds him of AI
Query Processing - how does your SQL query run?
Parsing -DBMS parses the SQL query, makes sure it's valid (syntax, access rights) and chooses the most efficient access/execution plan. Done by the query optimizer
SQL Cache stores the access plan
Execution -DBMS executes the SQL query using the chosen execution plan, retrieves data blocks from data files, place data blocks in data cache
Data Cache - run your SQL query
Fetching -DBMS sends the result set back to the client
SQL parsing phase (done by query optimizer)
1. Query is broken down into smaller units
2. Original SQL query is transformed into slightly different version of the original SQL code which is fully equivalent and more efficient
3. Query optimizer - analyzes SQL query and finds most efficient way to access data
4. Access plans - DBMS-specific and translate client's SQL query into a series of complex I/O operations. If access plan already exists for query in the SQL cache DBMS reuses it. Otherwise it evaluates various plans and chooses one to be placed in the SQL cache for use.
SQL Execution phase
All I/O operations indicated in the access plan are executed. Locks are acquired, data are retrieved and placed in the data cache.
Transaction management commands are processed.
NOTE: Execution here refers to executing the access plan (fetching/sending data from/to the backend database) the SQL query execution is in the next step.
SQL Fetching Phase
Rows of resulting query result set are returned to the client. DBMS may use temporary table space to store temporary data. Database server coordinates the movement of the result set rows from the server cache to the client cache (e.g. will send blocks of rows to the client, wait for the next request, send next block..)
Query Processing Bottlenecks
Delay introduced in the processing of an I/O operation that slows the system. Caused by the:
-CPU (slow processor..)
-RAM (shared among running processes)
-Hard disk (disk speed, transfer rates..)
-Network - (bandwidth shared among clients)
- bad user code, poor bd design...
AKA buy the best equipment or more RAM, fastest connection, good SQL code etc.
Indexes and Query Optimization
Indexes help speed up data access
Facilitate searching, sorting, aggregate functions, join operation s
Ordered set of values that contain the index key and pointers
More efficient than a full table scan (runtime lookup costs using the index are vastly cheaper than doing full searches through non-indexed rows).
Number of different values a column could have, low sparsity => index might be useless
Data structures used to implement indexes
DBMSs determine best type of index to use
Is based on an ordered list of hash values, computed from a key column, using a hashing algorithm - it is much faster to search through the hash values than search the columns. Each hash value then points to the actual (column) data.
A user query (eg. LNAME="Johnson") is converted to a hash 'key' which is then used to search through the pre-computed list of hash values and retrieve an exact match or a small set of values that are all stored with the same key (as a result of 'hash collision').
B-tree index, bitmap index
B-tree index used in columns with high data sparsity (many different values). Bitmap index used in columns with low data sparsity (few different values)
The likelihood that an index will be used in query processing. Indexes are used when a subset of rows from a large table is to be selected based on a given condition. It cannot always improve performance
Function-based index: Based on a specific SQL function or expression (e.g. EMP_SALARY+COMMISSION)
Indexes are useful when:
Indexes are useful when:
- an indexable column occurs in a WHERE or HAVING search expression
- an indexable column appears in a GROUP BY or ORDER BY clause
- MAX or MIN is applied to an indexable column
- there is high data sparsity on an indexable column
Worth creating indexes on single columns that appear in WHERE, HAVING, ORDER BY, GROUP BY and join conditions.
Special instructions for the optimizer, embedded in the SQL command text.
use to affect the optimizer choices. Because it might not make the best decision (e.g. based on old statistics)
3 Optimizer hints
ALL_ROWS - minimize overall execution time
FROM PRODUCT WHERE P_QOH < 10;
FIRST_ROWS - minimize time needed to process first set of rows (generally for interactive processes)
INDEX(name) - forces the optimizer to use the name index to process this query.
YOU MIGHT ALSO LIKE...
Module 13 - Chapter 11: DB Performance Tuning and…
Database Performance Tuning and Query Optimization
Chapter 11 (MIS 4330) Performance Tuning
ADBMD CHAPTER 11
OTHER SETS BY THIS CREATOR
585 Lecture 3