A SQL statement involving two or more tables requires Oracle join data from multiple sources in order to achieve the desired result. The Oracle Server has several built in methods for performing joins, and each method has its own performance and resource usage qualities. While an application developer or DBA doesn’t need to explicitly tell Oracle which join method to use or the order in which different tables should be joined, a solid understanding of how Oracle processes joins allows us to design and build systems for optimal performance. In this presentation we’ll examine the different methods Oracle uses for joining data from multiple sources, and we’ll see how this understanding can be used to design better database schemas, tune SQL statements, and tune databases as a whole for better join performance. Since most SQL statements involve joins, tuning joins leads to improved overall system performance.
How Oracle Processes SQL Statements Involving Joins
Oracle’s optimizer must choose an execution plan for each SQL statement submitted. The plan will indicate how data will be accessed—which indexes will be used and so forth. If the SQL statement involves a join or a subquery, then the optimizer must also choose the order in which tables will be joined together, and the method to be used for each join operation.
A row source is a logical set of rows from a partially complete query. The row source may be rows in an actual table or rows in a temporary segment stored on disk or in memory. Oracle always joins two row sources at a time. Therefore, if a SQL statement joins three tables, Oracle will actually join two tables together and then join the results with the third table.
The order in which row sources are joined can have a significant impact on performance. The optimizer makes a list of all physically possible join orders and then chooses what it believes will be the best option. The cost-based optimizer makes this decision by estimating the cost of each option and choosing the one with the lowest cost. The rule-based optimizer, meanwhile, scores each option based on a fixed set of rules and chooses the option with the best score.
Outer joins limit the number of possible join orders because a table being outer joined must appear in the join order after the table it joins to.
Let’s use the following query as an example:
SELECT E1.name “Employee Name”, D.deptname “Department”,
E2.name “Manager Name”
FROM emp E1, dept D, emp E2
WHERE E1.empno = :empno
AND D.deptno = E1.deptno
AND E2.empno (+) = E1.mgr_empno;
Oracle could execute this SQL statement by joining emp table E1 to dept, and then joining the results to emp table E2. Alternatively, the order could be E1 - E2 - D. Note that the order D - E1 - E2 is also possible, but probably not too efficient. Also note that the order D - E2 - E1 is not physically possible because E2 is being outer joined to E1 and therefore must appear after E1 in the join order. And since D and E2 share no criteria in the WHERE clause, such a join order wouldn’t seem to make much sense anyway.
- The nested loops join
- The sort-merge join
- The cluster join and
- The hash join
Nested Loops Joins
Suppose somebody gave us a telephone book and a list of 20 names, and asked us to write down each person’s name and telephone number on a fresh sheet of paper. We would probably go down the list of names, looking each up in the telephone book one at a time. This would be pretty easy, because the telephone book is alphabetized by name. Moreover, somebody looking over your shoulder could begin calling the first few numbers we write down while we are still looking up the rest. This scene describes a nested loops join.
In a nested loops join, Oracle reads the first row from the first row source and then checks the second row source for matches. All matches are then placed in the result set and Oracle goes on to the next row from the first row source. This continues until all rows in the first row source have been processed. The first row source is often called the outer table or driving table, while the second row source is called the inner table.
Nested loops joins are also flexible in that any two row sources can always be joined by nested loops—regardless of join condition and schema definition.
However, nested loops joins can be very inefficient if the inner row source does not have an index on the joined columns or if the index is not highly selective. Also, if the driving row source is quite large, other join methods may be more efficient.
Suppose two salespeople attend a conference and each collects over 100 business cards from potential new customers. Each salesperson now has a pile of cards in random order, and they want to see how many cards are duplicated in both piles. So each salesperson alphabetizes their pile, and then they call off names one at a time. Since both piles of cards have been sorted, it becomes much easier to find the names that appear in both piles. This scene describes a sort-merge join.
In a sort-merge join, Oracle sorts the first row source by its join columns, sorts the second row source by its join columns, and then “merges” the sorted row sources together. As matches are found, they are put into the result set.
Sort-merge joins can be effective when lack of data selectivity or useful indexes render a nested loops join inefficient, or when both of the row sources are quite large. However, sort-merge joins can only be used for equijoins (WHERE D.deptno = E1.deptno, as opposed to WHERE T.temperature >= D.daily_avg_temperature). Also, sort-merge joins require temporary segments for sorting. This can lead to extra memory utilization and extra disk I/O in the temporary tablespace.
Imagine a filing cabinet full of personnel records. For each employee there is a separate folder containing a basic information sheet, a list of vacation days used, and other papers pertaining to the employee. If the boss wants to see a report that lists each employee’s name, social security number, and number of vacation days used this year, then we could work through the filing cabinet one file at a time. When we pull out one employee’s file, we immediately have access to the basic information sheet (which has the employee’s social security number on it) and the list of vacation days used. If we imagine that the basic information is stored in one database table and the vacation days used in another, then this scene could describe a cluster join.
A cluster join is really just a special case of the nested loops join. If the two row sources being joined are actually tables that are part of a cluster and if the join is an equijoin between the cluster keys of the two tables, then Oracle can use a cluster join. In this case, Oracle reads each row from the first row source and finds all matches in the second row source by using the cluster index. In the personnel cabinet example, the filing cabinet is the cluster and the individual employee folders are the cluster key.
Cluster joins are extremely efficient, since the joining rows in the two row sources will actually be located in the same physical data block. However, clusters carry certain caveats of their own, and you can’t have a cluster join without a cluster. Therefore, cluster joins are not very commonly used. In a later section we’ll look at the pros and cons of clusters.
In a hash join, Oracle reads all of the join column values from the second row source, builds a hash table, and then probes the hash table for each of the join column values from the first row source. This is like a nested loops join, except that first Oracle builds a hash table to facilitate the operation.
Hash joins can be effective when the lack of a useful index renders nested loops joins inefficient. The hash join might be faster than a sort-merge join in this case because only one row source needs to be sorted, and could possibly be faster than a nested loops join because probing a hash table in memory can be faster than traversing a B-tree index. As with sort-merge joins and cluster joins, though, hash joins only work on equijoins. Also, as with sort-merge joins, hash joins use memory resources and can drive up I/O in the temporary tablespace. Finally, hash joins are only available in Oracle 7.3 and later, and then only when cost-based optimization is used.
Effective Schema Design
The database schema plays an important role in tuning joins. Certain features in the schema, or the absence of these features, will limit the options available to the optimizer for joining tables efficiently. Also, poorly placed features in the schema can fool the optimizer into making a poor decision.
All primary and unique keys should be identified and appropriate constraints declared. This will cause the creation of unique indexes on the primary and unique key columns. Indexes should usually be created explicitly on foreign key columns to facilitate joins and to eliminate locking problems when Oracle enforces referential integrity constraints.
Consider an orders table with primary key order_id, and an order_lines table with primary key order_line_id and foreign key order_id referencing the orders table.
SELECT O.order_number, L.line_number, L.item_number, L.quantity
FROM orders O, order_lines L
WHERE O.order_id = :order_id
AND L.order_id = O.order_id
ORDER BY O.order_number, L.line_number
If the database will contain many orders, then the order_id column in the order_lines table is very selective and should be indexed. This index, plus the unique index on the order_id column of the orders table will allow Oracle to execute this query extremely efficiently using a nested loops join.
First Oracle will use the unique index on the orders table to create a row source with one row from the orders table. Then Oracle will join this row source with all rows from the order_lines table. The index on the order_id column of the order_lines table allows Oracle to efficiently find all order lines for the specified order. Without this index, Oracle would probably perform a sort-merge join or hash join, requiring a scan of all rows in the order_lines table.
Its not a hard and fast rule, though, that all foreign key columns should be indexed. Indexes should not be placed on columns that are not selective. Such indexes could trick the optimizer into choosing a bad execution plan. Suppose that an order tracking application has a table called order_placement_methods that lists how orders can be placed and an orders table that lists active orders. The orders table has an order_placement_method_id referencing the order_placement_methods table to indicate how a specific order was placed. If orders can be placed by telephone or via the Internet, and it is expected that roughly half the orders will be placed each way, then the foreign key in the orders table should definitely not be indexed. This index would not be selective, always retrieving about half the rows in the orders table.
The Star Formation
Beginning in Oracle 7.3, the optimizer recognizes a schema design known as a star formation. This design is common in data warehouses, and the optimizer can handle queries against these types of schemas using a special nested loops join that is an exception to the “only two row sources can be joined at a time” rule.
In a star formation, several relatively small tables called dimension tables or lookup tables have simple primary keys. A large table called the fact table has a primary key that is a concatenation of the primary keys of the dimension tables. A schema with small tables for time, region, and product and a large table for sales figures would follow a star formation if each record in the sales table indicated sales of one product in one region during one time period.
The following query would be executed very efficiently if the primary key on the sales table consists of the concatenation of the primary keys of the other three tables:
FROM time T, regions R, products P, sales S
WHERE T.end_date BETWEEN SYSDATE - 1 AND SYSDATE
AND R.name = ‘West Coast’
AND P.bar_code = ‘184901724’
AND S.time_id = T.time_id
AND S.region_id = R.region_id
AND S.product_id = P.product_id
Clusters can improve join performance when two or more tables are usually accessed together, as in a master-detail relationship. Queries that join the clustered tables together run faster because each master record is stored in the same data block with its related detail records. Even a single table that is frequently self-joined or queried for all records with the same value of a key can benefit from being stored in a cluster.
However, updates are slower when a table is clustered, as are full table scans. Also, storage tends not to be as efficient in a cluster. Therefore, much thought should be put in before clustering tables. In practice, clusters are not frequently used.
Because each node of a distributed database is an autonomous unit, distributed queries can be a serious bottleneck. Whereas Oracle can efficiently join two row sources located on the same node, joins between a local row source and a remote one can be costly. In a cross-node join, the local node must connect to the remote node and then formulate standard SQL statements to extract the desired data from the remote node. The optimizer on the local node does not have access to the remote node’s optimizer and therefore must make educated guesses as to the optimal way to execute the distributed query.
In many cases, distributed queries can be tuned to work efficiently. This will be discussed in the section below on tuning SQL. For distributed queries that just can’t be made efficient, or where maximum performance is critical, replication can be used. Simple snapshots or Oracle’s Advanced Replication Option can be used to copy data from a remote node so that queries do not need to join row sources across nodes.
Choosing Data Types
Choose data types for all primary and unique keys carefully, and remain consistent throughout the schema. For example, if the primary key on an invoice table is a column called invoice_number and you select a data type of NUMBER(9) for this column, then always use this same data type for invoice numbers throughout the entire schema. If you were to use a data type of VARCHAR2(10) for the invoice_number column of the invoice_items table, then Oracle would need to perform a data type conversion when joining the invoices and invoice_items tables. This conversion costs CPU cycles and can sometimes defeat indexes, having a disastrous effect on performance.
Primary and unique keys for large tables or tables that are heavily joined should be kept small and simple. I’ve worked on multiple projects where a critical table had as its primary key a column with a data type such as VARCHAR2(100). The primary key was actually a series of strings, concatenated together in order to make a unique “key”. Such a primary key is expensive because the large key must be carried in every referencing table, and also because Oracle must compare long strings when performing a join.
A NUMBER(9) or VARCHAR2(5) column requires far fewer bytes and allows faster comparisons during a join. If the true unique key for a table in your application involves many columns or long strings, then consider declaring a unique key on these columns and adding a surrogate primary key column with data type NUMBER(9) that is populated from a sequence. All referencing tables could then carry the NUMBER(9) value instead of the more complex true unique key.
The more a database schema is designed with an eye toward optimizing joins, the less individual SQL statements need to be tuned. But still, SQL is such a flexible language that there are usually many ways to write a query. Being able to identify the efficient ways is a good skill to have.
The Usual Suspects
All of the usual fundamentals of tuning SQL still apply when tuning joins, including such techniques as using selective indexes where possible, avoiding unintentional index defeat caused by NVL() and other functions, and avoiding excessive sorting action by overworking the DISTINCT and GROUP BY operators. Since this presentation already has so much ground to cover, these basic tuning concepts will not be discussed here.
Early Elimination of Candidate Rows
Suppose we have a list of 1000 residents of your town along with each resident’s street address, and we are asked to prepare an alphabetized list of residents who have propane delivered to their home.
We could first alphabetize the list of 1000 names and then look up each street address in the list of propane delivery locations, but instead you probably would look up each street address first and then alphabetize second. Why? Either way you will need to perform 1000 street address lookups. However, these lookups will eliminate many names from the list (since not all residents have propane delivered to their home) and thus your alphabetizing job will be easier.
We can apply the same concept when writing SQL that joins tables together. The Oracle optimizer is pretty smart about choosing the most efficient order in which to do things, but how a query is written can constrain the options available to the optimizer.
The query below leaves the optimizer no choice but to read all of Acme’s invoice lines, when in fact only the unpaid invoices are of interest:
SELECT V.vendor_num, I.invoice_num, SUM (L.amount)
FROM vendors V, invoices I, invoice_lines L
WHERE V.vendor_name = ‘Acme’
AND L.vendor_num = V.vendor_num
AND I.vendor_num = L.vendor_num
AND I.invoice_num = L.invoice_num
AND I.paid = ‘N’
GROUP BY V.vendor_num, I.invoice_num
ORDER BY I.invoice_num
This query could be rewritten as follows:
SELECT V.vendor_num, I.invoice_num, SUM (L.amount)
FROM vendors V, invoices I, invoice_lines L
WHERE V.vendor_name = ‘Acme’
AND I.vendor_num = V.vendor_num
AND I.paid = ‘N’
AND L.vendor_num = I.vendor_num
AND L.invoice_num = I.invoice_num
GROUP BY V.vendor_num, I.invoice_num
ORDER BY I.invoice_num
In the rewritten query, the optimizer eliminates all of the paid invoices before joining to the invoice_lines table. If most of the invoices in the database have already been paid, then the rewritten query will be significantly faster. (The schema design in this example is dubious, and is used for illustrative purposes only.)
Forcing a Specific Join Method
When choosing an execution plan for a query involving joins, the optimizer considers all possible join methods and join orders. The optimizer does its best to evaluate the merits of each option and to choose the optimal execution plan, but sometimes there is just no substitute for human intelligence.
In these situations the USE_NL, USE_MERGE, and USE_HASH hints can be used to request a specific join method, and the ORDERED hint can be used to request a specific join order. The optimizer will do its best to observe the wishes of these hints, but if you ask for something impossible (such as a sort-merge join on an anti-join) the hint will be ignored. Note that there is no hint to request a cluster join.
Distributed queries can sometimes be disastrous for performance. In particular a nested loops join between two row sources on separate nodes of a distributed database can be very slow. Figure 4 shows a simple distributed query and its execution plan. This query will be slow because for each row retrieved from the customers table, a separate query will be dispatched to the remote node in order to see if there is an open booking for the customer. This will result in many small network packets moving between the two nodes of the database, and the network latency and overhead will degrade performance.
When distributed queries cannot be avoided, use IN clauses, set operators such as UNION and MINUS, and whatever else we can to reduce the network traffic between nodes of the database. Using views will not necessarily improve performance. Views will hide the complexity from the application developer, but all of the complexity and issues are still there under the covers.