Friday, March 17, 2023

Testing Different Access Paths : Concatenated Index

Oracle Concatenated Indexes - Practical Deep Dive
Oracle concatenated index deep dive

Concatenated IndexesHow composite indexes really work, why column order matters, and when skip scan changes the story

Concatenated indexes, also called composite indexes, are easy to explain badly and surprisingly rich to explain well. The usual summary is “Oracle can use the index only when the leading column is present,” but that is only the starting point. To design them properly, you need to think about leading portions, equality versus range predicates, ordering requirements, skip scan eligibility, covering behavior, and whether one composite index can replace several single-column indexes in a given workload.

What a concatenated index actually is

A concatenated index is a single index built on more than one column, such as (department_id, job_id) or (customer_id, order_date, status). These are also known as composite indexes, and their usefulness depends heavily on whether queries can use a leading portion of the key. That phrase - leading portion - is the heart of composite-index design.

Key idea

If an index is defined on (A, B, C), then the leading portions are (A), (A, B), and (A, B, C). This is why a query on B alone is fundamentally different from a query on A alone.

Index key( A , B , C )Leading portions(A)(A,B)(A,B,C)Not leading(B)(C)(B,C)
Recollect: A composite index is not just “three single-column indexes stuck together.” It is an ordered search structure, and order is the whole story.

The leading-edge rule, explained properly

Oracle’s tuning guidance says a composite index can support an access path when the SQL statement uses a leading portion of the index. That is the basic rule behind range scans on composite keys. If the first column is missing, the optimizer usually cannot do a normal index range scan based on the composite ordering. This is why queries on A, A+B, or A+B+C behave differently from a query on B alone.

Classic range-scan friendly predicates

  • WHERE A = :1
  • WHERE A = :1 AND B = :2
  • WHERE A = :1 AND B = :2 AND C = :3
  • WHERE A = :1 AND B > :2

Not leading by default

  • WHERE B = :1
  • WHERE C = :1
  • WHERE B = :1 AND C = :2
  • WHERE C BETWEEN :1 AND :2
Important nuance

“Cannot use the index” is too strong in modern Oracle. A non-leading predicate often prevents a standard range scan, but the optimizer may still consider INDEX SKIP SCAN when the leading column has very few distinct values.

Why column order matters so much

Column order affects far more than whether a predicate is legal for a range scan. It also affects selectivity usage, sort elimination opportunities, the usefulness of the index for partially specified predicates, and whether a range condition cuts off the effective use of columns that come after it. Oracle’s index guidance emphasizes designing composite indexes around actual query patterns rather than abstract rules.

Equality first

Columns commonly used in equality predicates are often strong candidates for earlier positions.

Range later

A range predicate often limits how much benefit later columns can provide for access.

Ordering-aware design

If queries often need ORDER BY A, B, matching index order can reduce or remove sorting work.
Avoid slogans

The advice “put the most selective column first” is not universally right. If your workload almost always filters on a less-selective column first and rarely on the more-selective one alone, query pattern can matter more than raw selectivity.

Index skip scan: the exception that keeps confusing people

Index skip scan is why composite-index discussions often become muddled. Oracle can sometimes use a composite index even when the leading column is not present, but only by logically probing multiple subtrees for the missing leading-column values. Oracle documents this access path as INDEX SKIP SCAN and describes it as most useful when the leading column has low cardinality. In other words: skip scan is real, but it is not a reason to ignore column order.

Index (A,B)A has only a few valuesA=10 subtreeA=20 subtreeQuery on B onlyOracle cannot do a normalleading range scan on BSkip scan ideaProbe several A subtreeslooking for the same B value
SQL - Encourage skip scan for testing
EXPLAIN PLAN FOR
SELECT /*+ INDEX_SS(e idx_emp_dept_job) */ *
FROM   emp_test e
WHERE  job_id = 'ST_CLERK';

SELECT *
FROM   TABLE(DBMS_XPLAN.DISPLAY);
Healthy interpretation

If skip scan appears, treat it as an optimizer option that can sometimes rescue a non-leading predicate, not as proof that column order no longer matters.

Reading execution plans for concatenated indexes

Oracle plans tell you a lot once you know what to look for. A composite index may be used via INDEX RANGE SCAN, INDEX UNIQUE SCAN, INDEX FULL SCAN, or INDEX SKIP SCAN depending on the predicate shape and selectivity. The goal is not to memorize plan names blindly, but to tie each access path back to the SQL pattern that produced it.

Plan operatorTypical meaning hereWhat it suggests
INDEX RANGE SCANOracle is traversing a leading portion of the indexClassic composite-index success case
INDEX UNIQUE SCANAll key columns of a unique index are specifiedMost precise access pattern
INDEX SKIP SCANOracle is probing non-leading predicates through multiple leading-key branchesPossible rescue path, but workload-dependent
INDEX FULL SCANOracle may be using the index for ordering or covering needsUseful in some report and sort-elimination cases
SQL - Compare leading and non-leading plans
EXPLAIN PLAN FOR
SELECT * FROM emp_test WHERE department_id = 50;

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY);

EXPLAIN PLAN FOR
SELECT * FROM emp_test WHERE job_id = 'ST_CLERK';

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY);
Recollect: Plans are the truth serum for composite-index design. If the plan shape contradicts your mental model, trust the plan and revisit the model.

Covering behavior and sort elimination

Composite indexes are useful not just for filtering but also for returning rows in a useful order and, in some cases, avoiding table access when all required columns are present in the index. Oracle documents that concatenated indexes can help satisfy ORDER BY clauses efficiently when the ordering matches the index structure closely enough.

Sort elimination

If a query filters and orders in a way that matches index order, Oracle may avoid a separate sort step.

Covering potential

If the query can be answered from the index alone, Oracle may choose an index-only style access path or at least reduce table visits.
SQL - Composite index that helps filter and order
CREATE INDEX idx_orders_cust_date_status
  ON orders(customer_id, order_date, status);

EXPLAIN PLAN FOR
SELECT customer_id, order_date, status
FROM   orders
WHERE  customer_id = 101
ORDER BY order_date, status;

SELECT *
FROM   TABLE(DBMS_XPLAN.DISPLAY);

This is one reason blindly splitting a composite index into multiple single-column indexes can make a workload worse: you may lose both access efficiency and ordering benefit.

Design rules that hold up in real workloads

The best rules are those that survive contact with real applications. Oracle’s performance guidance consistently points toward designing indexes around actual predicate patterns and optimizer behavior rather than slogans.

Rule 1

Lead with columns that the workload commonly constrains first, especially equality predicates.

Rule 2

Push range columns later if earlier equality predicates are what make the access path useful.

Rule 3

Consider ORDER BY and grouping needs, not just the WHERE clause.

Rule 4

Do not assume one composite index can replace every access path you care about. Plans decide that, not hope.
Balanced rule of thumb

Put columns first because the workload needs them first, not because an isolated selectivity formula told you to.

Common mistakes and how to avoid them

Mistake 1

Assuming a query on the second column will always ignore the index entirely.

Mistake 2

Believing “most selective first” is always the right answer.

Mistake 3

Creating many overlapping composite indexes without checking whether they duplicate each other.
Operational caution

Too many overlapping composite indexes increase DML cost, maintenance overhead, and optimizer complexity. Composite indexes should simplify the workload, not create an index graveyard.

End-to-end demo: testing a composite index on department and job

This demo uses a safe and familiar sample pattern: build a larger test table from HR data, create a composite index on (department_id, job_id), gather stats, and compare plan shapes for leading-column, full-key, and non-leading-column predicates. The goal is not to invent fixed costs or elapsed times, because those depend on environment. The goal is to show real commands and the exact plan patterns you should verify.

Demo principle

Use exact commands, then verify with metadata and plans. Do not teach fake outputs where Oracle’s optimizer would legitimately vary by data volume and statistics.

Step 1: create the demo table and gather stats

SQL - Build test data
CREATE TABLE emp_test AS
SELECT employee_id,
       department_id,
       job_id,
       salary,
       last_name
FROM   hr.employees
       CROSS JOIN (
         SELECT level AS n
         FROM   dual
         CONNECT BY level <= 1000);

BEGIN
  DBMS_STATS.GATHER_TABLE_STATS(
    ownname => USER,
    tabname => 'EMP_TEST',
    cascade => FALSE);
END;
/

Step 2: create the concatenated index and verify it exists

SQL - Create and verify index
CREATE INDEX idx_emp_dept_job
  ON emp_test(department_id, job_id);

SELECT index_name,
       index_type,
       status,
       visibility
FROM   user_indexes
WHERE  index_name = 'IDX_EMP_DEPT_JOB';

The success condition here is one visible row in USER_INDEXES with the expected index name and a usable status.

Step 3: test the leading-column predicate

SQL - Leading column only
EXPLAIN PLAN FOR
SELECT *
FROM   emp_test
WHERE  department_id = 50;

SELECT *
FROM   TABLE(DBMS_XPLAN.DISPLAY);

The plan you hope to see is an index access path such as INDEX RANGE SCAN on IDX_EMP_DEPT_JOB, because the leading column is present.

Step 4: test both indexed columns together

SQL - Full leading portion
EXPLAIN PLAN FOR
SELECT *
FROM   emp_test
WHERE  department_id = 50
  AND  job_id = 'ST_CLERK';

SELECT *
FROM   TABLE(DBMS_XPLAN.DISPLAY);

This is the classic sweet spot. The plan should normally show the composite index used very efficiently because both leading columns are constrained.

Step 5: test the non-leading column alone

SQL - Non-leading predicate only
EXPLAIN PLAN FOR
SELECT *
FROM   emp_test
WHERE  job_id = 'ST_CLERK';

SELECT *
FROM   TABLE(DBMS_XPLAN.DISPLAY);

At this point the plan may show a full table scan, or it may show an INDEX SKIP SCAN if Oracle decides the leading column has few enough distinct values for that to be worthwhile. Either result is consistent with Oracle’s documented behavior; that is why the demo avoids pretending one outcome is guaranteed on every system.

Step 6: test skip scan explicitly

SQL - Encourage skip scan
EXPLAIN PLAN FOR
SELECT /*+ INDEX_SS(emp_test idx_emp_dept_job) */ *
FROM   emp_test
WHERE  job_id = 'ST_CLERK';

SELECT *
FROM   TABLE(DBMS_XPLAN.DISPLAY);

The success condition here is not a specific cost. It is whether the plan now shows INDEX SKIP SCAN, which proves the optimizer can sometimes use the composite index even without the leading column predicate.

Step 7: test with actual execution statistics

SQL - Runtime stats
SELECT /*+ gather_plan_statistics */ *
FROM   emp_test
WHERE  department_id = 50
  AND  job_id = 'ST_CLERK';

SELECT *
FROM   TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL, NULL, 'ALLSTATS LAST'));
What counts as a successful demo?

Success means the index exists, plans for leading-column predicates show normal composite-index access, the non-leading-only case demonstrates the leading-edge rule, and skip scan can be demonstrated honestly as a conditional exception rather than a guarantee.

Knowledge check

These questions are built to sharpen judgment, not just terminology. Submit answers and read the explanations carefully.

Q1. For an index on (A, B, C), which set correctly describes the leading portions?
(B), (C), (B,C)
(A), (A,B), (A,B,C)
(A,C), (B,C), (C)
Any subset of the columns is a leading portion.
Correct answer: The leading portions start from the first column. That is the central structural rule behind how concatenated indexes are searched efficiently.
Q2. Why can a query on the second column alone behave differently from a query on the first column alone?
Because Oracle forbids references to the second column in SQL.
Because the second column is automatically invisible.
Because a standard range scan depends on the ordered leading structure of the index, and the first column anchors that order.
Because Oracle always rewrites second-column predicates into bitmap scans.
Correct answer: The first column anchors the index ordering. That is why the leading-edge rule exists in the first place.
Q3. What is the healthiest way to think about INDEX SKIP SCAN?
It is a useful exception that can sometimes help with non-leading predicates, but it does not remove the importance of column order.
It proves the leading-column rule is obsolete.
It always beats a full table scan.
It only works on unique indexes.
Correct answer: Skip scan is conditional, not magical. It can help when the leading column has low cardinality, but it is not a substitute for good composite-index design.
Q4. Which statement about column order is the most accurate?
The most selective column must always be first, no exceptions.
Column order only matters for unique indexes.
Order matters only for sorting, never for filtering.
Column order should reflect real predicate patterns, equality versus range usage, and ordering needs.
Correct answer: Workload pattern decides. Selectivity matters, but so do predicate shape and ordering behavior. Good design balances all of them.
Q5. In the demo, what is the safest way to validate that the concatenated index behaved as expected?
Assume success because the index creation statement completed.
Verify index metadata and compare execution-plan operators for leading, full-key, and non-leading predicates.
Only look at table row counts.
Drop all single-column indexes automatically.
Correct answer: Plans plus metadata tell the real story. Composite-index design is validated by what Oracle actually does, not by what we hoped it would do.
Q6. Why can a composite index help with ORDER BY as well as filtering?
Because every index automatically sorts all result sets.
Because ORDER BY is ignored when an index exists.
Because if the query’s order matches the index order closely enough, Oracle may reduce or avoid an explicit sort.
Because concatenated indexes are always covering indexes.
Correct answer: Index order can do double duty. It can help locate rows and sometimes help return them in a useful order too.
Q7. What is the healthiest rule of thumb for composite-index design?
Design around actual workload patterns and verify with execution plans rather than relying on slogans.
Put columns in alphabetical order for clarity.
Never create a composite index if single-column indexes already exist.
Always replace every single-column index with one large composite index.
Correct answer: Workload first, slogans second. Oracle indexing decisions become much clearer when you let real SQL patterns and real plans drive the design.

No comments:

Post a Comment