Column Order in Composite IndexesHow leading portions, skip scans, ordering, and workload shape the right key sequence
Composite indexes are not just several columns bundled into one object. Their order determines which predicates form a usable key prefix, which statements can benefit from ordered access, and whether one index can support several important query patterns or only one narrow path. In Oracle, that makes column order a design choice with optimizer consequences.
ORDER BY affect design
05. Skip scanWhat changes when the leading key is absent
06. DiagnosticsPlans, usage monitoring, and invisible tests
07. Edge casesNulls, covering, overlaps, and misconceptions
08. Hands-on labReproducible SQL for your sandbox
09. Quiz and takeawaysCheck instincts before you design
Start with the right mental model: a composite index is one ordered key
Oracle documentation uses both composite and concatenated index. Both terms describe the same concept: Oracle stores one ordered key, not several independent mini-indexes. An index on (customer_id, order_date, status) is ordered first by customer_id, then by order_date within each customer, and then by status.
If an index is defined as (A, B, C), the documented leading portions are (A), (A, B), and (A, B, C). That is why a predicate on B alone is fundamentally different from a predicate on A alone.
Good instinct
Weak instinct
How order changes which query patterns the index can serve
Suppose two possible indexes exist on the same table: (customer_id, order_date) and (order_date, customer_id). They contain the same columns, but they do not support the same workload. The difference is concrete: each order exposes a different leading prefix.
| Typical predicate | Index (customer_id, order_date) |
Index (order_date, customer_id) |
Reason |
|---|---|---|---|
WHERE customer_id = :b1 |
Strong fit | No leading prefix | The first index exposes customer_id as a documented leading portion. |
WHERE customer_id = :b1 AND order_date >= :b2 |
Strong fit | Possible for date-led SQL, but not aligned to the same prefix | The first order matches the common “customer first, then date range” pattern. |
WHERE order_date BETWEEN :d1 AND :d2 |
No leading prefix | Strong fit | Only the second index leads with order_date. |
WHERE customer_id = :b1 ORDER BY order_date |
Often attractive | Poor alignment | The first order can both filter by customer and preserve date order within that customer slice. |
A single composite index can satisfy several important statements only when the workload uses compatible leading prefixes. Same columns does not mean same usefulness.
CREATE INDEX idx_orders_cust_date ON orders(customer_id, order_date); CREATE INDEX idx_orders_date_cust ON orders(order_date, customer_id); -- Query pattern A: customer lookup with optional date range SELECT * FROM orders WHERE customer_id = :cust_id AND order_date >= :start_date; -- Query pattern B: reporting by date range SELECT * FROM orders WHERE order_date BETWEEN :d1 AND :d2;
How to choose the order without falling for bad rules of thumb
“Put the most selective column first” is too blunt to be a complete design rule. Selectivity matters, but Oracle’s own guidance starts with workload shape: which keys appear together, which keys appear alone, and which leading prefixes let multiple important statements use the same index.
1. Lead with reused prefixes
customer_id alone and also by customer_id + order_date, leading with customer_id gives both patterns a useful prefix.2. Keep equality predicates early
3. Include ordering needs
4. Validate with plans
(A, B) is usually betterA appears by itself often, and the common two-column statements are A = ... AND B ....
(B, A) may be betterThe real workload mostly starts with B, or reporting statements are driven primarily by B and only sometimes add A.
| Observed workload pattern | Usually stronger order | Reasoning |
|---|---|---|
WHERE account_id = :x and WHERE account_id = :x AND tran_date >= :d | (account_id, tran_date) | The shorter and longer patterns share the same leading prefix. |
WHERE status = :s is rare, but WHERE status = :s AND created_on BETWEEN ... is common | Depends on whether status must ever drive access alone | If not, another order may better reflect the real access path. |
WHERE region = :r ORDER BY sale_date | (region, sale_date) | That order can align both filtering and sorting needs. |
Queries frequently omit column A but use B heavily | Do not assume (A, B) is enough | The missing leading column may require a different index or may rely only on skip scan in narrow cases. |
A selective column placed first is not automatically the right answer if that order makes the workload’s common partial predicates unusable. Column order is about useful prefixes, not just raw distinct-count trivia.
Equality first, broader ranges later, and never ignore ORDER BY
Once you understand leading portions, a practical pattern becomes clear: earlier equality predicates usually let Oracle navigate deeper into the key before the scan opens into a range. That is why an index such as (customer_id, order_date) is often a stronger fit for statements like customer_id = :x AND order_date >= :d than the reverse order.
Oracle documentation also notes that an index can eliminate a sort when the ORDER BY columns match the order of the leading index columns. Column order therefore has to be evaluated against sorting needs as well as filtering predicates.
CREATE INDEX idx_sales_region_date ON sales(region, sale_date); SELECT sale_id, sale_date, amount FROM sales WHERE region = :region ORDER BY sale_date; -- This order is attractive because the region slice is narrow first, -- and rows inside that slice are already arranged by sale_date.
| Design question | Why it matters |
|---|---|
| Does the workload frequently do equality on one column and range on the next? | That pattern often benefits from putting the equality-driving column first. |
| Does the query need rows back in key order? | The right composite order can reduce or remove an explicit sort. |
| Is the query broad enough that Oracle may prefer a full scan anyway? | Even a well-ordered index remains optional to the optimizer, not mandatory. |
Index skip scan is the exception that softens the old oversimplification
A common simplification is that Oracle can use a composite index only when the leading column is present. Oracle documentation qualifies that statement: index skip scan is the main exception. Skip scan lets Oracle consider a composite index even when the leading column is missing by conceptually probing logical subindexes based on the leading key’s distinct values.
Skip scan does not make column order irrelevant. It adds another optimizer option in some cases. It is not a blanket substitute for a properly ordered index.
Good skip-scan candidate
Weak skip-scan candidate
Still workload-driven
Be cautious
CREATE INDEX idx_customer_gender_email ON customers(gender, email); -- Leading column GENDER is absent here. -- Oracle may consider skip scan if gender has few distinct values -- and email is selective enough. SELECT customer_id FROM customers WHERE email = :email_value;
How to validate column order responsibly
Index design needs plan evidence. Once two plausible orders exist, validate them with current statistics, representative SQL, and runtime plan inspection. Oracle also gives you operational tools to observe whether an index is used and to test an alternative order without exposing it broadly right away.
1. Gather current stats
2. Run representative SQL
3. Inspect runtime plans
DBMS_XPLAN.DISPLAY_CURSOR with actual row-source stats where possible.4. Compare alternatives safely
BEGIN DBMS_STATS.GATHER_TABLE_STATS( ownname => USER, tabname => 'ORDERS', cascade => TRUE); END; / ALTER SESSION SET statistics_level = ALL; SELECT /*+ gather_plan_statistics */ order_id, order_date, amount FROM orders WHERE customer_id = :cust_id AND order_date >= :start_date ORDER BY order_date; SELECT * FROM TABLE( DBMS_XPLAN.DISPLAY_CURSOR( NULL, NULL, 'ALLSTATS LAST +PREDICATE +ALIAS +BYTES +COST' ) );
-- Observe whether an existing index is actually used. ALTER INDEX idx_orders_cust_date MONITORING USAGE; SELECT index_name, used, monitoring, start_monitoring, end_monitoring FROM user_object_usage WHERE index_name = 'IDX_ORDERS_CUST_DATE'; -- Build an alternative order invisibly. CREATE INDEX idx_orders_date_cust_inv ON orders(order_date, customer_id) INVISIBLE; -- In a controlled session, allow the optimizer to consider it. ALTER SESSION SET optimizer_use_invisible_indexes = TRUE;
An invisible index is maintained by DML but ignored by the optimizer unless OPTIMIZER_USE_INVISIBLE_INDEXES is enabled. That makes it useful for comparing an alternative column order without changing general optimizer choices immediately.
Edge cases and misconceptions that trip people up
| Misconception | Better interpretation |
|---|---|
| “A query on the second column can never use the composite index.” | Not as a normal leading-prefix range lookup, but Oracle may still choose skip scan or a broader index scan in specific circumstances. |
| “Most selective first” is always right. | Not reliably. The more durable question is which leading prefix the workload needs most often. |
| “If all needed columns are in the index, leading-column rules stop mattering.” | Covering can reduce table visits, but it does not create a missing leading prefix for navigational access. |
| “Rows with null key values are always in the B-tree index.” | Oracle does not index rows where all key columns are null in a normal B-tree index. That matters for some query designs. |
| “One big composite index replaces every smaller index.” | Sometimes, but not automatically. Overlapping indexes can still differ in size, maintenance cost, and suitability for important SQL. |
Null nuance
Covering nuance
Maintenance nuance
Do not drop a shorter overlapping index only because a longer composite index starts with the same columns. Oracle may still prefer the shorter one for some statements because it is smaller or because it satisfies the statement more cheaply.
Hands-on lab: compare two column orders in a controlled test
This lab works well in a sandbox. It gives you a repeatable setup and highlights the plan shapes worth inspecting when you compare two plausible index orders.
Step 1: build a table with mixed access patterns
DROP TABLE order_demo PURGE; CREATE TABLE order_demo AS SELECT level AS order_id, MOD(level, 5000) + 1 AS customer_id, DATE '2024-01-01' + MOD(level, 365) AS order_date, CASE WHEN MOD(level, 20) = 0 THEN 'CLOSED' WHEN MOD(level, 5) = 0 THEN 'PENDING' ELSE 'OPEN' END AS status, MOD(level, 2000) AS amount FROM dual CONNECT BY level <= 200000; BEGIN DBMS_STATS.GATHER_TABLE_STATS( ownname => USER, tabname => 'ORDER_DEMO', cascade => FALSE); END; /
Step 2: create two competing orders
CREATE INDEX idx_od_cust_date_status ON order_demo(customer_id, order_date, status); CREATE INDEX idx_od_date_cust_status ON order_demo(order_date, customer_id, status) INVISIBLE; BEGIN DBMS_STATS.GATHER_TABLE_STATS( ownname => USER, tabname => 'ORDER_DEMO', cascade => TRUE); END; /
Step 3: run three representative statements
ALTER SESSION SET statistics_level = ALL; -- Query A: pure leading-column lookup SELECT /*+ gather_plan_statistics */ order_id, order_date, status FROM order_demo WHERE customer_id = 1001; -- Query B: customer plus date range and ordered output SELECT /*+ gather_plan_statistics */ order_id, order_date, amount FROM order_demo WHERE customer_id = 1001 AND order_date BETWEEN DATE '2024-03-01' AND DATE '2024-03-31' ORDER BY order_date; -- Query C: broad date-driven reporting SELECT /*+ gather_plan_statistics */ order_id, customer_id, status FROM order_demo WHERE order_date BETWEEN DATE '2024-03-01' AND DATE '2024-03-31';
Step 4: what to inspect in the plan output
Expect for Query A
(customer_id, order_date, status) order is naturally aligned because customer_id is the leading portion.Expect for Query B
Expect for Query C
order_date.Keep in mind
SELECT * FROM TABLE( DBMS_XPLAN.DISPLAY_CURSOR( NULL, NULL, 'ALLSTATS LAST +PREDICATE +BYTES +COST' ) );
Quiz: do you have the right instincts about column order?
Answer from Oracle behavior, not rules of thumb
(customer_id, order_date) usually better than (order_date, customer_id) for queries that often filter by customer_id alone?customer_id, placing it first lets that shorter predicate use the index.(region, sale_date) be appealing for WHERE region = :r ORDER BY sale_date?Think in prefixes, not column lists. Put columns first because the workload needs them first. Treat skip scan as a documented exception, not as a design strategy. And do not finalize an order until the plans, row counts, and operational trade-offs agree with the theory.
No comments:
Post a Comment