Thursday, January 5, 2023

Exadata X8M Smart Flash Cache : Write Back and Write Through mode Conversions

Oracle Exadata Smart Flash Cache Modes Explained - WriteBack, WriteThrough, and Safe Conversion
Oracle Exadata Series

Oracle Exadata Smart Flash Cache Modes Explained What `WriteBack` and `WriteThrough` really change, how to convert safely, and what to verify before and after the switch.

Exadata Smart Flash Cache write mode changes how writes are staged, acknowledged, and later persisted to disk. In `WriteThrough`, writes are acknowledged only after they are also written to disk. In `WriteBack`, writes can be acknowledged from flash first and destaged to disk later. That difference affects performance expectations, maintenance steps, and the validation checks you should run before changing modes.

2 modes`WriteThrough` and `WriteBack`
1 key setting`flashCacheMode` on the cell
Flush mattersBefore leaving `WriteBack`
Warm-up costCache must repopulate after recreation

The core distinction: acknowledgement timing changes between `WriteThrough` and `WriteBack`

The two Smart Flash Cache write modes differ in when writes are acknowledged and when they reach disk. In `WriteThrough` mode, writes go to both flash cache and disk, and acknowledgement waits for disk completion. In `WriteBack` mode, writes are first written to flash cache and are destaged to disk later. Data not yet on disk can be lost if a flash failure occurs before destaging, and recovery then depends on mirrored copies.

That means the practical trade-off is not simply “safe versus fast”. Both modes exist for valid reasons, but `WriteBack` introduces a staged-write behavior that deserves more disciplined operational checks. The mode is powerful, but it is not the kind of setting you should change casually or explain with simplistic rules.

`WriteThrough` Write goes to flash and disk Acknowledgement waits for disk persistence `WriteBack` Write lands in flash first Destage to disk happens later Operational consequence Changing the mode changes write behavior, flush requirements, and the validation you should complete before maintenance.
The most important difference is when the system considers the write safe enough to acknowledge.

`WriteThrough` in practice

  • Write acknowledgement waits for disk completion.
  • Flash still participates, but disk persistence is on the critical path.
  • Mode changes away from it are usually about write-latency goals.

`WriteBack` in practice

  • Writes can be acknowledged from flash first.
  • Dirty data exists until destage completes.
  • Leaving the mode safely means dealing with dirty cache contents first.
The nuance that matters

`WriteBack` is not just ?faster flash?. It is a different write-persistence path, which is why Exadata includes explicit flush and conversion procedures around it.

The right control point: in current Exadata procedures, the mode is managed at the cell level

The older shorthand of looking for a `writeThrough` detail in flash-cache output is less useful than focusing on the cell attribute flashCacheMode, which can be set to WriteThrough or WriteBack. That attribute is the practical control point for enabling or disabling the mode.

This is important because it corrects a common operational mistake: treating Smart Flash Cache mode as only a property of the flash-cache object instead of a cell-level operating mode. You still inspect the flash cache itself for status and capacity, but the mode conversion workflow is anchored at the cell.

flashCacheModeCell-level mode setting
LIST CELL DETAILBest verification entry point
LIST FLASHCACHE DETAILCache object status and size
Inspect the current mode and flash cache state
-- Cell-level operating mode
CellCLI> LIST CELL ATTRIBUTES name, flashCacheMode, status

-- More detail if needed
CellCLI> LIST CELL DETAIL

-- Flash cache object details
CellCLI> LIST FLASHCACHE DETAIL
Mental model

Think of Smart Flash Cache mode as a cell operating choice that is then reflected in how the flash-cache object behaves, not as an isolated checkbox on the flash-cache object alone.

Safe conversion workflow: the documented procedure depends on direction

The procedure is not fully symmetric even though both directions involve recreating the flash cache. Moving to WriteBack means changing the cell mode and rebuilding the cache so the new behavior takes effect. Moving away from WriteBack adds a crucial extra step: dirty flash-cache contents must be flushed and verified before you finish the change.

That is exactly the kind of detail that gets lost in oversimplified runbooks. If you are changing from WriteBack to WriteThrough, the flush step is not decorative. It is the step that converts staged-but-not-yet-destaged data into a clean state before the cache is recreated in the safer mode.

1. Inspect current state

Confirm mode, flash-cache object status, and whether dirty write-back data exists.

2. Flush if leaving `WriteBack`

Do not skip the dirty-data check when converting away from staged writes.

3. Change the cell mode

Set flashCacheMode to the target value at the cell.

4. Recreate and verify

Rebuild the flash cache, then confirm the new mode and cache status.

Documented pattern: `WriteThrough` to `WriteBack`
-- Check the current cell mode
CellCLI> LIST CELL ATTRIBUTES flashCacheMode

-- Drop and recreate the cache in the new mode
CellCLI> DROP FLASHCACHE
CellCLI> ALTER CELL flashCacheMode=WriteBack
CellCLI> CREATE FLASHCACHE ALL

-- Verify
CellCLI> LIST CELL ATTRIBUTES flashCacheMode
CellCLI> LIST FLASHCACHE DETAIL
Documented pattern: `WriteBack` to `WriteThrough`
-- Flush dirty contents before leaving WriteBack
CellCLI> ALTER FLASHCACHE ALL FLUSH

-- Verify dirty bytes have drained
CellCLI> LIST METRICCURRENT FC_BY_DIRTY

-- Then switch mode and rebuild cache
CellCLI> DROP FLASHCACHE
CellCLI> ALTER CELL flashCacheMode=WriteThrough
CellCLI> CREATE FLASHCACHE ALL

-- Verify
CellCLI> LIST CELL ATTRIBUTES flashCacheMode
Maintenance reality

Even when the command sequence is short, the operational effect is not. Recreating flash cache removes cached benefit until the cache warms again, so a maintenance window or low-traffic period is still the sensible posture.

Validation and monitoring: prove the cache is clean, rebuilt, and behaving the way you expect

The most important validation during a mode change away from WriteBack is whether dirty flash-cache content has drained. The key metric for that check is FC_BY_DIRTY, which shows dirty bytes in flash cache. If that value is not where you expect it to be, the conversion still needs attention.

After recreation, the next thing to remember is that a newly created flash cache is empty of useful working-set history. Performance after the change therefore depends on repopulation. This is why a technically successful conversion can still feel operationally different for a period afterward.

1. Check mode Verify current `flashCacheMode` 2. Flush if needed Drain dirty bytes from `WriteBack` 3. Recreate cache Apply target mode and rebuild 4. Validate the aftermath Confirm mode, flash cache status, and dirty-byte behavior. Then remember that useful cache contents must warm again.
A mode switch is complete only when the cache state and the post-change behavior both make sense.
Validation sequence after or during conversion
-- Verify mode and cell status
CellCLI> LIST CELL ATTRIBUTES name, status, flashCacheMode

-- Verify flash cache object state
CellCLI> LIST FLASHCACHE DETAIL

-- When leaving WriteBack, confirm dirty bytes are gone
CellCLI> LIST METRICCURRENT FC_BY_DIRTY

-- Optional historical context if you are comparing before and after
CellCLI> LIST METRICHISTORY WHERE objectType='FLASHCACHE'

Before the switch

Know the current mode and whether dirty write-back content exists.

During the switch

Treat flush completion as an evidence check, not as an assumption.

After the switch

Expect a warm-up period because a rebuilt cache does not contain the prior working set.

Design guidance and caveats: where flash-cache assumptions become misleading

Claim you may hear More careful reading Why it matters
“`WriteBack` is always the better mode.” It can improve write behavior, but it also introduces dirty-cache management and a stricter operational posture. Mode choice should reflect workload needs and operational discipline, not only speed goals.
“Just recreate the cache and you are done.” If you are leaving `WriteBack`, the dirty-data flush and verification step is part of the real procedure. Skipping that step misunderstands what `WriteBack` means.
“The old cache returns immediately after recreation.” The cache must warm again because the prior working set is gone. Performance right after the change may not resemble steady state.
“This is only a flash-cache setting.” The mode is managed with the cell attribute flashCacheMode. It changes where you inspect and reason about the setting.
“All Exadata configurations behave identically.” PMEM-capable systems and older software releases can introduce additional mode interactions and release-specific procedures. Use the procedure that matches the actual platform generation and software version.

Misconception: detail output is the whole story

Flash-cache detail matters, but the mode decision itself is best understood from the cell-level setting.

Misconception: dirty bytes are an implementation detail

They are exactly what makes `WriteBack` operationally distinct during conversion.

Misconception: a successful command means a finished change

You still need to validate flush completion, recreated cache state, and post-change warm-up behavior.

Misconception: one runbook fits every vintage

Older Exadata System Software releases had different procedural details, so version alignment still matters.

Best operator question

Before changing the mode, ask three concrete questions: what is the current `flashCacheMode`, is there dirty write-back content to flush, and how will you verify the rebuilt cache after the switch?

Operational lab: a grounded checklist for inspection, conversion, and post-change proof

This lab gives you a concrete way to inspect current mode, switch safely, and verify the result while remembering that the cache has to repopulate.

Inspection and precheck
-- Confirm the current mode at the cell
CellCLI> LIST CELL ATTRIBUTES name, flashCacheMode, status

-- Inspect cache status and size
CellCLI> LIST FLASHCACHE DETAIL

-- If the cell is in WriteBack, inspect dirty bytes
CellCLI> LIST METRICCURRENT FC_BY_DIRTY
Mode conversion and validation
-- Example: leave WriteBack safely
CellCLI> ALTER FLASHCACHE ALL FLUSH
CellCLI> LIST METRICCURRENT FC_BY_DIRTY
CellCLI> DROP FLASHCACHE
CellCLI> ALTER CELL flashCacheMode=WriteThrough
CellCLI> CREATE FLASHCACHE ALL
CellCLI> LIST CELL ATTRIBUTES flashCacheMode
CellCLI> LIST FLASHCACHE DETAIL

What a clean outcome looks like

  • The cell reports the target flashCacheMode.
  • Flash-cache detail output shows a healthy recreated object.
  • Dirty-byte checks make sense for the direction you used.
  • The team expects a repopulation period after recreation.

What should stop the runbook

  • You are leaving `WriteBack` but never validated dirty-byte drainage.
  • The cell mode and flash-cache object state disagree with expectations.
  • The procedure in hand does not match the actual Exadata software generation.
  • The change window assumes steady-state performance immediately after recreation.

Quick quiz

The questions below test the distinctions that matter most in real Smart Flash Cache operations: acknowledgement timing, cell-level mode control, and flush discipline.

7 questions CellCLI Mode semantics
Q1. In `WriteThrough` mode, when is a write acknowledged?
Immediately after it reaches flash only
After it has also been written to disk
Only after cache warm-up finishes
Only after `FC_BY_DIRTY` becomes zero
Correct answer: `WriteThrough` waits for disk completion.
Q2. What is the cleanest current place to verify or reason about the flash-cache write mode?
`V$ASM_DISKGROUP` only
`FLASHCACHECONTENT` only
The cell attribute flashCacheMode
The database parameter file
Correct answer: the mode is managed and verified as a cell-level setting.
Q3. Why is `FC_BY_DIRTY` especially important when converting from `WriteBack` to `WriteThrough`?
Because it helps confirm that dirty write-back contents have been drained before the mode change completes
Because it shows ASM failure groups
Because it replaces `LIST FLASHCACHE DETAIL` entirely
Because it is used only for PMEM status
Correct answer: it is the dirty-byte proof point for leaving staged writes cleanly.
Q4. Which statement is safest after recreating Smart Flash Cache in a new mode?
The cache should immediately behave like it did before recreation
No verification is needed if the command succeeded
Only disk-group metrics matter now
The cache must warm again, so post-change behavior can differ from steady state for a while
Correct answer: a rebuilt cache must repopulate.
Q5. Which command sequence best matches the documented direction for moving to `WriteBack` in current procedures?
`ALTER FLASHCACHE ALL FLUSH` then `LIST METRICCURRENT FC_BY_DIRTY` only
`DROP FLASHCACHE`, `ALTER CELL flashCacheMode=WriteBack`, then `CREATE FLASHCACHE ALL`
`ALTER CELL flashCacheMode=WriteBack` without recreating the cache
`ALTER GRIDDISK ... CACHINGPOLICY` only
Correct answer: the documented pattern changes the cell mode and recreates the cache.
Q6. What is the safest interpretation of a recommendation to use `WriteBack`?
It is always correct for every Exadata configuration and workload
It removes the need for maintenance planning
It may be valuable, but you still need the right platform, procedure, and validation discipline
It means dirty data never exists in flash
Correct answer: mode choice is operational as well as technical.
Q7. Which three checks best define a careful flash-cache mode change?
Current flashCacheMode, dirty-byte state if relevant, and post-recreation verification
Only AWR load profile and ASM rebalance power
Only disk group free space and listener status
Only Smart Scan statistics
Correct answer: mode, dirty-state awareness, and post-change proof form the core checklist.

No comments:

Post a Comment