Capture Model, Registration, and Checkpoints How to reason about Oracle GoldenGate Extract without confusing redo access, database registration, trail position, and restart behavior.
Extract is not just the process that reads change data. In Oracle GoldenGate for Oracle databases, it is a contract between the source database, the capture engine, the GoldenGate deployment, and the trail that receives committed work. Most operational mistakes happen when those boundaries are blurred: operators create an Extract before registering it, treat a data pump like a source capture, assume a checkpoint table controls Extract restart, or read INFO EXTRACT lag as if it were exact end-to-end latency. A stable design starts by separating capture model, registration scope, checkpoint meaning, and runtime behavior.
For oracle database sources, the practical baseline is integrated extract. it captures through the database logmining server rather than reading redo as a classic oracle extract directly.
Registration is done only for a primary source extract. data pumps are not registered because they read trail, not redo or lcr streams.
Extract owns read checkpoints in the source stream and write checkpoints in the trail. that is a different concern from the database checkpoint table operators usually associate with replicat.
Define what Extract actually owns before you start issuing commands
An Extract group is more than a running executable. GoldenGate treats it as a processing group with its own parameter file, checkpoint state, report history, and associated trail relationship. The source database view of that group, however, depends on the capture model you selected.
That distinction matters because operators often collapse three different objects into one mental picture. The first object is the source-side capture binding inside the database, which exists only when registration is required. The second object is the GoldenGate Extract group, which has its own lifecycle in the deployment or classic installation. The third object is the trail relationship that marks where Extract has already written committed work. If you mix those together, restart and troubleshooting decisions become guesswork.
Extract is managed as a named group. In practice that means one process identity, one parameter file, one checkpoint lineage, one report lineage, and one outbound trail contract.
For integrated capture, database state exists outside the GoldenGate home as well. Registration and the logmining server become part of the real system, not just the GoldenGate filesystem.
Restart position is not just "last thing I saw." Extract maintains read and write checkpoint state so it can preserve commit ordering and avoid skipping or re-emitting the wrong records.
When a source capture incident occurs, ask four separate questions in order: was the source prepared correctly, was the Extract registered correctly, where is the read checkpoint, and where is the write checkpoint?
Do not diagnose Extract using only the trail RBA. A healthy write checkpoint can coexist with a bad source registration, missing archive log, or recovery checkpoint that is far older than you expect.
Treat INFO EXTRACT, INFO EXTRACT ... SHOWCH, SEND EXTRACT ... STATUS, and the report file as four different lenses, not interchangeable commands.
Choose the right capture model before you decide anything about registration
Registration is not the first design choice. Capture model is. The source path determines whether Extract reads redo directly, receives logical change records from the logmining server, or reads a trail as a downstream pump.
In modern Oracle GoldenGate practice for Oracle database sources, integrated Extract is the default design. Integrated Extract receives logical change records from the database logmining server instead of reading redo itself, and it is the model Oracle continues to center in current GoldenGate documentation and modern multitenant guidance. Older classic Oracle capture material still matters when you inherit estates that were built years ago, but it should not be the design center for a new Oracle source deployment.
| Capture path | What the process reads | Where registration matters | Operational consequence |
|---|---|---|---|
| Integrated primary Extract | Logical change records from the database logmining server. | Must be registered with the source or mining database before ADD EXTRACT ... INTEGRATED TRANLOG. |
This is the normal Oracle capture path for current GoldenGate practice, including multitenant and downstream mining designs. |
| Classic primary Extract | Redo or archive logs read directly by Extract. | Relevant mainly to older Oracle estates and older documentation. It is not the current design baseline for Oracle sources. | Useful mostly as legacy context. Its support story is increasingly restrictive in newer Oracle guidance. |
| Data pump Extract | Existing local trail. | No database registration, because it is not a source capture process. | Checkpointing is still real, but the source is trail position rather than database redo position. |
| Downstream integrated Extract | Logical change records mined on a downstream database from shipped redo or archive data. | Requires the same integrated registration logic, plus the mining-database login boundary. | Moves capture overhead and mining state away from the production source host at the cost of extra infrastructure and archive discipline. |
Do not multiply source Extracts by habit
For one Oracle database, or for a controlled set of PDBs inside one CDB, one integrated Extract is often sufficient. Add more only when scope, ownership, or dependency boundaries justify it.
Use downstream to move mining, not to hide confusion
Downstream capture is justified when production overhead isolation, platform separation, or operational policy requires it. It is not the answer to unclear registration or trail design.
Remember that pumps are not source capture
A pump can fan out, bridge networks, or normalize trail movement, but it cannot fix missing source logging, bad registration scope, or a broken primary Extract checkpoint chain.
Register the right object at the right scope, especially in multitenant estates
Registration is a source-database decision, not a formatting step. Its job is to create the capture binding that the primary Extract depends on. If you register the wrong scope, the rest of the build may still look syntactically correct while being operationally wrong.
For integrated Extract, registration must happen before the Extract group is added. In a non-CDB, that is usually straightforward: log into the source alias and register the Extract against the database. In a CDB, the design question is whether you want a per-PDB Extract or a root-level Extract that captures one or more PDBs. Oracle's current multitenant guidance favors per-PDB Extract where ownership is meant to stay at the pluggable database level. If you connect as the local PDB user, you do not need a CONTAINER clause or SOURCECATALOG just to build the Extract.
Best when ownership follows a single pluggable database
Connect as a local GoldenGate user in the PDB, register there, and build the Extract as if it were a non-CDB source. This is usually the cleanest operational boundary.
Best when one Extract must cover multiple PDBs
Connect at root with the right privileges, register the intended PDB scope explicitly with CONTAINER, and keep that scope visible in your verification commands and runbooks.
| Scope choice | How you log in | How you register | What operators forget |
|---|---|---|---|
| Non-CDB integrated Extract | Use the source database credential alias. | REGISTER EXTRACT name DATABASE |
The registration timestamp influences where BEGIN NOW effectively starts for integrated capture. |
| Per-PDB integrated Extract | Connect as the PDB-local GoldenGate user. | Register directly in that PDB with no CONTAINER clause. |
Operators often overcomplicate this by using a common root user and three-part object naming when they do not need to. |
| Root-level integrated Extract for selected PDBs | Connect as the common user at root. | REGISTER EXTRACT name DATABASE CONTAINER (...) |
If Extract will fetch data, the root user needs the right container-wide privilege model. |
| Downstream integrated Extract | Use the source login boundary and the mining-database login boundary. | Register with the mining database after DBLOGIN and MININGDBLOGIN as required. |
The registration target is the mining side, not the place where you merely happen to type the command. |
DBLOGIN USERIDALIAS ggsrc_fin REGISTER EXTRACT efin01 DATABASE DBLOGIN USERIDALIAS ggsrc_pdbops REGISTER EXTRACT eops01 DATABASE DBLOGIN USERIDALIAS cgg_root REGISTER EXTRACT ecdb01 DATABASE CONTAINER (pdbfin, pdbops) DBLOGIN USERIDALIAS ggsrc_root MININGDBLOGIN USERIDALIAS ggmine_root REGISTER EXTRACT edown01 DATABASE CONTAINER (pdbdist)
Multitenant changes do not stop once the Extract is registered. If you add or drop PDB scope later, the operation must be deliberate. The supported pattern is to stop the Extract, log into the correct database scope again, and then use REGISTER EXTRACT ... DATABASE ADD CONTAINER (...) or DROP CONTAINER (...). That is a configuration change, not a harmless metadata tweak.
A root-level Extract that was meant to isolate one PDB can become an ownership problem later because credential scope, object naming, and operational responsibility are now wider than the original intent.
If you register at a specific SCN, that SCN must align with a valid dictionary build boundary. Do not improvise a historical SCN just because you want to move the start point backward.
Registration can return immediately while database-side work continues in the background. Treat it as asynchronous enough that you verify completion before pushing to the next step.
Build Extract in the correct order or you will create a believable but broken configuration
A good Extract build is really a dependency chain. The database must be prepared first. Registration comes next for integrated capture. Only then should you add the Extract group, bind the trail, and start the process.
Prepare the source database for capture, not just connectivity
Enable ENABLE_GOLDENGATE_REPLICATION, make sure the source is in ARCHIVELOG mode, enable forced logging, and enable the required supplemental logging strategy for the replicated objects. Extract cannot capture from a read-only Data Guard or Active Data Guard standby.
Establish the right privilege model for the database release
On Oracle Database 21c and lower, integrated Extract privilege flows commonly rely on DBMS_GOLDENGATE_AUTH.GRANT_ADMIN_PRIVILEGE. On 26ai and higher, Oracle's role-based OGG_CAPTURE model replaces that procedure.
Register the primary source Extract before adding it
Integrated Extract is not created first and registered later. Registration is a prerequisite for ADD EXTRACT ... INTEGRATED TRANLOG and defines the database-side capture binding.
Add the Extract group and then the local trail
The Extract group defines the runtime object. The trail binding defines where committed work will be written. A data pump can come later if transport design requires it.
Start, verify, and read the checkpoints immediately
Do not stop at RUNNING. Confirm that the process is positioned where you expect, the trail is being written, and the checkpoint view matches the intended source and container scope.
SHOW PARAMETER enable_goldengate_replication; SELECT supplemental_log_data_min, force_logging FROM v$database; ARCHIVE LOG LIST; ALTER DATABASE FORCE LOGGING; ALTER DATABASE ADD SUPPLEMENTAL LOG DATA; ALTER SYSTEM SWITCH LOGFILE;
DBLOGIN USERIDALIAS ggsrc_fin REGISTER EXTRACT efin01 DATABASE ADD EXTRACT efin01, INTEGRATED TRANLOG, BEGIN NOW ADD EXTTRAIL ./dirdat/es, EXTRACT efin01, MEGABYTES 256 EDIT PARAMS efin01 START EXTRACT efin01 INFO EXTRACT efin01 INFO EXTRACT efin01, SHOWCH
EXTRACT efin01 USERIDALIAS ggsrc_fin EXTTRAIL ./dirdat/es LOGALLSUPCOLS UPDATERECORDFORMAT COMPACT DDL INCLUDE MAPPED TABLE sales_fin.*;
Two sequencing details are easy to miss. First, BEGIN NOW is not a magical runtime bookmark independent of registration. For integrated Extract, the effective "now" is tied to the registration event. Second, on the first startup GoldenGate uses the begin point established when the group was created, but after the group has run, normal restarts use the checkpoint chain, not the original BEGIN clause. If you forget that, you will misread recovery behavior after a stop or abend.
Before you start the process
- The source is in read/write mode and archive logging is enabled.
- The GoldenGate user has the release-appropriate capture privileges.
- The registration scope matches the intended database or PDB boundary.
- The Extract name fits the naming rules and the trail path is local to the capture runtime.
RUNNING is not enough
The process can be running yet positioned at an unexpected checkpoint, waiting on missing archives, or suspended by an event action. Always read checkpoint and runtime state after the first start.
Register only the primary Extract
When you create a pump with EXTTRAILSOURCE, its start point is trail-based. Registration and source-database capture privileges are not part of that object's lifecycle.
Read checkpoints like an operator instead of treating them as a generic "last good position"
Extract checkpoints are layered. There is a startup position, a recovery position, a current read position in the source stream, and a current write position in the trail. If you do not distinguish them, restart behavior looks random when it is actually doing exactly what it should.
The most important operational insight is that Extract is tracking two realities at once. It must know how far it has read in the source stream, and it must know how far it has safely emitted committed work into the trail. That is why INFO EXTRACT ... SHOWCH matters so much: it exposes both the read side and the write side. The recovery checkpoint is particularly important because it points to the oldest still-open transaction that has not been fully processed. That is the point that determines how far back recovery may need to look.
The start location GoldenGate will use when the group begins from its configured start point rather than from an already advanced runtime chain.
The source position of the oldest transaction not yet fully processed. This often governs how much archive history recovery needs.
The last source record read by Extract. In ordinary monitoring this aligns with the summary Log Read Checkpoint view.
The trail position currently being written. This is the write-side guarantee that lets pumps and Replicat consume a durable sequence.
INFO EXTRACT efin01 INFO EXTRACT efin01, SHOWCH INFO EXTRACT efin01, SHOWCH 5 SEND EXTRACT efin01, STATUS SEND EXTRACT efin01, SHOWTRANS VIEW REPORT efin01
| Field or command | What it really tells you | What it does not guarantee | How to use it correctly |
|---|---|---|---|
Log Read Checkpoint |
The last source position checkpointed by Extract. | That all older transactions are fully written and safely irrelevant for recovery. | Compare it with SHOWCH detail and recovery checkpoint when restart or archive issues exist. |
Recovery Checkpoint |
The oldest source record still needed to recover open work. | That recovery will be short if required archive or online logs are gone. | Use it to decide how far back archive availability must reach before restart or upgrade work. |
Checkpoint Lag |
Lag relative to the timestamp of the last record processed at the time the last checkpoint was written. | Exact end-to-end business latency at this instant. | Pair it with stats, trail movement, and downstream lag before making operational conclusions. |
SEND EXTRACT ... STATUS |
Whether the process is active, recovering, or suspended and which recovery stage it is in. | Detailed root cause on its own. | Use it when RUNNING looks suspicious or when a recovery appears stalled. |
| Checkpoint files | Persisted process positioning state. | A replacement for command-level inspection. | Let GoldenGate own the checkpoint artifacts; use commands and reports as the authoritative operator view. |
When operators talk about a checkpoint table, they are usually thinking about Replicat. For Extract, the practical day-two conversation should begin with read and write checkpoints shown by INFO EXTRACT ... SHOWCH.
Bounded Recovery is part of the Extract checkpoint facility for Oracle integrated recovery behavior. It exists to keep restart time from growing without limit when long-running transactions are present. The default bounded recovery interval is four hours, and archive retention needs to cover at least twice that interval if you want recovery to succeed consistently.
If the archive logs needed by the recovery checkpoint are not available, the restart problem is not cosmetic. You have removed the history that the Extract needs to recover correctly.
Monitor runtime behavior without fooling yourself about what RUNNING and lag mean
Healthy Extract operations come from reading state in context. Status, lag, checkpoints, open transactions, report messages, and container scope each answer a different question.
Oracle's own command semantics make this explicit. STARTING means the process has started but has not yet locked its checkpoint file for processing. RUNNING does not always mean active data movement; it can also mean the process is suspended and preserving current state. Checkpoint Lag is approximate and checkpoint-based. The right habit is to combine summary status with deeper commands whenever lag, trail movement, or restart timing feels inconsistent.
INFO EXTRACT efin01 INFO EXTRACT efin01, SHOWCH INFO EXTRACT efin01, CONTAINERS SEND EXTRACT efin01, STATUS STATS EXTRACT efin01, TOTAL VIEW REPORT efin01
| Verification step | Command | Signal to look for | What it means operationally |
|---|---|---|---|
| Confirm lifecycle state | INFO EXTRACT group |
Status is no longer STARTING and trail association is present. |
The process is up far enough to be inspected meaningfully. |
| Confirm source and trail positioning | INFO EXTRACT group, SHOWCH |
Read and write checkpoint details are advancing in a way that matches recent source activity. | The capture side and trail side agree on forward movement. |
| Confirm multitenant scope | INFO EXTRACT group, CONTAINERS |
The registered PDB list matches the intended replication boundary. | You are not accidentally capturing too much or too little of the CDB. |
| Confirm active versus suspended | SEND EXTRACT group, STATUS |
The process is active, or it clearly reports a recovery stage or suspended state. | You can distinguish "alive but not moving" from real forward processing. |
| Confirm work actually processed | STATS EXTRACT group, TOTAL |
Operation counts align with the intended workload. | Useful when lag looks low but business users still suspect capture issues. |
| Confirm the story behind the state | VIEW REPORT group |
No hidden warnings about registration, container scope, archive access, or parameter interpretation. | The report often explains why a state exists when the summary commands only describe it. |
Expect the begin point to matter
The very first run uses the point you established when adding the Extract. For integrated capture, that point ties back to registration semantics, not just the moment you typed ADD EXTRACT.
Expect the checkpoint chain to matter more
After the Extract has run, a normal restart should honor the current and recovery checkpoints rather than replaying the original startup idea.
Expect status phases, not instant readiness
If long-running transactions were open, SEND EXTRACT ... STATUS may show named recovery stages such as In recovery[1] and In recovery[2] while GoldenGate walks back through the necessary source history.
Checkpoint lag is useful, but it is not the whole latency story
The lag shown by INFO EXTRACT reflects the timestamp of the last processed record at the moment the last checkpoint was written. It is not a perfect substitute for business-observed commit latency or target apply latency.
A process can be running and still not be processing data
When automation or event actions suspend an Extract, the preserved runtime state can mislead operators who look only at a RUNNING summary. SEND EXTRACT ... STATUS closes that gap.
Diagnose the failure paths early, because most Extract incidents are boundary mistakes
The highest-value troubleshooting skill with Extract is not memorizing every message. It is recognizing which boundary failed: source preparation, registration scope, checkpoint continuity, or runtime recovery.
| Symptom | Likely cause | What to inspect first | Next action |
|---|---|---|---|
ADD EXTRACT ... INTEGRATED TRANLOG fails or later start fails immediately |
The primary Extract was not registered, or it was registered in the wrong database scope. | Registration history, the database alias used with DBLOGIN, and whether the intended PDB or mining database was actually the registration target. |
Correct the registration boundary before re-creating or restarting the Extract. |
| Extract starts but does not capture the expected PDB data | Wrong per-PDB versus root-level registration model, or the wrong container list. | INFO EXTRACT group, CONTAINERS and the credential alias used to build the process. |
Stop the process and correct the registered container scope with the proper login boundary. |
| Restart takes far longer than expected | Long-running open transactions are forcing a deeper recovery walk. | SEND EXTRACT group, STATUS, SEND EXTRACT group, SHOWTRANS, and the recovery checkpoint from SHOWCH. |
Preserve the required archives and let recovery complete unless you have a deliberate transaction-handling plan. |
| Recovery will not complete after maintenance or upgrade work | The archive logs required by the recovery checkpoint are gone. | The recovery checkpoint location and the oldest available archive history on the source or mining side. | Restore the needed log history or revise the restart strategy with full understanding of the data-loss implications. |
| Operators keep changing checkpoint table settings but Extract behavior does not change | The wrong object is being tuned. The incident is on the Extract checkpoint chain, not Replicat checkpoint-table state. | INFO EXTRACT group, SHOWCH, the report, and trail write behavior. |
Shift the diagnostic focus back to source read/write checkpoints and trail continuity. |
| Privilege procedure suddenly stops working on a new database release | The environment moved to the newer role-based privilege model where the old privilege procedure is disabled. | Database release level and the exact privilege grant method used for the GoldenGate user. | Grant the current capture role model instead of assuming the older package-based grant still applies. |
Registration after the fact
Operators sometimes build the group, then try to "fix" it by registering later. For integrated Extract, that reverses the dependency chain. Build order is part of correctness.
Trail-centric troubleshooting
If you look only at the latest trail sequence and RBA, you may miss that recovery still depends on older source history because the oldest open transaction predates the current write point.
Privilege drift across upgrades
Source registration failures after database modernization are often grant-model failures, not GoldenGate syntax failures. The command is unchanged while the privilege method is not.
Check source readiness, registration scope, checkpoint detail, recovery checkpoint archive availability, then report output. That order catches most real Extract incidents quickly.
Commands that force or skip open transactions exist, but they are incident tools with data consequences. Do not use them casually just because a recovery window is uncomfortable.
Before maintenance, read SHOWCH to understand which source history is still needed. Archive retention mistakes during patching and upgrade work are a repeatable cause of painful Extract restarts.
Keep version boundaries straight so your Extract guidance fits the estate you actually run
The core ideas are stable: Extract captures committed source changes, registration creates the database-side binding for integrated capture, and checkpoints govern restart. What changed over time is which capture mode Oracle expects you to design around, how multitenant scope is expressed, and how privileges are granted.
| Area | Older practical estate | Current practical estate | What to carry forward |
|---|---|---|---|
| Oracle capture method | Older Oracle GoldenGate documentation and training still document classic Extract alongside integrated Extract, while marking classic capture deprecated for Oracle. | Current 23 and 26-era operational guidance centers integrated Extract and modern multitenant registration patterns. | For Oracle sources, design around integrated Extract unless you are deliberately maintaining an inherited legacy estate. |
| Multitenant scope | Root-level common-user designs were a more common mental default. | Per-PDB Extract is now a first-class operational model and is the cleaner default when ownership is PDB-local. | Choose scope by ownership and blast radius, not by habit. |
| Privilege grant model | Oracle Database 21c and lower commonly use DBMS_GOLDENGATE_AUTH.GRANT_ADMIN_PRIVILEGE to prepare the capture user. |
Oracle AI Database 26ai and higher use role-based grants such as OGG_CAPTURE; the older procedure is disabled there. |
Validate privilege method against the database release before blaming GoldenGate syntax. |
| Registration syntax nuance | Older command examples often show more explicit database-name phrasing and legacy scoping assumptions. | From Oracle GoldenGate 21.3 onward, the Oracle registration syntax no longer requires the database name, while container-oriented scope remains central where applicable. | Use current command semantics, but keep older examples readable when supporting inherited runbooks. |
| Checkpoint recovery posture | Checkpoint theory was often learned in classic GGSCI-centric estates with filesystem emphasis. | Microservices changes the administration surface, but the Extract recovery logic still depends on read checkpoints, write checkpoints, bounded recovery, and archive continuity. | The control plane changed more than the recovery physics. |
Use per-PDB integrated Extract whenever you want a narrow operational boundary and there is no real need to capture across multiple PDBs from root.
Include SHOWCH, STATUS, and container verification in every Extract runbook. Those are not advanced commands; they are the normal tools for stable operations.
If you inherit classic Oracle Extract, do not merely preserve the syntax. Re-check support boundaries, data type assumptions, and the modern privilege model before treating it as a permanent baseline.
Extract becomes straightforward once you separate capture, registration, and restart state
Choose the capture model first. For Oracle database sources, that usually means integrated Extract. Then register the primary source Extract at the correct source or mining boundary. In multitenant environments, decide explicitly whether the ownership model is per-PDB or root-level across named PDBs. Only after that should you add the Extract group, bind the trail, and start the process.
After the process is running, operate it through checkpoints rather than intuition. The read checkpoint tells you where Extract has advanced in the source stream. The recovery checkpoint tells you how far back recovery may still need to look. The write checkpoint tells you what has been durably emitted to trail. If those three ideas stay distinct in your head, Extract incidents become easier to diagnose, restart behavior becomes predictable, and upgrade or maintenance planning stops being guesswork.
Test your understanding
Select an answer and click Check.
Q1 — In this article, which operational approach best matches "Define what Extract actually owns before you start issuing commands"?
Q2 — In this article, which operational approach best matches "Choose the right capture model before you decide anything about registration"?
Q3 — In this article, which operational approach best matches "Register the right object at the right scope, especially in multitenant estates"?
Q4 — In this article, which operational approach best matches "Build Extract in the correct order or you will create a believable but broken configura..."?
No comments:
Post a Comment