Thursday, March 19, 2026

GoldenGate Extract: Capture Model, Registration & Checkpoints

Extract in Oracle GoldenGate: Capture Model, Registration, Checkpoints, and Operational Behavior

Define what Extract actually owns before you start issuing commands

An Extract group is more than a running executable. GoldenGate treats it as a processing group with its own parameter file, checkpoint state, report history, and associated trail relationship. The source database view of that group, however, depends on the capture model you selected.

That distinction matters because operators often collapse three different objects into one mental picture. The first object is the source-side capture binding inside the database, which exists only when registration is required. The second object is the GoldenGate Extract group, which has its own lifecycle in the deployment or classic installation. The third object is the trail relationship that marks where Extract has already written committed work. If you mix those together, restart and troubleshooting decisions become guesswork.

Process Group

Extract is managed as a named group. In practice that means one process identity, one parameter file, one checkpoint lineage, one report lineage, and one outbound trail contract.

Capture State

For integrated capture, database state exists outside the GoldenGate home as well. Registration and the logmining server become part of the real system, not just the GoldenGate filesystem.

Recovery State

Restart position is not just "last thing I saw." Extract maintains read and write checkpoint state so it can preserve commit ordering and avoid skipping or re-emitting the wrong records.

Operator Rule

When a source capture incident occurs, ask four separate questions in order: was the source prepared correctly, was the Extract registered correctly, where is the read checkpoint, and where is the write checkpoint?

Common Mistake

Do not diagnose Extract using only the trail RBA. A healthy write checkpoint can coexist with a bad source registration, missing archive log, or recovery checkpoint that is far older than you expect.

Useful Habit

Treat INFO EXTRACT, INFO EXTRACT ... SHOWCH, SEND EXTRACT ... STATUS, and the report file as four different lenses, not interchangeable commands.

Choose the right capture model before you decide anything about registration

Registration is not the first design choice. Capture model is. The source path determines whether Extract reads redo directly, receives logical change records from the logmining server, or reads a trail as a downstream pump.

In modern Oracle GoldenGate practice for Oracle database sources, integrated Extract is the default design. Integrated Extract receives logical change records from the database logmining server instead of reading redo itself, and it is the model Oracle continues to center in current GoldenGate documentation and modern multitenant guidance. Older classic Oracle capture material still matters when you inherit estates that were built years ago, but it should not be the design center for a new Oracle source deployment.

Capture path What the process reads Where registration matters Operational consequence
Integrated primary Extract Logical change records from the database logmining server. Must be registered with the source or mining database before ADD EXTRACT ... INTEGRATED TRANLOG. This is the normal Oracle capture path for current GoldenGate practice, including multitenant and downstream mining designs.
Classic primary Extract Redo or archive logs read directly by Extract. Relevant mainly to older Oracle estates and older documentation. It is not the current design baseline for Oracle sources. Useful mostly as legacy context. Its support story is increasingly restrictive in newer Oracle guidance.
Data pump Extract Existing local trail. No database registration, because it is not a source capture process. Checkpointing is still real, but the source is trail position rather than database redo position.
Downstream integrated Extract Logical change records mined on a downstream database from shipped redo or archive data. Requires the same integrated registration logic, plus the mining-database login boundary. Moves capture overhead and mining state away from the production source host at the cost of extra infrastructure and archive discipline.
When One Extract Is Enough

Do not multiply source Extracts by habit

For one Oracle database, or for a controlled set of PDBs inside one CDB, one integrated Extract is often sufficient. Add more only when scope, ownership, or dependency boundaries justify it.

When Downstream Helps

Use downstream to move mining, not to hide confusion

Downstream capture is justified when production overhead isolation, platform separation, or operational policy requires it. It is not the answer to unclear registration or trail design.

When Pump Matters

Remember that pumps are not source capture

A pump can fan out, bridge networks, or normalize trail movement, but it cannot fix missing source logging, bad registration scope, or a broken primary Extract checkpoint chain.

Two Valid Integrated Shapes
Local integrated capture Source database exposes redo to its own logmining services.
register
Integrated Extract Consumes LCRs, filters, transforms, and writes local trail.
write
Trail or pump Durable handoff for transport and downstream apply.
Source database Ships redo or archives to downstream mining database.
mine
Downstream mining database Hosts the logmining server and registration target.
consume
Integrated Extract Writes trail after mining occurs off the production host.

Register the right object at the right scope, especially in multitenant estates

Registration is a source-database decision, not a formatting step. Its job is to create the capture binding that the primary Extract depends on. If you register the wrong scope, the rest of the build may still look syntactically correct while being operationally wrong.

For integrated Extract, registration must happen before the Extract group is added. In a non-CDB, that is usually straightforward: log into the source alias and register the Extract against the database. In a CDB, the design question is whether you want a per-PDB Extract or a root-level Extract that captures one or more PDBs. Oracle's current multitenant guidance favors per-PDB Extract where ownership is meant to stay at the pluggable database level. If you connect as the local PDB user, you do not need a CONTAINER clause or SOURCECATALOG just to build the Extract.

Per-PDB Model

Best when ownership follows a single pluggable database

Connect as a local GoldenGate user in the PDB, register there, and build the Extract as if it were a non-CDB source. This is usually the cleanest operational boundary.

Root-Level Model

Best when one Extract must cover multiple PDBs

Connect at root with the right privileges, register the intended PDB scope explicitly with CONTAINER, and keep that scope visible in your verification commands and runbooks.

Scope choice How you log in How you register What operators forget
Non-CDB integrated Extract Use the source database credential alias. REGISTER EXTRACT name DATABASE The registration timestamp influences where BEGIN NOW effectively starts for integrated capture.
Per-PDB integrated Extract Connect as the PDB-local GoldenGate user. Register directly in that PDB with no CONTAINER clause. Operators often overcomplicate this by using a common root user and three-part object naming when they do not need to.
Root-level integrated Extract for selected PDBs Connect as the common user at root. REGISTER EXTRACT name DATABASE CONTAINER (...) If Extract will fetch data, the root user needs the right container-wide privilege model.
Downstream integrated Extract Use the source login boundary and the mining-database login boundary. Register with the mining database after DBLOGIN and MININGDBLOGIN as required. The registration target is the mining side, not the place where you merely happen to type the command.
GoldenGate Command or configuration snippet
DBLOGIN USERIDALIAS ggsrc_fin
REGISTER EXTRACT efin01 DATABASE

DBLOGIN USERIDALIAS ggsrc_pdbops
REGISTER EXTRACT eops01 DATABASE

DBLOGIN USERIDALIAS cgg_root
REGISTER EXTRACT ecdb01 DATABASE CONTAINER (pdbfin, pdbops)

DBLOGIN USERIDALIAS ggsrc_root
MININGDBLOGIN USERIDALIAS ggmine_root
REGISTER EXTRACT edown01 DATABASE CONTAINER (pdbdist)

Multitenant changes do not stop once the Extract is registered. If you add or drop PDB scope later, the operation must be deliberate. The supported pattern is to stop the Extract, log into the correct database scope again, and then use REGISTER EXTRACT ... DATABASE ADD CONTAINER (...) or DROP CONTAINER (...). That is a configuration change, not a harmless metadata tweak.

Wrong-Scope Failure

A root-level Extract that was meant to isolate one PDB can become an ownership problem later because credential scope, object naming, and operational responsibility are now wider than the original intent.

SCN Caution

If you register at a specific SCN, that SCN must align with a valid dictionary build boundary. Do not improvise a historical SCN just because you want to move the start point backward.

Timing Caution

Registration can return immediately while database-side work continues in the background. Treat it as asynchronous enough that you verify completion before pushing to the next step.

Build Extract in the correct order or you will create a believable but broken configuration

A good Extract build is really a dependency chain. The database must be prepared first. Registration comes next for integrated capture. Only then should you add the Extract group, bind the trail, and start the process.

01

Prepare the source database for capture, not just connectivity

Enable ENABLE_GOLDENGATE_REPLICATION, make sure the source is in ARCHIVELOG mode, enable forced logging, and enable the required supplemental logging strategy for the replicated objects. Extract cannot capture from a read-only Data Guard or Active Data Guard standby.

02

Establish the right privilege model for the database release

On Oracle Database 21c and lower, integrated Extract privilege flows commonly rely on DBMS_GOLDENGATE_AUTH.GRANT_ADMIN_PRIVILEGE. On 26ai and higher, Oracle's role-based OGG_CAPTURE model replaces that procedure.

03

Register the primary source Extract before adding it

Integrated Extract is not created first and registered later. Registration is a prerequisite for ADD EXTRACT ... INTEGRATED TRANLOG and defines the database-side capture binding.

04

Add the Extract group and then the local trail

The Extract group defines the runtime object. The trail binding defines where committed work will be written. A data pump can come later if transport design requires it.

05

Start, verify, and read the checkpoints immediately

Do not stop at RUNNING. Confirm that the process is positioned where you expect, the trail is being written, and the checkpoint view matches the intended source and container scope.

GoldenGate Command or configuration snippet
SHOW PARAMETER enable_goldengate_replication;

SELECT supplemental_log_data_min, force_logging
FROM   v$database;

ARCHIVE LOG LIST;

ALTER DATABASE FORCE LOGGING;
ALTER DATABASE ADD SUPPLEMENTAL LOG DATA;
ALTER SYSTEM SWITCH LOGFILE;
GoldenGate Command or configuration snippet
DBLOGIN USERIDALIAS ggsrc_fin
REGISTER EXTRACT efin01 DATABASE

ADD EXTRACT efin01, INTEGRATED TRANLOG, BEGIN NOW
ADD EXTTRAIL ./dirdat/es, EXTRACT efin01, MEGABYTES 256

EDIT PARAMS efin01

START EXTRACT efin01
INFO EXTRACT efin01
INFO EXTRACT efin01, SHOWCH
GoldenGate Command or configuration snippet
EXTRACT efin01
USERIDALIAS ggsrc_fin
EXTTRAIL ./dirdat/es
LOGALLSUPCOLS
UPDATERECORDFORMAT COMPACT
DDL INCLUDE MAPPED
TABLE sales_fin.*;

Two sequencing details are easy to miss. First, BEGIN NOW is not a magical runtime bookmark independent of registration. For integrated Extract, the effective "now" is tied to the registration event. Second, on the first startup GoldenGate uses the begin point established when the group was created, but after the group has run, normal restarts use the checkpoint chain, not the original BEGIN clause. If you forget that, you will misread recovery behavior after a stop or abend.

Checklist

Before you start the process

  • The source is in read/write mode and archive logging is enabled.
  • The GoldenGate user has the release-appropriate capture privileges.
  • The registration scope matches the intended database or PDB boundary.
  • The Extract name fits the naming rules and the trail path is local to the capture runtime.
Do Not Assume

RUNNING is not enough

The process can be running yet positioned at an unexpected checkpoint, waiting on missing archives, or suspended by an event action. Always read checkpoint and runtime state after the first start.

Data Pump Reminder

Register only the primary Extract

When you create a pump with EXTTRAILSOURCE, its start point is trail-based. Registration and source-database capture privileges are not part of that object's lifecycle.

Read checkpoints like an operator instead of treating them as a generic "last good position"

Extract checkpoints are layered. There is a startup position, a recovery position, a current read position in the source stream, and a current write position in the trail. If you do not distinguish them, restart behavior looks random when it is actually doing exactly what it should.

The most important operational insight is that Extract is tracking two realities at once. It must know how far it has read in the source stream, and it must know how far it has safely emitted committed work into the trail. That is why INFO EXTRACT ... SHOWCH matters so much: it exposes both the read side and the write side. The recovery checkpoint is particularly important because it points to the oldest still-open transaction that has not been fully processed. That is the point that determines how far back recovery may need to look.

Startup checkpoint

The start location GoldenGate will use when the group begins from its configured start point rather than from an already advanced runtime chain.

Recovery checkpoint

The source position of the oldest transaction not yet fully processed. This often governs how much archive history recovery needs.

Current read checkpoint

The last source record read by Extract. In ordinary monitoring this aligns with the summary Log Read Checkpoint view.

Current write checkpoint

The trail position currently being written. This is the write-side guarantee that lets pumps and Replicat consume a durable sequence.

Why recovery checkpoint is the field that saves you during incidents
Source redo history Contains old open transaction beginnings and new committed work.
oldest open
Recovery checkpoint Marks the oldest still-needed source position, not merely the latest read position.
commit order
Trail write checkpoint Shows what has been durably emitted after transaction completion rules are preserved.
GoldenGate Command or configuration snippet
INFO EXTRACT efin01
INFO EXTRACT efin01, SHOWCH
INFO EXTRACT efin01, SHOWCH 5
SEND EXTRACT efin01, STATUS
SEND EXTRACT efin01, SHOWTRANS
VIEW REPORT efin01
Field or command What it really tells you What it does not guarantee How to use it correctly
Log Read Checkpoint The last source position checkpointed by Extract. That all older transactions are fully written and safely irrelevant for recovery. Compare it with SHOWCH detail and recovery checkpoint when restart or archive issues exist.
Recovery Checkpoint The oldest source record still needed to recover open work. That recovery will be short if required archive or online logs are gone. Use it to decide how far back archive availability must reach before restart or upgrade work.
Checkpoint Lag Lag relative to the timestamp of the last record processed at the time the last checkpoint was written. Exact end-to-end business latency at this instant. Pair it with stats, trail movement, and downstream lag before making operational conclusions.
SEND EXTRACT ... STATUS Whether the process is active, recovering, or suspended and which recovery stage it is in. Detailed root cause on its own. Use it when RUNNING looks suspicious or when a recovery appears stalled.
Checkpoint files Persisted process positioning state. A replacement for command-level inspection. Let GoldenGate own the checkpoint artifacts; use commands and reports as the authoritative operator view.
Do Not Confuse

When operators talk about a checkpoint table, they are usually thinking about Replicat. For Extract, the practical day-two conversation should begin with read and write checkpoints shown by INFO EXTRACT ... SHOWCH.

Bounded Recovery

Bounded Recovery is part of the Extract checkpoint facility for Oracle integrated recovery behavior. It exists to keep restart time from growing without limit when long-running transactions are present. The default bounded recovery interval is four hours, and archive retention needs to cover at least twice that interval if you want recovery to succeed consistently.

Retention Consequence

If the archive logs needed by the recovery checkpoint are not available, the restart problem is not cosmetic. You have removed the history that the Extract needs to recover correctly.

Monitor runtime behavior without fooling yourself about what RUNNING and lag mean

Healthy Extract operations come from reading state in context. Status, lag, checkpoints, open transactions, report messages, and container scope each answer a different question.

Oracle's own command semantics make this explicit. STARTING means the process has started but has not yet locked its checkpoint file for processing. RUNNING does not always mean active data movement; it can also mean the process is suspended and preserving current state. Checkpoint Lag is approximate and checkpoint-based. The right habit is to combine summary status with deeper commands whenever lag, trail movement, or restart timing feels inconsistent.

GoldenGate Command or configuration snippet
INFO EXTRACT efin01
INFO EXTRACT efin01, SHOWCH
INFO EXTRACT efin01, CONTAINERS
SEND EXTRACT efin01, STATUS
STATS EXTRACT efin01, TOTAL
VIEW REPORT efin01
Verification step Command Signal to look for What it means operationally
Confirm lifecycle state INFO EXTRACT group Status is no longer STARTING and trail association is present. The process is up far enough to be inspected meaningfully.
Confirm source and trail positioning INFO EXTRACT group, SHOWCH Read and write checkpoint details are advancing in a way that matches recent source activity. The capture side and trail side agree on forward movement.
Confirm multitenant scope INFO EXTRACT group, CONTAINERS The registered PDB list matches the intended replication boundary. You are not accidentally capturing too much or too little of the CDB.
Confirm active versus suspended SEND EXTRACT group, STATUS The process is active, or it clearly reports a recovery stage or suspended state. You can distinguish "alive but not moving" from real forward processing.
Confirm work actually processed STATS EXTRACT group, TOTAL Operation counts align with the intended workload. Useful when lag looks low but business users still suspect capture issues.
Confirm the story behind the state VIEW REPORT group No hidden warnings about registration, container scope, archive access, or parameter interpretation. The report often explains why a state exists when the summary commands only describe it.
First Start

Expect the begin point to matter

The very first run uses the point you established when adding the Extract. For integrated capture, that point ties back to registration semantics, not just the moment you typed ADD EXTRACT.

Normal Restart

Expect the checkpoint chain to matter more

After the Extract has run, a normal restart should honor the current and recovery checkpoints rather than replaying the original startup idea.

Recovery Event

Expect status phases, not instant readiness

If long-running transactions were open, SEND EXTRACT ... STATUS may show named recovery stages such as In recovery[1] and In recovery[2] while GoldenGate walks back through the necessary source history.

Lag Interpretation

Checkpoint lag is useful, but it is not the whole latency story

The lag shown by INFO EXTRACT reflects the timestamp of the last processed record at the moment the last checkpoint was written. It is not a perfect substitute for business-observed commit latency or target apply latency.

Suspended State

A process can be running and still not be processing data

When automation or event actions suspend an Extract, the preserved runtime state can mislead operators who look only at a RUNNING summary. SEND EXTRACT ... STATUS closes that gap.

Diagnose the failure paths early, because most Extract incidents are boundary mistakes

The highest-value troubleshooting skill with Extract is not memorizing every message. It is recognizing which boundary failed: source preparation, registration scope, checkpoint continuity, or runtime recovery.

Symptom Likely cause What to inspect first Next action
ADD EXTRACT ... INTEGRATED TRANLOG fails or later start fails immediately The primary Extract was not registered, or it was registered in the wrong database scope. Registration history, the database alias used with DBLOGIN, and whether the intended PDB or mining database was actually the registration target. Correct the registration boundary before re-creating or restarting the Extract.
Extract starts but does not capture the expected PDB data Wrong per-PDB versus root-level registration model, or the wrong container list. INFO EXTRACT group, CONTAINERS and the credential alias used to build the process. Stop the process and correct the registered container scope with the proper login boundary.
Restart takes far longer than expected Long-running open transactions are forcing a deeper recovery walk. SEND EXTRACT group, STATUS, SEND EXTRACT group, SHOWTRANS, and the recovery checkpoint from SHOWCH. Preserve the required archives and let recovery complete unless you have a deliberate transaction-handling plan.
Recovery will not complete after maintenance or upgrade work The archive logs required by the recovery checkpoint are gone. The recovery checkpoint location and the oldest available archive history on the source or mining side. Restore the needed log history or revise the restart strategy with full understanding of the data-loss implications.
Operators keep changing checkpoint table settings but Extract behavior does not change The wrong object is being tuned. The incident is on the Extract checkpoint chain, not Replicat checkpoint-table state. INFO EXTRACT group, SHOWCH, the report, and trail write behavior. Shift the diagnostic focus back to source read/write checkpoints and trail continuity.
Privilege procedure suddenly stops working on a new database release The environment moved to the newer role-based privilege model where the old privilege procedure is disabled. Database release level and the exact privilege grant method used for the GoldenGate user. Grant the current capture role model instead of assuming the older package-based grant still applies.
Failure Pattern

Registration after the fact

Operators sometimes build the group, then try to "fix" it by registering later. For integrated Extract, that reverses the dependency chain. Build order is part of correctness.

Failure Pattern

Trail-centric troubleshooting

If you look only at the latest trail sequence and RBA, you may miss that recovery still depends on older source history because the oldest open transaction predates the current write point.

Failure Pattern

Privilege drift across upgrades

Source registration failures after database modernization are often grant-model failures, not GoldenGate syntax failures. The command is unchanged while the privilege method is not.

Fast Triage Order

Check source readiness, registration scope, checkpoint detail, recovery checkpoint archive availability, then report output. That order catches most real Extract incidents quickly.

Open Transaction Caution

Commands that force or skip open transactions exist, but they are incident tools with data consequences. Do not use them casually just because a recovery window is uncomfortable.

Upgrade Caution

Before maintenance, read SHOWCH to understand which source history is still needed. Archive retention mistakes during patching and upgrade work are a repeatable cause of painful Extract restarts.

Keep version boundaries straight so your Extract guidance fits the estate you actually run

The core ideas are stable: Extract captures committed source changes, registration creates the database-side binding for integrated capture, and checkpoints govern restart. What changed over time is which capture mode Oracle expects you to design around, how multitenant scope is expressed, and how privileges are granted.

Area Older practical estate Current practical estate What to carry forward
Oracle capture method Older Oracle GoldenGate documentation and training still document classic Extract alongside integrated Extract, while marking classic capture deprecated for Oracle. Current 23 and 26-era operational guidance centers integrated Extract and modern multitenant registration patterns. For Oracle sources, design around integrated Extract unless you are deliberately maintaining an inherited legacy estate.
Multitenant scope Root-level common-user designs were a more common mental default. Per-PDB Extract is now a first-class operational model and is the cleaner default when ownership is PDB-local. Choose scope by ownership and blast radius, not by habit.
Privilege grant model Oracle Database 21c and lower commonly use DBMS_GOLDENGATE_AUTH.GRANT_ADMIN_PRIVILEGE to prepare the capture user. Oracle AI Database 26ai and higher use role-based grants such as OGG_CAPTURE; the older procedure is disabled there. Validate privilege method against the database release before blaming GoldenGate syntax.
Registration syntax nuance Older command examples often show more explicit database-name phrasing and legacy scoping assumptions. From Oracle GoldenGate 21.3 onward, the Oracle registration syntax no longer requires the database name, while container-oriented scope remains central where applicable. Use current command semantics, but keep older examples readable when supporting inherited runbooks.
Checkpoint recovery posture Checkpoint theory was often learned in classic GGSCI-centric estates with filesystem emphasis. Microservices changes the administration surface, but the Extract recovery logic still depends on read checkpoints, write checkpoints, bounded recovery, and archive continuity. The control plane changed more than the recovery physics.
Design Rule

Use per-PDB integrated Extract whenever you want a narrow operational boundary and there is no real need to capture across multiple PDBs from root.

Runbook Rule

Include SHOWCH, STATUS, and container verification in every Extract runbook. Those are not advanced commands; they are the normal tools for stable operations.

Migration Rule

If you inherit classic Oracle Extract, do not merely preserve the syntax. Re-check support boundaries, data type assumptions, and the modern privilege model before treating it as a permanent baseline.

Extract becomes straightforward once you separate capture, registration, and restart state

Choose the capture model first. For Oracle database sources, that usually means integrated Extract. Then register the primary source Extract at the correct source or mining boundary. In multitenant environments, decide explicitly whether the ownership model is per-PDB or root-level across named PDBs. Only after that should you add the Extract group, bind the trail, and start the process.

After the process is running, operate it through checkpoints rather than intuition. The read checkpoint tells you where Extract has advanced in the source stream. The recovery checkpoint tells you how far back recovery may still need to look. The write checkpoint tells you what has been durably emitted to trail. If those three ideas stay distinct in your head, Extract incidents become easier to diagnose, restart behavior becomes predictable, and upgrade or maintenance planning stops being guesswork.

Test your understanding

Select an answer and click Check.

Q1 — In this article, which operational approach best matches "Define what Extract actually owns before you start issuing commands"?

Q2 — In this article, which operational approach best matches "Choose the right capture model before you decide anything about registration"?

Q3 — In this article, which operational approach best matches "Register the right object at the right scope, especially in multitenant estates"?

Q4 — In this article, which operational approach best matches "Build Extract in the correct order or you will create a believable but broken configura..."?

No comments:

Post a Comment