Oracle GoldenGate Pipeline: Capture, Trail, Transport, and Apply

GoldenGate Pipeline: Capture, Trail, Transport & Apply

Oracle GoldenGate Pipeline Atlas

Capture. Trail. Transport. Apply. The core replication pipeline is not a pile of independent processes. It is a staged movement system with deliberate handoff points, restart boundaries, and operational evidence at every hop.

The emphasis is on the practical boundaries that matter in real estates: what capture actually owns, why trails exist as a durable buffer, how transport changes between classic and Microservices layouts, and how Replicat becomes the final integrity gate instead of a passive consumer.

Use This For

Design reviews, onboarding runbooks, migration planning, and troubleshooting sessions where the main question is which stage is healthy and which one is merely running.

Modern Baseline

Oracle source with integrated Extract, local trail persistence, Distribution Service or Receiver Service for network movement, and integrated or integrated parallel Replicat on the target is a common modern baseline.

Boundary

Ongoing replication pipeline only. Initial load, conflict detection, and detailed security hardening are adjacent topics and stay outside this article so the core data path remains crisp.

Table of Contents

Section 01

Topic boundary and the model you should keep in your head

The core GoldenGate pipeline is broader than Extract and narrower than the entire product. It starts where committed source change becomes eligible for capture and ends where Replicat commits that change on the target. Everything between those two points exists to make that journey durable, restartable, and observable.

That framing matters because GoldenGate estates are often described in misleadingly vague language: "capture is fine," "replication is behind," or "the path looks up." Those statements are too coarse to be useful. Capture, trail persistence, transport, and apply fail differently, recover differently, and need different evidence. Treating them as distinct stages is what turns a general replication platform into an operable system.

In scope steady-state CDC pipeline, modern Microservices baseline, classic carry-over concepts

Out of scope initial load mechanics, conflict management, DDL rollout strategy

Reader stance DBA, architect, platform engineer, or replication specialist under delivery pressure

Architecture Comparison

Classic vs. Microservices Transport

Both architectures solve the exact same handoff problem, but the control surfaces change significantly.

Modern Baseline

The modern routing layer replaces physical data pumps with explicit service-based paths.

ExtractCapture from source

Local trailSource-side persistence

Distribution / ReceiverPath-controlled network movement

ReplicatTarget apply

Legacy Baseline

Inherited estates often rely on standalone data pump processes for network transit.

ExtractCapture from source

Local trailSource-side persistence

Data pumpRemote trail transport

ReplicatTarget apply

The first durable boundary is the important one

When an operator says "GoldenGate is behind," the right response is to find the first boundary where progress stops. If the local trail advances, capture is not your main problem. If the remote trail advances but Replicat checkpoints do not, transport is not your main problem. The article stays organized around that logic all the way through.

Section 02

Capture on the source is where correctness starts, not where the network starts

Capture is responsible for turning committed source change into GoldenGate-owned change data. That sounds obvious, but it leads to an important operational rule: the source side must already be prepared before you care about paths, remote trails, or Replicat throughput.

For Oracle sources, the modern command surface exposes integrated capture directly. You add the primary Extract with INTEGRATED TRANLOG, then register it with the database. That sequencing is a strong reminder that integrated capture is not only a GoldenGate process definition. It is also a database relationship that has to exist before change can be mined reliably.

Capture prerequisites

Source logging must preserve enough row identity and change detail for downstream apply, especially for updates and deletes.
The database must be prepared for GoldenGate capture before Extract registration becomes a routine step.
Credentials, deployment ownership, and storage location for local trails should be settled before the process is started.
The first local trail should exist before you ask capture to run continuously.

Why this matters

Missing logging does not always fail at startup. It often surfaces later as apply ambiguity or missing key data.
Late registration creates avoidable confusion about whether the capture edge actually belongs to the database.
Starting Extract before defining the persistence boundary makes recovery reasoning harder than it needs to be.
Source-side discipline is what lets downstream lag analysis stay about lag instead of basic correctness.

Mine committed change

Capture works from the source change stream. The work is not "copy rows" but "materialize committed transactions into GoldenGate records."

Bind Extract to the source

Integrated Extract for Oracle is explicit. Registration is part of the source contract, not a postscript.

Write the local trail

The local trail is the first GoldenGate-owned durable queue. It separates database mining from the rest of the pipeline.

Only then think about transport

If the capture edge is sloppy, every downstream symptom becomes noisier than it has to be.

Admin Client or GGSCI - Oracle source capture skeleton

-- Run only after source logging, privileges, and database-side GoldenGate prep are complete
ADD EXTRACT excore, INTEGRATED TRANLOG, BEGIN NOW
REGISTER EXTRACT excore DATABASE
ADD EXTTRAIL trail_name, EXTRACT excore, MEGABYTES 500
START EXTRACT excore

The order in that bundle is the real lesson. ADD EXTRACT defines the group. REGISTER EXTRACT completes the Oracle capture relationship. ADD EXTTRAIL creates the first durable handoff. In the example, trail_name is the local trail base name for your deployment. Starting the process before those pieces are settled may still produce something that looks alive, but it does not produce a cleanly reasoned pipeline.

A running Extract can still be an unhealthy capture edge

If capture starts before the source has been prepared correctly, the failure may not show up as an immediate startup error. It can surface later as missing metadata, broken update identity, or target-side confusion that gets misdiagnosed as an apply problem. Capture quality is defined by what it emits into the trail, not only by whether the process stays green.

Section 03

The trail is the durable handoff, not an implementation detail you can mentally skip

Trail files are what make GoldenGate a controllable movement system instead of a direct source-to-target stream. They decouple capture from transport, decouple transport from apply, and give every downstream phase a recoverable restart boundary.

That is why experienced operators talk about the local trail and remote trail separately. The local trail proves whether capture is still producing change. The remote or target-side trail proves whether transport is really delivering. Once you think in those two queue boundaries, incident triage becomes dramatically faster.

Boundary	Written by	Consumed by	Why it exists operationally
Local trail	Primary Extract	Classic data pump or Distribution Service	Protects the source capture edge from network instability and gives you the first durable proof that change is being emitted correctly.
Remote or receiver-side trail	Data pump, Distribution Service, or Receiver Service path	Replicat	Protects the apply edge from transport interruptions and creates a clean target-side backlog signal.
Checkpoint state	Replicat metadata and checkpoint table	Replicat on restart	Lets apply resume from a known position rather than guessing from process status alone.

Why the trail is the first thing to inspect

If the local trail is advancing, Extract is still turning source change into GoldenGate-owned records. If the local trail is frozen, transport and apply analysis is premature. If the remote trail is advancing but target commits are not, transport is not the bottleneck. This is the single most useful way to reduce troubleshooting noise.

Why mixed-version estates care about trail format

Trail compatibility is not accidental. The trail definition can pin a compatibility target with FORMAT RELEASE, which becomes important when downstream readers are not yet on the same major or minor level as the writer. This is one of the cleanest upgrade pressure valves in the pipeline.

Admin Client or GGSCI - Trail with explicit compatibility level

-- Use FORMAT RELEASE only when downstream version compatibility requires it
ADD EXTTRAIL trail_name, EXTRACT excore, MEGABYTES 500, FORMAT RELEASE 21.3

Do not let purge policy outrun your slowest honest consumer

Trail cleanup is healthy only when it follows actual downstream progress. If remote transport stalls or Replicat falls behind, an aggressive purge rule can turn a recoverable lag incident into a rebuild. Trails are supposed to absorb pressure. Purge policy should respect that design rather than defeat it.

Section 04

Transport is where the same pipeline splits into classic pump logic or Microservices path logic

The transport stage moves trail data between systems without collapsing the source and target boundaries. In classic architecture this usually means a data pump writing a remote trail. In Microservices Architecture it means Distribution Service and, where needed, Receiver Service paths.

Current Microservices guidance matters because it changes how you reason about the network hop. Distribution Service is not just a renamed pump. It is the routing layer for path-managed delivery, point-to-point distribution, fan-out, and deployment-to-deployment movement. Receiver Service gives the target side a dedicated inbound role and, in target-initiated designs, the ability to pull from the source-side Distribution Service when the network policy demands that direction.

Transport decision matrix

Transport pattern	Best fit	Main consequence
Classic data pump	Inherited classic estates, simple point-to-point flows.	You keep familiar pump semantics, but you also keep classic operational tooling.
Source-initiated Distribution path	Modern default when the source is allowed to initiate outbound delivery.	The outbound path is defined and managed from the source-side Distribution Service.
Target-initiated Receiver path	Restricted target zones where the source cannot initiate inbound sessions.	The target deployment owns the pull relationship and writes the received trail locally.

What transport should prove

The path can authenticate and connect in the right direction for the network policy.
Target-side storage is receiving trail data where Replicat expects to read it.
Transport backlog is measurable as a queue, not guessed from general network health.
Path ownership is obvious enough that restart, pause, and resync actions are not ad hoc.

Microservices Transport

Choose path direction deliberately instead of treating it as wiring trivia

The correct direction is a security and operability decision, not just a network workaround.

Source-initiated path

Create the outbound path on the source-side Distribution Service.
Point it at the target deployment and target trail landing point.
Use this when the source environment is allowed to initiate the session toward the target.
Operationally, this keeps ownership of the network send action on the source side.

Target-initiated path

Create the path on the target-side Receiver Service.
Point it to the source Distribution Service URI and define the received trail.
Use this when the target is allowed to pull but the source cannot open into the destination zone.
Operationally, this makes the target side responsible for fetching and landing the trail.

Transport is still a queueing problem

Even in Microservices, the right mental model is not "a REST-managed stream." It is still queued data moving from one durable boundary to another. The services change path ownership and observability, but they do not eliminate the need to reason about backlog, landing location, or which side owns the next restart.

Section 05

Apply on the target is where availability, ordering, and restart discipline stop being abstract

Replicat is the last stage in the core pipeline, but it is not just the last reader. It is the place where trail records become committed target transactions, where checkpoint state must be trustworthy, and where poor upstream reasoning often shows up as false blame.

For Oracle targets, current GoldenGate command flow exposes both integrated Replicat and integrated parallel Replicat. That choice should be conscious. Integrated apply is a common modern shape for Oracle targets. Integrated parallel apply belongs where throughput and dependency behavior justify the extra concurrency and tuning surface. Neither choice should be made by inheritance alone.

Integrated Replicat

Strong modern baseline for Oracle targets when the goal is dependable transactional apply with database-aware coordination.

Integrated Parallel Replicat

Use when target-side workload shape, dependency profile, and throughput needs justify additional parallel apply machinery.

Checkpoint Table

Not optional window dressing. It records the read and write positions that give restart behavior its discipline.

Admin Client or GGSCI - Oracle target apply skeleton

-- Establish checkpoint state before starting target apply
ADD CHECKPOINTTABLE ggops.gg_ckpt

-- Integrated Replicat
ADD REPLICAT rcore, INTEGRATED, EXTTRAIL trail_name, CHECKPOINTTABLE ggops.gg_ckpt
START REPLICAT rcore

-- Integrated parallel Replicat when the target workload justifies it
ADD REPLICAT rcorep, INTEGRATED PARALLEL, EXTTRAIL trail_name, CHECKPOINTTABLE ggops.gg_ckpt
START REPLICAT rcorep

The checkpoint table matters because target apply is never only about current status. It is about where Replicat last read, where it last wrote, and whether a restart resumes from a defensible position. In the example, trail_name stands for the target-side trail base name that Replicat will read. If the target side is under pressure, checkpoint movement is often more informative than a simple running state.

Target lag is not always a target-only problem

If Replicat falls behind, the diagnosis still starts with queue boundaries. Is the remote trail receiving data at the expected rate? Are checkpoints stalled because apply is the bottleneck, or because the path stopped landing new trail records? Treating all target lag as a pure SQL or index problem is one of the quickest ways to waste an incident window.

Section 06

Build order matters because each step defines the next observable boundary

A strong GoldenGate onboarding sequence does more than create objects in the right general order. It makes each step independently verifiable before the next stage is introduced. That is how you avoid a deployment that technically starts but is hard to reason about under load.

Prepare source

Finish source logging and database-side prep before Extract.

Add Extract

Create integrated capture, register it, then make runnable.

Persistence

Create the local trail and settle compatibility first.

Transport

Choose classic pump or specific Microservices path direction.

Target apply

Add checkpoints, create Replicat, and validate end to end.

Admin Client or GGSCI - Add heartbeat tables

-- Connect to each participating database and add heartbeat objects
ADD HEARTBEATTABLE

Heartbeat tables belong near the end of bring-up but before you call the pipeline production-ready. They convert "lag feels high" into a stage-aware observation with actual timestamps moving through the route. For a serious estate, that is the difference between subjective replication complaints and a measurable data path.

Phase	What to inspect	Healthy signal	What it means if not healthy
Capture bring-up	Extract definition, registration, and local trail progression	Extract is running and the first trail boundary advances when source transactions occur	If the local trail does not move, transport and apply analysis can wait.
Transport bring-up	Distribution or Receiver path state, remote landing trail activity	Path is established and target-side trail receives new records	If target-side trail stays idle while the local trail advances, the gap is in transport ownership, connectivity, or path definition.
Apply bring-up	Replicat status and checkpoint movement	Replicat checkpoints move as remote trail data arrives	If checkpoints stall while the remote trail advances, apply is now the primary problem boundary.
End-to-end visibility	Heartbeat state and lag interpretation	Timestamps move cleanly through the path and lag stays explainable by workload and topology	If heartbeats do not advance, process status alone is not trustworthy evidence.

Section 07

Diagnostics are easiest when you ask which boundary stopped moving first

GoldenGate incidents feel chaotic mainly when teams jump straight to the most visible symptom. A cleaner method is stage-first diagnosis: inspect capture, then local trail, then transport landing, then apply checkpoints, then heartbeat movement. That order mirrors the actual pipeline and keeps you from treating the wrong tier as guilty.

Capture evidence

Source activity should translate into advancing local trail state. If not, the problem is still upstream of the network.

Transport evidence

Target-side trail landing is the proof that a path is more than merely configured.

Apply evidence

Replicat checkpoint movement is stronger evidence than a green process badge.

Symptom	Most likely stalled boundary	What to inspect next	Why this usually happens
Source commits exist, but no downstream movement is visible	Capture to local trail	Confirm Extract registration, source readiness, and whether the local trail advances at all	Source preparation was incomplete, or the capture edge never became healthy even though the process exists.
Local trail advances, but target-side trail stays quiet	Transport	Inspect path ownership, route direction, authentication, and target landing definition	The path is paused, misdirected, blocked by network policy, or writing somewhere Replicat is not reading.
Target-side trail advances, but commits on the target do not	Apply	Inspect Replicat mode, checkpoint movement, target contention, and dependency pressure	Replicat is the true bottleneck, not capture or transport.
Updates and deletes fail or behave ambiguously after onboarding	Capture correctness	Review source logging assumptions and whether the captured records preserve required identity information	Process health was mistaken for data quality at the capture edge.
Restart behavior is messy or duplicates are suspected	Checkpoint discipline	Inspect checkpoint table usage, trail bindings, and recent recovery actions	The restart boundary was not treated as a first-class design object.
Mixed-version path breaks during upgrade work	Trail compatibility boundary	Review whether the writing side needs an explicit `FORMAT RELEASE` for downstream readers	Version mismatch was handled socially instead of through the trail contract.

Prefer queue questions over process questions

"Is Extract up?" is weaker than "Is the local trail moving?" "Is the path configured?" is weaker than "Is the target-side trail receiving data?" "Is Replicat green?" is weaker than "Are checkpoints advancing?" Queue-aware questions align with the actual pipeline and produce shorter incidents.

Section 08

Version-aware design guidance: keep the pipeline model, update the control surface

The underlying replication stages are longstanding. What changes across generations is how you configure, route, observe, and scale them. The safest design posture is to preserve the core pipeline mental model while updating the surrounding tooling and defaults.

Area	Longstanding reality	Current design bias	Operational consequence
Architecture	Classic vocabulary and process-group thinking still appear in older estates.	Treat Microservices Architecture as the current baseline for new build-outs.	Teams must stay bilingual during migration work, but new operational design should center on deployments and services.
Oracle capture	Extract has always been the source edge of the pipeline.	For Oracle, current command flow explicitly exposes integrated Extract and separate Extract registration.	Source onboarding is a database-integrated design task, not just a GoldenGate shell task.
Transport	Data pumps solved the classic remote trail hop.	Distribution Service and Receiver Service should be treated as the modern routing and landing control surface.	Transport direction, path ownership, and landing location become first-class design decisions.
Apply	Replicat has always been the target-side commit boundary.	Integrated Replicat or integrated parallel Replicat deserve an explicit review for Oracle targets.	Do not carry an old apply mode forward automatically just because the old estate survived on it.
Compatibility	Mixed-version movement has always required careful boundaries.	Use trail compatibility settings deliberately when readers lag behind writers.	Trail format becomes a clean upgrade contract instead of a source of surprise at cutover time.
Observability	Operators always needed lag and checkpoint evidence.	Heartbeat objects and service-level path visibility should be added early, not after the first incident.	Modern estates should diagnose by stage movement, not by intuition or a single status page.

The most productive way to operate Oracle GoldenGate is to stop thinking of it as one opaque replication engine and start thinking of it as four linked boundaries: capture, trail, transport, and apply. Each boundary owns a different failure mode, a different restart story, and a different kind of evidence.

Once that model is clear, the rest of the architecture becomes easier to reason about. Integrated Extract defines the Oracle source edge. Trails provide the durable queue. Distribution Service and Receiver Service modernize the network hop without erasing the queueing model. Replicat, backed by checkpoint state, finishes the path on the target. That is the core pipeline, and nearly every serious GoldenGate design or incident can be understood by asking which of those four boundaries is truly moving.

Oracle Apps DBA

Search This Blog

Oracle GoldenGate Pipeline: Capture, Trail, Transport, and Apply

Capture. Trail. Transport. Apply. The core replication pipeline is not a pile of independent processes. It is a staged movement system with deliberate handoff points, restart boundaries, and operational evidence at every hop.

Topic boundary and the model you should keep in your head

Classic vs. Microservices Transport

Modern Baseline

Legacy Baseline

Capture on the source is where correctness starts, not where the network starts

Mine committed change

Bind Extract to the source

Write the local trail

Only then think about transport

The trail is the durable handoff, not an implementation detail you can mentally skip

Why the trail is the first thing to inspect

Why mixed-version estates care about trail format

Transport is where the same pipeline splits into classic pump logic or Microservices path logic

Choose path direction deliberately instead of treating it as wiring trivia

Source-initiated path

Target-initiated path

Apply on the target is where availability, ordering, and restart discipline stop being abstract

Build order matters because each step defines the next observable boundary

Prepare source

Add Extract

Persistence

Transport

Target apply

Diagnostics are easiest when you ask which boundary stopped moving first

Version-aware design guidance: keep the pipeline model, update the control surface

Test your understanding

Labels

Comments

Post a Comment

Popular posts from this blog

Data Safe - Introduction

Testing Different Access Paths : Concatenated Index

Database Replay - Real Application Testing (RAT)