How to Replicate Aptible PostgreSQL to Google BigQuery using Datastream
Last updated: December 9, 2025
Overview
This guide outlines the architecture and steps to establish a real-time, Change Data Capture (CDC) pipeline from an Aptible hosted PostgreSQL database to Google BigQuery.
The Architecture
We utilize Google Datastream (Serverless CDC) to read the PostgreSQL Write-Ahead Log (WAL) via a secure TCP Endpoint.
Aptible PostgreSQL -> Aptible Database Endpoint (with IP Filtering) -> Google Datastream -> Google BigQuery
Key Features:
Secure: Access is restricted to specific Google Cloud static IPs and encrypted via TLS using the Aptible Environment CA.
Real-time: Changes (Inserts, Updates, Deletes) are replicated in near real-time.
No Maintenance: Uses native PostgreSQL replication features (
pgoutput) and fully managed Google services.
Prerequisites
Administrator access to the Aptible Environment.
The Aptible CLI installed.
A Google Cloud Platform (GCP) project with billing enabled.
Datastream API and BigQuery API enabled in the GCP project.
Step 1: Configure Aptible Network Access
To allow Google Datastream to connect to your database, you must create an external endpoint and allowlist Google's IP addresses.
Create a Database Endpoint
Navigate to the Aptible Dashboard > Database > Endpoints.
Create a new TCP Endpoint.
Note: Keep this page open. You will need the Host (e.g.,
elb-xxx.aptible.in) and Port later.
Get Google Datastream IPs
Open the Google Cloud Console.
Navigate to Datastream > Connection Profiles > Create Profile.
Select PostgreSQL.
Select the Region where you plan to run the stream (e.g.,
us-east1).Select "IP allowlisting" as the connectivity method.
Copy the list of Static IP addresses displayed in the UI.
Allowlist IPs in Aptible
Return to the Aptible Dashboard (or use the CLI).
Edit the Endpoint created in Step 1.
Add the Google IPs to the IP Allowlist.
Step 2: Configure PostgreSQL Replication
We will configure the database to output logical replication changes using the default aptible superuser.
Connect to the Database
Use the Aptible CLI to create a tunnel to your database:
Bash
aptible db:tunnel <YOUR_DATABASE_HANDLE>Copy the connection URL provided (e.g.,
postgresql://aptible:PASS@localhost:PORT/db).Run the Configuration SQL
Connect via your preferred SQL client (e.g., psql, DBeaver) using the tunnel credentials and run the following commands:
-- 1. Create the Publication -- This tells Postgres to send changes for ALL tables. CREATE PUBLICATION datastream_publication FOR ALL TABLES; -- 2. Create the Replication Slot -- We use the standard 'pgoutput' plugin required by Google Datastream. SELECT pg_create_logical_replication_slot('datastream_slot', 'pgoutput'); -- 3. Prevent Idle Timeouts -- Aptible Load Balancers have a 60-minute idle timeout. ALTER SYSTEM SET wal_sender_timeout = 0; -- 4. Verify Creation SELECT * FROM pg_replication_slots;
Step 3: Configure Google Datastream
3.1 Retrieve the Aptible CA Certificate
To ensure the connection is securely encrypted and verified, you must download the Certificate Authority (CA) for your Aptible environment.
Identify your Environment Handle (found in the URL of your dashboard, e.g.,
aptible.com/environments/YOUR_HANDLE).Run the following CLI command to save the certificate to a file:
aptible environment:ca_cert <YOUR_ENVIRONMENT_HANDLE> > aptible_ca.pemThis creates a file named
aptible_ca.pemin your current directory.
3.2 Create Source Connection Profile (PostgreSQL)
Hostname: Enter the Aptible Endpoint URL (e.g.,
elb-xxx.aptible.in).Port: Enter the Aptible Endpoint Port.
Username:
aptiblePassword: <Your Database Password>
Database Name:
db(or your specific database name).Encryption:
Select Require SSL/TLS.
Server verification: Select Server verification.
CA Certificate: Click Upload and select the
aptible_ca.pemfile you generated in Step 3.1.
Connectivity: Select IP allowlisting (verify the IPs match Step 1).
Click RUN TEST. Once successful, click CREATE.
3.3 Create Destination Connection Profile (BigQuery)
Navigate to Connection Profiles > Create Profile.
Select BigQuery.
Select your existing GCP Project.
Click CREATE.
3.4 Create the Stream
Navigate to Streams > Create Stream.
Name:
aptible-to-bigquery.Source: Select the Postgres profile created above.
Destination: Select the BigQuery profile created above.
Source Config:
Replication Slot Name:
datastream_slotPublication Name:
datastream_publication
Destination Config:
Choose your Staleness limit (usually 0-15 minutes).
Select the Region for the BigQuery dataset.
Review & Start: Click START.
Step 4: Validation
Wait approx. 2-5 minutes for the stream to initialize and perform the "Backfill" (initial load of existing data).
Navigate to BigQuery in the Google Cloud Console.
You will see a new Dataset (likely named after your source DB).
Run a test query:
SELECT * FROM `your-project.your_dataset.your_table` LIMIT 100;
For specific configuration steps for Database Endpoints, refer to our Database Endpoints documentation. For Google Datastream setup, consult Google Cloud's Datastream documentation.