begriffs open source - pg_scribe/blob - logical-replication-backup-design.md

   1 # Incremental SQL Backup System Using PostgreSQL Logical Replication
   2
   3 ## Executive Summary
   4
   5 This document details a proof-of-concept design for a PostgreSQL backup system that produces human-readable, plain SQL incremental backups using logical replication. The system aims to create backups that remain readable and restorable for 10+ years while supporting online operation and crash safety.
   6
   7 **Key Finding**: This approach is **feasible and significantly aided by built-in PostgreSQL tools**.
   8
   9 **UPDATE**: The built-in `pg_recvlogical` utility provides most of the incremental collection functionality, substantially reducing implementation complexity from initial estimates.
  10
  11 **Major Challenges**:
  12 1. **DDL is not captured by logical replication** - requires event triggers (which have gaps for shared objects)
  13    - ✅ **Addressed by**: `pg_dumpall --globals-only` for databases/roles/tablespaces
  14 2. **Replica identity enforcement** - tables without proper configuration will error on UPDATE/DELETE at the source
  15    - ⚠️ **No tool assistance** - requires manual configuration and enforcement
  16 3. **Operational risk** - inactive replication slots can cause database shutdown via transaction wraparound
  17    - ⚠️ **Partial assistance** - monitoring views exist (`pg_stat_replication_slots`) but alerting must be custom
  18 4. **Data type complexity** - transforming WAL output to portable SQL is non-trivial
  19    - ✅ **Simplified by**: `pg_recvlogical` provides structured output from `test_decoding` plugin
  20 5. **Incomplete coverage** - event triggers don't fire for databases, roles, or tablespaces
  21    - ✅ **Addressed by**: `pg_dumpall --globals-only` captures all shared objects
  22
  23 **Bottom Line**: While technically feasible, this approach requires substantially more engineering effort, operational maturity, and ongoing maintenance than standard PostgreSQL backup solutions (pg_basebackup + WAL archiving). However, the existence of `pg_recvlogical` reduces complexity significantly. Only pursue this if human-readable SQL backups are a hard requirement (e.g., regulatory compliance).
  24
  25 ## Architecture Overview
  26
  27 ### High-Level Design
  28
  29 ```
  30 ┌─────────────────────────────────────────────────────────────┐
  31 │                    PostgreSQL Database                       │
  32 │                                                              │
  33 │  ┌────────────────┐         ┌──────────────────┐           │
  34 │  │  Regular Tables│────────▶│  WAL (Write-Ahead│           │
  35 │  │  (DML Changes) │         │       Log)       │           │
  36 │  └────────────────┘         └──────────────────┘           │
  37 │                                      │                       │
  38 │                                      ▼                       │
  39 │                         ┌─────────────────────────┐         │
  40 │                         │ Logical Decoding Process│         │
  41 │                         │  (Decodes WAL to SQL)   │         │
  42 │                         └─────────────────────────┘         │
  43 │                                      │                       │
  44 └──────────────────────────────────────┼───────────────────────┘
  45                                        │
  46                                        ▼
  47                         ┌─────────────────────────────┐
  48                         │  Replication Slot           │
  49                         │  (Tracks position, durable) │
  50                         └─────────────────────────────┘
  51                                        │
  52                                        ▼
  53                         ┌─────────────────────────────┐
  54                         │    pg_recvlogical Tool     │
  55                         │ (Built-in PostgreSQL util)  │
  56                         └─────────────────────────────┘
  57                                        │
  58                         ┌──────────────┴──────────────┐
  59                         ▼                             ▼
  60           ┌─────────────────────┐      ┌─────────────────────┐
  61           │  Incremental Files  │      │  Full pg_dump       │
  62           │  (SQL Changes)      │      │  (Periodic)         │
  63           │  - 2024-01-01.sql   │      │  - base-2024-01.sql │
  64           │  - 2024-01-02.sql   │      │  - base-2024-02.sql │
  65           │  - ...              │      │  - ...              │
  66           └─────────────────────┘      └─────────────────────┘
  67 ```
  68
  69 ### Core Components
  70
  71 1. **Logical Replication Slot**: Durable position tracker in PostgreSQL
  72 2. **Output Plugin**: Transforms binary WAL to human-readable format (e.g., `test_decoding`)
  73 3. **`pg_recvlogical`**: Built-in PostgreSQL tool that streams logical decoding output
  74 4. **Base Backup System**: Regular full `pg_dump` backups with `--snapshot` for consistency
  75 5. **Schema Tracking System**: Event triggers + `pg_dumpall --globals-only` for DDL changes
  76
  77 ## How Logical Replication Works
  78
  79 ### What Logical Replication Provides
  80
  81 PostgreSQL's logical replication decodes the Write-Ahead Log (WAL) into logical changes:
  82
  83 ```sql
  84 -- Example output from test_decoding plugin:
  85 BEGIN 10298
  86 table public.users: INSERT: id[integer]:1 name[text]:'Alice' email[text]:'alice@example.com'
  87 table public.users: UPDATE: id[integer]:1 name[text]:'Alice Smith' email[text]:'alice@example.com'
  88 table public.orders: DELETE: id[integer]:42
  89 COMMIT 10298
  90 ```
  91
  92 This can be transformed into standard SQL:
  93
  94 ```sql
  95 BEGIN;
  96 INSERT INTO public.users (id, name, email) VALUES (1, 'Alice', 'alice@example.com');
  97 UPDATE public.users SET name = 'Alice Smith' WHERE id = 1;
  98 DELETE FROM public.orders WHERE id = 42;
  99 COMMIT;
 100 ```
 101
 102 ### Key Properties
 103
 104 - **Crash-safe**: Replication slots persist position across crashes
 105 - **Consistent**: Transaction boundaries are preserved
 106 - **Online**: Runs without blocking database operations
 107 - **Idempotent positioning**: Can restart from last known position
 108
 109 ## Critical Challenge: DDL is NOT Replicated
 110
 111 ### The Problem
 112
 113 From PostgreSQL documentation (Section 29.8):
 114
 115 > **The database schema and DDL commands are not replicated.** The initial schema can be copied by hand using `pg_dump --schema-only`. Subsequent schema changes would need to be kept in sync manually.
 116
 117 This means logical replication will **NOT** capture:
 118
 119 - `CREATE TABLE`
 120 - `ALTER TABLE` (adding columns, changing types, etc.)
 121 - `DROP TABLE`
 122 - `CREATE INDEX`
 123 - Any other schema modifications
 124
 125 ### Example of the Issue
 126
 127 ```sql
 128 -- Day 1: Base backup taken
 129 CREATE TABLE users (id INT PRIMARY KEY, name TEXT);
 130
 131 -- Day 2: Data changes (CAPTURED by logical replication)
 132 INSERT INTO users VALUES (1, 'Alice');
 133
 134 -- Day 3: Schema change (NOT CAPTURED)
 135 ALTER TABLE users ADD COLUMN email TEXT;
 136
 137 -- Day 4: More data changes (CAPTURED, but in new schema)
 138 UPDATE users SET email = 'alice@example.com' WHERE id = 1;
 139 ```
 140
 141 **Problem**: The incremental backup for Day 4 will contain:
 142 ```sql
 143 UPDATE users SET email = 'alice@example.com' WHERE id = 1;
 144 ```
 145
 146 But if you restore from Day 1 base backup, the `email` column doesn't exist yet, and the UPDATE will fail.
 147
 148 ### Solutions for DDL Tracking
 149
 150 #### Option 1: Event Triggers (Recommended)
 151
 152 PostgreSQL supports **event triggers** that can capture DDL events:
 153
 154 ```sql
 155 -- Create a table to log DDL changes
 156 CREATE TABLE ddl_history (
 157     id SERIAL PRIMARY KEY,
 158     executed_at TIMESTAMP DEFAULT now(),
 159     ddl_command TEXT,
 160     object_type TEXT,
 161     object_identity TEXT,
 162     query TEXT
 163 );
 164
 165 -- Create event trigger function
 166 CREATE OR REPLACE FUNCTION log_ddl_changes()
 167 RETURNS event_trigger AS $$
 168 DECLARE
 169     obj RECORD;
 170 BEGIN
 171     -- Capture DDL command details
 172     FOR obj IN SELECT * FROM pg_event_trigger_ddl_commands()
 173     LOOP
 174         INSERT INTO ddl_history (ddl_command, object_type, object_identity, query)
 175         VALUES (obj.command_tag, obj.object_type, obj.object_identity, current_query());
 176     END LOOP;
 177 END;
 178 $$ LANGUAGE plpgsql;
 179
 180 -- Register the event trigger for DDL commands
 181 CREATE EVENT TRIGGER capture_ddl ON ddl_command_end
 182     EXECUTE FUNCTION log_ddl_changes();
 183 ```
 184
 185 **How it works**:
 186 1. Event trigger captures all DDL commands
 187 2. Stores them in `ddl_history` table with timestamps
 188 3. Your backup script exports this table periodically
 189 4. During restore, apply DDL changes in chronological order before applying DML changes
 190
 191 **Benefits**:
 192 - ✓ Captures all DDL automatically
 193 - ✓ Timestamp-ordered for correct replay
 194 - ✓ The `ddl_history` table itself gets replicated via logical replication
 195
 196 **Limitations**:
 197 - ✗ **Event triggers don't fire for shared objects** - this is a significant gap:
 198   - `CREATE/DROP/ALTER DATABASE` commands are not captured
 199   - `CREATE/DROP/ALTER ROLE` and role membership (`GRANT ROLE`) are not captured
 200   - `CREATE/DROP/ALTER TABLESPACE` commands are not captured
 201   - `ALTER SYSTEM` commands are not captured
 202   - These require a **separate tracking mechanism** or periodic `pg_dumpall --globals-only`
 203 - ✗ Event trigger functions cannot themselves use `CREATE EVENT TRIGGER` (triggers on triggers)
 204 - ✗ The `ddl_history` table's own schema evolution creates a bootstrap problem
 205
 206 #### Option 2: Application-Level Tracking
 207
 208 Track all schema migrations in your application:
 209
 210 ```python
 211 # migration_001.sql
 212 CREATE TABLE users (id INT PRIMARY KEY, name TEXT);
 213
 214 # migration_002.sql
 215 ALTER TABLE users ADD COLUMN email TEXT;
 216 ```
 217
 218 Store these migrations alongside your backups and apply them in order during restore.
 219
 220 **Benefits**:
 221 - ✓ Clean, version-controlled schema evolution
 222 - ✓ Works well if you already use migration tools (Flyway, Liquibase, Alembic)
 223
 224 **Limitations**:
 225 - ✗ Requires discipline - all DDL must go through migration system
 226 - ✗ Ad-hoc DDL changes will be missed
 227
 228 #### Option 3: Periodic Schema Dumps
 229
 230 Take regular `pg_dump --schema-only` snapshots:
 231
 232 ```bash
 233 # Every hour or day
 234 pg_dump --schema-only dbname > schema-2024-01-01-14:00.sql
 235 ```
 236
 237 During restore:
 238 1. Restore base backup
 239 2. Apply schema dumps in order
 240 3. Apply incremental DML changes
 241
 242 **Benefits**:
 243 - ✓ Simple to implement
 244 - ✓ Catches all schema changes
 245
 246 **Limitations**:
 247 - ✗ May capture schema at wrong moment (between related DDL+DML)
 248 - ✗ Harder to determine exact order of schema vs. data changes
 249
 250 ## Things That Work Better Than Expected
 251
 252 Before diving into challenges, it's worth noting what works well:
 253
 254 ### 1. Crash Safety is Excellent
 255
 256 **From documentation (Section 47.2.2)**:
 257 > A logical slot will emit each change just once in normal operation. Slots persist independently of the connection using them and are crash-safe.
 258
 259 The replication slot mechanism is robust and well-tested. After crashes:
 260 - No data loss occurs (all changes are preserved in WAL)
 261 - Position tracking is durable (persisted at checkpoints)
 262 - Only duplicate delivery is possible (easily handled)
 263
 264 ### 2. Transaction Boundaries Are Preserved
 265
 266 Logical decoding respects ACID properties:
 267 - All changes within a transaction are grouped together
 268 - Transactions appear atomically in the change stream
 269 - Transaction ordering is maintained
 270 - This makes restore much simpler than systems that capture individual row changes
 271
 272 ### 3. Online Operation Has No Blocking
 273
 274 Unlike some backup methods:
 275 - No locks are held during logical decoding
 276 - No impact on normal database operations
 277 - Can run continuously without maintenance windows
 278 - Minimal performance impact (though measurable - see Section 8)
 279
 280 ### 4. Exported Snapshots Solve Initial Consistency
 281
 282 PostgreSQL provides built-in snapshot export when creating a replication slot, ensuring the base backup and incremental stream are perfectly aligned with no gaps. This is a solved problem.
 283
 284 ## Other Challenges and Limitations
 285
 286 ### 1. Sequences Are Not Replicated
 287
 288 **Problem**: Sequence values (used by `SERIAL` and `IDENTITY` columns) are not replicated.
 289
 290 **From documentation (Section 29.8)**:
 291 > Sequence data is not replicated. The data in serial or identity columns backed by sequences will of course be replicated as part of the table, but the sequence itself would still show the start value on the subscriber.
 292
 293 **Example**:
 294 ```sql
 295 CREATE TABLE users (id SERIAL PRIMARY KEY, name TEXT);
 296
 297 -- Insert 1000 rows
 298 -- Sequence on source is now at 1001
 299 -- Sequence on restored database is still at 1
 300
 301 -- Next insert will fail with duplicate key error
 302 INSERT INTO users (name) VALUES ('Bob');  -- Tries to use id=1, but id=1 already exists
 303 ```
 304
 305 **Solution**:
 306 Use `pg_dump --sequence-data` (enabled by default) in periodic full dumps to capture sequence values. After restore, synchronize sequences:
 307
 308 ```sql
 309 -- After restore, sync all sequences
 310 SELECT setval('users_id_seq', (SELECT MAX(id) FROM users));
 311 ```
 312
 313 The `pg_dump --sequence-data` flag is enabled by default and will include current sequence values in full backups.
 314
 315 ### 2. Large Objects Are Not Replicated
 316
 317 **From documentation (Section 29.8)**:
 318 > Large objects (see Chapter 33) are not replicated. There is no workaround for that, other than storing data in normal tables.
 319
 320 **Solution**:
 321 - **Preferred**: Avoid using PostgreSQL large objects. Use `BYTEA` columns instead, which ARE replicated.
 322 - **Alternative**: Use `pg_dump --large-objects` (or `-b`) in periodic full backups to capture large objects
 323   - Note: Incremental changes to large objects will NOT be captured between full backups
 324
 325 ### 3. Replication Slots Prevent WAL Cleanup
 326
 327 **From documentation (Section 47.2.2)**:
 328 > Replication slots persist across crashes and know nothing about the state of their consumer(s). They will prevent removal of required resources even when there is no connection using them. This consumes storage because neither required WAL nor required rows from the system catalogs can be removed by VACUUM as long as they are required by a replication slot.
 329
 330 **Problem**: If your backup script stops running, the replication slot will:
 331 1. Prevent WAL files from being cleaned up → **disk fills up**
 332 2. Prevent VACUUM from cleaning old row versions → **table bloat**
 333 3. Prevent VACUUM from cleaning old transaction IDs → **database shutdown to prevent transaction ID wraparound**
 334
 335 **From documentation (Section 47.2.2)**:
 336 > In extreme cases this could cause the database to shut down to prevent transaction ID wraparound.
 337
 338 This is **not a theoretical risk** - an inactive replication slot can and will cause production outages.
 339
 340 **Solution**:
 341 - **Monitor slot lag aggressively**:
 342   ```sql
 343   SELECT * FROM pg_replication_slots;
 344   ```
 345 - Set up **critical alerting** when `restart_lsn` falls more than 1GB behind
 346 - Set up **emergency alerting** when slot lag exceeds 10GB or age exceeds 24 hours
 347 - Have a **documented emergency procedure** to drop the slot if it threatens database availability
 348 - Consider using `pg_replication_slot_advance()` to skip ahead if needed (loses incremental backup coverage)
 349 - Use `max_slot_wal_keep_size` parameter (PostgreSQL 13+) to limit how much WAL a slot can retain
 350
 351 ### 4. Slot Position is Only Persisted at Checkpoint
 352
 353 **From documentation (Section 47.2.2)**:
 354 > The current position of each slot is persisted only at checkpoint, so in the case of a crash the slot might return to an earlier LSN, which will then cause recent changes to be sent again when the server restarts.
 355
 356 **Problem**: After a crash, you might receive duplicate changes.
 357
 358 **Example**:
 359 ```
 360 1. Script fetches changes, receives up to LSN 1000
 361 2. Script writes changes to file
 362 3. Database crashes before checkpoint
 363 4. Slot position rolled back to LSN 900
 364 5. Script fetches again, receives LSN 900-1000 again (duplicates!)
 365 ```
 366
 367 **Solution**:
 368 - `pg_recvlogical` handles this automatically through its status reporting mechanism
 369 - It sends periodic status updates back to the server confirming processed positions
 370 - On restart, `pg_recvlogical` can use `--startpos` to specify a starting LSN
 371 - For custom implementations: track highest LSN and pass to `pg_logical_slot_get_changes(slot_name, upto_lsn, ...)`
 372
 373 ### 5. TRUNCATE Replication Complexity
 374
 375 **From documentation (Section 29.8)**:
 376 > Replication of TRUNCATE commands is supported, but some care must be taken when truncating groups of tables connected by foreign keys.
 377
 378 **Problem**: If you truncate multiple related tables, the incremental backup needs to capture all of them.
 379
 380 **Solution**: Logical replication handles this automatically, but be aware of the behavior.
 381
 382 ### 6. Replica Identity Required for UPDATE/DELETE
 383
 384 For UPDATE and DELETE operations, PostgreSQL needs to identify which rows to modify. This requires a **replica identity**.
 385
 386 **From documentation (Section 29.1.1)**:
 387 > A published table must have a replica identity configured in order to be able to replicate UPDATE and DELETE operations, so that appropriate rows to update or delete can be identified on the subscriber side.
 388
 389 **Replica Identity Modes**:
 390
 391 - **DEFAULT**: Records old values of primary key columns
 392   - This is the default for non-system tables
 393   - **WARNING**: When there is no primary key, behavior is the same as `NOTHING`
 394
 395 - **USING INDEX _index_name_**: Records old values of columns in the specified unique index
 396   - Index must be unique, not partial, not deferrable
 397   - Index columns must be marked `NOT NULL`
 398   - If the index is later dropped, behavior becomes the same as `NOTHING`
 399
 400 - **FULL**: Records old values of all columns in the row
 401   - Allows UPDATE/DELETE without a primary key, but very expensive
 402   - Can use indexes on subscriber for searching, but inefficient without them
 403
 404 - **NOTHING**: Records no information about the old row
 405   - This is the default for system tables
 406   - Only `INSERT` operations are replicated
 407
 408 **Critical Behavior**:
 409
 410 From documentation (Section 29.1.1):
 411 > Tables with a replica identity defined as `NOTHING`, `DEFAULT` without a primary key, or `USING INDEX` with a dropped index, **cannot support UPDATE or DELETE operations** when included in a publication replicating these actions. **Attempting such operations will result in an error on the publisher.**
 412
 413 This means UPDATE/DELETE will **fail on the source database**, not just during restore!
 414
 415 **Example**:
 416 ```sql
 417 -- Table without primary key
 418 CREATE TABLE logs (timestamp TIMESTAMPTZ, message TEXT);
 419
 420 -- By default, replica identity is DEFAULT
 421 -- Since there's no PK, this behaves as NOTHING
 422 -- UPDATE/DELETE operations will ERROR if this table is in a publication!
 423
 424 -- Attempting this will fail:
 425 UPDATE logs SET message = 'Updated' WHERE timestamp < now();
 426 -- ERROR:  cannot update table "logs" because it does not have a replica identity
 427
 428 -- Fix: Set replica identity to FULL
 429 ALTER TABLE logs REPLICA IDENTITY FULL;
 430
 431 -- Now UPDATE/DELETE will work (but be inefficient)
 432 UPDATE logs SET message = 'Updated' WHERE timestamp < now();  -- OK
 433 ```
 434
 435 **Best practice**: Ensure all tables have either:
 436 - A primary key (automatic replica identity), OR
 437 - A unique index with all columns NOT NULL, configured via `REPLICA IDENTITY USING INDEX`, OR
 438 - Explicit `REPLICA IDENTITY FULL` setting (only as a last resort - very inefficient)
 439
 440 ### 7. Long-Term SQL Readability Challenges
 441
 442 The goal of producing SQL that remains readable and executable for 10+ years introduces additional complexities:
 443
 444 **Version Compatibility Issues**:
 445 - PostgreSQL syntax may change between major versions (though rare)
 446 - Data type representations may evolve
 447 - Default behaviors may change (e.g., how NULLs are handled in certain contexts)
 448
 449 **Extension Dependencies**:
 450 - Custom data types from extensions (PostGIS geometries, hstore, etc.) may not exist in future systems
 451 - Extension versions may be incompatible (e.g., PostGIS 2.x vs 4.x)
 452 - Must document all extension names and versions alongside backups
 453
 454 **Encoding and Collation**:
 455 - Character encoding standards may evolve
 456 - Locale/collation definitions may change
 457 - Text comparison behavior may differ across PostgreSQL versions
 458
 459 **Recommendations**:
 460 - Include a metadata file with each backup containing:
 461   - PostgreSQL version (full version string)
 462   - All installed extension names and versions
 463   - Database encoding
 464   - Locale and collation settings
 465   - Custom data types and enums
 466 - Periodically test restoring old backups on current PostgreSQL versions
 467 - Consider limiting use of PostgreSQL-specific SQL features for better forward compatibility
 468 - Prefer standard SQL types over custom types where possible
 469
 470 ### 8. Performance and Overhead
 471
 472 Logical replication introduces measurable overhead:
 473
 474 **Write Amplification**:
 475 - WAL must be written (normal)
 476 - WAL must be decoded into logical format (additional CPU)
 477 - Event triggers fire on every DDL operation (additional overhead)
 478 - The `ddl_history` table itself generates more WAL entries
 479
 480 **Disk I/O**:
 481 - Additional WAL volume retained by replication slots
 482 - More frequent checkpoint I/O if tuned for faster slot position persistence
 483
 484 **Recommendations**:
 485 - Benchmark the overhead on a test system with production-like workload
 486 - Monitor CPU usage of WAL sender processes
 487 - Consider the trade-off: is the overhead worth human-readable backups?
 488
 489 ## Prerequisites and Configuration
 490
 491 ### PostgreSQL Configuration
 492
 493 ```ini
 494 # postgresql.conf
 495
 496 # Required: Set WAL level to logical
 497 wal_level = logical
 498
 499 # Required: Allow at least one replication slot
 500 max_replication_slots = 10
 501
 502 # Recommended: Allow replication connections
 503 max_wal_senders = 10
 504
 505 # Recommended: Keep more WAL for safety
 506 wal_keep_size = 1GB
 507
 508 # Optional: Tune checkpoint frequency to persist slot positions more often
 509 checkpoint_timeout = 5min
 510 ```
 511
 512 ### Client Requirements
 513
 514 - PostgreSQL client utilities installed (`pg_recvlogical`, `pg_dump`, `pg_dumpall`)
 515 - Superuser or role with `REPLICATION` privilege
 516 - Permission to create replication slots
 517
 518 ## Built-in PostgreSQL Tools That Help
 519
 520 PostgreSQL provides several built-in utilities that address the major challenges of this backup approach:
 521
 522 ### `pg_recvlogical` - Logical Decoding Stream Collector
 523
 524 **Purpose**: Streams logical decoding output from a replication slot to files.
 525
 526 **Key Features**:
 527 - Creates/drops logical replication slots (`--create-slot`, `--drop-slot`)
 528 - Streams changes continuously with automatic reconnection (`--start`)
 529 - Supports multiple output plugins (`--plugin=test_decoding`)
 530 - Handles crash recovery with LSN position tracking
 531 - File rotation via SIGHUP signal
 532 - Configurable fsync intervals (`--fsync-interval`)
 533 - Can stop at specific LSN (`--endpos`)
 534
 535 **Impact**: Eliminates need to write custom code for streaming infrastructure, connection management, and position tracking.
 536
 537 ### `pg_dump` - Database Backup Utility
 538
 539 **Relevant Options**:
 540 - `--snapshot=name`: Synchronize dump with a logical replication slot's exported snapshot
 541 - `--schema-only`: Capture only schema for DDL tracking
 542 - `--sequence-data`: Include sequence values (enabled by default)
 543 - `--large-objects` / `-b`: Include large objects if needed
 544
 545 **Impact**: Provides consistent initial backup synchronized with the replication slot.
 546
 547 ### `pg_dumpall` - Cluster-wide Backup Utility
 548
 549 **Relevant Options**:
 550 - `--globals-only`: Dump only shared objects (databases, roles, tablespaces)
 551
 552 **Impact**: Solves the event trigger gap for shared objects that cannot be captured via DDL events.
 553
 554 ### Monitoring Views
 555
 556 **Relevant Views**:
 557 - `pg_replication_slots`: Monitor slot health, lag, and active status
 558 - `pg_stat_replication`: Track replication connections and status
 559 - `pg_stat_replication_slots`: Statistics on replication slot usage
 560
 561 **Impact**: Provides visibility into replication slot health for critical alerting.
 562
 563 ## Implementation Components
 564
 565 When implementing this system, the following programs/scripts will be needed:
 566
 567 ### 1. Initial Setup Script
 568
 569 **Purpose**: Bootstrap the backup system
 570
 571 **Tasks**:
 572 - Create logical replication slot with appropriate output plugin
 573 - Export the snapshot created by the slot for consistency
 574 - Take initial base backup using `pg_dump` **using the exported snapshot**
 575 - Set up DDL tracking infrastructure (event triggers and ddl_history table)
 576 - Create supplemental backup for shared objects (`pg_dumpall --globals-only`)
 577 - Set `REPLICA IDENTITY` on tables without primary keys
 578 - Document PostgreSQL version and installed extensions with versions
 579
 580 **Critical Detail - Initial Snapshot Consistency**:
 581
 582 From documentation (Section 47.2.5):
 583 > When a new replication slot is created using the streaming replication interface, a snapshot is exported which will show exactly the state of the database after which all changes will be included in the change stream. This can be used to create a new replica by using SET TRANSACTION SNAPSHOT to read the state of the database at the moment the slot was created.
 584
 585 This ensures the base backup and incremental stream are perfectly aligned with no gaps or overlaps.
 586
 587 ### 2. Incremental Backup Collection
 588
 589 **Tool**: Use built-in `pg_recvlogical` utility
 590
 591 **Key Capabilities**:
 592 - Creates and manages logical replication slots
 593 - Streams decoded changes continuously from the replication slot
 594 - Handles connection failures and automatic reconnection
 595 - Tracks LSN positions with status updates to the server
 596 - Supports file rotation via SIGHUP signal
 597 - Can stop at a specific LSN with `--endpos`
 598 - Configurable fsync intervals for crash safety
 599
 600 **Example Usage**:
 601 ```bash
 602 # Create slot and start streaming
 603 pg_recvlogical \
 604     --dbname=mydb \
 605     --slot=backup_slot \
 606     --plugin=test_decoding \
 607     --create-slot \
 608     --start \
 609     --file=/backups/incremental/changes.sql \
 610     --fsync-interval=10
 611 ```
 612
 613 **Additional Tasks** (custom wrapper script):
 614 - File rotation and timestamping
 615 - Transform `test_decoding` output into clean, portable SQL
 616 - Handle output format conversion for long-term readability
 617 - Coordinate with monitoring system
 618
 619 ### 3. Periodic Full Backup Script
 620
 621 **Purpose**: Take regular full backups as restore points
 622 **Tasks**:
 623 - Execute `pg_dump` to create full database backup
 624 - Take schema-only dump separately for reference
 625 - Compress old backups to save space
 626 - Implement retention policy (delete old backups)
 627 - Export DDL history table
 628
 629 ### 4. Restore Script
 630
 631 **Purpose**: Restore database from base + incremental backups
 632 **Tasks**:
 633 - Locate most recent full backup
 634 - Find all incremental backups since that full backup
 635 - Recreate target database
 636 - Restore full backup
 637 - Apply incremental backups in chronological order
 638 - Handle point-in-time recovery (stop at specific timestamp)
 639 - Synchronize sequence values to current maximums
 640 - Verify data integrity
 641
 642 ### 5. Output Plugin / Format Transformer
 643
 644 **Purpose**: Convert logical decoding output to clean SQL
 645
 646 **Tasks**:
 647 - Parse output plugin format (test_decoding or pgoutput)
 648 - Generate proper INSERT/UPDATE/DELETE statements
 649 - Handle complex data type conversions and escaping:
 650   - Arrays (e.g., `{1,2,3}`)
 651   - JSON/JSONB values
 652   - BYTEA (binary data)
 653   - Custom/composite types
 654   - NULL vs empty string vs literal 'null'
 655   - Text with quotes, backslashes, and special characters
 656   - Temporal types (timestamp with timezone, intervals)
 657 - Preserve transaction boundaries (BEGIN/COMMIT)
 658 - Format output for human readability
 659 - Ensure output is valid SQL that can be executed years later
 660
 661 **Challenges**:
 662 - The `test_decoding` plugin is primarily for debugging, not production use
 663 - Consider using the `pgoutput` plugin (used by PostgreSQL's built-in logical replication)
 664 - Data type semantics may change across PostgreSQL major versions
 665 - Custom types and extensions (PostGIS, etc.) may not exist in future systems
 666
 667 ### 6. Monitoring and Health Check Script
 668
 669 **Purpose**: Ensure backup system is functioning correctly
 670 **Tasks**:
 671 - Check replication slot health and lag
 672 - Alert if slot is not advancing
 673 - Monitor disk space in backup directories
 674 - Verify backup files are being created
 675 - Test restore procedures periodically
 676
 677 ## Testing Strategy
 678
 679 ### 1. Basic Functionality Test
 680
 681 ```sql
 682 -- Create test database
 683 CREATE DATABASE backup_test;
 684 \c backup_test
 685
 686 -- Run setup script
 687 -- Then create test data
 688 CREATE TABLE test_users (id SERIAL PRIMARY KEY, name TEXT, created_at TIMESTAMP DEFAULT now());
 689
 690 INSERT INTO test_users (name) VALUES ('Alice'), ('Bob'), ('Charlie');
 691
 692 -- Wait for incremental backup to run
 693 -- Add more data
 694 UPDATE test_users SET name = 'Alice Smith' WHERE id = 1;
 695 DELETE FROM test_users WHERE id = 3;
 696
 697 -- Take another incremental
 698 -- Add schema change
 699 ALTER TABLE test_users ADD COLUMN email TEXT;
 700 UPDATE test_users SET email = 'alice@example.com' WHERE id = 1;
 701
 702 -- Now restore and verify all data is present and correct
 703 ```
 704
 705 ### 2. Crash Recovery Test
 706
 707 ```bash
 708 # Start collecting incrementals
 709 # Generate load with pgbench
 710 # Simulate crash with pg_ctl stop -m immediate
 711 # Restart PostgreSQL
 712 # Collect incrementals again - verify no data loss and duplicates are handled
 713 # Restore and verify data matches
 714 ```
 715
 716 ### 3. Long-Term Storage Test
 717
 718 ```bash
 719 # Create backup
 720 # Store backup in time capsule (simulate 10 years)
 721 # On modern system, verify SQL is still readable and executable
 722 ```
 723
 724 ## Operational Procedures
 725
 726 ### Monitoring
 727
 728 Monitor these metrics:
 729
 730 ```sql
 731 -- Check replication slot health
 732 SELECT
 733     slot_name,
 734     slot_type,
 735     database,
 736     active,
 737     pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) as replication_lag,
 738     pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn)) as confirmed_lag
 739 FROM pg_replication_slots
 740 WHERE slot_type = 'logical';
 741
 742 -- Alert if lag > 1GB or slot is inactive for > 1 hour
 743 ```
 744
 745 ### Backup Schedule
 746
 747 **Recommended**:
 748 - **Incremental**: Every 15 minutes (or continuously streaming)
 749 - **Full backup**: Daily at 2 AM
 750 - **Schema-only dump**: Every hour
 751 - **DDL history export**: Every 15 minutes (part of incremental)
 752
 753 ### Retention Policy
 754
 755 - **Incremental backups**: Keep 7 days
 756 - **Full backups**: Keep 30 days, then one per month for 1 year
 757 - **Monitor disk space**: Alert if backup directory exceeds 80% capacity
 758
 759 ### Disaster Recovery Runbook
 760
 761 1. **Stop application** to prevent new writes during restore
 762 2. **Create new database** (don't overwrite production)
 763 3. **Restore latest full backup**
 764 4. **Apply DDL changes** from `ddl_history` in chronological order
 765 5. **Apply incremental backups** in chronological order
 766 6. **Sync sequences** to latest values
 767 7. **Verify data integrity** with checksums or row counts
 768 8. **Test application** against restored database
 769 9. **Switch over** application to restored database
 770
 771 ## Comparison with Alternatives
 772
 773 | Feature | Logical Replication | pg_dump Only | WAL Archiving |
 774 |---------|-------------------|--------------|---------------|
 775 | Incremental | ✓ Yes | ✗ No | ✓ Yes |
 776 | Human-readable | ✓ SQL-like | ✓ SQL | ✗ Binary |
 777 | Online operation | ✓ Yes | ✓ Yes | ✓ Yes |
 778 | Crash-safe position | ✓ Yes | N/A | ✓ Yes |
 779 | Captures DDL | ✗ **No** | ✓ Yes | ✓ Yes |
 780 | Long-term readable | ✓ Yes | ✓ Yes | ✗ No |
 781 | Point-in-time recovery | ✓ Yes | ✗ No | ✓ Yes |
 782 | Storage efficiency | ✓ Good | ✗ Poor | ✓ Excellent |
 783
 784 ## Conclusion
 785
 786 ### Feasibility: YES, but More Complex Than Initially Assessed
 787
 788 The logical replication approach **is feasible** for incremental SQL backups, but the implementation is **more complex than it initially appears**. Key requirements:
 789
 790 1. **Separate DDL tracking** using event triggers (with gaps for shared objects like databases/roles)
 791 2. **Additional tracking** for shared objects via periodic `pg_dumpall --globals-only`
 792 3. **Aggressive monitoring** of replication slot health to prevent production outages
 793 4. **Replica identity configuration** on all tables (or face errors on UPDATE/DELETE)
 794 5. **Sequence synchronization** during restore
 795 6. **Duplicate handling** for crash scenarios
 796 7. **Complex data type handling** in the SQL transformation layer
 797 8. **Version and extension metadata** tracking for long-term readability
 798 9. **Performance overhead** monitoring and capacity planning
 799
 800 ### Primary Limitations
 801
 802 **1. DDL Capture is Incomplete**
 803
 804 While logical replication doesn't capture DDL, event triggers only partially solve this:
 805 - ✓ Captures table/index/function DDL
 806 - ✗ Does NOT capture database creation/deletion
 807 - ✗ Does NOT capture role management and permissions
 808 - ✗ Does NOT capture tablespace changes
 809 - Requires supplemental `pg_dumpall --globals-only` backups
 810
 811 **2. Replica Identity Enforcement is Strict**
 812
 813 Tables without proper replica identity will **cause errors on the source database** for UPDATE/DELETE operations, not just during restore. This requires proactive configuration.
 814
 815 **3. Operational Risk from Inactive Slots**
 816
 817 An inactive replication slot can cause database shutdown via transaction ID wraparound. This is not a theoretical concern - it requires 24/7 monitoring and documented emergency procedures.
 818
 819 ### When This Approach Makes Sense
 820
 821 - You need human-readable backups for compliance/archival (legal requirement)
 822 - You want continuous incremental backups without WAL shipping
 823 - You have **strong operational maturity** and 24/7 monitoring capability
 824 - You can dedicate engineering time to building and maintaining the system
 825 - You can enforce discipline around replica identity and schema changes
 826 - Your database has moderate change velocity (not extremely high-volume OLTP)
 827 - You're willing to accept the performance overhead
 828
 829 ### When to Consider Alternatives
 830
 831 - If you need zero-configuration backup → use `pg_dump` + `pg_basebackup`
 832 - If operational complexity is too high → use WAL archiving + `pg_basebackup` (standard PITR)
 833 - If you lack 24/7 monitoring capability → any approach without replication slots
 834 - If you need guaranteed DDL capture → use WAL archiving (captures everything)
 835 - If performance overhead is unacceptable → use `pg_basebackup` + WAL archiving
 836 - If you need native PostgreSQL tooling only → use continuous archiving
 837
 838 ### Realistic Complexity Assessment
 839
 840 **This is NOT a simple weekend project.** A production-ready implementation requires:
 841
 842 - **Development effort**: 2-4 weeks for initial implementation (reduced from 4-8 weeks due to `pg_recvlogical`)
 843   - Most streaming infrastructure already exists via `pg_recvlogical`
 844   - Primary development work is output transformation and orchestration
 845 - **Testing effort**: 2-4 weeks for comprehensive testing
 846 - **Documentation**: Detailed runbooks for operations team
 847 - **Ongoing maintenance**: Regular updates as PostgreSQL evolves
 848 - **24/7 monitoring**: Critical alerting on replication slot health
 849 - **Operational expertise**: Deep PostgreSQL knowledge required
 850
 851 **Key Insight**: The existence of `pg_recvlogical` substantially reduces implementation complexity. The original assessment significantly underestimated the availability of built-in PostgreSQL tools for this use case.
 852
 853 ### Next Steps for Proof of Concept
 854
 855 1. **Initial Setup**: Create replication slot and DDL tracking infrastructure
 856    - Use `pg_recvlogical --create-slot` to create logical replication slot
 857    - Set up event triggers for DDL capture
 858    - Take initial `pg_dump` with `--snapshot` synchronized to the slot
 859 2. **Streaming Collection**: Test `pg_recvlogical` for continuous streaming
 860    - Run `pg_recvlogical --start` to begin capturing changes
 861    - Verify `test_decoding` output format
 862 3. **Output Transformation**: Build parser for `test_decoding` format
 863    - Transform to clean, portable SQL
 864    - Handle complex data types (arrays, JSON, bytea, etc.)
 865 4. **DDL Handling**: Implement shared object tracking
 866    - Schedule periodic `pg_dumpall --globals-only`
 867    - Coordinate with event trigger output
 868 5. **Restore Process**: Build and test restore scripts
 869    - Test point-in-time recovery
 870    - Validate sequence synchronization
 871 6. **Crash Recovery**: Test duplicate handling and LSN tracking
 872 7. **Performance**: Measure storage and CPU overhead
 873 8. **Documentation**: Write operational runbooks and emergency procedures
 874
 875 ## References
 876
 877 - PostgreSQL Documentation: Chapter 25 - Backup and Restore
 878 - PostgreSQL Documentation: Chapter 29 - Logical Replication
 879 - PostgreSQL Documentation: Chapter 47 - Logical Decoding
 880 - PostgreSQL Documentation: Section 29.8 - Logical Replication Restrictions
 881 - PostgreSQL Documentation: `pg_recvlogical` man page
 882 - PostgreSQL Documentation: `pg_dump` man page
 883 - PostgreSQL Documentation: `pg_dumpall` man page