1 # pg_scribe Command Line Interface
3 This document describes the command line interface for **pg_scribe**, an incremental SQL backup system for PostgreSQL.
7 pg_scribe uses action flags similar to other PostgreSQL tools like `pg_recvlogical`. The tool performs one primary action per invocation, specified by action flags.
9 The system organizes backups into **chains** - each chain consists of one base backup plus zero or more differential backups. See `doc/file-handling.md` for detailed terminology and file organization.
14 # Initialize backup system (one-time)
15 pg_scribe --init [OPTIONS]
17 # Start streaming incremental backups
18 pg_scribe --start [OPTIONS]
20 # Stop active streaming process
21 pg_scribe --stop [OPTIONS]
23 # Rotate the differential file
24 pg_scribe --rotate-diff [OPTIONS]
26 # Create new chain with fresh base backup
27 pg_scribe --new-chain [OPTIONS]
29 # Restore from backups
30 pg_scribe --restore [OPTIONS]
32 # Check replication slot status
33 pg_scribe --status [OPTIONS]
38 These options apply to all actions:
40 ### Connection Options
43 -d, --dbname=DBNAME Database name (can be a connection string)
44 -h, --host=HOSTNAME Database server host (default: localhost)
45 -p, --port=PORT Database server port (default: 5432)
46 -U, --username=USERNAME Database user (default: $PGUSER or $USER)
47 -w, --no-password Never prompt for password
48 -W, --password Force password prompt
54 -v, --verbose Enable verbose mode
55 -V, --version Print version and exit
56 -?, --help Show help and exit
61 Exactly one of the following action flags must be specified:
65 Initialize the backup system by creating a replication slot, setting up DDL capture via the wal2sql extension, and taking an initial base backup in the first chain.
67 **This is a one-time initialization operation** - following PostgreSQL conventions (`initdb`, `pg_basebackup`), it requires an empty backup directory and will fail if already initialized. Use `--if-not-exists` to make initialization idempotent (succeeds silently if already initialized). If initialization fails, partial state is automatically cleaned up (replication slot dropped, partial backup files removed).
69 **Additional options:**
72 --backup-dir=DIRECTORY Backup output directory (required)
73 --slot=SLOTNAME Replication slot name (default: pg_scribe)
74 --if-not-exists Do not error if already initialized (idempotent)
75 --force Skip validation and force initialization (dangerous!)
76 --compress=METHOD Compression method: gzip, lz4, zstd, or none
77 Can include level (e.g., zstd:9) (default: none)
82 **Phase 1: Validation** (runs first, can fail)
84 1. **CRITICAL Checks** (must pass or initialization fails):
85 - Verify `wal_level = logical`
86 - Verify `max_replication_slots >= 1`
87 - Verify `max_wal_senders >= 1`
88 - Check all tables have adequate replica identity (PRIMARY KEY, USING INDEX, or FULL)
90 2. **Coverage Warnings** (non-blocking, informational):
91 - List unlogged tables (will not be backed up)
92 - Check for large objects (not incrementally backed up)
94 **Phase 2: Setup** (only runs if validation passes or `--force` used)
96 1. Verifies backup directory is empty (or doesn't exist)
97 2. Creates the wal2sql extension if it doesn't exist (`CREATE EXTENSION IF NOT EXISTS wal2sql;`)
98 - This automatically installs the DDL event trigger
99 3. Creates a logical replication slot (fails if slot already exists)
100 4. Generates chain ID from current timestamp (e.g., `20231215T120000Z`)
101 5. Creates first chain directory: `chain-{CHAIN_ID}/`
102 6. Takes synchronized base backup using the slot's snapshot → `chain-{CHAIN_ID}/base.sql`
103 7. Creates `pg_dumpall --globals-only` backup → `chain-{CHAIN_ID}/globals.sql`
104 8. Generates metadata file → `chain-{CHAIN_ID}/metadata.json` (PostgreSQL version, extensions, encoding)
105 9. Applies compression if `--compress` specified
110 pg_scribe --init -d mydb --backup-dir /backups/mydb --slot mydb_backup
113 **Output should convey:**
115 - **Validation results** with clear pass/fail status for:
116 - PostgreSQL configuration (wal_level, max_replication_slots, max_wal_senders)
117 - Replica identity for all tables
118 - Coverage warnings (unlogged tables, large objects)
119 - **If validation fails**: List CRITICAL issues with specific fix commands, then exit with error code 5 (unless `--force` used)
120 - **If validation passes** (or `--force`): Progress through setup steps (extension creation, slot creation with snapshot ID, chain creation, backup paths)
121 - **Final status**: Success message with chain ID created
127 ✓ Replication slot 'mydb_backup' created at LSN 0/1234567
128 ✓ Initial chain created: 20231215T120000Z
129 Location: /backups/mydb/chain-20231215T120000Z/
131 Next step: Start streaming with:
132 pg_scribe --start --backup-dir /backups/mydb
139 Start streaming incremental backups continuously to the latest chain in the backup directory.
141 **Additional options:**
144 --backup-dir=DIRECTORY Backup directory (required)
145 --slot=SLOTNAME Replication slot name (default: pg_scribe)
146 -s, --status-interval=SECS Status update interval in seconds (default: 10)
147 -F, --fsync-interval=SECS Fsync interval in seconds (default: 10, 0 to disable)
152 1. Validates database connection and replication slot
153 2. Auto-detects latest chain (newest by timestamp)
154 3. Checks for existing streaming process (via `.pg_scribe.pid`)
155 4. Creates top-level pidfile: `{backup-dir}/.pg_scribe.pid`
156 5. Uses `exec` to replace itself with `pg_recvlogical`, which then:
157 - Streams decoded changes using the wal2sql plugin
158 - Writes SQL to `chain-{CHAIN_ID}/active.sql`
159 - Periodically fsyncs output file for crash safety
160 - Reports LSN position and lag to stderr
161 - Responds to SIGHUP by closing and reopening output file (for log rotation)
163 **Implementation:** `pg_scribe --start` is a thin wrapper that validates prerequisites and uses `exec` to become `pg_recvlogical`. This design has several advantages:
164 - **No orphaned processes**: pg_scribe becomes pg_recvlogical (same PID), eliminating parent-child complexity
165 - **Direct signal handling**: SIGHUP, SIGTERM, SIGINT go directly to pg_recvlogical without forwarding
166 - **Simpler code**: No need for signal forwarding, child process tracking, or wait loops
167 - **Reliable cleanup**: Tests and process managers interact with a single process
169 **Log rotation support:** File rotation is provided automatically by `pg_recvlogical`:
171 1. Rename the current output file
172 2. Send SIGHUP to the pg_scribe/pg_recvlogical process
173 3. pg_recvlogical closes the old file and opens a new one with the same name
178 # Stream to latest chain (foreground)
179 pg_scribe --start -d mydb --backup-dir /backups/mydb --slot mydb_backup
181 # Run as background daemon (redirect stderr to log file)
182 pg_scribe --start -d mydb --backup-dir /backups/mydb --slot mydb_backup \
183 2>/var/log/pg_scribe.log &
185 # Stream to stdout (for processing with other tools) - NOT SUPPORTED in chain mode
186 # Use --backup-dir; active.sql will be created in latest chain
189 **Output (to stderr):**
191 ✓ Found latest chain: 20231215T120000Z
192 ✓ Starting replication from slot 'mydb_backup'
193 ✓ Writing to: /backups/mydb/chain-20231215T120000Z/active.sql
194 [periodic LSN updates...]
197 **Note:** Only one `--start` process can run per backup directory at a time.
203 Stop the active streaming process gracefully.
205 **Additional options:**
208 --backup-dir=DIRECTORY Backup directory containing active process (required)
213 1. Locates the pidfile at `{backup-dir}/.pg_scribe.pid`
214 2. Validates the PID exists and is a `pg_recvlogical` process
215 3. Sends SIGTERM to the process for graceful shutdown
216 4. Waits up to 30 seconds for the process to stop
217 5. Sends SIGKILL if the process doesn't stop gracefully
218 6. Removes the pidfile
223 pg_scribe --stop --backup-dir /backups/mydb
228 ✓ Found pg_recvlogical process (PID 12345)
229 ✓ Active chain: 20231215T120000Z
230 ✓ Sending SIGTERM to process 12345...
231 ✓ Waiting for process to stop...
232 ✓ Streaming process stopped
235 **Use case:** Clean shutdown before maintenance, before transitioning to a new chain manually, or when stopping backups temporarily.
237 **Edge cases handled:**
238 - Stale pidfile (process doesn't exist) - removes pidfile and exits successfully
239 - Wrong process (not pg_recvlogical) - refuses to stop and exits with error
240 - Process doesn't respond to SIGTERM - sends SIGKILL after timeout
246 Rotate the differential file within the active chain, sealing the current differential and starting a new one (similar to log rotation).
248 **Additional options:**
251 --backup-dir=DIRECTORY Backup directory (required)
256 1. Finds the active streaming process via `.pg_scribe.pid`
257 2. Validates process exists and is pg_recvlogical
258 3. Locates the active chain (finds `active.sql`)
259 4. Generates new differential name: `diff-{TIMESTAMP}.sql`
260 5. **Atomic rotation:**
261 - Rename `active.sql` → `diff-{TIMESTAMP}.sql`
262 - Send SIGHUP to pg_recvlogical process
263 - Wait for new `active.sql` to appear (timeout after 30s)
264 - Validate new file is being written to
269 pg_scribe --rotate-diff --backup-dir /backups/mydb
274 ✓ Found active chain: 20231215T120000Z
275 ✓ Rotated differential: diff-20231216T083000Z.sql (2.1 GB)
276 ✓ New differential started
279 **Use case:** Daily or weekly differential rotation to keep individual files manageable.
281 **Edge cases handled:**
282 - Stale pidfile (process doesn't exist)
283 - Wrong process (not pg_recvlogical)
284 - SIGHUP timeout (new file doesn't appear)
290 Create a new chain with a fresh base backup. This operation takes a new base backup while streaming continues to the old chain.
292 **Additional options:**
295 --backup-dir=DIRECTORY Backup directory (required)
296 --start Automatically stop old streaming and start new (one-step transition)
297 --compress=METHOD Compression method: gzip, lz4, zstd, or none
298 Can include level (e.g., zstd:9) (default: none)
303 1. Generates new chain ID from current timestamp
304 2. Creates new chain directory: `chain-{NEW_ID}/`
305 3. **Takes new base backup** (while streaming continues to old chain if active):
306 - `pg_dump` → `chain-{NEW_ID}/base.sql`
307 - `pg_dumpall --globals-only` → `chain-{NEW_ID}/globals.sql`
308 - Metadata → `chain-{NEW_ID}/metadata.json`
309 - Apply compression if specified
310 4. **If `--start` flag specified:**
311 - Stops old streaming process (sends SIGTERM, waits up to 30s)
312 - Starts streaming to new chain using `exec pg_recvlogical`
313 - Process continues running in foreground (or background if run with `&`)
314 5. **If `--start` NOT specified:**
315 - Reports new chain created
316 - Displays manual transition instructions
321 # Create new chain only (manual transition)
322 pg_scribe --new-chain -d mydb --backup-dir /backups/mydb
324 # Create new chain with automatic transition (recommended)
325 pg_scribe --new-chain --start -d mydb --backup-dir /backups/mydb --compress=zstd:9
328 **Output (without --start):**
330 Taking new base backup...
331 ✓ Base backup complete: chain-20231222T120000Z (12.3 GB)
333 New chain created: 20231222T120000Z
334 To transition to the new chain:
335 1. Stop the current streaming process:
336 pg_scribe --stop -f /backups/mydb
337 2. Start streaming to the new chain:
338 pg_scribe --start -d mydb -f /backups/mydb -S mydb_backup
341 **Output (with --start):**
343 Taking new base backup...
344 ✓ Base backup complete: chain-20231222T120000Z (12.3 GB)
346 Automatic Transition (--start specified)
347 ✓ Stopping old streaming process (PID 12345)...
348 ✓ Old streaming process stopped
350 Starting streaming to new chain: 20231222T120000Z
351 ✓ Output file: /backups/mydb/chain-20231222T120000Z/active.sql
352 [continues streaming in foreground...]
355 **Use case:** Monthly or quarterly base backup refresh to keep chains manageable. Use `--start` for automated transition in production (recommended), or omit it for manual control.
357 **Note on overlap:** Since the base backup is taken while streaming continues, there will be some overlap between the old chain's final differential and the new chain's base backup. This is acceptable and simplifies the design. Restore operates on one chain at a time, so overlap doesn't cause issues.
359 **Note on --start:** When using `--start`, the command requires `-d/--dbname` and other connection options since it will exec into `pg_recvlogical` to begin streaming.
365 Restore database from a chain's base backup plus its differential backups.
367 **Additional options:**
370 --backup-dir=DIRECTORY Backup input directory (required)
371 -d, --dbname=DBNAME Target database name (required)
372 -C, --create Create target database
373 --chain-id=ID Specific chain to restore (default: latest)
374 --include-active Also apply incomplete active.sql (risky!)
375 --up-to=TIMESTAMP Only apply differentials up to this time
376 --no-sync-sequences Skip sequence synchronization
381 1. Determines target chain (latest or specified via `--chain-id`)
382 2. Locates base backup: `chain-{ID}/base.sql`
383 3. Finds all sealed differentials: `chain-{ID}/diff-*.sql` (sorted by timestamp)
384 4. Creates target database (if `--create` specified)
385 5. Restores globals: `chain-{ID}/globals.sql`
386 6. Restores base backup: `chain-{ID}/base.sql`
387 7. Applies sealed differentials in chronological order
388 8. Synchronizes sequences using `setval()` (unless `--no-sync-sequences`)
389 9. Reports statistics
394 # Restore latest chain to new database
395 pg_scribe --restore --backup-dir /backups/mydb -d mydb_restored --create
397 # Restore specific chain
398 pg_scribe --restore --backup-dir /backups/mydb --chain-id 20231215T120000Z \
399 -d mydb_restored --create
401 # Restore with incomplete active.sql (risky!)
402 pg_scribe --restore --backup-dir /backups/mydb -d mydb_restored --create --include-active
407 Restoring from chain: 20231215T120000Z
409 ✓ Restored base backup (1,234,567 rows)
410 ✓ Applied 15 differentials
411 - diff-20231216T083000Z.sql
412 - diff-20231217T083000Z.sql
414 ✓ Synchronized 42 sequences
415 Restore complete (took 3m 42s)
418 **Default behavior:** By default, `active.sql` is **ignored** because it's incomplete and may be inconsistent. Use `--include-active` only if you need the absolute latest changes and accept the risk.
424 Check replication slot health and backup system status.
426 **Additional options:**
429 --slot=SLOTNAME Replication slot name (default: pg_scribe)
430 --backup-dir=DIRECTORY Backup directory to analyze (optional)
435 1. Queries `pg_replication_slots` for slot health
436 2. Shows replication lag and WAL retention
437 3. Analyzes backup directory if provided:
438 - Lists all chains with timestamps
439 - Shows differential counts and sizes
440 - Indicates which chain is actively streaming
441 4. Reports warnings about potential issues
446 pg_scribe --status -d mydb --slot mydb_backup --backup-dir /backups/mydb
451 Replication Slot: mydb_backup
453 Current LSN: 0/9876543
454 Confirmed LSN: 0/9876540
458 Backup Directory: /backups/mydb
461 chain-20231215T120000Z
462 Base: 2023-12-15 12:00:00 (10.2 GB)
463 Differentials: 15 files (2.1 GB total)
466 chain-20231222T120000Z (ACTIVE - streaming)
467 Base: 2023-12-22 12:00:00 (10.5 GB)
468 Differentials: 3 files (1.8 GB total)
469 Last activity: 2 minutes ago
472 Total backup size: 24.1 GB
477 ## Chain-Based File Structure
479 The backup directory contains multiple chain subdirectories, each representing a complete backup lineage:
483 .pg_scribe.pid # Single pidfile for backup directory
484 chain-20231215T120000Z/ # First chain (sealed)
485 base.sql # Base backup (pg_dump)
486 globals.sql # Roles, tablespaces (pg_dumpall)
487 metadata.json # PG version, extensions, etc.
488 diff-20231216T083000Z.sql # Sealed differential
489 diff-20231217T083000Z.sql # Sealed differential
491 chain-20231222T120000Z/ # Second chain (active)
495 active.sql # Currently streaming (incomplete)
499 - **Chain ID**: Timestamp of base backup creation (e.g., `20231215T120000Z`)
500 - **Base backup**: Complete database snapshot at chain creation time
501 - **Active differential**: Currently streaming changes (incomplete, named `active.sql`)
502 - **Sealed differential**: Completed differential (immutable, named `diff-{TIMESTAMP}.sql`)
503 - **One active chain**: Only one chain receives streaming changes at a time
504 - **One pidfile**: Single `.pg_scribe.pid` tracks the active streaming process
506 See `doc/file-handling.md` for comprehensive details on file organization and terminology.
512 Chain management uses standard Unix tools:
518 # View chain contents
519 ls -lh /backups/mydb/chain-20231215T120000Z/
524 # Remove old chain (manual cleanup)
525 rm -rf /backups/mydb/chain-20231215T120000Z/
527 # Check if streaming is active
528 if [ -f /backups/mydb/.pg_scribe.pid ]; then
529 cat /backups/mydb/.pg_scribe.pid
530 ps -p $(cat /backups/mydb/.pg_scribe.pid)
534 **Cleanup strategy**: Old chains accumulate over time. Removal is manual and left to the user (or external scripts) to maintain full control over retention policies.
543 2 Database connection error
544 3 Replication slot error
545 4 Backup/restore error
546 5 Invalid arguments or validation failure (--init without --force)
547 10 Warning conditions (--status only)
552 ## Environment Variables
554 Standard PostgreSQL environment variables are supported:
559 PGDATABASE Database name
561 PGPASSWORD Database password (not recommended, use .pgpass instead)
562 PG_COLOR Use color in diagnostics: always, auto, or never
569 ### Initial Setup and Daily Operation
572 # 1. Initialize backup system (one-time setup, requires empty directory)
573 pg_scribe --init -d production --backup-dir /backups/production --slot prod_backup
575 # 2. Start streaming backups (run as daemon/service)
576 pg_scribe --start -d production --backup-dir /backups/production --slot prod_backup \
577 2>/var/log/pg_scribe.log &
579 # 3. Daily differential rotation (cron: 0 2 * * *)
580 pg_scribe --rotate-diff --backup-dir /backups/production
582 # 4. Monthly new chain with automatic transition (cron: 0 3 1 * *)
583 # This stops old streaming and starts new in one command
584 pg_scribe --new-chain --start -d production --backup-dir /backups/production \
585 --slot prod_backup --compress=zstd:9 2>/var/log/pg_scribe.log &
587 # 5. Monitor slot health (cron: */15 * * * *)
588 pg_scribe --status -d production --slot prod_backup --backup-dir /backups/production
590 # Alternative: Manual transition (if preferred)
591 # 4a. Create new chain
592 pg_scribe --new-chain -d production --backup-dir /backups/production --compress=zstd:9
593 # 4b. Stop old streaming
594 pg_scribe --stop --backup-dir /backups/production
595 # 4c. Start streaming to new chain
596 pg_scribe --start -d production --backup-dir /backups/production --slot prod_backup \
597 2>/var/log/pg_scribe.log &
600 ### Disaster Recovery
603 # 1. Check available backups
604 pg_scribe --status --backup-dir /backups/production
606 # 2. Restore latest chain to new database
607 pg_scribe --restore --backup-dir /backups/production -d production_restored --create
609 # 3. Restore specific older chain
610 pg_scribe --restore --backup-dir /backups/production --chain-id 20231215T120000Z \
611 -d production_december --create
613 # 4. Test restored database
614 psql -d production_restored -c "SELECT COUNT(*) FROM users;"
616 # 5. Switch application to restored database (manual step)
621 ## Comparison with PostgreSQL Tools
623 pg_scribe follows the same CLI conventions as core PostgreSQL tools:
625 | Tool | Action Flags | Connection Options | File Options |
626 |------|-------------|-------------------|--------------|
627 | **pg_recvlogical** | `--create-slot`, `--drop-slot`, `--start` | `-d`, `-h`, `-p`, `-U` | `-f` (output file), `--slot` (slot) |
628 | **pg_dump** | (positional dbname) | `-d`, `-h`, `-p`, `-U` | `-f` (output file) |
629 | **pg_basebackup** | (none) | `-d`, `-h`, `-p`, `-U` | `-D` (data directory!) |
630 | **pg_restore** | (none) | `-d`, `-h`, `-p`, `-U` | positional (archive file) |
631 | **pg_scribe** | `--init`, `--start`, `--rotate-diff`, `--new-chain`, `--restore`, `--status` | `-d`, `-h`, `-p`, `-U` | `--backup-dir` (directory), `--slot` (slot) |
633 **Note:** `pg_basebackup` uses `-D/--pgdata` because it creates an actual PostgreSQL data directory cluster. `pg_scribe` uses `--backup-dir` because it manages a collection of chains with SQL backup files.
637 ## Implementation Notes
639 ### Design Principles
641 1. **Consistent with PostgreSQL**: Follow exact same conventions as `pg_recvlogical`, `pg_dump`, `initdb`, `pg_basebackup`
642 2. **Fail-fast initialization**: `--init` requires empty directory, cleans up on failure (like `initdb` and `pg_basebackup`)
643 3. **Clear output**: Progress to stderr, data to stdout/files
644 4. **Scriptable**: Clear text output format, proper exit codes
645 5. **Opinionated file structure**: Chains provide consistent organization
647 ### Technology Choices (POC)
649 - **Language**: Bash or Python
650 - Bash: Minimal dependencies, matches PostgreSQL tool style
651 - Python: Better error handling, easier testing
652 - **Dependencies**: Only PostgreSQL client tools (pg_recvlogical, pg_dump, pg_dumpall, psql)
654 ### Key Implementation Components
656 1. **Connection management**: Use libpq environment variables, .pgpass
657 2. **Error handling**: Validate prerequisites before starting operations
658 3. **Chain management**:
659 - Auto-detect latest chain for `--start`
660 - Single pidfile at `{backup-dir}/.pg_scribe.pid`
661 - Chain ID format: `YYYYMMDDTHHMMSSZ` (ISO 8601, UTC, sortable)
662 4. **Signal handling**:
663 - SIGTERM/SIGINT for graceful shutdown (handled directly by pg_recvlogical after exec)
664 - SIGHUP for file rotation (handled directly by pg_recvlogical after exec)
665 5. **Output conventions**:
666 - Progress and status messages → stderr
667 - SQL output → `active.sql` in chain directory
668 6. **Process management for --start**:
669 - Validate prerequisites (connection, replication slot, no existing process)
670 - Display configuration to stderr
671 - Use `exec` to replace pg_scribe with `pg_recvlogical`
672 - Benefits: No orphaned processes, direct signal handling, simpler code, same PID
674 **Extensibility Note**: This POC design uses a modular, action-based CLI that can accommodate additional features and options in future versions without breaking compatibility.