# pg_scribe Command Line Interface This document describes the command line interface for **pg_scribe**, an incremental SQL backup system for PostgreSQL. ## Overview pg_scribe uses action flags similar to other PostgreSQL tools like `pg_recvlogical`. The tool performs one primary action per invocation, specified by action flags. ## Synopsis ```bash # Initialize backup system pg_scribe --init [OPTIONS] # Start streaming incremental backups pg_scribe --start [OPTIONS] # Take a full backup pg_scribe --full-backup [OPTIONS] # Restore from backups pg_scribe --restore [OPTIONS] # Check replication slot status pg_scribe --status [OPTIONS] ``` ## Common Options These options apply to all actions: ### Connection Options ``` -d, --dbname=DBNAME Database name (can be a connection string) -h, --host=HOSTNAME Database server host (default: localhost) -p, --port=PORT Database server port (default: 5432) -U, --username=USERNAME Database user (default: $PGUSER or $USER) -w, --no-password Never prompt for password -W, --password Force password prompt ``` ### General Options ``` -v, --verbose Enable verbose mode -V, --version Print version and exit -?, --help Show help and exit ``` ## Action Flags Exactly one of the following action flags must be specified: ### `--init` Initialize the backup system by creating a replication slot, setting up DDL capture via the wal2sql extension, and taking an initial base backup. **This is a one-time initialization operation** - following PostgreSQL conventions (`initdb`, `pg_basebackup`), it requires an empty backup directory and will fail if already initialized. If initialization fails, partial state is automatically cleaned up (replication slot dropped, partial backup files removed). **Additional options:** ``` -f, --file=DIRECTORY Backup output directory (required) -S, --slot=SLOTNAME Replication slot name (default: pg_scribe) --force Skip validation and force initialization (dangerous!) ``` **What it does:** **Phase 1: Validation** (runs first, can fail) 1. **CRITICAL Checks** (must pass or initialization fails): - Verify `wal_level = logical` - Verify `max_replication_slots >= 1` - Verify `max_wal_senders >= 1` - Check all tables have adequate replica identity (PRIMARY KEY, USING INDEX, or FULL) 2. **Coverage Warnings** (non-blocking, informational): - List unlogged tables (will not be backed up) - Check for large objects (not incrementally backed up) **Phase 2: Setup** (only runs if validation passes or `--force` used) 1. Verifies backup directory is empty (or doesn't exist) 2. Creates the wal2sql extension if it doesn't exist (`CREATE EXTENSION IF NOT EXISTS wal2sql;`) - This automatically installs the DDL event trigger 3. Creates a logical replication slot (fails if slot already exists) 4. Takes synchronized base backup using `pg_dump` 5. Creates initial `pg_dumpall --globals-only` backup 6. Generates metadata file with PostgreSQL version, extensions, encoding **Example:** ```bash pg_scribe --init -d mydb -f /backups/mydb -S mydb_backup ``` **Output should convey:** - **Validation results** with clear pass/fail status for: - PostgreSQL configuration (wal_level, max_replication_slots, max_wal_senders) - Replica identity for all tables - Coverage warnings (unlogged tables, large objects) - **If validation fails**: List CRITICAL issues with specific fix commands, then exit with error code 5 (unless `--force` used) - **If validation passes** (or `--force`): Progress through setup steps (extension creation, slot creation with snapshot ID, backup paths) - **Final status**: Success message or warning if forced past validation failures --- ### `--start` Start streaming incremental backups continuously from a replication slot. **Additional options:** ``` -f, --file=FILENAME Output file (use '-' for stdout, required) -S, --slot=SLOTNAME Replication slot name (default: pg_scribe) -s, --status-interval=SECS Status update interval in seconds (default: 10) -F, --fsync-interval=SECS Fsync interval in seconds (default: 10, 0 to disable) ``` **What it does:** 1. Validates database connection and replication slot 2. Displays configuration (database, slot name, output file, intervals) 3. Uses `exec` to replace itself with `pg_recvlogical`, which then: - Streams decoded changes using the wal2sql plugin - Writes SQL to output file - Periodically fsyncs output file for crash safety - Reports LSN position and lag to stderr - Responds to SIGHUP by closing and reopening output file (for log rotation) **Implementation:** `pg_scribe --start` is a thin wrapper that validates prerequisites and uses `exec` to become `pg_recvlogical`. This design has several advantages: - **No orphaned processes**: pg_scribe becomes pg_recvlogical (same PID), eliminating parent-child complexity - **Direct signal handling**: SIGHUP, SIGTERM, SIGINT go directly to pg_recvlogical without forwarding - **Simpler code**: No need for signal forwarding, child process tracking, or wait loops - **Reliable cleanup**: Tests and process managers interact with a single process **Log rotation support:** File rotation is provided automatically by `pg_recvlogical`: 1. Rename the current output file 2. Send SIGHUP to the pg_scribe/pg_recvlogical process 3. pg_recvlogical closes the old file and opens a new one with the same name **Example:** ```bash # Stream to a file (foreground) pg_scribe --start -d mydb -f /backups/mydb/incremental.sql -S mydb_backup # Run as background daemon (redirect stderr to log file) pg_scribe --start -d mydb -f /backups/mydb/incremental.sql -S mydb_backup \ 2>/var/log/pg_scribe.log & # Stream to stdout (for processing with other tools) pg_scribe --start -d mydb -f - -S mydb_backup > /backups/mydb/incremental.sql ``` **Log rotation:** ```bash #!/bin/bash BACKUP_FILE="/backups/mydb/incremental.sql" mv "$BACKUP_FILE" "$BACKUP_FILE.$(date +%Y%m%d)" killall -SIGHUP pg_scribe ``` **Output (to stderr):** Connection status, output file path, periodic LSN position and lag, file rotation events --- ### `--full-backup` Take a full backup using `pg_dump` and `pg_dumpall --globals-only`. **Additional options:** ``` -f, --file=DIRECTORY Backup output directory (required) -Z, --compress=METHOD Compression method: gzip, lz4, zstd, or none Can include level (e.g., zstd:9) (default: gzip) ``` **What it does:** 1. Takes full `pg_dump` backup 2. Takes `pg_dumpall --globals-only` backup 3. Generates metadata file 4. Compresses backups if enabled **Example:** ```bash pg_scribe --full-backup -d mydb -f /backups/mydb pg_scribe --full-backup -d mydb -f /backups/mydb --compress=zstd:9 ``` **Output should convey:** Progress for pg_dump and pg_dumpall steps, file paths, backup size (compressed and uncompressed if applicable) --- ### `--restore` Restore database from base backup plus incremental backups. **Additional options:** ``` -f, --file=DIRECTORY Backup input directory (required) -d, --dbname=DBNAME Target database name (required) -C, --create Create target database --base-backup=FILENAME Specific base backup file (default: latest) --no-sync-sequences Skip sequence synchronization ``` **What it does:** 1. Locates base backup (latest or specified) 2. Finds all incremental backups since base 3. Creates target database (if `--create` specified) 4. Restores globals (roles, tablespaces) 5. Restores base backup 6. Applies incremental backups in chronological order 7. Synchronizes sequences using `setval()` (unless `--no-sync-sequences`) 8. Reports basic statistics **Example:** ```bash # Restore latest to new database pg_scribe --restore -f /backups/mydb -d mydb_restored --create # Restore specific base backup pg_scribe --restore -f /backups/mydb -d mydb_restored \ --base-backup=/backups/mydb/base-2024-01-10.sql ``` **Output should convey:** Base backup identified, number of incremental backups to apply, progress through each restore phase (globals, base, incrementals with counts), sequence synchronization details, final statistics (duration, row counts) --- ### `--status` Check replication slot health and backup system status. **Additional options:** ``` -S, --slot=SLOTNAME Replication slot name (default: pg_scribe) -f, --file=DIRECTORY Backup directory to analyze (optional) ``` **What it does:** 1. Queries `pg_replication_slots` for slot health 2. Shows replication lag and WAL retention 3. Analyzes backup directory if provided 4. Reports warnings about potential issues **Example:** ```bash pg_scribe --status -d mydb -S mydb_backup -f /backups/mydb ``` **Output should convey:** Replication slot details (active status, LSN positions, lag, WAL retention, age), backup directory analysis (base backup count and dates, incremental file count and size, last backup timestamp, coverage continuity), overall health status with warnings if applicable --- ## Exit Status ``` 0 Success 1 General error 2 Database connection error 3 Replication slot error 4 Backup/restore error 5 Invalid arguments or validation failure (--init without --force) 10 Warning conditions (--status only) ``` --- ## Environment Variables Standard PostgreSQL environment variables are supported: ``` PGHOST Database host PGPORT Database port PGDATABASE Database name PGUSER Database user PGPASSWORD Database password (not recommended, use .pgpass instead) PG_COLOR Use color in diagnostics: always, auto, or never ``` --- ## Example Workflows ### Initial Setup and Daily Operation ```bash # 1. Initialize backup system (one-time setup, requires empty directory) pg_scribe --init -d production -f /backups/production -S prod_backup # 2. Start streaming backups (run as daemon/service) pg_scribe --start -d production -f /backups/production/incremental.sql \ -S prod_backup 2>/var/log/pg_scribe.log & # 3. Set up log rotation (logrotate or custom script) cat > /etc/logrotate.d/pg_scribe <