1 # pg_scribe Command Line Interface
3 This document describes the command line interface for **pg_scribe**, an incremental SQL backup system for PostgreSQL.
7 pg_scribe uses action flags similar to other PostgreSQL tools like `pg_recvlogical`. The tool performs one primary action per invocation, specified by action flags.
12 # Initialize backup system
13 pg_scribe --init [OPTIONS]
15 # Start streaming incremental backups
16 pg_scribe --start [OPTIONS]
19 pg_scribe --full-backup [OPTIONS]
21 # Restore from backups
22 pg_scribe --restore [OPTIONS]
24 # Check replication slot status
25 pg_scribe --status [OPTIONS]
30 These options apply to all actions:
32 ### Connection Options
35 -d, --dbname=DBNAME Database name (can be a connection string)
36 -h, --host=HOSTNAME Database server host (default: localhost)
37 -p, --port=PORT Database server port (default: 5432)
38 -U, --username=USERNAME Database user (default: $PGUSER or $USER)
39 -w, --no-password Never prompt for password
40 -W, --password Force password prompt
46 -v, --verbose Enable verbose mode
47 -V, --version Print version and exit
48 -?, --help Show help and exit
53 Exactly one of the following action flags must be specified:
57 Initialize the backup system by creating a replication slot, setting up DDL capture via the wal2sql extension, and taking an initial base backup.
59 **This is a one-time initialization operation** - following PostgreSQL conventions (`initdb`, `pg_basebackup`), it requires an empty backup directory and will fail if already initialized. If initialization fails, partial state is automatically cleaned up (replication slot dropped, partial backup files removed).
61 **Additional options:**
64 -f, --file=DIRECTORY Backup output directory (required)
65 -S, --slot=SLOTNAME Replication slot name (default: pg_scribe)
66 --force Skip validation and force initialization (dangerous!)
71 **Phase 1: Validation** (runs first, can fail)
73 1. **CRITICAL Checks** (must pass or initialization fails):
74 - Verify `wal_level = logical`
75 - Verify `max_replication_slots >= 1`
76 - Verify `max_wal_senders >= 1`
77 - Check all tables have adequate replica identity (PRIMARY KEY, USING INDEX, or FULL)
79 2. **Coverage Warnings** (non-blocking, informational):
80 - List unlogged tables (will not be backed up)
81 - Check for large objects (not incrementally backed up)
83 **Phase 2: Setup** (only runs if validation passes or `--force` used)
85 1. Verifies backup directory is empty (or doesn't exist)
86 2. Creates the wal2sql extension if it doesn't exist (`CREATE EXTENSION IF NOT EXISTS wal2sql;`)
87 - This automatically installs the DDL event trigger
88 3. Creates a logical replication slot (fails if slot already exists)
89 4. Takes synchronized base backup using `pg_dump`
90 5. Creates initial `pg_dumpall --globals-only` backup
91 6. Generates metadata file with PostgreSQL version, extensions, encoding
96 pg_scribe --init -d mydb -f /backups/mydb -S mydb_backup
99 **Output should convey:**
101 - **Validation results** with clear pass/fail status for:
102 - PostgreSQL configuration (wal_level, max_replication_slots, max_wal_senders)
103 - Replica identity for all tables
104 - Coverage warnings (unlogged tables, large objects)
105 - **If validation fails**: List CRITICAL issues with specific fix commands, then exit with error code 5 (unless `--force` used)
106 - **If validation passes** (or `--force`): Progress through setup steps (extension creation, slot creation with snapshot ID, backup paths)
107 - **Final status**: Success message or warning if forced past validation failures
113 Start streaming incremental backups continuously from a replication slot.
115 **Additional options:**
118 -f, --file=FILENAME Output file (use '-' for stdout, required)
119 -S, --slot=SLOTNAME Replication slot name (default: pg_scribe)
120 -s, --status-interval=SECS Status update interval in seconds (default: 10)
121 -F, --fsync-interval=SECS Fsync interval in seconds (default: 10, 0 to disable)
126 1. Wraps `pg_recvlogical` to stream from the specified replication slot
127 2. Streams decoded changes using the wal2sql plugin
128 3. Writes SQL to output file
129 4. Periodically fsyncs output file for crash safety
130 5. Reports LSN position and lag to stderr
131 6. Responds to SIGHUP by closing and reopening output file (for log rotation)
133 **Implementation:** `pg_scribe --start` is a thin wrapper around `pg_recvlogical --start` that:
134 - Invokes `pg_recvlogical` with the `wal2sql` plugin and appropriate options
135 - Forwards signals (SIGHUP, SIGTERM, SIGINT) to the child process
136 - Provides consistent interface with other pg_scribe commands
138 **Log rotation support:** File rotation is provided automatically by `pg_recvlogical`:
140 1. Rename the current output file
141 2. Send SIGHUP to pg_scribe process (which forwards it to pg_recvlogical)
142 3. pg_recvlogical closes the old file and opens a new one with the same name
147 # Stream to a file (foreground)
148 pg_scribe --start -d mydb -f /backups/mydb/incremental.sql -S mydb_backup
150 # Run as background daemon (redirect stderr to log file)
151 pg_scribe --start -d mydb -f /backups/mydb/incremental.sql -S mydb_backup \
152 2>/var/log/pg_scribe.log &
154 # Stream to stdout (for processing with other tools)
155 pg_scribe --start -d mydb -f - -S mydb_backup > /backups/mydb/incremental.sql
162 BACKUP_FILE="/backups/mydb/incremental.sql"
163 mv "$BACKUP_FILE" "$BACKUP_FILE.$(date +%Y%m%d)"
164 killall -SIGHUP pg_scribe
167 **Output (to stderr):** Connection status, output file path, periodic LSN position and lag, file rotation events
173 Take a full backup using `pg_dump` and `pg_dumpall --globals-only`.
175 **Additional options:**
178 -f, --file=DIRECTORY Backup output directory (required)
179 -Z, --compress=METHOD Compression method: gzip, lz4, zstd, or none
180 Can include level (e.g., zstd:9) (default: zstd:9)
185 1. Takes full `pg_dump` backup
186 2. Takes `pg_dumpall --globals-only` backup
187 3. Generates metadata file
188 4. Compresses backups if enabled
193 pg_scribe --full-backup -d mydb -f /backups/mydb
194 pg_scribe --full-backup -d mydb -f /backups/mydb --compress=zstd:9
197 **Output should convey:** Progress for pg_dump and pg_dumpall steps, file paths, backup size (compressed and uncompressed if applicable)
203 Restore database from base backup plus incremental backups.
205 **Additional options:**
208 -f, --file=DIRECTORY Backup input directory (required)
209 -d, --dbname=DBNAME Target database name (required)
210 -C, --create Create target database
211 --base-backup=FILENAME Specific base backup file (default: latest)
212 --no-sync-sequences Skip sequence synchronization
217 1. Locates base backup (latest or specified)
218 2. Finds all incremental backups since base
219 3. Creates target database (if `--create` specified)
220 4. Restores globals (roles, tablespaces)
221 5. Restores base backup
222 6. Applies incremental backups in chronological order
223 7. Synchronizes sequences using `setval()` (unless `--no-sync-sequences`)
224 8. Reports basic statistics
229 # Restore latest to new database
230 pg_scribe --restore -f /backups/mydb -d mydb_restored --create
232 # Restore specific base backup
233 pg_scribe --restore -f /backups/mydb -d mydb_restored \
234 --base-backup=/backups/mydb/base-2024-01-10.sql
237 **Output should convey:** Base backup identified, number of incremental backups to apply, progress through each restore phase (globals, base, incrementals with counts), sequence synchronization details, final statistics (duration, row counts)
243 Check replication slot health and backup system status.
245 **Additional options:**
248 -S, --slot=SLOTNAME Replication slot name (default: pg_scribe)
249 -f, --file=DIRECTORY Backup directory to analyze (optional)
254 1. Queries `pg_replication_slots` for slot health
255 2. Shows replication lag and WAL retention
256 3. Analyzes backup directory if provided
257 4. Reports warnings about potential issues
262 pg_scribe --status -d mydb -S mydb_backup -f /backups/mydb
265 **Output should convey:** Replication slot details (active status, LSN positions, lag, WAL retention, age), backup directory analysis (base backup count and dates, incremental file count and size, last backup timestamp, coverage continuity), overall health status with warnings if applicable
274 2 Database connection error
275 3 Replication slot error
276 4 Backup/restore error
277 5 Invalid arguments or validation failure (--init without --force)
278 10 Warning conditions (--status only)
283 ## Environment Variables
285 Standard PostgreSQL environment variables are supported:
290 PGDATABASE Database name
292 PGPASSWORD Database password (not recommended, use .pgpass instead)
293 PG_COLOR Use color in diagnostics: always, auto, or never
300 ### Initial Setup and Daily Operation
303 # 1. Initialize backup system (one-time setup, requires empty directory)
304 pg_scribe --init -d production -f /backups/production -S prod_backup
306 # 2. Start streaming backups (run as daemon/service)
307 pg_scribe --start -d production -f /backups/production/incremental.sql \
308 -S prod_backup 2>/var/log/pg_scribe.log &
310 # 3. Set up log rotation (logrotate or custom script)
311 cat > /etc/logrotate.d/pg_scribe <<EOF
312 /backups/production/incremental.sql {
317 killall -SIGHUP pg_scribe
322 # 4. Schedule daily full backups (cron: 0 2 * * *)
323 pg_scribe --full-backup -d production -f /backups/production
325 # 5. Monitor slot health (cron: */15 * * * *)
326 pg_scribe --status -d production -S prod_backup
329 ### Disaster Recovery
332 # 1. Check available backups
333 pg_scribe --status -f /backups/production
335 # 2. Restore to new database
336 pg_scribe --restore -f /backups/production \
337 -d production_restored --create
339 # 3. Test restored database
340 psql -d production_restored -c "SELECT COUNT(*) FROM users;"
342 # 4. Switch application to restored database (manual step)
347 ## Comparison with PostgreSQL Tools
349 pg_scribe follows the same CLI conventions as core PostgreSQL tools:
351 | Tool | Action Flags | Connection Options | File Options |
352 |------|-------------|-------------------|--------------|
353 | **pg_recvlogical** | `--create-slot`, `--drop-slot`, `--start` | `-d`, `-h`, `-p`, `-U` | `-f` (output file), `-S` (slot) |
354 | **pg_dump** | (positional dbname) | `-d`, `-h`, `-p`, `-U` | `-f` (output file) |
355 | **pg_basebackup** | (none) | `-d`, `-h`, `-p`, `-U` | `-D` (data directory!) |
356 | **pg_restore** | (none) | `-d`, `-h`, `-p`, `-U` | positional (archive file) |
357 | **pg_scribe** | `--init`, `--start`, `--full-backup`, `--restore`, `--status` | `-d`, `-h`, `-p`, `-U` | `-f` (file/directory), `-S` (slot) |
359 **Note:** `pg_basebackup` uses `-D/--pgdata` because it creates an actual PostgreSQL data directory cluster. `pg_scribe` uses `-f/--file` like `pg_dump` and `pg_recvlogical` because it creates backup files.
363 ## Implementation Notes
365 ### Design Principles
367 1. **Consistent with PostgreSQL**: Follow exact same conventions as `pg_recvlogical`, `pg_dump`, `initdb`, `pg_basebackup`
368 2. **Fail-fast initialization**: `--init` requires empty directory, cleans up on failure (like `initdb` and `pg_basebackup`)
369 3. **Clear output**: Progress to stderr, data to stdout (when using `-f -`)
370 4. **Scriptable**: Clear text output format, proper exit codes
372 ### Technology Choices (POC)
374 - **Language**: Bash or Python
375 - Bash: Minimal dependencies, matches PostgreSQL tool style
376 - Python: Better error handling, easier testing
377 - **Dependencies**: Only PostgreSQL client tools (pg_recvlogical, pg_dump, pg_dumpall, psql)
379 ### Key Implementation Components
381 1. **Connection management**: Use libpq environment variables, .pgpass
382 2. **Error handling**: Validate prerequisites before starting operations
383 3. **File management**:
384 - For `--start`: Invoke `pg_recvlogical` with file specified by `-f`
385 - For `--full-backup`: Write to directory specified by `-f`
386 - For `--restore`: Read from directory specified by `-f`
387 4. **Signal handling**:
388 - SIGTERM/SIGINT for graceful shutdown (forward to child processes)
389 - SIGHUP for file rotation (--start only, forwarded to pg_recvlogical)
390 5. **Output conventions**:
391 - Progress and status messages → stderr
392 - SQL output → file specified by `-f` (or stdout if `-f -`)
393 6. **Process management for --start**:
394 - Spawn `pg_recvlogical` as child process
395 - Forward signals to child process
396 - Monitor child exit status
397 - Wrap child output to provide consistent pg_scribe formatting
399 **Extensibility Note**: This POC design uses a modular, action-based CLI that can accommodate additional features and options in future versions without breaking compatibility.