1 # pg_scribe Command Line Interface
3 This document describes the command line interface for **pg_scribe**, an incremental SQL backup system for PostgreSQL.
7 pg_scribe uses action flags similar to other PostgreSQL tools like `pg_recvlogical`. The tool performs one primary action per invocation, specified by action flags.
12 # Initialize backup system
13 pg_scribe --init [OPTIONS]
15 # Start streaming incremental backups
16 pg_scribe --start [OPTIONS]
19 pg_scribe --full-backup [OPTIONS]
21 # Restore from backups
22 pg_scribe --restore [OPTIONS]
24 # Check replication slot status
25 pg_scribe --status [OPTIONS]
30 These options apply to all actions:
32 ### Connection Options
35 -d, --dbname=DBNAME Database name (can be a connection string)
36 -h, --host=HOSTNAME Database server host (default: localhost)
37 -p, --port=PORT Database server port (default: 5432)
38 -U, --username=USERNAME Database user (default: $PGUSER or $USER)
39 -w, --no-password Never prompt for password
40 -W, --password Force password prompt
46 -v, --verbose Enable verbose mode
47 -V, --version Print version and exit
48 -?, --help Show help and exit
53 Exactly one of the following action flags must be specified:
57 Initialize the backup system by creating a replication slot, setting up DDL capture via the wal2sql extension, and taking an initial base backup.
59 **This operation is idempotent** - it can be safely re-run if initialization fails partway through. Existing slots and extensions will not cause errors.
61 **Additional options:**
64 -f, --file=DIRECTORY Backup output directory (required)
65 -S, --slot=SLOTNAME Replication slot name (default: pg_scribe)
70 1. Creates the wal2sql extension if it doesn't exist (`CREATE EXTENSION IF NOT EXISTS wal2sql;`)
71 - This automatically installs the DDL event trigger
72 2. Creates a logical replication slot with snapshot export (skips if slot exists)
73 3. Takes synchronized base backup using `pg_dump --snapshot`
74 4. Creates initial `pg_dumpall --globals-only` backup
75 5. Generates metadata file with PostgreSQL version, extensions, encoding
76 6. Prints replica identity warnings for tables without primary keys
81 pg_scribe --init -d mydb -f /backups/mydb -S mydb_backup
87 Creating wal2sql extension...
88 Extension 'wal2sql' created (includes DDL event trigger).
89 Creating replication slot 'mydb_backup'...
90 Snapshot exported: 00000003-00000001-1
92 Base backup saved to: /backups/mydb/base-2024-01-15.sql
94 Globals saved to: /backups/mydb/globals-2024-01-15.sql
97 Initialization complete!
100 Table 'public.logs' has no replica identity - UPDATE/DELETE will fail
101 Run: ALTER TABLE public.logs REPLICA IDENTITY FULL;
104 **Output when re-running (idempotent):**
107 Checking wal2sql extension...
108 Extension 'wal2sql' already exists.
109 Checking replication slot 'mydb_backup'...
110 Slot 'mydb_backup' already exists, skipping creation.
111 Taking base backup...
112 Base backup saved to: /backups/mydb/base-2024-01-15.sql
114 Globals saved to: /backups/mydb/globals-2024-01-15.sql
117 Initialization complete!
124 Start streaming incremental backups continuously from a replication slot.
126 **Additional options:**
129 -f, --file=FILENAME Output file (use '-' for stdout, required)
130 -S, --slot=SLOTNAME Replication slot name (default: pg_scribe)
131 -s, --status-interval=SECS Status update interval in seconds (default: 10)
132 -F, --fsync-interval=SECS Fsync interval in seconds (default: 10, 0 to disable)
133 -n, --no-loop Don't retry on connection failure
138 1. Wraps `pg_recvlogical` to stream from the specified replication slot
139 2. Streams decoded changes using the wal2sql plugin
140 3. Writes SQL to output file
141 4. Periodically fsyncs output file for crash safety
142 5. Reports LSN position and lag to stderr
143 6. Responds to SIGHUP by closing and reopening output file (for log rotation)
145 **Implementation:** `pg_scribe --start` is a thin wrapper around `pg_recvlogical --start` that:
146 - Invokes `pg_recvlogical` with the `wal2sql` plugin and appropriate options
147 - Forwards signals (SIGHUP, SIGTERM, SIGINT) to the child process
148 - Provides consistent interface with other pg_scribe commands
150 **Log rotation support:** File rotation is provided automatically by `pg_recvlogical`:
152 1. Rename the current output file
153 2. Send SIGHUP to pg_scribe process (which forwards it to pg_recvlogical)
154 3. pg_recvlogical closes the old file and opens a new one with the same name
159 # Stream to a file (foreground)
160 pg_scribe --start -d mydb -f /backups/mydb/incremental.sql -S mydb_backup
162 # Run as background daemon (redirect stderr to log file)
163 pg_scribe --start -d mydb -f /backups/mydb/incremental.sql -S mydb_backup \
164 2>/var/log/pg_scribe.log &
166 # Stream to stdout (for processing with other tools)
167 pg_scribe --start -d mydb -f - -S mydb_backup > /backups/mydb/incremental.sql
170 **Log rotation script example:**
174 # Rotate incremental backup file
175 BACKUP_FILE="/backups/mydb/incremental.sql"
176 mv "$BACKUP_FILE" "$BACKUP_FILE.$(date +%Y%m%d)"
177 killall -SIGHUP pg_scribe
180 **Output (to stderr):**
182 Progress and status messages are written to stderr (following PostgreSQL tool conventions). When running as a background daemon, redirect stderr to a log file using `2>`.
185 Connecting to replication slot 'mydb_backup'...
186 Streaming to: /backups/mydb/incremental.sql
187 LSN: 0/1A2B3C4D Lag: 128 bytes
188 LSN: 0/1A2B3C8F Lag: 0 bytes
189 File rotated (SIGHUP received)
190 Streaming to: /backups/mydb/incremental.sql
198 Take a full backup using `pg_dump` and `pg_dumpall --globals-only`.
200 **Additional options:**
203 -f, --file=DIRECTORY Backup output directory (required)
204 -Z, --compress=METHOD Compression method: gzip, lz4, zstd, or none
205 Can include level (e.g., zstd:9) (default: zstd:9)
210 1. Takes full `pg_dump` backup
211 2. Takes `pg_dumpall --globals-only` backup
212 3. Generates metadata file
213 4. Compresses backups if enabled
218 pg_scribe --full-backup -d mydb -f /backups/mydb
220 # With specific compression
221 pg_scribe --full-backup -d mydb -f /backups/mydb --compress=zstd:9
227 Taking full backup...
228 Full backup saved to: /backups/mydb/base-2024-01-15.sql.zst
230 Globals saved to: /backups/mydb/globals-2024-01-15.sql
233 Full backup complete!
234 Size: 1.2 GB (compressed from 5.4 GB)
241 Restore database from base backup plus incremental backups.
243 **Additional options:**
246 -f, --file=DIRECTORY Backup input directory (required)
247 -d, --dbname=DBNAME Target database name (required)
248 -C, --create Create target database
249 -c, --clean Drop target database before creating (requires -C)
250 --base-backup=FILENAME Specific base backup file (default: latest)
251 --recovery-target-time=TIME Restore until timestamp (format: YYYY-MM-DD HH:MM:SS)
252 --recovery-target-name=NAME Restore until specific incremental file
253 -j, --jobs=NJOBS Parallel restore jobs (default: 1)
254 --no-sync-sequences Skip sequence synchronization
259 1. Locates base backup (latest or specified)
260 2. Finds all incremental backups since base
261 3. Creates target database (if `--create` specified)
262 4. Restores globals (roles, tablespaces)
263 5. Restores base backup
264 6. Applies incremental backups in chronological order
265 7. Synchronizes sequences using `setval()` (unless `--no-sync-sequences`)
266 8. Reports basic statistics
271 # Restore latest to new database
272 pg_scribe --restore -f /backups/mydb -d mydb_restored --create
274 # Restore to specific point in time
275 pg_scribe --restore -f /backups/mydb -d mydb_restored \
276 --recovery-target-time="2024-01-15 14:30:00"
278 # Restore specific base backup
279 pg_scribe --restore -f /backups/mydb -d mydb_restored \
280 --base-backup=/backups/mydb/base-2024-01-10.sql
286 Located base backup: base-2024-01-10.sql (5.4 GB)
287 Found 5 incremental backups to apply
288 Creating database 'mydb_restored'...
290 Restoring base backup... (5 minutes)
291 Applying incremental backup 1/5: incremental-2024-01-10.sql
292 Applying incremental backup 2/5: incremental-2024-01-11.sql
293 Applying incremental backup 3/5: incremental-2024-01-12.sql
294 Applying incremental backup 4/5: incremental-2024-01-13.sql
295 Applying incremental backup 5/5: incremental-2024-01-14.sql
296 Synchronizing sequences...
297 users_id_seq: set to 12845
298 orders_id_seq: set to 98234
299 products_id_seq: set to 456
303 Rows restored: 1,284,567
310 Check replication slot health and backup system status.
312 **Additional options:**
315 -S, --slot=SLOTNAME Replication slot name (default: pg_scribe)
316 -f, --file=DIRECTORY Backup directory to analyze (optional)
317 --format=FORMAT Output format: text or json (default: text)
322 1. Queries `pg_replication_slots` for slot health
323 2. Shows replication lag and WAL retention
324 3. Analyzes backup directory if provided
325 4. Reports warnings about potential issues
330 pg_scribe --status -d mydb -S mydb_backup -f /backups/mydb
336 Replication Slot: mydb_backup
340 Restart LSN: 0/1A2B3C4D
341 Confirmed LSN: 0/1A2B3C8F
342 Replication lag: 128 bytes
343 WAL retention: 2.1 GB
344 Slot age: 2 days 14 hours
346 Backup Directory: /backups/mydb
347 Base backups: 5 (oldest: 2024-01-01, newest: 2024-01-15)
348 Incremental: 45 files (12.4 GB total)
349 Last backup: 2024-01-15 14:23:01 (2 minutes ago)
350 Coverage: Continuous since 2024-01-01
361 Replication lag is 1.2 GB (threshold: 1.0 GB)
362 Slot has been inactive for 6 hours
363 Last incremental backup is 2 hours old
368 ## Size and Time Specifications
370 Size values accept suffixes:
371 - `K` or `KB` - kilobytes
372 - `M` or `MB` - megabytes
373 - `G` or `GB` - gigabytes
375 Time intervals accept suffixes:
382 - `--compress=zstd:9` (zstd compression, level 9)
383 - `--fsync-interval=10` (10 seconds)
392 2 Database connection error
393 3 Replication slot error
394 4 Backup/restore error
396 10 Warning conditions (--status only)
401 ## Environment Variables
403 Standard PostgreSQL environment variables are supported:
408 PGDATABASE Database name
410 PGPASSWORD Database password (not recommended, use .pgpass instead)
411 PG_COLOR Use color in diagnostics: always, auto, or never
418 ### Initial Setup and Daily Operation
421 # 1. Initialize backup system (idempotent - safe to re-run)
422 pg_scribe --init -d production -f /backups/production -S prod_backup
424 # 2. Start streaming backups (run as daemon/service)
425 pg_scribe --start -d production -f /backups/production/incremental.sql \
426 -S prod_backup 2>/var/log/pg_scribe.log &
428 # 3. Set up log rotation (logrotate or custom script)
429 cat > /etc/logrotate.d/pg_scribe <<EOF
430 /backups/production/incremental.sql {
435 killall -SIGHUP pg_scribe
440 # 4. Schedule daily full backups (cron: 0 2 * * *)
441 pg_scribe --full-backup -d production -f /backups/production
443 # 5. Monitor slot health (cron: */15 * * * *)
444 pg_scribe --status -d production -S prod_backup
447 ### Disaster Recovery
450 # 1. Check available backups
451 pg_scribe --status -f /backups/production
453 # 2. Restore to new database
454 pg_scribe --restore -f /backups/production \
455 -d production_restored --create
457 # 3. Test restored database
458 psql -d production_restored -c "SELECT COUNT(*) FROM users;"
460 # 4. Switch application to restored database (manual step)
463 ### Point-in-Time Recovery
466 # Restore to specific timestamp
467 pg_scribe --restore -f /backups/production \
468 -d production_pit --create \
469 --recovery-target-time="2024-01-15 09:30:00"
474 ## Comparison with PostgreSQL Tools
476 pg_scribe follows the same CLI conventions as core PostgreSQL tools:
478 | Tool | Action Flags | Connection Options | File Options |
479 |------|-------------|-------------------|--------------|
480 | **pg_recvlogical** | `--create-slot`, `--drop-slot`, `--start` | `-d`, `-h`, `-p`, `-U` | `-f` (output file), `-S` (slot) |
481 | **pg_dump** | (positional dbname) | `-d`, `-h`, `-p`, `-U` | `-f` (output file) |
482 | **pg_basebackup** | (none) | `-d`, `-h`, `-p`, `-U` | `-D` (data directory!) |
483 | **pg_restore** | (none) | `-d`, `-h`, `-p`, `-U` | positional (archive file) |
484 | **pg_scribe** | `--init`, `--start`, `--full-backup`, `--restore`, `--status` | `-d`, `-h`, `-p`, `-U` | `-f` (file/directory), `-S` (slot) |
486 **Note:** `pg_basebackup` uses `-D/--pgdata` because it creates an actual PostgreSQL data directory cluster. `pg_scribe` uses `-f/--file` like `pg_dump` and `pg_recvlogical` because it creates backup files.
490 ## Implementation Notes
492 ### Design Principles
494 1. **Consistent with PostgreSQL**: Follow exact same conventions as pg_recvlogical, pg_dump, etc.
495 2. **Safe by default**: Require explicit flags for destructive operations (`-c/--clean`)
496 3. **Idempotent where sensible**: `--init` can be safely re-run
497 4. **Clear output**: Progress to stderr, data to stdout (when using `-f -`)
498 5. **Scriptable**: Support JSON output (`--format=json`), proper exit codes
500 ### Technology Choices (POC)
502 - **Language**: Bash or Python
503 - Bash: Minimal dependencies, matches PostgreSQL tool style
504 - Python: Better error handling, easier testing
505 - **Dependencies**: Only PostgreSQL client tools (pg_recvlogical, pg_dump, pg_dumpall, psql)
507 ### Key Implementation Components
509 1. **Connection management**: Use libpq environment variables, .pgpass
510 2. **Error handling**: Validate prerequisites before starting operations
511 3. **File management**:
512 - For `--start`: Invoke `pg_recvlogical` with file specified by `-f`
513 - For `--full-backup`: Write to directory specified by `-f`
514 - For `--restore`: Read from directory specified by `-f`
515 4. **Signal handling**:
516 - SIGTERM/SIGINT for graceful shutdown (forward to child processes)
517 - SIGHUP for file rotation (--start only, forwarded to pg_recvlogical)
518 5. **Output conventions**:
519 - Progress and status messages → stderr
520 - SQL output → file specified by `-f` (or stdout if `-f -`)
521 6. **Process management for --start**:
522 - Spawn `pg_recvlogical` as child process
523 - Forward signals to child process
524 - Monitor child exit status
525 - Wrap child output to provide consistent pg_scribe formatting
527 ### Future Enhancements (Beyond POC)
529 - Parallel restore with concurrent psql sessions
530 - Retention policy enforcement with automatic cleanup
531 - Backup verification and integrity checks
532 - Remote backup destinations (S3, rsync)
533 - Encryption support (similar to `pg_basebackup --compress`)
534 - Web UI for backup monitoring