4 26.4.1. User's Overview
5 26.4.2. Handling Query Conflicts
6 26.4.3. Administrator's Overview
7 26.4.4. Hot Standby Parameter Reference
10 Hot standby is the term used to describe the ability to connect to the
11 server and run read-only queries while the server is in archive
12 recovery or standby mode. This is useful both for replication purposes
13 and for restoring a backup to a desired state with great precision. The
14 term hot standby also refers to the ability of the server to move from
15 recovery through to normal operation while users continue running
16 queries and/or keep their connections open.
18 Running queries in hot standby mode is similar to normal query
19 operation, though there are several usage and administrative
20 differences explained below.
22 26.4.1. User's Overview #
24 When the hot_standby parameter is set to true on a standby server, it
25 will begin accepting connections once the recovery has brought the
26 system to a consistent state and be ready for hot standby. All such
27 connections are strictly read-only; not even temporary tables may be
30 The data on the standby takes some time to arrive from the primary
31 server so there will be a measurable delay between primary and standby.
32 Running the same query nearly simultaneously on both primary and
33 standby might therefore return differing results. We say that data on
34 the standby is eventually consistent with the primary. Once the commit
35 record for a transaction is replayed on the standby, the changes made
36 by that transaction will be visible to any new snapshots taken on the
37 standby. Snapshots may be taken at the start of each query or at the
38 start of each transaction, depending on the current transaction
39 isolation level. For more details, see Section 13.2.
41 Transactions started during hot standby may issue the following
43 * Query access: SELECT, COPY TO
44 * Cursor commands: DECLARE, FETCH, CLOSE
45 * Settings: SHOW, SET, RESET
46 * Transaction management commands:
47 + BEGIN, END, ABORT, START TRANSACTION
48 + SAVEPOINT, RELEASE, ROLLBACK TO SAVEPOINT
49 + EXCEPTION blocks and other internal subtransactions
50 * LOCK TABLE, though only when explicitly in one of these modes:
51 ACCESS SHARE, ROW SHARE or ROW EXCLUSIVE.
52 * Plans and resources: PREPARE, EXECUTE, DEALLOCATE, DISCARD
53 * Plugins and extensions: LOAD
56 Transactions started during hot standby will never be assigned a
57 transaction ID and cannot write to the system write-ahead log.
58 Therefore, the following actions will produce error messages:
59 * Data Manipulation Language (DML): INSERT, UPDATE, DELETE, MERGE,
60 COPY FROM, TRUNCATE. Note that there are no allowed actions that
61 result in a trigger being executed during recovery. This
62 restriction applies even to temporary tables, because table rows
63 cannot be read or written without assigning a transaction ID, which
64 is currently not possible in a hot standby environment.
65 * Data Definition Language (DDL): CREATE, DROP, ALTER, COMMENT. This
66 restriction applies even to temporary tables, because carrying out
67 these operations would require updating the system catalog tables.
68 * SELECT ... FOR SHARE | UPDATE, because row locks cannot be taken
69 without updating the underlying data files.
70 * Rules on SELECT statements that generate DML commands.
71 * LOCK that explicitly requests a mode higher than ROW EXCLUSIVE
73 * LOCK in short default form, since it requests ACCESS EXCLUSIVE
75 * Transaction management commands that explicitly set non-read-only
77 + BEGIN READ WRITE, START TRANSACTION READ WRITE
78 + SET TRANSACTION READ WRITE, SET SESSION CHARACTERISTICS AS
79 TRANSACTION READ WRITE
80 + SET transaction_read_only = off
81 * Two-phase commit commands: PREPARE TRANSACTION, COMMIT PREPARED,
82 ROLLBACK PREPARED because even read-only transactions need to write
83 WAL in the prepare phase (the first phase of two phase commit).
84 * Sequence updates: nextval(), setval()
87 In normal operation, “read-only” transactions are allowed to use LISTEN
88 and NOTIFY, so hot standby sessions operate under slightly tighter
89 restrictions than ordinary read-only sessions. It is possible that some
90 of these restrictions might be loosened in a future release.
92 During hot standby, the parameter transaction_read_only is always true
93 and may not be changed. But as long as no attempt is made to modify the
94 database, connections during hot standby will act much like any other
95 database connection. If failover or switchover occurs, the database
96 will switch to normal processing mode. Sessions will remain connected
97 while the server changes mode. Once hot standby finishes, it will be
98 possible to initiate read-write transactions (even from a session begun
101 Users can determine whether hot standby is currently active for their
102 session by issuing SHOW in_hot_standby. (In server versions before 14,
103 the in_hot_standby parameter did not exist; a workable substitute
104 method for older servers is SHOW transaction_read_only.) In addition, a
105 set of functions (Table 9.98) allow users to access information about
106 the standby server. These allow you to write programs that are aware of
107 the current state of the database. These can be used to monitor the
108 progress of recovery, or to allow you to write complex programs that
109 restore the database to particular states.
111 26.4.2. Handling Query Conflicts #
113 The primary and standby servers are in many ways loosely connected.
114 Actions on the primary will have an effect on the standby. As a result,
115 there is potential for negative interactions or conflicts between them.
116 The easiest conflict to understand is performance: if a huge data load
117 is taking place on the primary then this will generate a similar stream
118 of WAL records on the standby, so standby queries may contend for
119 system resources, such as I/O.
121 There are also additional types of conflict that can occur with hot
122 standby. These conflicts are hard conflicts in the sense that queries
123 might need to be canceled and, in some cases, sessions disconnected to
124 resolve them. The user is provided with several ways to handle these
125 conflicts. Conflict cases include:
126 * Access Exclusive locks taken on the primary server, including both
127 explicit LOCK commands and various DDL actions, conflict with table
128 accesses in standby queries.
129 * Dropping a tablespace on the primary conflicts with standby queries
130 using that tablespace for temporary work files.
131 * Dropping a database on the primary conflicts with sessions
132 connected to that database on the standby.
133 * Application of a vacuum cleanup record from WAL conflicts with
134 standby transactions whose snapshots can still “see” any of the
136 * Application of a vacuum cleanup record from WAL conflicts with
137 queries accessing the target page on the standby, whether or not
138 the data to be removed is visible.
140 On the primary server, these cases simply result in waiting; and the
141 user might choose to cancel either of the conflicting actions. However,
142 on the standby there is no choice: the WAL-logged action already
143 occurred on the primary so the standby must not fail to apply it.
144 Furthermore, allowing WAL application to wait indefinitely may be very
145 undesirable, because the standby's state will become increasingly far
146 behind the primary's. Therefore, a mechanism is provided to forcibly
147 cancel standby queries that conflict with to-be-applied WAL records.
149 An example of the problem situation is an administrator on the primary
150 server running DROP TABLE on a table that is currently being queried on
151 the standby server. Clearly the standby query cannot continue if the
152 DROP TABLE is applied on the standby. If this situation occurred on the
153 primary, the DROP TABLE would wait until the other query had finished.
154 But when DROP TABLE is run on the primary, the primary doesn't have
155 information about what queries are running on the standby, so it will
156 not wait for any such standby queries. The WAL change records come
157 through to the standby while the standby query is still running,
158 causing a conflict. The standby server must either delay application of
159 the WAL records (and everything after them, too) or else cancel the
160 conflicting query so that the DROP TABLE can be applied.
162 When a conflicting query is short, it's typically desirable to allow it
163 to complete by delaying WAL application for a little bit; but a long
164 delay in WAL application is usually not desirable. So the cancel
165 mechanism has parameters, max_standby_archive_delay and
166 max_standby_streaming_delay, that define the maximum allowed delay in
167 WAL application. Conflicting queries will be canceled once it has taken
168 longer than the relevant delay setting to apply any newly-received WAL
169 data. There are two parameters so that different delay values can be
170 specified for the case of reading WAL data from an archive (i.e.,
171 initial recovery from a base backup or “catching up” a standby server
172 that has fallen far behind) versus reading WAL data via streaming
175 In a standby server that exists primarily for high availability, it's
176 best to set the delay parameters relatively short, so that the server
177 cannot fall far behind the primary due to delays caused by standby
178 queries. However, if the standby server is meant for executing
179 long-running queries, then a high or even infinite delay value may be
180 preferable. Keep in mind however that a long-running query could cause
181 other sessions on the standby server to not see recent changes on the
182 primary, if it delays application of WAL records.
184 Once the delay specified by max_standby_archive_delay or
185 max_standby_streaming_delay has been exceeded, conflicting queries will
186 be canceled. This usually results just in a cancellation error,
187 although in the case of replaying a DROP DATABASE the entire
188 conflicting session will be terminated. Also, if the conflict is over a
189 lock held by an idle transaction, the conflicting session is terminated
190 (this behavior might change in the future).
192 Canceled queries may be retried immediately (after beginning a new
193 transaction, of course). Since query cancellation depends on the nature
194 of the WAL records being replayed, a query that was canceled may well
195 succeed if it is executed again.
197 Keep in mind that the delay parameters are compared to the elapsed time
198 since the WAL data was received by the standby server. Thus, the grace
199 period allowed to any one query on the standby is never more than the
200 delay parameter, and could be considerably less if the standby has
201 already fallen behind as a result of waiting for previous queries to
202 complete, or as a result of being unable to keep up with a heavy update
205 The most common reason for conflict between standby queries and WAL
206 replay is “early cleanup”. Normally, PostgreSQL allows cleanup of old
207 row versions when there are no transactions that need to see them to
208 ensure correct visibility of data according to MVCC rules. However,
209 this rule can only be applied for transactions executing on the
210 primary. So it is possible that cleanup on the primary will remove row
211 versions that are still visible to a transaction on the standby.
213 Row version cleanup isn't the only potential cause of conflicts with
214 standby queries. All index-only scans (including those that run on
215 standbys) must use an MVCC snapshot that “agrees” with the visibility
216 map. Conflicts are therefore required whenever VACUUM sets a page as
217 all-visible in the visibility map containing one or more rows not
218 visible to all standby queries. So even running VACUUM against a table
219 with no updated or deleted rows requiring cleanup might lead to
222 Users should be clear that tables that are regularly and heavily
223 updated on the primary server will quickly cause cancellation of longer
224 running queries on the standby. In such cases the setting of a finite
225 value for max_standby_archive_delay or max_standby_streaming_delay can
226 be considered similar to setting statement_timeout.
228 Remedial possibilities exist if the number of standby-query
229 cancellations is found to be unacceptable. The first option is to set
230 the parameter hot_standby_feedback, which prevents VACUUM from removing
231 recently-dead rows and so cleanup conflicts do not occur. If you do
232 this, you should note that this will delay cleanup of dead rows on the
233 primary, which may result in undesirable table bloat. However, the
234 cleanup situation will be no worse than if the standby queries were
235 running directly on the primary server, and you are still getting the
236 benefit of off-loading execution onto the standby. If standby servers
237 connect and disconnect frequently, you might want to make adjustments
238 to handle the period when hot_standby_feedback feedback is not being
239 provided. For example, consider increasing max_standby_archive_delay so
240 that queries are not rapidly canceled by conflicts in WAL archive files
241 during disconnected periods. You should also consider increasing
242 max_standby_streaming_delay to avoid rapid cancellations by
243 newly-arrived streaming WAL entries after reconnection.
245 The number of query cancels and the reason for them can be viewed using
246 the pg_stat_database_conflicts system view on the standby server. The
247 pg_stat_database system view also contains summary information.
249 Users can control whether a log message is produced when WAL replay is
250 waiting longer than deadlock_timeout for conflicts. This is controlled
251 by the log_recovery_conflict_waits parameter.
253 26.4.3. Administrator's Overview #
255 If hot_standby is on in postgresql.conf (the default value) and there
256 is a standby.signal file present, the server will run in hot standby
257 mode. However, it may take some time for hot standby connections to be
258 allowed, because the server will not accept connections until it has
259 completed sufficient recovery to provide a consistent state against
260 which queries can run. During this period, clients that attempt to
261 connect will be refused with an error message. To confirm the server
262 has come up, either loop trying to connect from the application, or
263 look for these messages in the server logs:
264 LOG: entering standby mode
266 ... then some time later ...
268 LOG: consistent recovery state reached
269 LOG: database system is ready to accept read-only connections
271 Consistency information is recorded once per checkpoint on the primary.
272 It is not possible to enable hot standby when reading WAL written
273 during a period when wal_level was not set to replica or logical on the
274 primary. Even after reaching a consistent state, the recovery snapshot
275 may not be ready for hot standby if both of the following conditions
276 are met, delaying accepting read-only connections. To enable hot
277 standby, long-lived write transactions with more than 64
278 subtransactions need to be closed on the primary.
279 * A write transaction has more than 64 subtransactions
280 * Very long-lived write transactions
282 If you are running file-based log shipping ("warm standby"), you might
283 need to wait until the next WAL file arrives, which could be as long as
284 the archive_timeout setting on the primary.
286 The settings of some parameters determine the size of shared memory for
287 tracking transaction IDs, locks, and prepared transactions. These
288 shared memory structures must be no smaller on a standby than on the
289 primary in order to ensure that the standby does not run out of shared
290 memory during recovery. For example, if the primary had used a prepared
291 transaction but the standby had not allocated any shared memory for
292 tracking prepared transactions, then recovery could not continue until
293 the standby's configuration is changed. The parameters affected are:
295 * max_prepared_transactions
296 * max_locks_per_transaction
298 * max_worker_processes
300 The easiest way to ensure this does not become a problem is to have
301 these parameters set on the standbys to values equal to or greater than
302 on the primary. Therefore, if you want to increase these values, you
303 should do so on all standby servers first, before applying the changes
304 to the primary server. Conversely, if you want to decrease these
305 values, you should do so on the primary server first, before applying
306 the changes to all standby servers. Keep in mind that when a standby is
307 promoted, it becomes the new reference for the required parameter
308 settings for the standbys that follow it. Therefore, to avoid this
309 becoming a problem during a switchover or failover, it is recommended
310 to keep these settings the same on all standby servers.
312 The WAL tracks changes to these parameters on the primary. If a hot
313 standby processes WAL that indicates that the current value on the
314 primary is higher than its own value, it will log a warning and pause
315 recovery, for example:
316 WARNING: hot standby is not possible because of insufficient parameter settings
317 DETAIL: max_connections = 80 is a lower setting than on the primary server, whe
318 re its value was 100.
319 LOG: recovery has paused
320 DETAIL: If recovery is unpaused, the server will shut down.
321 HINT: You can then restart the server after making the necessary configuration
324 At that point, the settings on the standby need to be updated and the
325 instance restarted before recovery can continue. If the standby is not
326 a hot standby, then when it encounters the incompatible parameter
327 change, it will shut down immediately without pausing, since there is
328 then no value in keeping it up.
330 It is important that the administrator select appropriate settings for
331 max_standby_archive_delay and max_standby_streaming_delay. The best
332 choices vary depending on business priorities. For example if the
333 server is primarily tasked as a High Availability server, then you will
334 want low delay settings, perhaps even zero, though that is a very
335 aggressive setting. If the standby server is tasked as an additional
336 server for decision support queries then it might be acceptable to set
337 the maximum delay values to many hours, or even -1 which means wait
338 forever for queries to complete.
340 Transaction status "hint bits" written on the primary are not
341 WAL-logged, so data on the standby will likely re-write the hints again
342 on the standby. Thus, the standby server will still perform disk writes
343 even though all users are read-only; no changes occur to the data
344 values themselves. Users will still write large sort temporary files
345 and re-generate relcache info files, so no part of the database is
346 truly read-only during hot standby mode. Note also that writes to
347 remote databases using dblink module, and other operations outside the
348 database using PL functions will still be possible, even though the
349 transaction is read-only locally.
351 The following types of administration commands are not accepted during
353 * Data Definition Language (DDL): e.g., CREATE INDEX
354 * Privilege and Ownership: GRANT, REVOKE, REASSIGN
355 * Maintenance commands: ANALYZE, VACUUM, CLUSTER, REINDEX
357 Again, note that some of these commands are actually allowed during
358 "read only" mode transactions on the primary.
360 As a result, you cannot create additional indexes that exist solely on
361 the standby, nor statistics that exist solely on the standby. If these
362 administration commands are needed, they should be executed on the
363 primary, and eventually those changes will propagate to the standby.
365 pg_cancel_backend() and pg_terminate_backend() will work on user
366 backends, but not the startup process, which performs recovery.
367 pg_stat_activity does not show recovering transactions as active. As a
368 result, pg_prepared_xacts is always empty during recovery. If you wish
369 to resolve in-doubt prepared transactions, view pg_prepared_xacts on
370 the primary and issue commands to resolve transactions there or resolve
371 them after the end of recovery.
373 pg_locks will show locks held by backends, as normal. pg_locks also
374 shows a virtual transaction managed by the startup process that owns
375 all AccessExclusiveLocks held by transactions being replayed by
376 recovery. Note that the startup process does not acquire locks to make
377 database changes, and thus locks other than AccessExclusiveLocks do not
378 show in pg_locks for the Startup process; they are just presumed to
381 The Nagios plugin check_pgsql will work, because the simple information
382 it checks for exists. The check_postgres monitoring script will also
383 work, though some reported values could give different or confusing
384 results. For example, last vacuum time will not be maintained, since no
385 vacuum occurs on the standby. Vacuums running on the primary do still
386 send their changes to the standby.
388 WAL file control commands will not work during recovery, e.g.,
389 pg_backup_start, pg_switch_wal etc.
391 Dynamically loadable modules work, including pg_stat_statements.
393 Advisory locks work normally in recovery, including deadlock detection.
394 Note that advisory locks are never WAL logged, so it is impossible for
395 an advisory lock on either the primary or the standby to conflict with
396 WAL replay. Nor is it possible to acquire an advisory lock on the
397 primary and have it initiate a similar advisory lock on the standby.
398 Advisory locks relate only to the server on which they are acquired.
400 Trigger-based replication systems such as Slony, Londiste and Bucardo
401 won't run on the standby at all, though they will run happily on the
402 primary server as long as the changes are not sent to standby servers
403 to be applied. WAL replay is not trigger-based so you cannot relay from
404 the standby to any system that requires additional database writes or
405 relies on the use of triggers.
407 New OIDs cannot be assigned, though some UUID generators may still work
408 as long as they do not rely on writing new status to the database.
410 Currently, temporary table creation is not allowed during read-only
411 transactions, so in some cases existing scripts will not run correctly.
412 This restriction might be relaxed in a later release. This is both an
413 SQL standard compliance issue and a technical issue.
415 DROP TABLESPACE can only succeed if the tablespace is empty. Some
416 standby users may be actively using the tablespace via their
417 temp_tablespaces parameter. If there are temporary files in the
418 tablespace, all active queries are canceled to ensure that temporary
419 files are removed, so the tablespace can be removed and WAL replay can
422 Running DROP DATABASE or ALTER DATABASE ... SET TABLESPACE on the
423 primary will generate a WAL entry that will cause all users connected
424 to that database on the standby to be forcibly disconnected. This
425 action occurs immediately, whatever the setting of
426 max_standby_streaming_delay. Note that ALTER DATABASE ... RENAME does
427 not disconnect users, which in most cases will go unnoticed, though
428 might in some cases cause a program confusion if it depends in some way
431 In normal (non-recovery) mode, if you issue DROP USER or DROP ROLE for
432 a role with login capability while that user is still connected then
433 nothing happens to the connected user — they remain connected. The user
434 cannot reconnect however. This behavior applies in recovery also, so a
435 DROP USER on the primary does not disconnect that user on the standby.
437 The cumulative statistics system is active during recovery. All scans,
438 reads, blocks, index usage, etc., will be recorded normally on the
439 standby. However, WAL replay will not increment relation and database
440 specific counters. I.e. replay will not increment pg_stat_all_tables
441 columns (like n_tup_ins), nor will reads or writes performed by the
442 startup process be tracked in the pg_statio_ views, nor will associated
443 pg_stat_database columns be incremented.
445 Autovacuum is not active during recovery. It will start normally at the
448 The checkpointer process and the background writer process are active
449 during recovery. The checkpointer process will perform restartpoints
450 (similar to checkpoints on the primary) and the background writer
451 process will perform normal block cleaning activities. This can include
452 updates of the hint bit information stored on the standby server. The
453 CHECKPOINT command is accepted during recovery, though it performs a
454 restartpoint rather than a new checkpoint.
456 26.4.4. Hot Standby Parameter Reference #
458 Various parameters have been mentioned above in Section 26.4.2 and
461 On the primary, the wal_level parameter can be used.
462 max_standby_archive_delay and max_standby_streaming_delay have no
463 effect if set on the primary.
465 On the standby, parameters hot_standby, max_standby_archive_delay and
466 max_standby_streaming_delay can be used.
470 There are several limitations of hot standby. These can and probably
471 will be fixed in future releases:
472 * Full knowledge of running transactions is required before snapshots
473 can be taken. Transactions that use large numbers of
474 subtransactions (currently greater than 64) will delay the start of
475 read-only connections until the completion of the longest running
476 write transaction. If this situation occurs, explanatory messages
477 will be sent to the server log.
478 * Valid starting points for standby queries are generated at each
479 checkpoint on the primary. If the standby is shut down while the
480 primary is in a shutdown state, it might not be possible to
481 re-enter hot standby until the primary is started up, so that it
482 generates further starting points in the WAL logs. This situation
483 isn't a problem in the most common situations where it might
484 happen. Generally, if the primary is shut down and not available
485 anymore, that's likely due to a serious failure that requires the
486 standby being converted to operate as the new primary anyway. And
487 in situations where the primary is being intentionally taken down,
488 coordinating to make sure the standby becomes the new primary
489 smoothly is also standard procedure.
490 * At the end of recovery, AccessExclusiveLocks held by prepared
491 transactions will require twice the normal number of lock table
492 entries. If you plan on running either a large number of concurrent
493 prepared transactions that normally take AccessExclusiveLocks, or
494 you plan on having one large transaction that takes many
495 AccessExclusiveLocks, you are advised to select a larger value of
496 max_locks_per_transaction, perhaps as much as twice the value of
497 the parameter on the primary server. You need not consider this at
498 all if your setting of max_prepared_transactions is 0.
499 * The Serializable transaction isolation level is not yet available
500 in hot standby. (See Section 13.2.3 and Section 13.4.1 for
501 details.) An attempt to set a transaction to the serializable
502 isolation level in hot standby mode will generate an error.