2 26.2. Log-Shipping Standby Servers #
5 26.2.2. Standby Server Operation
6 26.2.3. Preparing the Primary for Standby Servers
7 26.2.4. Setting Up a Standby Server
8 26.2.5. Streaming Replication
9 26.2.6. Replication Slots
10 26.2.7. Cascading Replication
11 26.2.8. Synchronous Replication
12 26.2.9. Continuous Archiving in Standby
14 Continuous archiving can be used to create a high availability (HA)
15 cluster configuration with one or more standby servers ready to take
16 over operations if the primary server fails. This capability is widely
17 referred to as warm standby or log shipping.
19 The primary and standby server work together to provide this
20 capability, though the servers are only loosely coupled. The primary
21 server operates in continuous archiving mode, while each standby server
22 operates in continuous recovery mode, reading the WAL files from the
23 primary. No changes to the database tables are required to enable this
24 capability, so it offers low administration overhead compared to some
25 other replication solutions. This configuration also has relatively low
26 performance impact on the primary server.
28 Directly moving WAL records from one database server to another is
29 typically described as log shipping. PostgreSQL implements file-based
30 log shipping by transferring WAL records one file (WAL segment) at a
31 time. WAL files (16MB) can be shipped easily and cheaply over any
32 distance, whether it be to an adjacent system, another system at the
33 same site, or another system on the far side of the globe. The
34 bandwidth required for this technique varies according to the
35 transaction rate of the primary server. Record-based log shipping is
36 more granular and streams WAL changes incrementally over a network
37 connection (see Section 26.2.5).
39 It should be noted that log shipping is asynchronous, i.e., the WAL
40 records are shipped after transaction commit. As a result, there is a
41 window for data loss should the primary server suffer a catastrophic
42 failure; transactions not yet shipped will be lost. The size of the
43 data loss window in file-based log shipping can be limited by use of
44 the archive_timeout parameter, which can be set as low as a few
45 seconds. However such a low setting will substantially increase the
46 bandwidth required for file shipping. Streaming replication (see
47 Section 26.2.5) allows a much smaller window of data loss.
49 Recovery performance is sufficiently good that the standby will
50 typically be only moments away from full availability once it has been
51 activated. As a result, this is called a warm standby configuration
52 which offers high availability. Restoring a server from an archived
53 base backup and rollforward will take considerably longer, so that
54 technique only offers a solution for disaster recovery, not high
55 availability. A standby server can also be used for read-only queries,
56 in which case it is called a hot standby server. See Section 26.4 for
61 It is usually wise to create the primary and standby servers so that
62 they are as similar as possible, at least from the perspective of the
63 database server. In particular, the path names associated with
64 tablespaces will be passed across unmodified, so both primary and
65 standby servers must have the same mount paths for tablespaces if that
66 feature is used. Keep in mind that if CREATE TABLESPACE is executed on
67 the primary, any new mount point needed for it must be created on the
68 primary and all standby servers before the command is executed.
69 Hardware need not be exactly the same, but experience shows that
70 maintaining two identical systems is easier than maintaining two
71 dissimilar ones over the lifetime of the application and system. In any
72 case the hardware architecture must be the same — shipping from, say, a
73 32-bit to a 64-bit system will not work.
75 In general, log shipping between servers running different major
76 PostgreSQL release levels is not possible. It is the policy of the
77 PostgreSQL Global Development Group not to make changes to disk formats
78 during minor release upgrades, so it is likely that running different
79 minor release levels on primary and standby servers will work
80 successfully. However, no formal support for that is offered and you
81 are advised to keep primary and standby servers at the same release
82 level as much as possible. When updating to a new minor release, the
83 safest policy is to update the standby servers first — a new minor
84 release is more likely to be able to read WAL files from a previous
85 minor release than vice versa.
87 26.2.2. Standby Server Operation #
89 A server enters standby mode if a standby.signal file exists in the
90 data directory when the server is started.
92 In standby mode, the server continuously applies WAL received from the
93 primary server. The standby server can read WAL from a WAL archive (see
94 restore_command) or directly from the primary over a TCP connection
95 (streaming replication). The standby server will also attempt to
96 restore any WAL found in the standby cluster's pg_wal directory. That
97 typically happens after a server restart, when the standby replays
98 again WAL that was streamed from the primary before the restart, but
99 you can also manually copy files to pg_wal at any time to have them
102 At startup, the standby begins by restoring all WAL available in the
103 archive location, calling restore_command. Once it reaches the end of
104 WAL available there and restore_command fails, it tries to restore any
105 WAL available in the pg_wal directory. If that fails, and streaming
106 replication has been configured, the standby tries to connect to the
107 primary server and start streaming WAL from the last valid record found
108 in archive or pg_wal. If that fails or streaming replication is not
109 configured, or if the connection is later disconnected, the standby
110 goes back to step 1 and tries to restore the file from the archive
111 again. This loop of retries from the archive, pg_wal, and via streaming
112 replication goes on until the server is stopped or is promoted.
114 Standby mode is exited and the server switches to normal operation when
115 pg_ctl promote is run, or pg_promote() is called. Before failover, any
116 WAL immediately available in the archive or in pg_wal will be restored,
117 but no attempt is made to connect to the primary.
119 26.2.3. Preparing the Primary for Standby Servers #
121 Set up continuous archiving on the primary to an archive directory
122 accessible from the standby, as described in Section 25.3. The archive
123 location should be accessible from the standby even when the primary is
124 down, i.e., it should reside on the standby server itself or another
125 trusted server, not on the primary server.
127 If you want to use streaming replication, set up authentication on the
128 primary server to allow replication connections from the standby
129 server(s); that is, create a role and provide a suitable entry or
130 entries in pg_hba.conf with the database field set to replication. Also
131 ensure max_wal_senders is set to a sufficiently large value in the
132 configuration file of the primary server. If replication slots will be
133 used, ensure that max_replication_slots is set sufficiently high as
136 Take a base backup as described in Section 25.3.2 to bootstrap the
139 26.2.4. Setting Up a Standby Server #
141 To set up the standby server, restore the base backup taken from
142 primary server (see Section 25.3.5). Create a file standby.signal in
143 the standby's cluster data directory. Set restore_command to a simple
144 command to copy files from the WAL archive. If you plan to have
145 multiple standby servers for high availability purposes, make sure that
146 recovery_target_timeline is set to latest (the default), to make the
147 standby server follow the timeline change that occurs at failover to
152 restore_command should return immediately if the file does not exist;
153 the server will retry the command again if necessary.
155 If you want to use streaming replication, fill in primary_conninfo with
156 a libpq connection string, including the host name (or IP address) and
157 any additional details needed to connect to the primary server. If the
158 primary needs a password for authentication, the password needs to be
159 specified in primary_conninfo as well.
161 If you're setting up the standby server for high availability purposes,
162 set up WAL archiving, connections and authentication like the primary
163 server, because the standby server will work as a primary server after
166 If you're using a WAL archive, its size can be minimized using the
167 archive_cleanup_command parameter to remove files that are no longer
168 required by the standby server. The pg_archivecleanup utility is
169 designed specifically to be used with archive_cleanup_command in
170 typical single-standby configurations, see pg_archivecleanup. Note
171 however, that if you're using the archive for backup purposes, you need
172 to retain files needed to recover from at least the latest base backup,
173 even if they're no longer needed by the standby.
175 A simple example of configuration is:
176 primary_conninfo = 'host=192.168.1.50 port=5432 user=foo password=foopass option
177 s=''-c wal_sender_timeout=5000'''
178 restore_command = 'cp /path/to/archive/%f %p'
179 archive_cleanup_command = 'pg_archivecleanup /path/to/archive %r'
181 You can have any number of standby servers, but if you use streaming
182 replication, make sure you set max_wal_senders high enough in the
183 primary to allow them to be connected simultaneously.
185 26.2.5. Streaming Replication #
187 Streaming replication allows a standby server to stay more up-to-date
188 than is possible with file-based log shipping. The standby connects to
189 the primary, which streams WAL records to the standby as they're
190 generated, without waiting for the WAL file to be filled.
192 Streaming replication is asynchronous by default (see Section 26.2.8),
193 in which case there is a small delay between committing a transaction
194 in the primary and the changes becoming visible in the standby. This
195 delay is however much smaller than with file-based log shipping,
196 typically under one second assuming the standby is powerful enough to
197 keep up with the load. With streaming replication, archive_timeout is
198 not required to reduce the data loss window.
200 If you use streaming replication without file-based continuous
201 archiving, the server might recycle old WAL segments before the standby
202 has received them. If this occurs, the standby will need to be
203 reinitialized from a new base backup. You can avoid this by setting
204 wal_keep_size to a value large enough to ensure that WAL segments are
205 not recycled too early, or by configuring a replication slot for the
206 standby. If you set up a WAL archive that's accessible from the
207 standby, these solutions are not required, since the standby can always
208 use the archive to catch up provided it retains enough segments.
210 To use streaming replication, set up a file-based log-shipping standby
211 server as described in Section 26.2. The step that turns a file-based
212 log-shipping standby into streaming replication standby is setting the
213 primary_conninfo setting to point to the primary server. Set
214 listen_addresses and authentication options (see pg_hba.conf) on the
215 primary so that the standby server can connect to the replication
216 pseudo-database on the primary server (see Section 26.2.5.1).
218 On systems that support the keepalive socket option, setting
219 tcp_keepalives_idle, tcp_keepalives_interval and tcp_keepalives_count
220 helps the primary promptly notice a broken connection.
222 Set the maximum number of concurrent connections from the standby
223 servers (see max_wal_senders for details).
225 When the standby is started and primary_conninfo is set correctly, the
226 standby will connect to the primary after replaying all WAL files
227 available in the archive. If the connection is established
228 successfully, you will see a walreceiver in the standby, and a
229 corresponding walsender process in the primary.
231 26.2.5.1. Authentication #
233 It is very important that the access privileges for replication be set
234 up so that only trusted users can read the WAL stream, because it is
235 easy to extract privileged information from it. Standby servers must
236 authenticate to the primary as an account that has the REPLICATION
237 privilege or a superuser. It is recommended to create a dedicated user
238 account with REPLICATION and LOGIN privileges for replication. While
239 REPLICATION privilege gives very high permissions, it does not allow
240 the user to modify any data on the primary system, which the SUPERUSER
243 Client authentication for replication is controlled by a pg_hba.conf
244 record specifying replication in the database field. For example, if
245 the standby is running on host IP 192.168.1.100 and the account name
246 for replication is foo, the administrator can add the following line to
247 the pg_hba.conf file on the primary:
248 # Allow the user "foo" from host 192.168.1.100 to connect to the primary
249 # as a replication standby if the user's password is correctly supplied.
251 # TYPE DATABASE USER ADDRESS METHOD
252 host replication foo 192.168.1.100/32 md5
254 The host name and port number of the primary, connection user name, and
255 password are specified in the primary_conninfo. The password can also
256 be set in the ~/.pgpass file on the standby (specify replication in the
257 database field). For example, if the primary is running on host IP
258 192.168.1.50, port 5432, the account name for replication is foo, and
259 the password is foopass, the administrator can add the following line
260 to the postgresql.conf file on the standby:
261 # The standby connects to the primary that is running on host 192.168.1.50
262 # and port 5432 as the user "foo" whose password is "foopass".
263 primary_conninfo = 'host=192.168.1.50 port=5432 user=foo password=foopass'
265 26.2.5.2. Monitoring #
267 An important health indicator of streaming replication is the amount of
268 WAL records generated in the primary, but not yet applied in the
269 standby. You can calculate this lag by comparing the current WAL write
270 location on the primary with the last WAL location received by the
271 standby. These locations can be retrieved using pg_current_wal_lsn on
272 the primary and pg_last_wal_receive_lsn on the standby, respectively
273 (see Table 9.97 and Table 9.98 for details). The last WAL receive
274 location in the standby is also displayed in the process status of the
275 WAL receiver process, displayed using the ps command (see Section 27.1
278 You can retrieve a list of WAL sender processes via the
279 pg_stat_replication view. Large differences between pg_current_wal_lsn
280 and the view's sent_lsn field might indicate that the primary server is
281 under heavy load, while differences between sent_lsn and
282 pg_last_wal_receive_lsn on the standby might indicate network delay, or
283 that the standby is under heavy load.
285 On a hot standby, the status of the WAL receiver process can be
286 retrieved via the pg_stat_wal_receiver view. A large difference between
287 pg_last_wal_replay_lsn and the view's flushed_lsn indicates that WAL is
288 being received faster than it can be replayed.
290 26.2.6. Replication Slots #
292 Replication slots provide an automated way to ensure that the primary
293 server does not remove WAL segments until they have been received by
294 all standbys, and that the primary does not remove rows which could
295 cause a recovery conflict even when the standby is disconnected.
297 In lieu of using replication slots, it is possible to prevent the
298 removal of old WAL segments using wal_keep_size, or by storing the
299 segments in an archive using archive_command or archive_library. A
300 disadvantage of these methods is that they often result in retaining
301 more WAL segments than required, whereas replication slots retain only
302 the number of segments known to be needed.
304 Similarly, hot_standby_feedback on its own, without also using a
305 replication slot, provides protection against relevant rows being
306 removed by vacuum, but provides no protection during any time period
307 when the standby is not connected.
311 Beware that replication slots can cause the server to retain so many
312 WAL segments that they fill up the space allocated for pg_wal.
313 max_slot_wal_keep_size can be used to limit the size of WAL files
314 retained by replication slots.
316 26.2.6.1. Querying and Manipulating Replication Slots #
318 Each replication slot has a name, which can contain lower-case letters,
319 numbers, and the underscore character.
321 Existing replication slots and their state can be seen in the
322 pg_replication_slots view.
324 Slots can be created and dropped either via the streaming replication
325 protocol (see Section 54.4) or via SQL functions (see Section 9.28.6).
327 26.2.6.2. Configuration Example #
329 You can create a replication slot like this:
330 postgres=# SELECT * FROM pg_create_physical_replication_slot('node_a_slot');
335 postgres=# SELECT slot_name, slot_type, active FROM pg_replication_slots;
336 slot_name | slot_type | active
337 -------------+-----------+--------
338 node_a_slot | physical | f
341 To configure the standby to use this slot, primary_slot_name should be
342 configured on the standby. Here is a simple example:
343 primary_conninfo = 'host=192.168.1.50 port=5432 user=foo password=foopass'
344 primary_slot_name = 'node_a_slot'
346 26.2.7. Cascading Replication #
348 The cascading replication feature allows a standby server to accept
349 replication connections and stream WAL records to other standbys,
350 acting as a relay. This can be used to reduce the number of direct
351 connections to the primary and also to minimize inter-site bandwidth
354 A standby acting as both a receiver and a sender is known as a
355 cascading standby. Standbys that are more directly connected to the
356 primary are known as upstream servers, while those standby servers
357 further away are downstream servers. Cascading replication does not
358 place limits on the number or arrangement of downstream servers, though
359 each standby connects to only one upstream server which eventually
360 links to a single primary server.
362 A cascading standby sends not only WAL records received from the
363 primary but also those restored from the archive. So even if the
364 replication connection in some upstream connection is terminated,
365 streaming replication continues downstream for as long as new WAL
366 records are available.
368 Cascading replication is currently asynchronous. Synchronous
369 replication (see Section 26.2.8) settings have no effect on cascading
370 replication at present.
372 Hot standby feedback propagates upstream, whatever the cascaded
375 If an upstream standby server is promoted to become the new primary,
376 downstream servers will continue to stream from the new primary if
377 recovery_target_timeline is set to 'latest' (the default).
379 To use cascading replication, set up the cascading standby so that it
380 can accept replication connections (that is, set max_wal_senders and
381 hot_standby, and configure host-based authentication). You will also
382 need to set primary_conninfo in the downstream standby to point to the
385 26.2.8. Synchronous Replication #
387 PostgreSQL streaming replication is asynchronous by default. If the
388 primary server crashes then some transactions that were committed may
389 not have been replicated to the standby server, causing data loss. The
390 amount of data loss is proportional to the replication delay at the
393 Synchronous replication offers the ability to confirm that all changes
394 made by a transaction have been transferred to one or more synchronous
395 standby servers. This extends that standard level of durability offered
396 by a transaction commit. This level of protection is referred to as
397 2-safe replication in computer science theory, and group-1-safe
398 (group-safe and 1-safe) when synchronous_commit is set to remote_write.
400 When requesting synchronous replication, each commit of a write
401 transaction will wait until confirmation is received that the commit
402 has been written to the write-ahead log on disk of both the primary and
403 standby server. The only possibility that data can be lost is if both
404 the primary and the standby suffer crashes at the same time. This can
405 provide a much higher level of durability, though only if the sysadmin
406 is cautious about the placement and management of the two servers.
407 Waiting for confirmation increases the user's confidence that the
408 changes will not be lost in the event of server crashes but it also
409 necessarily increases the response time for the requesting transaction.
410 The minimum wait time is the round-trip time between primary and
413 Read-only transactions and transaction rollbacks need not wait for
414 replies from standby servers. Subtransaction commits do not wait for
415 responses from standby servers, only top-level commits. Long running
416 actions such as data loading or index building do not wait until the
417 very final commit message. All two-phase commit actions require commit
418 waits, including both prepare and commit.
420 A synchronous standby can be a physical replication standby or a
421 logical replication subscriber. It can also be any other physical or
422 logical WAL replication stream consumer that knows how to send the
423 appropriate feedback messages. Besides the built-in physical and
424 logical replication systems, this includes special programs such as
425 pg_receivewal and pg_recvlogical as well as some third-party
426 replication systems and custom programs. Check the respective
427 documentation for details on synchronous replication support.
429 26.2.8.1. Basic Configuration #
431 Once streaming replication has been configured, configuring synchronous
432 replication requires only one additional configuration step:
433 synchronous_standby_names must be set to a non-empty value.
434 synchronous_commit must also be set to on, but since this is the
435 default value, typically no change is required. (See Section 19.5.1 and
436 Section 19.6.2.) This configuration will cause each commit to wait for
437 confirmation that the standby has written the commit record to durable
438 storage. synchronous_commit can be set by individual users, so it can
439 be configured in the configuration file, for particular users or
440 databases, or dynamically by applications, in order to control the
441 durability guarantee on a per-transaction basis.
443 After a commit record has been written to disk on the primary, the WAL
444 record is then sent to the standby. The standby sends reply messages
445 each time a new batch of WAL data is written to disk, unless
446 wal_receiver_status_interval is set to zero on the standby. In the case
447 that synchronous_commit is set to remote_apply, the standby sends reply
448 messages when the commit record is replayed, making the transaction
449 visible. If the standby is chosen as a synchronous standby, according
450 to the setting of synchronous_standby_names on the primary, the reply
451 messages from that standby will be considered along with those from
452 other synchronous standbys to decide when to release transactions
453 waiting for confirmation that the commit record has been received.
454 These parameters allow the administrator to specify which standby
455 servers should be synchronous standbys. Note that the configuration of
456 synchronous replication is mainly on the primary. Named standbys must
457 be directly connected to the primary; the primary knows nothing about
458 downstream standby servers using cascaded replication.
460 Setting synchronous_commit to remote_write will cause each commit to
461 wait for confirmation that the standby has received the commit record
462 and written it out to its own operating system, but not for the data to
463 be flushed to disk on the standby. This setting provides a weaker
464 guarantee of durability than on does: the standby could lose the data
465 in the event of an operating system crash, though not a PostgreSQL
466 crash. However, it's a useful setting in practice because it can
467 decrease the response time for the transaction. Data loss could only
468 occur if both the primary and the standby crash and the database of the
469 primary gets corrupted at the same time.
471 Setting synchronous_commit to remote_apply will cause each commit to
472 wait until the current synchronous standbys report that they have
473 replayed the transaction, making it visible to user queries. In simple
474 cases, this allows for load balancing with causal consistency.
476 Users will stop waiting if a fast shutdown is requested. However, as
477 when using asynchronous replication, the server will not fully shutdown
478 until all outstanding WAL records are transferred to the currently
479 connected standby servers.
481 26.2.8.2. Multiple Synchronous Standbys #
483 Synchronous replication supports one or more synchronous standby
484 servers; transactions will wait until all the standby servers which are
485 considered as synchronous confirm receipt of their data. The number of
486 synchronous standbys that transactions must wait for replies from is
487 specified in synchronous_standby_names. This parameter also specifies a
488 list of standby names and the method (FIRST and ANY) to choose
489 synchronous standbys from the listed ones.
491 The method FIRST specifies a priority-based synchronous replication and
492 makes transaction commits wait until their WAL records are replicated
493 to the requested number of synchronous standbys chosen based on their
494 priorities. The standbys whose names appear earlier in the list are
495 given higher priority and will be considered as synchronous. Other
496 standby servers appearing later in this list represent potential
497 synchronous standbys. If any of the current synchronous standbys
498 disconnects for whatever reason, it will be replaced immediately with
499 the next-highest-priority standby.
501 An example of synchronous_standby_names for a priority-based multiple
502 synchronous standbys is:
503 synchronous_standby_names = 'FIRST 2 (s1, s2, s3)'
505 In this example, if four standby servers s1, s2, s3 and s4 are running,
506 the two standbys s1 and s2 will be chosen as synchronous standbys
507 because their names appear early in the list of standby names. s3 is a
508 potential synchronous standby and will take over the role of
509 synchronous standby when either of s1 or s2 fails. s4 is an
510 asynchronous standby since its name is not in the list.
512 The method ANY specifies a quorum-based synchronous replication and
513 makes transaction commits wait until their WAL records are replicated
514 to at least the requested number of synchronous standbys in the list.
516 An example of synchronous_standby_names for a quorum-based multiple
517 synchronous standbys is:
518 synchronous_standby_names = 'ANY 2 (s1, s2, s3)'
520 In this example, if four standby servers s1, s2, s3 and s4 are running,
521 transaction commits will wait for replies from at least any two
522 standbys of s1, s2 and s3. s4 is an asynchronous standby since its name
525 The synchronous states of standby servers can be viewed using the
526 pg_stat_replication view.
528 26.2.8.3. Planning for Performance #
530 Synchronous replication usually requires carefully planned and placed
531 standby servers to ensure applications perform acceptably. Waiting
532 doesn't utilize system resources, but transaction locks continue to be
533 held until the transfer is confirmed. As a result, incautious use of
534 synchronous replication will reduce performance for database
535 applications because of increased response times and higher contention.
537 PostgreSQL allows the application developer to specify the durability
538 level required via replication. This can be specified for the system
539 overall, though it can also be specified for specific users or
540 connections, or even individual transactions.
542 For example, an application workload might consist of: 10% of changes
543 are important customer details, while 90% of changes are less important
544 data that the business can more easily survive if it is lost, such as
545 chat messages between users.
547 With synchronous replication options specified at the application level
548 (on the primary) we can offer synchronous replication for the most
549 important changes, without slowing down the bulk of the total workload.
550 Application level options are an important and practical tool for
551 allowing the benefits of synchronous replication for high performance
554 You should consider that the network bandwidth must be higher than the
555 rate of generation of WAL data.
557 26.2.8.4. Planning for High Availability #
559 synchronous_standby_names specifies the number and names of synchronous
560 standbys that transaction commits made when synchronous_commit is set
561 to on, remote_apply or remote_write will wait for responses from. Such
562 transaction commits may never be completed if any one of the
563 synchronous standbys should crash.
565 The best solution for high availability is to ensure you keep as many
566 synchronous standbys as requested. This can be achieved by naming
567 multiple potential synchronous standbys using
568 synchronous_standby_names.
570 In a priority-based synchronous replication, the standbys whose names
571 appear earlier in the list will be used as synchronous standbys.
572 Standbys listed after these will take over the role of synchronous
573 standby if one of current ones should fail.
575 In a quorum-based synchronous replication, all the standbys appearing
576 in the list will be used as candidates for synchronous standbys. Even
577 if one of them should fail, the other standbys will keep performing the
578 role of candidates of synchronous standby.
580 When a standby first attaches to the primary, it will not yet be
581 properly synchronized. This is described as catchup mode. Once the lag
582 between standby and primary reaches zero for the first time we move to
583 real-time streaming state. The catch-up duration may be long
584 immediately after the standby has been created. If the standby is shut
585 down, then the catch-up period will increase according to the length of
586 time the standby has been down. The standby is only able to become a
587 synchronous standby once it has reached streaming state. This state can
588 be viewed using the pg_stat_replication view.
590 If primary restarts while commits are waiting for acknowledgment, those
591 waiting transactions will be marked fully committed once the primary
592 database recovers. There is no way to be certain that all standbys have
593 received all outstanding WAL data at time of the crash of the primary.
594 Some transactions may not show as committed on the standby, even though
595 they show as committed on the primary. The guarantee we offer is that
596 the application will not receive explicit acknowledgment of the
597 successful commit of a transaction until the WAL data is known to be
598 safely received by all the synchronous standbys.
600 If you really cannot keep as many synchronous standbys as requested
601 then you should decrease the number of synchronous standbys that
602 transaction commits must wait for responses from in
603 synchronous_standby_names (or disable it) and reload the configuration
604 file on the primary server.
606 If the primary is isolated from remaining standby servers you should
607 fail over to the best candidate of those other remaining standby
610 If you need to re-create a standby server while transactions are
611 waiting, make sure that the functions pg_backup_start() and
612 pg_backup_stop() are run in a session with synchronous_commit = off,
613 otherwise those requests will wait forever for the standby to appear.
615 26.2.9. Continuous Archiving in Standby #
617 When continuous WAL archiving is used in a standby, there are two
618 different scenarios: the WAL archive can be shared between the primary
619 and the standby, or the standby can have its own WAL archive. When the
620 standby has its own WAL archive, set archive_mode to always, and the
621 standby will call the archive command for every WAL segment it
622 receives, whether it's by restoring from the archive or by streaming
623 replication. The shared archive can be handled similarly, but the
624 archive_command or archive_library must test if the file being archived
625 exists already, and if the existing file has identical contents. This
626 requires more care in the archive_command or archive_library, as it
627 must be careful to not overwrite an existing file with different
628 contents, but return success if the exactly same file is archived
629 twice. And all that must be done free of race conditions, if two
630 servers attempt to archive the same file at the same time.
632 If archive_mode is set to on, the archiver is not enabled during
633 recovery or standby mode. If the standby server is promoted, it will
634 start archiving after the promotion, but will not archive any WAL or
635 timeline history files that it did not generate itself. To get a
636 complete series of WAL files in the archive, you must ensure that all
637 WAL is archived, before it reaches the standby. This is inherently true
638 with file-based log shipping, as the standby can only restore files
639 that are found in the archive, but not if streaming replication is
640 enabled. When a server is not in recovery mode, there is no difference
641 between on and always modes.