2 5.12. Table Partitioning #
5 5.12.2. Declarative Partitioning
6 5.12.3. Partitioning Using Inheritance
7 5.12.4. Partition Pruning
8 5.12.5. Partitioning and Constraint Exclusion
9 5.12.6. Best Practices for Declarative Partitioning
11 PostgreSQL supports basic table partitioning. This section describes
12 why and how to implement partitioning as part of your database design.
16 Partitioning refers to splitting what is logically one large table into
17 smaller physical pieces. Partitioning can provide several benefits:
18 * Query performance can be improved dramatically in certain
19 situations, particularly when most of the heavily accessed rows of
20 the table are in a single partition or a small number of
21 partitions. Partitioning effectively substitutes for the upper tree
22 levels of indexes, making it more likely that the heavily-used
23 parts of the indexes fit in memory.
24 * When queries or updates access a large percentage of a single
25 partition, performance can be improved by using a sequential scan
26 of that partition instead of using an index, which would require
27 random-access reads scattered across the whole table.
28 * Bulk loads and deletes can be accomplished by adding or removing
29 partitions, if the usage pattern is accounted for in the
30 partitioning design. Dropping an individual partition using DROP
31 TABLE, or doing ALTER TABLE DETACH PARTITION, is far faster than a
32 bulk operation. These commands also entirely avoid the VACUUM
33 overhead caused by a bulk DELETE.
34 * Seldom-used data can be migrated to cheaper and slower storage
37 These benefits will normally be worthwhile only when a table would
38 otherwise be very large. The exact point at which a table will benefit
39 from partitioning depends on the application, although a rule of thumb
40 is that the size of the table should exceed the physical memory of the
43 PostgreSQL offers built-in support for the following forms of
47 The table is partitioned into “ranges” defined by a key column
48 or set of columns, with no overlap between the ranges of values
49 assigned to different partitions. For example, one might
50 partition by date ranges, or by ranges of identifiers for
51 particular business objects. Each range's bounds are understood
52 as being inclusive at the lower end and exclusive at the upper
53 end. For example, if one partition's range is from 1 to 10, and
54 the next one's range is from 10 to 20, then value 10 belongs to
55 the second partition not the first.
58 The table is partitioned by explicitly listing which key
59 value(s) appear in each partition.
62 The table is partitioned by specifying a modulus and a remainder
63 for each partition. Each partition will hold the rows for which
64 the hash value of the partition key divided by the specified
65 modulus will produce the specified remainder.
67 If your application needs to use other forms of partitioning not listed
68 above, alternative methods such as inheritance and UNION ALL views can
69 be used instead. Such methods offer flexibility but do not have some of
70 the performance benefits of built-in declarative partitioning.
72 5.12.2. Declarative Partitioning #
74 PostgreSQL allows you to declare that a table is divided into
75 partitions. The table that is divided is referred to as a partitioned
76 table. The declaration includes the partitioning method as described
77 above, plus a list of columns or expressions to be used as the
80 The partitioned table itself is a “virtual” table having no storage of
81 its own. Instead, the storage belongs to partitions, which are
82 otherwise-ordinary tables associated with the partitioned table. Each
83 partition stores a subset of the data as defined by its partition
84 bounds. All rows inserted into a partitioned table will be routed to
85 the appropriate one of the partitions based on the values of the
86 partition key column(s). Updating the partition key of a row will cause
87 it to be moved into a different partition if it no longer satisfies the
88 partition bounds of its original partition.
90 Partitions may themselves be defined as partitioned tables, resulting
91 in sub-partitioning. Although all partitions must have the same columns
92 as their partitioned parent, partitions may have their own indexes,
93 constraints and default values, distinct from those of other
94 partitions. See CREATE TABLE for more details on creating partitioned
95 tables and partitions.
97 It is not possible to turn a regular table into a partitioned table or
98 vice versa. However, it is possible to add an existing regular or
99 partitioned table as a partition of a partitioned table, or remove a
100 partition from a partitioned table turning it into a standalone table;
101 this can simplify and speed up many maintenance processes. See ALTER
102 TABLE to learn more about the ATTACH PARTITION and DETACH PARTITION
105 Partitions can also be foreign tables, although considerable care is
106 needed because it is then the user's responsibility that the contents
107 of the foreign table satisfy the partitioning rule. There are some
108 other restrictions as well. See CREATE FOREIGN TABLE for more
113 Suppose we are constructing a database for a large ice cream company.
114 The company measures peak temperatures every day as well as ice cream
115 sales in each region. Conceptually, we want a table like:
116 CREATE TABLE measurement (
117 city_id int not null,
118 logdate date not null,
123 We know that most queries will access just the last week's, month's or
124 quarter's data, since the main use of this table will be to prepare
125 online reports for management. To reduce the amount of old data that
126 needs to be stored, we decide to keep only the most recent 3 years
127 worth of data. At the beginning of each month we will remove the oldest
128 month's data. In this situation we can use partitioning to help us meet
129 all of our different requirements for the measurements table.
131 To use declarative partitioning in this case, use the following steps:
132 1. Create the measurement table as a partitioned table by specifying
133 the PARTITION BY clause, which includes the partitioning method
134 (RANGE in this case) and the list of column(s) to use as the
136 CREATE TABLE measurement (
137 city_id int not null,
138 logdate date not null,
141 ) PARTITION BY RANGE (logdate);
143 2. Create partitions. Each partition's definition must specify bounds
144 that correspond to the partitioning method and partition key of the
145 parent. Note that specifying bounds such that the new partition's
146 values would overlap with those in one or more existing partitions
148 Partitions thus created are in every way normal PostgreSQL tables
149 (or, possibly, foreign tables). It is possible to specify a
150 tablespace and storage parameters for each partition separately.
151 For our example, each partition should hold one month's worth of
152 data, to match the requirement of deleting one month's data at a
153 time. So the commands might look like:
154 CREATE TABLE measurement_y2006m02 PARTITION OF measurement
155 FOR VALUES FROM ('2006-02-01') TO ('2006-03-01');
157 CREATE TABLE measurement_y2006m03 PARTITION OF measurement
158 FOR VALUES FROM ('2006-03-01') TO ('2006-04-01');
161 CREATE TABLE measurement_y2007m11 PARTITION OF measurement
162 FOR VALUES FROM ('2007-11-01') TO ('2007-12-01');
164 CREATE TABLE measurement_y2007m12 PARTITION OF measurement
165 FOR VALUES FROM ('2007-12-01') TO ('2008-01-01')
166 TABLESPACE fasttablespace;
168 CREATE TABLE measurement_y2008m01 PARTITION OF measurement
169 FOR VALUES FROM ('2008-01-01') TO ('2008-02-01')
170 WITH (parallel_workers = 4)
171 TABLESPACE fasttablespace;
173 (Recall that adjacent partitions can share a bound value, since
174 range upper bounds are treated as exclusive bounds.)
175 If you wish to implement sub-partitioning, again specify the
176 PARTITION BY clause in the commands used to create individual
177 partitions, for example:
178 CREATE TABLE measurement_y2006m02 PARTITION OF measurement
179 FOR VALUES FROM ('2006-02-01') TO ('2006-03-01')
180 PARTITION BY RANGE (peaktemp);
182 After creating partitions of measurement_y2006m02, any data
183 inserted into measurement that is mapped to measurement_y2006m02
184 (or data that is directly inserted into measurement_y2006m02, which
185 is allowed provided its partition constraint is satisfied) will be
186 further redirected to one of its partitions based on the peaktemp
187 column. The partition key specified may overlap with the parent's
188 partition key, although care should be taken when specifying the
189 bounds of a sub-partition such that the set of data it accepts
190 constitutes a subset of what the partition's own bounds allow; the
191 system does not try to check whether that's really the case.
192 Inserting data into the parent table that does not map to one of
193 the existing partitions will cause an error; an appropriate
194 partition must be added manually.
195 It is not necessary to manually create table constraints describing
196 the partition boundary conditions for partitions. Such constraints
197 will be created automatically.
198 3. Create an index on the key column(s), as well as any other indexes
199 you might want, on the partitioned table. (The key index is not
200 strictly necessary, but in most scenarios it is helpful.) This
201 automatically creates a matching index on each partition, and any
202 partitions you create or attach later will also have such an index.
203 An index or unique constraint declared on a partitioned table is
204 “virtual” in the same way that the partitioned table is: the actual
205 data is in child indexes on the individual partition tables.
206 CREATE INDEX ON measurement (logdate);
208 4. Ensure that the enable_partition_pruning configuration parameter is
209 not disabled in postgresql.conf. If it is, queries will not be
210 optimized as desired.
212 In the above example we would be creating a new partition each month,
213 so it might be wise to write a script that generates the required DDL
216 5.12.2.2. Partition Maintenance #
218 Normally the set of partitions established when initially defining the
219 table is not intended to remain static. It is common to want to remove
220 partitions holding old data and periodically add new partitions for new
221 data. One of the most important advantages of partitioning is precisely
222 that it allows this otherwise painful task to be executed nearly
223 instantaneously by manipulating the partition structure, rather than
224 physically moving large amounts of data around.
226 The simplest option for removing old data is to drop the partition that
227 is no longer necessary:
228 DROP TABLE measurement_y2006m02;
230 This can very quickly delete millions of records because it doesn't
231 have to individually delete every record. Note however that the above
232 command requires taking an ACCESS EXCLUSIVE lock on the parent table.
234 Another option that is often preferable is to remove the partition from
235 the partitioned table but retain access to it as a table in its own
236 right. This has two forms:
237 ALTER TABLE measurement DETACH PARTITION measurement_y2006m02;
238 ALTER TABLE measurement DETACH PARTITION measurement_y2006m02 CONCURRENTLY;
240 These allow further operations to be performed on the data before it is
241 dropped. For example, this is often a useful time to back up the data
242 using COPY, pg_dump, or similar tools. It might also be a useful time
243 to aggregate data into smaller formats, perform other data
244 manipulations, or run reports. The first form of the command requires
245 an ACCESS EXCLUSIVE lock on the parent table. Adding the CONCURRENTLY
246 qualifier as in the second form allows the detach operation to require
247 only SHARE UPDATE EXCLUSIVE lock on the parent table, but see ALTER
248 TABLE ... DETACH PARTITION for details on the restrictions.
250 Similarly we can add a new partition to handle new data. We can create
251 an empty partition in the partitioned table just as the original
252 partitions were created above:
253 CREATE TABLE measurement_y2008m02 PARTITION OF measurement
254 FOR VALUES FROM ('2008-02-01') TO ('2008-03-01')
255 TABLESPACE fasttablespace;
257 As an alternative to creating a new partition, it is sometimes more
258 convenient to create a new table separate from the partition structure
259 and attach it as a partition later. This allows new data to be loaded,
260 checked, and transformed prior to it appearing in the partitioned
261 table. Moreover, the ATTACH PARTITION operation requires only a SHARE
262 UPDATE EXCLUSIVE lock on the partitioned table rather than the ACCESS
263 EXCLUSIVE lock required by CREATE TABLE ... PARTITION OF, so it is more
264 friendly to concurrent operations on the partitioned table; see ALTER
265 TABLE ... ATTACH PARTITION for additional details. The CREATE TABLE ...
266 LIKE option can be helpful to avoid tediously repeating the parent
267 table's definition; for example:
268 CREATE TABLE measurement_y2008m02
269 (LIKE measurement INCLUDING DEFAULTS INCLUDING CONSTRAINTS)
270 TABLESPACE fasttablespace;
272 ALTER TABLE measurement_y2008m02 ADD CONSTRAINT y2008m02
273 CHECK ( logdate >= DATE '2008-02-01' AND logdate < DATE '2008-03-01' );
275 \copy measurement_y2008m02 from 'measurement_y2008m02'
276 -- possibly some other data preparation work
278 ALTER TABLE measurement ATTACH PARTITION measurement_y2008m02
279 FOR VALUES FROM ('2008-02-01') TO ('2008-03-01' );
281 Note that when running the ATTACH PARTITION command, the table will be
282 scanned to validate the partition constraint while holding an ACCESS
283 EXCLUSIVE lock on that partition. As shown above, it is recommended to
284 avoid this scan by creating a CHECK constraint matching the expected
285 partition constraint on the table prior to attaching it. Once the
286 ATTACH PARTITION is complete, it is recommended to drop the
287 now-redundant CHECK constraint. If the table being attached is itself a
288 partitioned table, then each of its sub-partitions will be recursively
289 locked and scanned until either a suitable CHECK constraint is
290 encountered or the leaf partitions are reached.
292 Similarly, if the partitioned table has a DEFAULT partition, it is
293 recommended to create a CHECK constraint which excludes the
294 to-be-attached partition's constraint. If this is not done, the DEFAULT
295 partition will be scanned to verify that it contains no records which
296 should be located in the partition being attached. This operation will
297 be performed whilst holding an ACCESS EXCLUSIVE lock on the DEFAULT
298 partition. If the DEFAULT partition is itself a partitioned table, then
299 each of its partitions will be recursively checked in the same way as
300 the table being attached, as mentioned above.
302 As mentioned earlier, it is possible to create indexes on partitioned
303 tables so that they are applied automatically to the entire hierarchy.
304 This can be very convenient as not only will all existing partitions be
305 indexed, but any future partitions will be as well. However, one
306 limitation when creating new indexes on partitioned tables is that it
307 is not possible to use the CONCURRENTLY qualifier, which could lead to
308 long lock times. To avoid this, you can use CREATE INDEX ON ONLY the
309 partitioned table, which creates the new index marked as invalid,
310 preventing automatic application to existing partitions. Instead,
311 indexes can then be created individually on each partition using
312 CONCURRENTLY and attached to the partitioned index on the parent using
313 ALTER INDEX ... ATTACH PARTITION. Once indexes for all the partitions
314 are attached to the parent index, the parent index will be marked valid
315 automatically. Example:
316 CREATE INDEX measurement_usls_idx ON ONLY measurement (unitsales);
318 CREATE INDEX CONCURRENTLY measurement_usls_200602_idx
319 ON measurement_y2006m02 (unitsales);
320 ALTER INDEX measurement_usls_idx
321 ATTACH PARTITION measurement_usls_200602_idx;
324 This technique can be used with UNIQUE and PRIMARY KEY constraints too;
325 the indexes are created implicitly when the constraint is created.
327 ALTER TABLE ONLY measurement ADD UNIQUE (city_id, logdate);
329 ALTER TABLE measurement_y2006m02 ADD UNIQUE (city_id, logdate);
330 ALTER INDEX measurement_city_id_logdate_key
331 ATTACH PARTITION measurement_y2006m02_city_id_logdate_key;
334 5.12.2.3. Limitations #
336 The following limitations apply to partitioned tables:
337 * To create a unique or primary key constraint on a partitioned
338 table, the partition keys must not include any expressions or
339 function calls and the constraint's columns must include all of the
340 partition key columns. This limitation exists because the
341 individual indexes making up the constraint can only directly
342 enforce uniqueness within their own partitions; therefore, the
343 partition structure itself must guarantee that there are not
344 duplicates in different partitions.
345 * Similarly an exclusion constraint must include all the partition
346 key columns. Furthermore the constraint must compare those columns
347 for equality (not e.g. &&). Again, this limitation stems from not
348 being able to enforce cross-partition restrictions. The constraint
349 may include additional columns that aren't part of the partition
350 key, and it may compare those with any operators you like.
351 * BEFORE ROW triggers on INSERT cannot change which partition is the
352 final destination for a new row.
353 * Mixing temporary and permanent relations in the same partition tree
354 is not allowed. Hence, if the partitioned table is permanent, so
355 must be its partitions and likewise if the partitioned table is
356 temporary. When using temporary relations, all members of the
357 partition tree have to be from the same session.
359 Individual partitions are linked to their partitioned table using
360 inheritance behind-the-scenes. However, it is not possible to use all
361 of the generic features of inheritance with declaratively partitioned
362 tables or their partitions, as discussed below. Notably, a partition
363 cannot have any parents other than the partitioned table it is a
364 partition of, nor can a table inherit from both a partitioned table and
365 a regular table. That means partitioned tables and their partitions
366 never share an inheritance hierarchy with regular tables.
368 Since a partition hierarchy consisting of the partitioned table and its
369 partitions is still an inheritance hierarchy, tableoid and all the
370 normal rules of inheritance apply as described in Section 5.11, with a
372 * Partitions cannot have columns that are not present in the parent.
373 It is not possible to specify columns when creating partitions with
374 CREATE TABLE, nor is it possible to add columns to partitions
375 after-the-fact using ALTER TABLE. Tables may be added as a
376 partition with ALTER TABLE ... ATTACH PARTITION only if their
377 columns exactly match the parent.
378 * Both CHECK and NOT NULL constraints of a partitioned table are
379 always inherited by all its partitions; it is not allowed to create
380 NO INHERIT constraints of those types. You cannot drop a constraint
381 of those types if the same constraint is present in the parent
383 * Using ONLY to add or drop a constraint on only the partitioned
384 table is supported as long as there are no partitions. Once
385 partitions exist, using ONLY will result in an error for any
386 constraints other than UNIQUE and PRIMARY KEY. Instead, constraints
387 on the partitions themselves can be added and (if they are not
388 present in the parent table) dropped.
389 * As a partitioned table does not have any data itself, attempts to
390 use TRUNCATE ONLY on a partitioned table will always return an
393 5.12.3. Partitioning Using Inheritance #
395 While the built-in declarative partitioning is suitable for most common
396 use cases, there are some circumstances where a more flexible approach
397 may be useful. Partitioning can be implemented using table inheritance,
398 which allows for several features not supported by declarative
399 partitioning, such as:
400 * For declarative partitioning, partitions must have exactly the same
401 set of columns as the partitioned table, whereas with table
402 inheritance, child tables may have extra columns not present in the
404 * Table inheritance allows for multiple inheritance.
405 * Declarative partitioning only supports range, list and hash
406 partitioning, whereas table inheritance allows data to be divided
407 in a manner of the user's choosing. (Note, however, that if
408 constraint exclusion is unable to prune child tables effectively,
409 query performance might be poor.)
413 This example builds a partitioning structure equivalent to the
414 declarative partitioning example above. Use the following steps:
415 1. Create the “root” table, from which all of the “child” tables will
416 inherit. This table will contain no data. Do not define any check
417 constraints on this table, unless you intend them to be applied
418 equally to all child tables. There is no point in defining any
419 indexes or unique constraints on it, either. For our example, the
420 root table is the measurement table as originally defined:
421 CREATE TABLE measurement (
422 city_id int not null,
423 logdate date not null,
428 2. Create several “child” tables that each inherit from the root
429 table. Normally, these tables will not add any columns to the set
430 inherited from the root. Just as with declarative partitioning,
431 these tables are in every way normal PostgreSQL tables (or foreign
433 CREATE TABLE measurement_y2006m02 () INHERITS (measurement);
434 CREATE TABLE measurement_y2006m03 () INHERITS (measurement);
436 CREATE TABLE measurement_y2007m11 () INHERITS (measurement);
437 CREATE TABLE measurement_y2007m12 () INHERITS (measurement);
438 CREATE TABLE measurement_y2008m01 () INHERITS (measurement);
440 3. Add non-overlapping table constraints to the child tables to define
441 the allowed key values in each.
442 Typical examples would be:
444 CHECK ( county IN ( 'Oxfordshire', 'Buckinghamshire', 'Warwickshire' ))
445 CHECK ( outletID >= 100 AND outletID < 200 )
447 Ensure that the constraints guarantee that there is no overlap
448 between the key values permitted in different child tables. A
449 common mistake is to set up range constraints like:
450 CHECK ( outletID BETWEEN 100 AND 200 )
451 CHECK ( outletID BETWEEN 200 AND 300 )
453 This is wrong since it is not clear which child table the key value
454 200 belongs in. Instead, ranges should be defined in this style:
455 CREATE TABLE measurement_y2006m02 (
456 CHECK ( logdate >= DATE '2006-02-01' AND logdate < DATE '2006-03-01' )
457 ) INHERITS (measurement);
459 CREATE TABLE measurement_y2006m03 (
460 CHECK ( logdate >= DATE '2006-03-01' AND logdate < DATE '2006-04-01' )
461 ) INHERITS (measurement);
464 CREATE TABLE measurement_y2007m11 (
465 CHECK ( logdate >= DATE '2007-11-01' AND logdate < DATE '2007-12-01' )
466 ) INHERITS (measurement);
468 CREATE TABLE measurement_y2007m12 (
469 CHECK ( logdate >= DATE '2007-12-01' AND logdate < DATE '2008-01-01' )
470 ) INHERITS (measurement);
472 CREATE TABLE measurement_y2008m01 (
473 CHECK ( logdate >= DATE '2008-01-01' AND logdate < DATE '2008-02-01' )
474 ) INHERITS (measurement);
476 4. For each child table, create an index on the key column(s), as well
477 as any other indexes you might want.
478 CREATE INDEX measurement_y2006m02_logdate ON measurement_y2006m02 (logdate);
479 CREATE INDEX measurement_y2006m03_logdate ON measurement_y2006m03 (logdate);
480 CREATE INDEX measurement_y2007m11_logdate ON measurement_y2007m11 (logdate);
481 CREATE INDEX measurement_y2007m12_logdate ON measurement_y2007m12 (logdate);
482 CREATE INDEX measurement_y2008m01_logdate ON measurement_y2008m01 (logdate);
484 5. We want our application to be able to say INSERT INTO measurement
485 ... and have the data be redirected into the appropriate child
486 table. We can arrange that by attaching a suitable trigger function
487 to the root table. If data will be added only to the latest child,
488 we can use a very simple trigger function:
489 CREATE OR REPLACE FUNCTION measurement_insert_trigger()
490 RETURNS TRIGGER AS $$
492 INSERT INTO measurement_y2008m01 VALUES (NEW.*);
498 After creating the function, we create a trigger which calls the
500 CREATE TRIGGER insert_measurement_trigger
501 BEFORE INSERT ON measurement
502 FOR EACH ROW EXECUTE FUNCTION measurement_insert_trigger();
504 We must redefine the trigger function each month so that it always
505 inserts into the current child table. The trigger definition does
506 not need to be updated, however.
507 We might want to insert data and have the server automatically
508 locate the child table into which the row should be added. We could
509 do this with a more complex trigger function, for example:
510 CREATE OR REPLACE FUNCTION measurement_insert_trigger()
511 RETURNS TRIGGER AS $$
513 IF ( NEW.logdate >= DATE '2006-02-01' AND
514 NEW.logdate < DATE '2006-03-01' ) THEN
515 INSERT INTO measurement_y2006m02 VALUES (NEW.*);
516 ELSIF ( NEW.logdate >= DATE '2006-03-01' AND
517 NEW.logdate < DATE '2006-04-01' ) THEN
518 INSERT INTO measurement_y2006m03 VALUES (NEW.*);
520 ELSIF ( NEW.logdate >= DATE '2008-01-01' AND
521 NEW.logdate < DATE '2008-02-01' ) THEN
522 INSERT INTO measurement_y2008m01 VALUES (NEW.*);
524 RAISE EXCEPTION 'Date out of range. Fix the measurement_insert_trigger(
532 The trigger definition is the same as before. Note that each IF
533 test must exactly match the CHECK constraint for its child table.
534 While this function is more complex than the single-month case, it
535 doesn't need to be updated as often, since branches can be added in
536 advance of being needed.
539 In practice, it might be best to check the newest child first, if
540 most inserts go into that child. For simplicity, we have shown the
541 trigger's tests in the same order as in other parts of this
543 A different approach to redirecting inserts into the appropriate
544 child table is to set up rules, instead of a trigger, on the root
546 CREATE RULE measurement_insert_y2006m02 AS
547 ON INSERT TO measurement WHERE
548 ( logdate >= DATE '2006-02-01' AND logdate < DATE '2006-03-01' )
550 INSERT INTO measurement_y2006m02 VALUES (NEW.*);
552 CREATE RULE measurement_insert_y2008m01 AS
553 ON INSERT TO measurement WHERE
554 ( logdate >= DATE '2008-01-01' AND logdate < DATE '2008-02-01' )
556 INSERT INTO measurement_y2008m01 VALUES (NEW.*);
558 A rule has significantly more overhead than a trigger, but the
559 overhead is paid once per query rather than once per row, so this
560 method might be advantageous for bulk-insert situations. In most
561 cases, however, the trigger method will offer better performance.
562 Be aware that COPY ignores rules. If you want to use COPY to insert
563 data, you'll need to copy into the correct child table rather than
564 directly into the root. COPY does fire triggers, so you can use it
565 normally if you use the trigger approach.
566 Another disadvantage of the rule approach is that there is no
567 simple way to force an error if the set of rules doesn't cover the
568 insertion date; the data will silently go into the root table
570 6. Ensure that the constraint_exclusion configuration parameter is not
571 disabled in postgresql.conf; otherwise child tables may be accessed
574 As we can see, a complex table hierarchy could require a substantial
575 amount of DDL. In the above example we would be creating a new child
576 table each month, so it might be wise to write a script that generates
577 the required DDL automatically.
579 5.12.3.2. Maintenance for Inheritance Partitioning #
581 To remove old data quickly, simply drop the child table that is no
583 DROP TABLE measurement_y2006m02;
585 To remove the child table from the inheritance hierarchy table but
586 retain access to it as a table in its own right:
587 ALTER TABLE measurement_y2006m02 NO INHERIT measurement;
589 To add a new child table to handle new data, create an empty child
590 table just as the original children were created above:
591 CREATE TABLE measurement_y2008m02 (
592 CHECK ( logdate >= DATE '2008-02-01' AND logdate < DATE '2008-03-01' )
593 ) INHERITS (measurement);
595 Alternatively, one may want to create and populate the new child table
596 before adding it to the table hierarchy. This could allow data to be
597 loaded, checked, and transformed before being made visible to queries
599 CREATE TABLE measurement_y2008m02
600 (LIKE measurement INCLUDING DEFAULTS INCLUDING CONSTRAINTS);
601 ALTER TABLE measurement_y2008m02 ADD CONSTRAINT y2008m02
602 CHECK ( logdate >= DATE '2008-02-01' AND logdate < DATE '2008-03-01' );
603 \copy measurement_y2008m02 from 'measurement_y2008m02'
604 -- possibly some other data preparation work
605 ALTER TABLE measurement_y2008m02 INHERIT measurement;
609 The following caveats apply to partitioning implemented using
611 * There is no automatic way to verify that all of the CHECK
612 constraints are mutually exclusive. It is safer to create code that
613 generates child tables and creates and/or modifies associated
614 objects than to write each by hand.
615 * Indexes and foreign key constraints apply to single tables and not
616 to their inheritance children, hence they have some caveats to be
618 * The schemes shown here assume that the values of a row's key
619 column(s) never change, or at least do not change enough to require
620 it to move to another partition. An UPDATE that attempts to do that
621 will fail because of the CHECK constraints. If you need to handle
622 such cases, you can put suitable update triggers on the child
623 tables, but it makes management of the structure much more
625 * Manual VACUUM and ANALYZE commands will automatically process all
626 inheritance child tables. If this is undesirable, you can use the
627 ONLY keyword. A command like:
628 ANALYZE ONLY measurement;
630 will only process the root table.
631 * INSERT statements with ON CONFLICT clauses are unlikely to work as
632 expected, as the ON CONFLICT action is only taken in case of unique
633 violations on the specified target relation, not its child
635 * Triggers or rules will be needed to route rows to the desired child
636 table, unless the application is explicitly aware of the
637 partitioning scheme. Triggers may be complicated to write, and will
638 be much slower than the tuple routing performed internally by
639 declarative partitioning.
641 5.12.4. Partition Pruning #
643 Partition pruning is a query optimization technique that improves
644 performance for declaratively partitioned tables. As an example:
645 SET enable_partition_pruning = on; -- the default
646 SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
648 Without partition pruning, the above query would scan each of the
649 partitions of the measurement table. With partition pruning enabled,
650 the planner will examine the definition of each partition and prove
651 that the partition need not be scanned because it could not contain any
652 rows meeting the query's WHERE clause. When the planner can prove this,
653 it excludes (prunes) the partition from the query plan.
655 By using the EXPLAIN command and the enable_partition_pruning
656 configuration parameter, it's possible to show the difference between a
657 plan for which partitions have been pruned and one for which they have
658 not. A typical unoptimized plan for this type of table setup is:
659 SET enable_partition_pruning = off;
660 EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
662 -------------------------------------------------------------------------------
664 Aggregate (cost=188.76..188.77 rows=1 width=8)
665 -> Append (cost=0.00..181.05 rows=3085 width=0)
666 -> Seq Scan on measurement_y2006m02 (cost=0.00..33.12 rows=617 width=
668 Filter: (logdate >= '2008-01-01'::date)
669 -> Seq Scan on measurement_y2006m03 (cost=0.00..33.12 rows=617 width=
671 Filter: (logdate >= '2008-01-01'::date)
673 -> Seq Scan on measurement_y2007m11 (cost=0.00..33.12 rows=617 width=
675 Filter: (logdate >= '2008-01-01'::date)
676 -> Seq Scan on measurement_y2007m12 (cost=0.00..33.12 rows=617 width=
678 Filter: (logdate >= '2008-01-01'::date)
679 -> Seq Scan on measurement_y2008m01 (cost=0.00..33.12 rows=617 width=
681 Filter: (logdate >= '2008-01-01'::date)
683 Some or all of the partitions might use index scans instead of
684 full-table sequential scans, but the point here is that there is no
685 need to scan the older partitions at all to answer this query. When we
686 enable partition pruning, we get a significantly cheaper plan that will
687 deliver the same answer:
688 SET enable_partition_pruning = on;
689 EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
691 -------------------------------------------------------------------------------
693 Aggregate (cost=37.75..37.76 rows=1 width=8)
694 -> Seq Scan on measurement_y2008m01 (cost=0.00..33.12 rows=617 width=0)
695 Filter: (logdate >= '2008-01-01'::date)
697 Note that partition pruning is driven only by the constraints defined
698 implicitly by the partition keys, not by the presence of indexes.
699 Therefore it isn't necessary to define indexes on the key columns.
700 Whether an index needs to be created for a given partition depends on
701 whether you expect that queries that scan the partition will generally
702 scan a large part of the partition or just a small part. An index will
703 be helpful in the latter case but not the former.
705 Partition pruning can be performed not only during the planning of a
706 given query, but also during its execution. This is useful as it can
707 allow more partitions to be pruned when clauses contain expressions
708 whose values are not known at query planning time, for example,
709 parameters defined in a PREPARE statement, using a value obtained from
710 a subquery, or using a parameterized value on the inner side of a
711 nested loop join. Partition pruning during execution can be performed
712 at any of the following times:
713 * During initialization of the query plan. Partition pruning can be
714 performed here for parameter values which are known during the
715 initialization phase of execution. Partitions which are pruned
716 during this stage will not show up in the query's EXPLAIN or
717 EXPLAIN ANALYZE. It is possible to determine the number of
718 partitions which were removed during this phase by observing the
719 “Subplans Removed” property in the EXPLAIN output. The query
720 planner obtains locks for all partitions which are part of the
721 plan. However, when the executor uses a cached plan, locks are only
722 obtained on the partitions which remain after partition pruning
723 done during the initialization phase of execution, i.e., the ones
724 shown in the EXPLAIN output and not the ones referred to by the
725 “Subplans Removed” property.
726 * During actual execution of the query plan. Partition pruning may
727 also be performed here to remove partitions using values which are
728 only known during actual query execution. This includes values from
729 subqueries and values from execution-time parameters such as those
730 from parameterized nested loop joins. Since the value of these
731 parameters may change many times during the execution of the query,
732 partition pruning is performed whenever one of the execution
733 parameters being used by partition pruning changes. Determining if
734 partitions were pruned during this phase requires careful
735 inspection of the loops property in the EXPLAIN ANALYZE output.
736 Subplans corresponding to different partitions may have different
737 values for it depending on how many times each of them was pruned
738 during execution. Some may be shown as (never executed) if they
739 were pruned every time.
741 Partition pruning can be disabled using the enable_partition_pruning
744 5.12.5. Partitioning and Constraint Exclusion #
746 Constraint exclusion is a query optimization technique similar to
747 partition pruning. While it is primarily used for partitioning
748 implemented using the legacy inheritance method, it can be used for
749 other purposes, including with declarative partitioning.
751 Constraint exclusion works in a very similar way to partition pruning,
752 except that it uses each table's CHECK constraints — which gives it its
753 name — whereas partition pruning uses the table's partition bounds,
754 which exist only in the case of declarative partitioning. Another
755 difference is that constraint exclusion is only applied at plan time;
756 there is no attempt to remove partitions at execution time.
758 The fact that constraint exclusion uses CHECK constraints, which makes
759 it slow compared to partition pruning, can sometimes be used as an
760 advantage: because constraints can be defined even on
761 declaratively-partitioned tables, in addition to their internal
762 partition bounds, constraint exclusion may be able to elide additional
763 partitions from the query plan.
765 The default (and recommended) setting of constraint_exclusion is
766 neither on nor off, but an intermediate setting called partition, which
767 causes the technique to be applied only to queries that are likely to
768 be working on inheritance partitioned tables. The on setting causes the
769 planner to examine CHECK constraints in all queries, even simple ones
770 that are unlikely to benefit.
772 The following caveats apply to constraint exclusion:
773 * Constraint exclusion is only applied during query planning, unlike
774 partition pruning, which can also be applied during query
776 * Constraint exclusion only works when the query's WHERE clause
777 contains constants (or externally supplied parameters). For
778 example, a comparison against a non-immutable function such as
779 CURRENT_TIMESTAMP cannot be optimized, since the planner cannot
780 know which child table the function's value might fall into at run
782 * Keep the partitioning constraints simple, else the planner may not
783 be able to prove that child tables might not need to be visited.
784 Use simple equality conditions for list partitioning, or simple
785 range tests for range partitioning, as illustrated in the preceding
786 examples. A good rule of thumb is that partitioning constraints
787 should contain only comparisons of the partitioning column(s) to
788 constants using B-tree-indexable operators, because only
789 B-tree-indexable column(s) are allowed in the partition key.
790 * All constraints on all children of the parent table are examined
791 during constraint exclusion, so large numbers of children are
792 likely to increase query planning time considerably. So the legacy
793 inheritance based partitioning will work well with up to perhaps a
794 hundred child tables; don't try to use many thousands of children.
796 5.12.6. Best Practices for Declarative Partitioning #
798 The choice of how to partition a table should be made carefully, as the
799 performance of query planning and execution can be negatively affected
802 One of the most critical design decisions will be the column or columns
803 by which you partition your data. Often the best choice will be to
804 partition by the column or set of columns which most commonly appear in
805 WHERE clauses of queries being executed on the partitioned table. WHERE
806 clauses that are compatible with the partition bound constraints can be
807 used to prune unneeded partitions. However, you may be forced into
808 making other decisions by requirements for the PRIMARY KEY or a UNIQUE
809 constraint. Removal of unwanted data is also a factor to consider when
810 planning your partitioning strategy. An entire partition can be
811 detached fairly quickly, so it may be beneficial to design the
812 partition strategy in such a way that all data to be removed at once is
813 located in a single partition.
815 Choosing the target number of partitions that the table should be
816 divided into is also a critical decision to make. Not having enough
817 partitions may mean that indexes remain too large and that data
818 locality remains poor which could result in low cache hit ratios.
819 However, dividing the table into too many partitions can also cause
820 issues. Too many partitions can mean longer query planning times and
821 higher memory consumption during both query planning and execution, as
822 further described below. When choosing how to partition your table,
823 it's also important to consider what changes may occur in the future.
824 For example, if you choose to have one partition per customer and you
825 currently have a small number of large customers, consider the
826 implications if in several years you instead find yourself with a large
827 number of small customers. In this case, it may be better to choose to
828 partition by HASH and choose a reasonable number of partitions rather
829 than trying to partition by LIST and hoping that the number of
830 customers does not increase beyond what it is practical to partition
833 Sub-partitioning can be useful to further divide partitions that are
834 expected to become larger than other partitions. Another option is to
835 use range partitioning with multiple columns in the partition key.
836 Either of these can easily lead to excessive numbers of partitions, so
837 restraint is advisable.
839 It is important to consider the overhead of partitioning during query
840 planning and execution. The query planner is generally able to handle
841 partition hierarchies with up to a few thousand partitions fairly well,
842 provided that typical queries allow the query planner to prune all but
843 a small number of partitions. Planning times become longer and memory
844 consumption becomes higher when more partitions remain after the
845 planner performs partition pruning. Another reason to be concerned
846 about having a large number of partitions is that the server's memory
847 consumption may grow significantly over time, especially if many
848 sessions touch large numbers of partitions. That's because each
849 partition requires its metadata to be loaded into the local memory of
850 each session that touches it.
852 With data warehouse type workloads, it can make sense to use a larger
853 number of partitions than with an OLTP type workload. Generally, in
854 data warehouses, query planning time is less of a concern as the
855 majority of processing time is spent during query execution. With
856 either of these two types of workload, it is important to make the
857 right decisions early, as re-partitioning large quantities of data can
858 be painfully slow. Simulations of the intended workload are often
859 beneficial for optimizing the partitioning strategy. Never just assume
860 that more partitions are better than fewer partitions, nor vice-versa.