2 39.3. Materialized Views #
4 Materialized views in PostgreSQL use the rule system like views do, but
5 persist the results in a table-like form. The main differences between:
6 CREATE MATERIALIZED VIEW mymatview AS SELECT * FROM mytab;
9 CREATE TABLE mymatview AS SELECT * FROM mytab;
11 are that the materialized view cannot subsequently be directly updated
12 and that the query used to create the materialized view is stored in
13 exactly the same way that a view's query is stored, so that fresh data
14 can be generated for the materialized view with:
15 REFRESH MATERIALIZED VIEW mymatview;
17 The information about a materialized view in the PostgreSQL system
18 catalogs is exactly the same as it is for a table or view. So for the
19 parser, a materialized view is a relation, just like a table or a view.
20 When a materialized view is referenced in a query, the data is returned
21 directly from the materialized view, like from a table; the rule is
22 only used for populating the materialized view.
24 While access to the data stored in a materialized view is often much
25 faster than accessing the underlying tables directly or through a view,
26 the data is not always current; yet sometimes current data is not
27 needed. Consider a table which records sales:
28 CREATE TABLE invoice (
29 invoice_no integer PRIMARY KEY,
30 seller_no integer, -- ID of salesperson
31 invoice_date date, -- date of sale
32 invoice_amt numeric(13,2) -- amount of sale
35 If people want to be able to quickly graph historical sales data, they
36 might want to summarize, and they may not care about the incomplete
37 data for the current date:
38 CREATE MATERIALIZED VIEW sales_summary AS
42 sum(invoice_amt)::numeric(13,2) as sales_amt
44 WHERE invoice_date < CURRENT_DATE
49 CREATE UNIQUE INDEX sales_summary_seller
50 ON sales_summary (seller_no, invoice_date);
52 This materialized view might be useful for displaying a graph in the
53 dashboard created for salespeople. A job could be scheduled to update
54 the statistics each night using this SQL statement:
55 REFRESH MATERIALIZED VIEW sales_summary;
57 Another use for a materialized view is to allow faster access to data
58 brought across from a remote system through a foreign data wrapper. A
59 simple example using file_fdw is below, with timings, but since this is
60 using cache on the local system the performance difference compared to
61 access to a remote system would usually be greater than shown here.
62 Notice we are also exploiting the ability to put an index on the
63 materialized view, whereas file_fdw does not support indexes; this
64 advantage might not apply for other sorts of foreign data access.
67 CREATE EXTENSION file_fdw;
68 CREATE SERVER local_file FOREIGN DATA WRAPPER file_fdw;
69 CREATE FOREIGN TABLE words (word text NOT NULL)
71 OPTIONS (filename '/usr/share/dict/words');
72 CREATE MATERIALIZED VIEW wrd AS SELECT * FROM words;
73 CREATE UNIQUE INDEX wrd_word ON wrd (word);
74 CREATE EXTENSION pg_trgm;
75 CREATE INDEX wrd_trgm ON wrd USING gist (word gist_trgm_ops);
78 Now let's spell-check a word. Using file_fdw directly:
79 SELECT count(*) FROM words WHERE word = 'caterpiler';
86 With EXPLAIN ANALYZE, we see:
87 Aggregate (cost=21763.99..21764.00 rows=1 width=0) (actual time=188.180..188.1
89 -> Foreign Scan on words (cost=0.00..21761.41 rows=1032 width=0) (actual ti
90 me=188.177..188.177 rows=0.00 loops=1)
91 Filter: (word = 'caterpiler'::text)
92 Rows Removed by Filter: 479829
93 Foreign File: /usr/share/dict/words
94 Foreign File Size: 4953699
95 Planning time: 0.118 ms
96 Execution time: 188.273 ms
98 If the materialized view is used instead, the query is much faster:
99 Aggregate (cost=4.44..4.45 rows=1 width=0) (actual time=0.042..0.042 rows=1.00
101 -> Index Only Scan using wrd_word on wrd (cost=0.42..4.44 rows=1 width=0) (
102 actual time=0.039..0.039 rows=0.00 loops=1)
103 Index Cond: (word = 'caterpiler'::text)
106 Planning time: 0.164 ms
107 Execution time: 0.117 ms
109 Either way, the word is spelled wrong, so let's look for what we might
110 have wanted. Again using file_fdw and pg_trgm:
111 SELECT word FROM words ORDER BY word <-> 'caterpiler' LIMIT 10;
127 Limit (cost=11583.61..11583.64 rows=10 width=32) (actual time=1431.591..1431.5
128 94 rows=10.00 loops=1)
129 -> Sort (cost=11583.61..11804.76 rows=88459 width=32) (actual time=1431.589
130 ..1431.591 rows=10.00 loops=1)
131 Sort Key: ((word <-> 'caterpiler'::text))
132 Sort Method: top-N heapsort Memory: 25kB
133 -> Foreign Scan on words (cost=0.00..9672.05 rows=88459 width=32) (ac
134 tual time=0.057..1286.455 rows=479829.00 loops=1)
135 Foreign File: /usr/share/dict/words
136 Foreign File Size: 4953699
137 Planning time: 0.128 ms
138 Execution time: 1431.679 ms
140 Using the materialized view:
141 Limit (cost=0.29..1.06 rows=10 width=10) (actual time=187.222..188.257 rows=10
143 -> Index Scan using wrd_trgm on wrd (cost=0.29..37020.87 rows=479829 width=
144 10) (actual time=187.219..188.252 rows=10.00 loops=1)
145 Order By: (word <-> 'caterpiler'::text)
147 Planning time: 0.196 ms
148 Execution time: 198.640 ms
150 If you can tolerate periodic update of the remote data to the local
151 database, the performance benefit can be substantial.