begriffs open source - ai-pg/blob - full-docs/txt/rules-materializedviews.txt

   1
   2 39.3. Materialized Views #
   3
   4    Materialized views in PostgreSQL use the rule system like views do, but
   5    persist the results in a table-like form. The main differences between:
   6 CREATE MATERIALIZED VIEW mymatview AS SELECT * FROM mytab;
   7
   8    and:
   9 CREATE TABLE mymatview AS SELECT * FROM mytab;
  10
  11    are that the materialized view cannot subsequently be directly updated
  12    and that the query used to create the materialized view is stored in
  13    exactly the same way that a view's query is stored, so that fresh data
  14    can be generated for the materialized view with:
  15 REFRESH MATERIALIZED VIEW mymatview;
  16
  17    The information about a materialized view in the PostgreSQL system
  18    catalogs is exactly the same as it is for a table or view. So for the
  19    parser, a materialized view is a relation, just like a table or a view.
  20    When a materialized view is referenced in a query, the data is returned
  21    directly from the materialized view, like from a table; the rule is
  22    only used for populating the materialized view.
  23
  24    While access to the data stored in a materialized view is often much
  25    faster than accessing the underlying tables directly or through a view,
  26    the data is not always current; yet sometimes current data is not
  27    needed. Consider a table which records sales:
  28 CREATE TABLE invoice (
  29     invoice_no    integer        PRIMARY KEY,
  30     seller_no     integer,       -- ID of salesperson
  31     invoice_date  date,          -- date of sale
  32     invoice_amt   numeric(13,2)  -- amount of sale
  33 );
  34
  35    If people want to be able to quickly graph historical sales data, they
  36    might want to summarize, and they may not care about the incomplete
  37    data for the current date:
  38 CREATE MATERIALIZED VIEW sales_summary AS
  39   SELECT
  40       seller_no,
  41       invoice_date,
  42       sum(invoice_amt)::numeric(13,2) as sales_amt
  43     FROM invoice
  44     WHERE invoice_date < CURRENT_DATE
  45     GROUP BY
  46       seller_no,
  47       invoice_date;
  48
  49 CREATE UNIQUE INDEX sales_summary_seller
  50   ON sales_summary (seller_no, invoice_date);
  51
  52    This materialized view might be useful for displaying a graph in the
  53    dashboard created for salespeople. A job could be scheduled to update
  54    the statistics each night using this SQL statement:
  55 REFRESH MATERIALIZED VIEW sales_summary;
  56
  57    Another use for a materialized view is to allow faster access to data
  58    brought across from a remote system through a foreign data wrapper. A
  59    simple example using file_fdw is below, with timings, but since this is
  60    using cache on the local system the performance difference compared to
  61    access to a remote system would usually be greater than shown here.
  62    Notice we are also exploiting the ability to put an index on the
  63    materialized view, whereas file_fdw does not support indexes; this
  64    advantage might not apply for other sorts of foreign data access.
  65
  66    Setup:
  67 CREATE EXTENSION file_fdw;
  68 CREATE SERVER local_file FOREIGN DATA WRAPPER file_fdw;
  69 CREATE FOREIGN TABLE words (word text NOT NULL)
  70   SERVER local_file
  71   OPTIONS (filename '/usr/share/dict/words');
  72 CREATE MATERIALIZED VIEW wrd AS SELECT * FROM words;
  73 CREATE UNIQUE INDEX wrd_word ON wrd (word);
  74 CREATE EXTENSION pg_trgm;
  75 CREATE INDEX wrd_trgm ON wrd USING gist (word gist_trgm_ops);
  76 VACUUM ANALYZE wrd;
  77
  78    Now let's spell-check a word. Using file_fdw directly:
  79 SELECT count(*) FROM words WHERE word = 'caterpiler';
  80
  81  count
  82 -------
  83      0
  84 (1 row)
  85
  86    With EXPLAIN ANALYZE, we see:
  87  Aggregate  (cost=21763.99..21764.00 rows=1 width=0) (actual time=188.180..188.1
  88 81 rows=1.00 loops=1)
  89    ->  Foreign Scan on words  (cost=0.00..21761.41 rows=1032 width=0) (actual ti
  90 me=188.177..188.177 rows=0.00 loops=1)
  91          Filter: (word = 'caterpiler'::text)
  92          Rows Removed by Filter: 479829
  93          Foreign File: /usr/share/dict/words
  94          Foreign File Size: 4953699
  95  Planning time: 0.118 ms
  96  Execution time: 188.273 ms
  97
  98    If the materialized view is used instead, the query is much faster:
  99  Aggregate  (cost=4.44..4.45 rows=1 width=0) (actual time=0.042..0.042 rows=1.00
 100  loops=1)
 101    ->  Index Only Scan using wrd_word on wrd  (cost=0.42..4.44 rows=1 width=0) (
 102 actual time=0.039..0.039 rows=0.00 loops=1)
 103          Index Cond: (word = 'caterpiler'::text)
 104          Heap Fetches: 0
 105          Index Searches: 1
 106  Planning time: 0.164 ms
 107  Execution time: 0.117 ms
 108
 109    Either way, the word is spelled wrong, so let's look for what we might
 110    have wanted. Again using file_fdw and pg_trgm:
 111 SELECT word FROM words ORDER BY word <-> 'caterpiler' LIMIT 10;
 112
 113      word
 114 ---------------
 115  cater
 116  caterpillar
 117  Caterpillar
 118  caterpillars
 119  caterpillar's
 120  Caterpillar's
 121  caterer
 122  caterer's
 123  caters
 124  catered
 125 (10 rows)
 126
 127  Limit  (cost=11583.61..11583.64 rows=10 width=32) (actual time=1431.591..1431.5
 128 94 rows=10.00 loops=1)
 129    ->  Sort  (cost=11583.61..11804.76 rows=88459 width=32) (actual time=1431.589
 130 ..1431.591 rows=10.00 loops=1)
 131          Sort Key: ((word <-> 'caterpiler'::text))
 132          Sort Method: top-N heapsort  Memory: 25kB
 133          ->  Foreign Scan on words  (cost=0.00..9672.05 rows=88459 width=32) (ac
 134 tual time=0.057..1286.455 rows=479829.00 loops=1)
 135                Foreign File: /usr/share/dict/words
 136                Foreign File Size: 4953699
 137  Planning time: 0.128 ms
 138  Execution time: 1431.679 ms
 139
 140    Using the materialized view:
 141  Limit  (cost=0.29..1.06 rows=10 width=10) (actual time=187.222..188.257 rows=10
 142 .00 loops=1)
 143    ->  Index Scan using wrd_trgm on wrd  (cost=0.29..37020.87 rows=479829 width=
 144 10) (actual time=187.219..188.252 rows=10.00 loops=1)
 145          Order By: (word <-> 'caterpiler'::text)
 146          Index Searches: 1
 147  Planning time: 0.196 ms
 148  Execution time: 198.640 ms
 149
 150    If you can tolerate periodic update of the remote data to the local
 151    database, the performance benefit can be substantial.