begriffs open source - ai-pg/blob - full-docs/src/sgml/html/queries-with.html

   1 <?xml version="1.0" encoding="UTF-8" standalone="no"?>
   2 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>7.8. WITH Queries (Common Table Expressions)</title><link rel="stylesheet" type="text/css" href="stylesheet.css" /><link rev="made" href="pgsql-docs@lists.postgresql.org" /><meta name="generator" content="DocBook XSL Stylesheets Vsnapshot" /><link rel="prev" href="queries-values.html" title="7.7. VALUES Lists" /><link rel="next" href="datatype.html" title="Chapter 8. Data Types" /></head><body id="docContent" class="container-fluid col-10"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="5" align="center">7.8. <code class="literal">WITH</code> Queries (Common Table Expressions)</th></tr><tr><td width="10%" align="left"><a accesskey="p" href="queries-values.html" title="7.7. VALUES Lists">Prev</a> </td><td width="10%" align="left"><a accesskey="u" href="queries.html" title="Chapter 7. Queries">Up</a></td><th width="60%" align="center">Chapter 7. Queries</th><td width="10%" align="right"><a accesskey="h" href="index.html" title="PostgreSQL 18.0 Documentation">Home</a></td><td width="10%" align="right"> <a accesskey="n" href="datatype.html" title="Chapter 8. Data Types">Next</a></td></tr></table><hr /></div><div class="sect1" id="QUERIES-WITH"><div class="titlepage"><div><div><h2 class="title" style="clear: both">7.8. <code class="literal">WITH</code> Queries (Common Table Expressions) <a href="#QUERIES-WITH" class="id_link">#</a></h2></div></div></div><div class="toc"><dl class="toc"><dt><span class="sect2"><a href="queries-with.html#QUERIES-WITH-SELECT">7.8.1. <code class="command">SELECT</code> in <code class="literal">WITH</code></a></span></dt><dt><span class="sect2"><a href="queries-with.html#QUERIES-WITH-RECURSIVE">7.8.2. Recursive Queries</a></span></dt><dt><span class="sect2"><a href="queries-with.html#QUERIES-WITH-CTE-MATERIALIZATION">7.8.3. Common Table Expression Materialization</a></span></dt><dt><span class="sect2"><a href="queries-with.html#QUERIES-WITH-MODIFYING">7.8.4. Data-Modifying Statements in <code class="literal">WITH</code></a></span></dt></dl></div><a id="id-1.5.6.12.2" class="indexterm"></a><a id="id-1.5.6.12.3" class="indexterm"></a><p>
   3    <code class="literal">WITH</code> provides a way to write auxiliary statements for use in a
   4    larger query.  These statements, which are often referred to as Common
   5    Table Expressions or <acronym class="acronym">CTE</acronym>s, can be thought of as defining
   6    temporary tables that exist just for one query.  Each auxiliary statement
   7    in a <code class="literal">WITH</code> clause can be a <code class="command">SELECT</code>,
   8    <code class="command">INSERT</code>, <code class="command">UPDATE</code>, <code class="command">DELETE</code>,
   9    or <code class="command">MERGE</code>; and the
  10    <code class="literal">WITH</code> clause itself is attached to a primary statement that can
  11    also be a <code class="command">SELECT</code>, <code class="command">INSERT</code>, <code class="command">UPDATE</code>,
  12    <code class="command">DELETE</code>, or <code class="command">MERGE</code>.
  13   </p><div class="sect2" id="QUERIES-WITH-SELECT"><div class="titlepage"><div><div><h3 class="title">7.8.1. <code class="command">SELECT</code> in <code class="literal">WITH</code> <a href="#QUERIES-WITH-SELECT" class="id_link">#</a></h3></div></div></div><p>
  14    The basic value of <code class="command">SELECT</code> in <code class="literal">WITH</code> is to
  15    break down complicated queries into simpler parts.  An example is:
  16
  17 </p><pre class="programlisting">
  18 WITH regional_sales AS (
  19     SELECT region, SUM(amount) AS total_sales
  20     FROM orders
  21     GROUP BY region
  22 ), top_regions AS (
  23     SELECT region
  24     FROM regional_sales
  25     WHERE total_sales &gt; (SELECT SUM(total_sales)/10 FROM regional_sales)
  26 )
  27 SELECT region,
  28        product,
  29        SUM(quantity) AS product_units,
  30        SUM(amount) AS product_sales
  31 FROM orders
  32 WHERE region IN (SELECT region FROM top_regions)
  33 GROUP BY region, product;
  34 </pre><p>
  35
  36    which displays per-product sales totals in only the top sales regions.
  37    The <code class="literal">WITH</code> clause defines two auxiliary statements named
  38    <code class="structname">regional_sales</code> and <code class="structname">top_regions</code>,
  39    where the output of <code class="structname">regional_sales</code> is used in
  40    <code class="structname">top_regions</code> and the output of <code class="structname">top_regions</code>
  41    is used in the primary <code class="command">SELECT</code> query.
  42    This example could have been written without <code class="literal">WITH</code>,
  43    but we'd have needed two levels of nested sub-<code class="command">SELECT</code>s.  It's a bit
  44    easier to follow this way.
  45   </p></div><div class="sect2" id="QUERIES-WITH-RECURSIVE"><div class="titlepage"><div><div><h3 class="title">7.8.2. Recursive Queries <a href="#QUERIES-WITH-RECURSIVE" class="id_link">#</a></h3></div></div></div><p>
  46    <a id="id-1.5.6.12.6.2.1" class="indexterm"></a>
  47    The optional <code class="literal">RECURSIVE</code> modifier changes <code class="literal">WITH</code>
  48    from a mere syntactic convenience into a feature that accomplishes
  49    things not otherwise possible in standard SQL.  Using
  50    <code class="literal">RECURSIVE</code>, a <code class="literal">WITH</code> query can refer to its own
  51    output.  A very simple example is this query to sum the integers from 1
  52    through 100:
  53
  54 </p><pre class="programlisting">
  55 WITH RECURSIVE t(n) AS (
  56     VALUES (1)
  57   UNION ALL
  58     SELECT n+1 FROM t WHERE n &lt; 100
  59 )
  60 SELECT sum(n) FROM t;
  61 </pre><p>
  62
  63    The general form of a recursive <code class="literal">WITH</code> query is always a
  64    <em class="firstterm">non-recursive term</em>, then <code class="literal">UNION</code> (or
  65    <code class="literal">UNION ALL</code>), then a
  66    <em class="firstterm">recursive term</em>, where only the recursive term can contain
  67    a reference to the query's own output.  Such a query is executed as
  68    follows:
  69   </p><div class="procedure" id="id-1.5.6.12.6.3"><p class="title"><strong>Recursive Query Evaluation</strong></p><ol class="procedure" type="1"><li class="step"><p>
  70      Evaluate the non-recursive term.  For <code class="literal">UNION</code> (but not
  71      <code class="literal">UNION ALL</code>), discard duplicate rows.  Include all remaining
  72      rows in the result of the recursive query, and also place them in a
  73      temporary <em class="firstterm">working table</em>.
  74     </p></li><li class="step"><p>
  75      So long as the working table is not empty, repeat these steps:
  76     </p><ol type="a" class="substeps"><li class="step"><p>
  77        Evaluate the recursive term, substituting the current contents of
  78        the working table for the recursive self-reference.
  79        For <code class="literal">UNION</code> (but not <code class="literal">UNION ALL</code>), discard
  80        duplicate rows and rows that duplicate any previous result row.
  81        Include all remaining rows in the result of the recursive query, and
  82        also place them in a temporary <em class="firstterm">intermediate table</em>.
  83       </p></li><li class="step"><p>
  84        Replace the contents of the working table with the contents of the
  85        intermediate table, then empty the intermediate table.
  86       </p></li></ol></li></ol></div><div class="note"><h3 class="title">Note</h3><p>
  87     While <code class="literal">RECURSIVE</code> allows queries to be specified
  88     recursively, internally such queries are evaluated iteratively.
  89    </p></div><p>
  90    In the example above, the working table has just a single row in each step,
  91    and it takes on the values from 1 through 100 in successive steps.  In
  92    the 100th step, there is no output because of the <code class="literal">WHERE</code>
  93    clause, and so the query terminates.
  94   </p><p>
  95    Recursive queries are typically used to deal with hierarchical or
  96    tree-structured data.  A useful example is this query to find all the
  97    direct and indirect sub-parts of a product, given only a table that
  98    shows immediate inclusions:
  99
 100 </p><pre class="programlisting">
 101 WITH RECURSIVE included_parts(sub_part, part, quantity) AS (
 102     SELECT sub_part, part, quantity FROM parts WHERE part = 'our_product'
 103   UNION ALL
 104     SELECT p.sub_part, p.part, p.quantity * pr.quantity
 105     FROM included_parts pr, parts p
 106     WHERE p.part = pr.sub_part
 107 )
 108 SELECT sub_part, SUM(quantity) as total_quantity
 109 FROM included_parts
 110 GROUP BY sub_part
 111 </pre><p>
 112   </p><div class="sect3" id="QUERIES-WITH-SEARCH"><div class="titlepage"><div><div><h4 class="title">7.8.2.1. Search Order <a href="#QUERIES-WITH-SEARCH" class="id_link">#</a></h4></div></div></div><p>
 113     When computing a tree traversal using a recursive query, you might want to
 114     order the results in either depth-first or breadth-first order.  This can
 115     be done by computing an ordering column alongside the other data columns
 116     and using that to sort the results at the end.  Note that this does not
 117     actually control in which order the query evaluation visits the rows; that
 118     is as always in SQL implementation-dependent.  This approach merely
 119     provides a convenient way to order the results afterwards.
 120    </p><p>
 121     To create a depth-first order, we compute for each result row an array of
 122     rows that we have visited so far.  For example, consider the following
 123     query that searches a table <code class="structname">tree</code> using a
 124     <code class="structfield">link</code> field:
 125
 126 </p><pre class="programlisting">
 127 WITH RECURSIVE search_tree(id, link, data) AS (
 128     SELECT t.id, t.link, t.data
 129     FROM tree t
 130   UNION ALL
 131     SELECT t.id, t.link, t.data
 132     FROM tree t, search_tree st
 133     WHERE t.id = st.link
 134 )
 135 SELECT * FROM search_tree;
 136 </pre><p>
 137
 138     To add depth-first ordering information, you can write this:
 139
 140 </p><pre class="programlisting">
 141 WITH RECURSIVE search_tree(id, link, data, <span class="emphasis"><strong>path</strong></span>) AS (
 142     SELECT t.id, t.link, t.data, <span class="emphasis"><strong>ARRAY[t.id]</strong></span>
 143     FROM tree t
 144   UNION ALL
 145     SELECT t.id, t.link, t.data, <span class="emphasis"><strong>path || t.id</strong></span>
 146     FROM tree t, search_tree st
 147     WHERE t.id = st.link
 148 )
 149 SELECT * FROM search_tree <span class="emphasis"><strong>ORDER BY path</strong></span>;
 150 </pre><p>
 151    </p><p>
 152     In the general case where more than one field needs to be used to identify
 153     a row, use an array of rows.  For example, if we needed to track fields
 154     <code class="structfield">f1</code> and <code class="structfield">f2</code>:
 155
 156 </p><pre class="programlisting">
 157 WITH RECURSIVE search_tree(id, link, data, <span class="emphasis"><strong>path</strong></span>) AS (
 158     SELECT t.id, t.link, t.data, <span class="emphasis"><strong>ARRAY[ROW(t.f1, t.f2)]</strong></span>
 159     FROM tree t
 160   UNION ALL
 161     SELECT t.id, t.link, t.data, <span class="emphasis"><strong>path || ROW(t.f1, t.f2)</strong></span>
 162     FROM tree t, search_tree st
 163     WHERE t.id = st.link
 164 )
 165 SELECT * FROM search_tree <span class="emphasis"><strong>ORDER BY path</strong></span>;
 166 </pre><p>
 167    </p><div class="tip"><h3 class="title">Tip</h3><p>
 168      Omit the <code class="literal">ROW()</code> syntax in the common case where only one
 169      field needs to be tracked.  This allows a simple array rather than a
 170      composite-type array to be used, gaining efficiency.
 171     </p></div><p>
 172     To create a breadth-first order, you can add a column that tracks the depth
 173     of the search, for example:
 174
 175 </p><pre class="programlisting">
 176 WITH RECURSIVE search_tree(id, link, data, <span class="emphasis"><strong>depth</strong></span>) AS (
 177     SELECT t.id, t.link, t.data, <span class="emphasis"><strong>0</strong></span>
 178     FROM tree t
 179   UNION ALL
 180     SELECT t.id, t.link, t.data, <span class="emphasis"><strong>depth + 1</strong></span>
 181     FROM tree t, search_tree st
 182     WHERE t.id = st.link
 183 )
 184 SELECT * FROM search_tree <span class="emphasis"><strong>ORDER BY depth</strong></span>;
 185 </pre><p>
 186
 187     To get a stable sort, add data columns as secondary sorting columns.
 188    </p><div class="tip"><h3 class="title">Tip</h3><p>
 189      The recursive query evaluation algorithm produces its output in
 190      breadth-first search order.  However, this is an implementation detail and
 191      it is perhaps unsound to rely on it.  The order of the rows within each
 192      level is certainly undefined, so some explicit ordering might be desired
 193      in any case.
 194     </p></div><p>
 195     There is built-in syntax to compute a depth- or breadth-first sort column.
 196     For example:
 197
 198 </p><pre class="programlisting">
 199 WITH RECURSIVE search_tree(id, link, data) AS (
 200     SELECT t.id, t.link, t.data
 201     FROM tree t
 202   UNION ALL
 203     SELECT t.id, t.link, t.data
 204     FROM tree t, search_tree st
 205     WHERE t.id = st.link
 206 ) <span class="emphasis"><strong>SEARCH DEPTH FIRST BY id SET ordercol</strong></span>
 207 SELECT * FROM search_tree ORDER BY ordercol;
 208
 209 WITH RECURSIVE search_tree(id, link, data) AS (
 210     SELECT t.id, t.link, t.data
 211     FROM tree t
 212   UNION ALL
 213     SELECT t.id, t.link, t.data
 214     FROM tree t, search_tree st
 215     WHERE t.id = st.link
 216 ) <span class="emphasis"><strong>SEARCH BREADTH FIRST BY id SET ordercol</strong></span>
 217 SELECT * FROM search_tree ORDER BY ordercol;
 218 </pre><p>
 219     This syntax is internally expanded to something similar to the above
 220     hand-written forms.  The <code class="literal">SEARCH</code> clause specifies whether
 221     depth- or breadth first search is wanted, the list of columns to track for
 222     sorting, and a column name that will contain the result data that can be
 223     used for sorting.  That column will implicitly be added to the output rows
 224     of the CTE.
 225    </p></div><div class="sect3" id="QUERIES-WITH-CYCLE"><div class="titlepage"><div><div><h4 class="title">7.8.2.2. Cycle Detection <a href="#QUERIES-WITH-CYCLE" class="id_link">#</a></h4></div></div></div><p>
 226    When working with recursive queries it is important to be sure that
 227    the recursive part of the query will eventually return no tuples,
 228    or else the query will loop indefinitely.  Sometimes, using
 229    <code class="literal">UNION</code> instead of <code class="literal">UNION ALL</code> can accomplish this
 230    by discarding rows that duplicate previous output rows.  However, often a
 231    cycle does not involve output rows that are completely duplicate: it may be
 232    necessary to check just one or a few fields to see if the same point has
 233    been reached before.  The standard method for handling such situations is
 234    to compute an array of the already-visited values.  For example, consider again
 235    the following query that searches a table <code class="structname">graph</code> using a
 236    <code class="structfield">link</code> field:
 237
 238 </p><pre class="programlisting">
 239 WITH RECURSIVE search_graph(id, link, data, depth) AS (
 240     SELECT g.id, g.link, g.data, 0
 241     FROM graph g
 242   UNION ALL
 243     SELECT g.id, g.link, g.data, sg.depth + 1
 244     FROM graph g, search_graph sg
 245     WHERE g.id = sg.link
 246 )
 247 SELECT * FROM search_graph;
 248 </pre><p>
 249
 250    This query will loop if the <code class="structfield">link</code> relationships contain
 251    cycles.  Because we require a <span class="quote">“<span class="quote">depth</span>”</span> output, just changing
 252    <code class="literal">UNION ALL</code> to <code class="literal">UNION</code> would not eliminate the looping.
 253    Instead we need to recognize whether we have reached the same row again
 254    while following a particular path of links.  We add two columns
 255    <code class="structfield">is_cycle</code> and <code class="structfield">path</code> to the loop-prone query:
 256
 257 </p><pre class="programlisting">
 258 WITH RECURSIVE search_graph(id, link, data, depth, <span class="emphasis"><strong>is_cycle, path</strong></span>) AS (
 259     SELECT g.id, g.link, g.data, 0,
 260       <span class="emphasis"><strong>false,
 261       ARRAY[g.id]</strong></span>
 262     FROM graph g
 263   UNION ALL
 264     SELECT g.id, g.link, g.data, sg.depth + 1,
 265       <span class="emphasis"><strong>g.id = ANY(path),
 266       path || g.id</strong></span>
 267     FROM graph g, search_graph sg
 268     WHERE g.id = sg.link <span class="emphasis"><strong>AND NOT is_cycle</strong></span>
 269 )
 270 SELECT * FROM search_graph;
 271 </pre><p>
 272
 273    Aside from preventing cycles, the array value is often useful in its own
 274    right as representing the <span class="quote">“<span class="quote">path</span>”</span> taken to reach any particular row.
 275   </p><p>
 276    In the general case where more than one field needs to be checked to
 277    recognize a cycle, use an array of rows.  For example, if we needed to
 278    compare fields <code class="structfield">f1</code> and <code class="structfield">f2</code>:
 279
 280 </p><pre class="programlisting">
 281 WITH RECURSIVE search_graph(id, link, data, depth, <span class="emphasis"><strong>is_cycle, path</strong></span>) AS (
 282     SELECT g.id, g.link, g.data, 0,
 283       <span class="emphasis"><strong>false,
 284       ARRAY[ROW(g.f1, g.f2)]</strong></span>
 285     FROM graph g
 286   UNION ALL
 287     SELECT g.id, g.link, g.data, sg.depth + 1,
 288       <span class="emphasis"><strong>ROW(g.f1, g.f2) = ANY(path),
 289       path || ROW(g.f1, g.f2)</strong></span>
 290     FROM graph g, search_graph sg
 291     WHERE g.id = sg.link <span class="emphasis"><strong>AND NOT is_cycle</strong></span>
 292 )
 293 SELECT * FROM search_graph;
 294 </pre><p>
 295   </p><div class="tip"><h3 class="title">Tip</h3><p>
 296     Omit the <code class="literal">ROW()</code> syntax in the common case where only one field
 297     needs to be checked to recognize a cycle.  This allows a simple array
 298     rather than a composite-type array to be used, gaining efficiency.
 299    </p></div><p>
 300    There is built-in syntax to simplify cycle detection.  The above query can
 301    also be written like this:
 302 </p><pre class="programlisting">
 303 WITH RECURSIVE search_graph(id, link, data, depth) AS (
 304     SELECT g.id, g.link, g.data, 1
 305     FROM graph g
 306   UNION ALL
 307     SELECT g.id, g.link, g.data, sg.depth + 1
 308     FROM graph g, search_graph sg
 309     WHERE g.id = sg.link
 310 ) <span class="emphasis"><strong>CYCLE id SET is_cycle USING path</strong></span>
 311 SELECT * FROM search_graph;
 312 </pre><p>
 313    and it will be internally rewritten to the above form.  The
 314    <code class="literal">CYCLE</code> clause specifies first the list of columns to
 315    track for cycle detection, then a column name that will show whether a
 316    cycle has been detected, and finally the name of another column that will track the
 317    path.  The cycle and path columns will implicitly be added to the output
 318    rows of the CTE.
 319   </p><div class="tip"><h3 class="title">Tip</h3><p>
 320     The cycle path column is computed in the same way as the depth-first
 321     ordering column show in the previous section.  A query can have both a
 322     <code class="literal">SEARCH</code> and a <code class="literal">CYCLE</code> clause, but a
 323     depth-first search specification and a cycle detection specification would
 324     create redundant computations, so it's more efficient to just use the
 325     <code class="literal">CYCLE</code> clause and order by the path column.  If
 326     breadth-first ordering is wanted, then specifying both
 327     <code class="literal">SEARCH</code> and <code class="literal">CYCLE</code> can be useful.
 328    </p></div><p>
 329    A helpful trick for testing queries
 330    when you are not certain if they might loop is to place a <code class="literal">LIMIT</code>
 331    in the parent query.  For example, this query would loop forever without
 332    the <code class="literal">LIMIT</code>:
 333
 334 </p><pre class="programlisting">
 335 WITH RECURSIVE t(n) AS (
 336     SELECT 1
 337   UNION ALL
 338     SELECT n+1 FROM t
 339 )
 340 SELECT n FROM t <span class="emphasis"><strong>LIMIT 100</strong></span>;
 341 </pre><p>
 342
 343    This works because <span class="productname">PostgreSQL</span>'s implementation
 344    evaluates only as many rows of a <code class="literal">WITH</code> query as are actually
 345    fetched by the parent query.  Using this trick in production is not
 346    recommended, because other systems might work differently.  Also, it
 347    usually won't work if you make the outer query sort the recursive query's
 348    results or join them to some other table, because in such cases the
 349    outer query will usually try to fetch all of the <code class="literal">WITH</code> query's
 350    output anyway.
 351   </p></div></div><div class="sect2" id="QUERIES-WITH-CTE-MATERIALIZATION"><div class="titlepage"><div><div><h3 class="title">7.8.3. Common Table Expression Materialization <a href="#QUERIES-WITH-CTE-MATERIALIZATION" class="id_link">#</a></h3></div></div></div><p>
 352    A useful property of <code class="literal">WITH</code> queries is that they are
 353    normally evaluated only once per execution of the parent query, even if
 354    they are referred to more than once by the parent query or
 355    sibling <code class="literal">WITH</code> queries.
 356    Thus, expensive calculations that are needed in multiple places can be
 357    placed within a <code class="literal">WITH</code> query to avoid redundant work.  Another
 358    possible application is to prevent unwanted multiple evaluations of
 359    functions with side-effects.
 360    However, the other side of this coin is that the optimizer is not able to
 361    push restrictions from the parent query down into a multiply-referenced
 362    <code class="literal">WITH</code> query, since that might affect all uses of the
 363    <code class="literal">WITH</code> query's output when it should affect only one.
 364    The multiply-referenced <code class="literal">WITH</code> query will be
 365    evaluated as written, without suppression of rows that the parent query
 366    might discard afterwards.  (But, as mentioned above, evaluation might stop
 367    early if the reference(s) to the query demand only a limited number of
 368    rows.)
 369   </p><p>
 370    However, if a <code class="literal">WITH</code> query is non-recursive and
 371    side-effect-free (that is, it is a <code class="literal">SELECT</code> containing
 372    no volatile functions) then it can be folded into the parent query,
 373    allowing joint optimization of the two query levels.  By default, this
 374    happens if the parent query references the <code class="literal">WITH</code> query
 375    just once, but not if it references the <code class="literal">WITH</code> query
 376    more than once.  You can override that decision by
 377    specifying <code class="literal">MATERIALIZED</code> to force separate calculation
 378    of the <code class="literal">WITH</code> query, or by specifying <code class="literal">NOT
 379    MATERIALIZED</code> to force it to be merged into the parent query.
 380    The latter choice risks duplicate computation of
 381    the <code class="literal">WITH</code> query, but it can still give a net savings if
 382    each usage of the <code class="literal">WITH</code> query needs only a small part
 383    of the <code class="literal">WITH</code> query's full output.
 384   </p><p>
 385    A simple example of these rules is
 386 </p><pre class="programlisting">
 387 WITH w AS (
 388     SELECT * FROM big_table
 389 )
 390 SELECT * FROM w WHERE key = 123;
 391 </pre><p>
 392    This <code class="literal">WITH</code> query will be folded, producing the same
 393    execution plan as
 394 </p><pre class="programlisting">
 395 SELECT * FROM big_table WHERE key = 123;
 396 </pre><p>
 397    In particular, if there's an index on <code class="structfield">key</code>,
 398    it will probably be used to fetch just the rows having <code class="literal">key =
 399    123</code>.  On the other hand, in
 400 </p><pre class="programlisting">
 401 WITH w AS (
 402     SELECT * FROM big_table
 403 )
 404 SELECT * FROM w AS w1 JOIN w AS w2 ON w1.key = w2.ref
 405 WHERE w2.key = 123;
 406 </pre><p>
 407    the <code class="literal">WITH</code> query will be materialized, producing a
 408    temporary copy of <code class="structname">big_table</code> that is then
 409    joined with itself — without benefit of any index.  This query
 410    will be executed much more efficiently if written as
 411 </p><pre class="programlisting">
 412 WITH w AS NOT MATERIALIZED (
 413     SELECT * FROM big_table
 414 )
 415 SELECT * FROM w AS w1 JOIN w AS w2 ON w1.key = w2.ref
 416 WHERE w2.key = 123;
 417 </pre><p>
 418    so that the parent query's restrictions can be applied directly
 419    to scans of <code class="structname">big_table</code>.
 420   </p><p>
 421    An example where <code class="literal">NOT MATERIALIZED</code> could be
 422    undesirable is
 423 </p><pre class="programlisting">
 424 WITH w AS (
 425     SELECT key, very_expensive_function(val) as f FROM some_table
 426 )
 427 SELECT * FROM w AS w1 JOIN w AS w2 ON w1.f = w2.f;
 428 </pre><p>
 429    Here, materialization of the <code class="literal">WITH</code> query ensures
 430    that <code class="function">very_expensive_function</code> is evaluated only
 431    once per table row, not twice.
 432   </p><p>
 433    The examples above only show <code class="literal">WITH</code> being used with
 434    <code class="command">SELECT</code>, but it can be attached in the same way to
 435    <code class="command">INSERT</code>, <code class="command">UPDATE</code>,
 436    <code class="command">DELETE</code>, or <code class="command">MERGE</code>.
 437    In each case it effectively provides temporary table(s) that can
 438    be referred to in the main command.
 439   </p></div><div class="sect2" id="QUERIES-WITH-MODIFYING"><div class="titlepage"><div><div><h3 class="title">7.8.4. Data-Modifying Statements in <code class="literal">WITH</code> <a href="#QUERIES-WITH-MODIFYING" class="id_link">#</a></h3></div></div></div><p>
 440     You can use data-modifying statements (<code class="command">INSERT</code>,
 441     <code class="command">UPDATE</code>, <code class="command">DELETE</code>, or
 442     <code class="command">MERGE</code>) in <code class="literal">WITH</code>.  This
 443     allows you to perform several different operations in the same query.
 444     An example is:
 445
 446 </p><pre class="programlisting">
 447 WITH moved_rows AS (
 448     DELETE FROM products
 449     WHERE
 450         "date" &gt;= '2010-10-01' AND
 451         "date" &lt; '2010-11-01'
 452     RETURNING *
 453 )
 454 INSERT INTO products_log
 455 SELECT * FROM moved_rows;
 456 </pre><p>
 457
 458     This query effectively moves rows from <code class="structname">products</code> to
 459     <code class="structname">products_log</code>.  The <code class="command">DELETE</code> in <code class="literal">WITH</code>
 460     deletes the specified rows from <code class="structname">products</code>, returning their
 461     contents by means of its <code class="literal">RETURNING</code> clause; and then the
 462     primary query reads that output and inserts it into
 463     <code class="structname">products_log</code>.
 464    </p><p>
 465     A fine point of the above example is that the <code class="literal">WITH</code> clause is
 466     attached to the <code class="command">INSERT</code>, not the sub-<code class="command">SELECT</code> within
 467     the <code class="command">INSERT</code>.  This is necessary because data-modifying
 468     statements are only allowed in <code class="literal">WITH</code> clauses that are attached
 469     to the top-level statement.  However, normal <code class="literal">WITH</code> visibility
 470     rules apply, so it is possible to refer to the <code class="literal">WITH</code>
 471     statement's output from the sub-<code class="command">SELECT</code>.
 472    </p><p>
 473     Data-modifying statements in <code class="literal">WITH</code> usually have
 474     <code class="literal">RETURNING</code> clauses (see <a class="xref" href="dml-returning.html" title="6.4. Returning Data from Modified Rows">Section 6.4</a>),
 475     as shown in the example above.
 476     It is the output of the <code class="literal">RETURNING</code> clause, <span class="emphasis"><em>not</em></span> the
 477     target table of the data-modifying statement, that forms the temporary
 478     table that can be referred to by the rest of the query.  If a
 479     data-modifying statement in <code class="literal">WITH</code> lacks a <code class="literal">RETURNING</code>
 480     clause, then it forms no temporary table and cannot be referred to in
 481     the rest of the query.  Such a statement will be executed nonetheless.
 482     A not-particularly-useful example is:
 483
 484 </p><pre class="programlisting">
 485 WITH t AS (
 486     DELETE FROM foo
 487 )
 488 DELETE FROM bar;
 489 </pre><p>
 490
 491     This example would remove all rows from tables <code class="structname">foo</code> and
 492     <code class="structname">bar</code>.  The number of affected rows reported to the client
 493     would only include rows removed from <code class="structname">bar</code>.
 494    </p><p>
 495     Recursive self-references in data-modifying statements are not
 496     allowed.  In some cases it is possible to work around this limitation by
 497     referring to the output of a recursive <code class="literal">WITH</code>, for example:
 498
 499 </p><pre class="programlisting">
 500 WITH RECURSIVE included_parts(sub_part, part) AS (
 501     SELECT sub_part, part FROM parts WHERE part = 'our_product'
 502   UNION ALL
 503     SELECT p.sub_part, p.part
 504     FROM included_parts pr, parts p
 505     WHERE p.part = pr.sub_part
 506 )
 507 DELETE FROM parts
 508   WHERE part IN (SELECT part FROM included_parts);
 509 </pre><p>
 510
 511     This query would remove all direct and indirect subparts of a product.
 512    </p><p>
 513     Data-modifying statements in <code class="literal">WITH</code> are executed exactly once,
 514     and always to completion, independently of whether the primary query
 515     reads all (or indeed any) of their output.  Notice that this is different
 516     from the rule for <code class="command">SELECT</code> in <code class="literal">WITH</code>: as stated in the
 517     previous section, execution of a <code class="command">SELECT</code> is carried only as far
 518     as the primary query demands its output.
 519    </p><p>
 520     The sub-statements in <code class="literal">WITH</code> are executed concurrently with
 521     each other and with the main query.  Therefore, when using data-modifying
 522     statements in <code class="literal">WITH</code>, the order in which the specified updates
 523     actually happen is unpredictable.  All the statements are executed with
 524     the same <em class="firstterm">snapshot</em> (see <a class="xref" href="mvcc.html" title="Chapter 13. Concurrency Control">Chapter 13</a>), so they
 525     cannot <span class="quote">“<span class="quote">see</span>”</span> one another's effects on the target tables.  This
 526     alleviates the effects of the unpredictability of the actual order of row
 527     updates, and means that <code class="literal">RETURNING</code> data is the only way to
 528     communicate changes between different <code class="literal">WITH</code> sub-statements and
 529     the main query.  An example of this is that in
 530
 531 </p><pre class="programlisting">
 532 WITH t AS (
 533     UPDATE products SET price = price * 1.05
 534     RETURNING *
 535 )
 536 SELECT * FROM products;
 537 </pre><p>
 538
 539     the outer <code class="command">SELECT</code> would return the original prices before the
 540     action of the <code class="command">UPDATE</code>, while in
 541
 542 </p><pre class="programlisting">
 543 WITH t AS (
 544     UPDATE products SET price = price * 1.05
 545     RETURNING *
 546 )
 547 SELECT * FROM t;
 548 </pre><p>
 549
 550     the outer <code class="command">SELECT</code> would return the updated data.
 551    </p><p>
 552     Trying to update the same row twice in a single statement is not
 553     supported.  Only one of the modifications takes place, but it is not easy
 554     (and sometimes not possible) to reliably predict which one.  This also
 555     applies to deleting a row that was already updated in the same statement:
 556     only the update is performed.  Therefore you should generally avoid trying
 557     to modify a single row twice in a single statement.  In particular avoid
 558     writing <code class="literal">WITH</code> sub-statements that could affect the same rows
 559     changed by the main statement or a sibling sub-statement.  The effects
 560     of such a statement will not be predictable.
 561    </p><p>
 562     At present, any table used as the target of a data-modifying statement in
 563     <code class="literal">WITH</code> must not have a conditional rule, nor an <code class="literal">ALSO</code>
 564     rule, nor an <code class="literal">INSTEAD</code> rule that expands to multiple statements.
 565    </p></div></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="queries-values.html" title="7.7. VALUES Lists">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="queries.html" title="Chapter 7. Queries">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="datatype.html" title="Chapter 8. Data Types">Next</a></td></tr><tr><td width="40%" align="left" valign="top">7.7. <code class="literal">VALUES</code> Lists </td><td width="20%" align="center"><a accesskey="h" href="index.html" title="PostgreSQL 18.0 Documentation">Home</a></td><td width="40%" align="right" valign="top"> Chapter 8. Data Types</td></tr></table></div></body></html>