1 <?xml version="1.0" encoding="UTF-8" standalone="no"?>
2 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>59.1. Sampling Method Support Functions</title><link rel="stylesheet" type="text/css" href="stylesheet.css" /><link rev="made" href="pgsql-docs@lists.postgresql.org" /><meta name="generator" content="DocBook XSL Stylesheets Vsnapshot" /><link rel="prev" href="tablesample-method.html" title="Chapter 59. Writing a Table Sampling Method" /><link rel="next" href="custom-scan.html" title="Chapter 60. Writing a Custom Scan Provider" /></head><body id="docContent" class="container-fluid col-10"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="5" align="center">59.1. Sampling Method Support Functions</th></tr><tr><td width="10%" align="left"><a accesskey="p" href="tablesample-method.html" title="Chapter 59. Writing a Table Sampling Method">Prev</a> </td><td width="10%" align="left"><a accesskey="u" href="tablesample-method.html" title="Chapter 59. Writing a Table Sampling Method">Up</a></td><th width="60%" align="center">Chapter 59. Writing a Table Sampling Method</th><td width="10%" align="right"><a accesskey="h" href="index.html" title="PostgreSQL 18.0 Documentation">Home</a></td><td width="10%" align="right"> <a accesskey="n" href="custom-scan.html" title="Chapter 60. Writing a Custom Scan Provider">Next</a></td></tr></table><hr /></div><div class="sect1" id="TABLESAMPLE-SUPPORT-FUNCTIONS"><div class="titlepage"><div><div><h2 class="title" style="clear: both">59.1. Sampling Method Support Functions <a href="#TABLESAMPLE-SUPPORT-FUNCTIONS" class="id_link">#</a></h2></div></div></div><p>
3 The TSM handler function returns a palloc'd <code class="type">TsmRoutine</code> struct
4 containing pointers to the support functions described below. Most of
5 the functions are required, but some are optional, and those pointers can
8 </p><pre class="programlisting">
10 SampleScanGetSampleSize (PlannerInfo *root,
17 This function is called during planning. It must estimate the number of
18 relation pages that will be read during a sample scan, and the number of
19 tuples that will be selected by the scan. (For example, these might be
20 determined by estimating the sampling fraction, and then multiplying
21 the <code class="literal">baserel->pages</code> and <code class="literal">baserel->tuples</code>
22 numbers by that, being sure to round the results to integral values.)
23 The <code class="literal">paramexprs</code> list holds the expression(s) that are
24 parameters to the <code class="literal">TABLESAMPLE</code> clause. It is recommended to
25 use <code class="function">estimate_expression_value()</code> to try to reduce these
26 expressions to constants, if their values are needed for estimation
27 purposes; but the function must provide size estimates even if they cannot
28 be reduced, and it should not fail even if the values appear invalid
29 (remember that they're only estimates of what the run-time values will be).
30 The <code class="literal">pages</code> and <code class="literal">tuples</code> parameters are outputs.
32 </p><pre class="programlisting">
34 InitSampleScan (SampleScanState *node,
38 Initialize for execution of a SampleScan plan node.
39 This is called during executor startup.
40 It should perform any initialization needed before processing can start.
41 The <code class="structname">SampleScanState</code> node has already been created, but
42 its <code class="structfield">tsm_state</code> field is NULL.
43 The <code class="function">InitSampleScan</code> function can palloc whatever internal
44 state data is needed by the sampling method, and store a pointer to
45 it in <code class="literal">node->tsm_state</code>.
46 Information about the table to scan is accessible through other fields
47 of the <code class="structname">SampleScanState</code> node (but note that the
48 <code class="literal">node->ss.ss_currentScanDesc</code> scan descriptor is not set
50 <code class="literal">eflags</code> contains flag bits describing the executor's
51 operating mode for this plan node.
53 When <code class="literal">(eflags & EXEC_FLAG_EXPLAIN_ONLY)</code> is true,
54 the scan will not actually be performed, so this function should only do
55 the minimum required to make the node state valid for <code class="command">EXPLAIN</code>
56 and <code class="function">EndSampleScan</code>.
58 This function can be omitted (set the pointer to NULL), in which case
59 <code class="function">BeginSampleScan</code> must perform all initialization needed
60 by the sampling method.
62 </p><pre class="programlisting">
64 BeginSampleScan (SampleScanState *node,
70 Begin execution of a sampling scan.
71 This is called just before the first attempt to fetch a tuple, and
72 may be called again if the scan needs to be restarted.
73 Information about the table to scan is accessible through fields
74 of the <code class="structname">SampleScanState</code> node (but note that the
75 <code class="literal">node->ss.ss_currentScanDesc</code> scan descriptor is not set
77 The <code class="literal">params</code> array, of length <code class="literal">nparams</code>, contains the
78 values of the parameters supplied in the <code class="literal">TABLESAMPLE</code> clause.
79 These will have the number and types specified in the sampling
80 method's <code class="literal">parameterTypes</code> list, and have been checked
82 <code class="literal">seed</code> contains a seed to use for any random numbers generated
83 within the sampling method; it is either a hash derived from the
84 <code class="literal">REPEATABLE</code> value if one was given, or the result
85 of <code class="literal">random()</code> if not.
87 This function may adjust the fields <code class="literal">node->use_bulkread</code>
88 and <code class="literal">node->use_pagemode</code>.
89 If <code class="literal">node->use_bulkread</code> is <code class="literal">true</code>, which it is by
90 default, the scan will use a buffer access strategy that encourages
91 recycling buffers after use. It might be reasonable to set this
92 to <code class="literal">false</code> if the scan will visit only a small fraction of the
94 If <code class="literal">node->use_pagemode</code> is <code class="literal">true</code>, which it is by
95 default, the scan will perform visibility checking in a single pass for
96 all tuples on each visited page. It might be reasonable to set this
97 to <code class="literal">false</code> if the scan will select only a small fraction of the
98 tuples on each visited page. That will result in fewer tuple visibility
99 checks being performed, though each one will be more expensive because it
100 will require more locking.
102 If the sampling method is
103 marked <code class="literal">repeatable_across_scans</code>, it must be able to
104 select the same set of tuples during a rescan as it did originally, that is
105 a fresh call of <code class="function">BeginSampleScan</code> must lead to selecting the
106 same tuples as before (if the <code class="literal">TABLESAMPLE</code> parameters
107 and seed don't change).
109 </p><pre class="programlisting">
111 NextSampleBlock (SampleScanState *node, BlockNumber nblocks);
114 Returns the block number of the next page to be scanned, or
115 <code class="literal">InvalidBlockNumber</code> if no pages remain to be scanned.
117 This function can be omitted (set the pointer to NULL), in which case
118 the core code will perform a sequential scan of the entire relation.
119 Such a scan can use synchronized scanning, so that the sampling method
120 cannot assume that the relation pages are visited in the same order on
123 </p><pre class="programlisting">
125 NextSampleTuple (SampleScanState *node,
127 OffsetNumber maxoffset);
130 Returns the offset number of the next tuple to be sampled on the
131 specified page, or <code class="literal">InvalidOffsetNumber</code> if no tuples remain to
132 be sampled. <code class="literal">maxoffset</code> is the largest offset number in use
134 </p><div class="note"><h3 class="title">Note</h3><p>
135 <code class="function">NextSampleTuple</code> is not explicitly told which of the offset
136 numbers in the range <code class="literal">1 .. maxoffset</code> actually contain valid
137 tuples. This is not normally a problem since the core code ignores
138 requests to sample missing or invisible tuples; that should not result in
139 any bias in the sample. However, if necessary, the function can use
140 <code class="literal">node->donetuples</code> to examine how many of the tuples
141 it returned were valid and visible.
142 </p></div><div class="note"><h3 class="title">Note</h3><p>
143 <code class="function">NextSampleTuple</code> must <span class="emphasis"><em>not</em></span> assume
144 that <code class="literal">blockno</code> is the same page number returned by the most
145 recent <code class="function">NextSampleBlock</code> call. It was returned by some
146 previous <code class="function">NextSampleBlock</code> call, but the core code is allowed
147 to call <code class="function">NextSampleBlock</code> in advance of actually scanning
148 pages, so as to support prefetching. It is OK to assume that once
149 sampling of a given page begins, successive <code class="function">NextSampleTuple</code>
150 calls all refer to the same page until <code class="literal">InvalidOffsetNumber</code> is
153 </p><pre class="programlisting">
155 EndSampleScan (SampleScanState *node);
158 End the scan and release resources. It is normally not important
159 to release palloc'd memory, but any externally-visible resources
160 should be cleaned up.
161 This function can be omitted (set the pointer to NULL) in the common
162 case where no such resources exist.
163 </p></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="tablesample-method.html" title="Chapter 59. Writing a Table Sampling Method">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="tablesample-method.html" title="Chapter 59. Writing a Table Sampling Method">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="custom-scan.html" title="Chapter 60. Writing a Custom Scan Provider">Next</a></td></tr><tr><td width="40%" align="left" valign="top">Chapter 59. Writing a Table Sampling Method </td><td width="20%" align="center"><a accesskey="h" href="index.html" title="PostgreSQL 18.0 Documentation">Home</a></td><td width="40%" align="right" valign="top"> Chapter 60. Writing a Custom Scan Provider</td></tr></table></div></body></html>