begriffs open source - ai-pg/blob - full-docs/txt/dynamic-trace.txt

   1
   2 27.5. Dynamic Tracing #
   3
   4    27.5.1. Compiling for Dynamic Tracing
   5    27.5.2. Built-in Probes
   6    27.5.3. Using Probes
   7    27.5.4. Defining New Probes
   8
   9    PostgreSQL provides facilities to support dynamic tracing of the
  10    database server. This allows an external utility to be called at
  11    specific points in the code and thereby trace execution.
  12
  13    A number of probes or trace points are already inserted into the source
  14    code. These probes are intended to be used by database developers and
  15    administrators. By default the probes are not compiled into PostgreSQL;
  16    the user needs to explicitly tell the configure script to make the
  17    probes available.
  18
  19    Currently, the DTrace utility is supported, which, at the time of this
  20    writing, is available on Solaris, macOS, FreeBSD, NetBSD, and Oracle
  21    Linux. The SystemTap project for Linux provides a DTrace equivalent and
  22    can also be used. Supporting other dynamic tracing utilities is
  23    theoretically possible by changing the definitions for the macros in
  24    src/include/utils/probes.h.
  25
  26 27.5.1. Compiling for Dynamic Tracing #
  27
  28    By default, probes are not available, so you will need to explicitly
  29    tell the configure script to make the probes available in PostgreSQL.
  30    To include DTrace support specify --enable-dtrace to configure. See
  31    Section 17.3.3.6 for further information.
  32
  33 27.5.2. Built-in Probes #
  34
  35    A number of standard probes are provided in the source code, as shown
  36    in Table 27.49; Table 27.50 shows the types used in the probes. More
  37    probes can certainly be added to enhance PostgreSQL's observability.
  38
  39    Table 27.49. Built-in DTrace Probes
  40    Name Parameters Description
  41    transaction-start (LocalTransactionId) Probe that fires at the start of
  42    a new transaction. arg0 is the transaction ID.
  43    transaction-commit (LocalTransactionId) Probe that fires when a
  44    transaction completes successfully. arg0 is the transaction ID.
  45    transaction-abort (LocalTransactionId) Probe that fires when a
  46    transaction completes unsuccessfully. arg0 is the transaction ID.
  47    query-start (const char *) Probe that fires when the processing of a
  48    query is started. arg0 is the query string.
  49    query-done (const char *) Probe that fires when the processing of a
  50    query is complete. arg0 is the query string.
  51    query-parse-start (const char *) Probe that fires when the parsing of a
  52    query is started. arg0 is the query string.
  53    query-parse-done (const char *) Probe that fires when the parsing of a
  54    query is complete. arg0 is the query string.
  55    query-rewrite-start (const char *) Probe that fires when the rewriting
  56    of a query is started. arg0 is the query string.
  57    query-rewrite-done (const char *) Probe that fires when the rewriting
  58    of a query is complete. arg0 is the query string.
  59    query-plan-start () Probe that fires when the planning of a query is
  60    started.
  61    query-plan-done () Probe that fires when the planning of a query is
  62    complete.
  63    query-execute-start () Probe that fires when the execution of a query
  64    is started.
  65    query-execute-done () Probe that fires when the execution of a query is
  66    complete.
  67    statement-status (const char *) Probe that fires anytime the server
  68    process updates its pg_stat_activity.status. arg0 is the new status
  69    string.
  70    checkpoint-start (int) Probe that fires when a checkpoint is started.
  71    arg0 holds the bitwise flags used to distinguish different checkpoint
  72    types, such as shutdown, immediate or force.
  73    checkpoint-done (int, int, int, int, int) Probe that fires when a
  74    checkpoint is complete. (The probes listed next fire in sequence during
  75    checkpoint processing.) arg0 is the number of buffers written. arg1 is
  76    the total number of buffers. arg2, arg3 and arg4 contain the number of
  77    WAL files added, removed and recycled respectively.
  78    clog-checkpoint-start (bool) Probe that fires when the CLOG portion of
  79    a checkpoint is started. arg0 is true for normal checkpoint, false for
  80    shutdown checkpoint.
  81    clog-checkpoint-done (bool) Probe that fires when the CLOG portion of a
  82    checkpoint is complete. arg0 has the same meaning as for
  83    clog-checkpoint-start.
  84    subtrans-checkpoint-start (bool) Probe that fires when the SUBTRANS
  85    portion of a checkpoint is started. arg0 is true for normal checkpoint,
  86    false for shutdown checkpoint.
  87    subtrans-checkpoint-done (bool) Probe that fires when the SUBTRANS
  88    portion of a checkpoint is complete. arg0 has the same meaning as for
  89    subtrans-checkpoint-start.
  90    multixact-checkpoint-start (bool) Probe that fires when the MultiXact
  91    portion of a checkpoint is started. arg0 is true for normal checkpoint,
  92    false for shutdown checkpoint.
  93    multixact-checkpoint-done (bool) Probe that fires when the MultiXact
  94    portion of a checkpoint is complete. arg0 has the same meaning as for
  95    multixact-checkpoint-start.
  96    buffer-checkpoint-start (int) Probe that fires when the buffer-writing
  97    portion of a checkpoint is started. arg0 holds the bitwise flags used
  98    to distinguish different checkpoint types, such as shutdown, immediate
  99    or force.
 100    buffer-sync-start (int, int) Probe that fires when we begin to write
 101    dirty buffers during checkpoint (after identifying which buffers must
 102    be written). arg0 is the total number of buffers. arg1 is the number
 103    that are currently dirty and need to be written.
 104    buffer-sync-written (int) Probe that fires after each buffer is written
 105    during checkpoint. arg0 is the ID number of the buffer.
 106    buffer-sync-done (int, int, int) Probe that fires when all dirty
 107    buffers have been written. arg0 is the total number of buffers. arg1 is
 108    the number of buffers actually written by the checkpoint process. arg2
 109    is the number that were expected to be written (arg1 of
 110    buffer-sync-start); any difference reflects other processes flushing
 111    buffers during the checkpoint.
 112    buffer-checkpoint-sync-start () Probe that fires after dirty buffers
 113    have been written to the kernel, and before starting to issue fsync
 114    requests.
 115    buffer-checkpoint-done () Probe that fires when syncing of buffers to
 116    disk is complete.
 117    twophase-checkpoint-start () Probe that fires when the two-phase
 118    portion of a checkpoint is started.
 119    twophase-checkpoint-done () Probe that fires when the two-phase portion
 120    of a checkpoint is complete.
 121    buffer-extend-start (ForkNumber, BlockNumber, Oid, Oid, Oid, int,
 122    unsigned int) Probe that fires when a relation extension starts. arg0
 123    contains the fork to be extended. arg1, arg2, and arg3 contain the
 124    tablespace, database, and relation OIDs identifying the relation. arg4
 125    is the ID of the backend which created the temporary relation for a
 126    local buffer, or INVALID_PROC_NUMBER (-1) for a shared buffer. arg5 is
 127    the number of blocks the caller would like to extend by.
 128    buffer-extend-done (ForkNumber, BlockNumber, Oid, Oid, Oid, int,
 129    unsigned int, BlockNumber) Probe that fires when a relation extension
 130    is complete. arg0 contains the fork to be extended. arg1, arg2, and
 131    arg3 contain the tablespace, database, and relation OIDs identifying
 132    the relation. arg4 is the ID of the backend which created the temporary
 133    relation for a local buffer, or INVALID_PROC_NUMBER (-1) for a shared
 134    buffer. arg5 is the number of blocks the relation was extended by, this
 135    can be less than the number in the buffer-extend-start due to resource
 136    constraints. arg6 contains the BlockNumber of the first new block.
 137    buffer-read-start (ForkNumber, BlockNumber, Oid, Oid, Oid, int) Probe
 138    that fires when a buffer read is started. arg0 and arg1 contain the
 139    fork and block numbers of the page. arg2, arg3, and arg4 contain the
 140    tablespace, database, and relation OIDs identifying the relation. arg5
 141    is the ID of the backend which created the temporary relation for a
 142    local buffer, or INVALID_PROC_NUMBER (-1) for a shared buffer.
 143    buffer-read-done (ForkNumber, BlockNumber, Oid, Oid, Oid, int, bool)
 144    Probe that fires when a buffer read is complete. arg0 and arg1 contain
 145    the fork and block numbers of the page. arg2, arg3, and arg4 contain
 146    the tablespace, database, and relation OIDs identifying the relation.
 147    arg5 is the ID of the backend which created the temporary relation for
 148    a local buffer, or INVALID_PROC_NUMBER (-1) for a shared buffer. arg6
 149    is true if the buffer was found in the pool, false if not.
 150    buffer-flush-start (ForkNumber, BlockNumber, Oid, Oid, Oid) Probe that
 151    fires before issuing any write request for a shared buffer. arg0 and
 152    arg1 contain the fork and block numbers of the page. arg2, arg3, and
 153    arg4 contain the tablespace, database, and relation OIDs identifying
 154    the relation.
 155    buffer-flush-done (ForkNumber, BlockNumber, Oid, Oid, Oid) Probe that
 156    fires when a write request is complete. (Note that this just reflects
 157    the time to pass the data to the kernel; it's typically not actually
 158    been written to disk yet.) The arguments are the same as for
 159    buffer-flush-start.
 160    wal-buffer-write-dirty-start () Probe that fires when a server process
 161    begins to write a dirty WAL buffer because no more WAL buffer space is
 162    available. (If this happens often, it implies that wal_buffers is too
 163    small.)
 164    wal-buffer-write-dirty-done () Probe that fires when a dirty WAL buffer
 165    write is complete.
 166    wal-insert (unsigned char, unsigned char) Probe that fires when a WAL
 167    record is inserted. arg0 is the resource manager (rmid) for the record.
 168    arg1 contains the info flags.
 169    wal-switch () Probe that fires when a WAL segment switch is requested.
 170    smgr-md-read-start (ForkNumber, BlockNumber, Oid, Oid, Oid, int) Probe
 171    that fires when beginning to read a block from a relation. arg0 and
 172    arg1 contain the fork and block numbers of the page. arg2, arg3, and
 173    arg4 contain the tablespace, database, and relation OIDs identifying
 174    the relation. arg5 is the ID of the backend which created the temporary
 175    relation for a local buffer, or INVALID_PROC_NUMBER (-1) for a shared
 176    buffer.
 177    smgr-md-read-done (ForkNumber, BlockNumber, Oid, Oid, Oid, int, int,
 178    int) Probe that fires when a block read is complete. arg0 and arg1
 179    contain the fork and block numbers of the page. arg2, arg3, and arg4
 180    contain the tablespace, database, and relation OIDs identifying the
 181    relation. arg5 is the ID of the backend which created the temporary
 182    relation for a local buffer, or INVALID_PROC_NUMBER (-1) for a shared
 183    buffer. arg6 is the number of bytes actually read, while arg7 is the
 184    number requested (if these are different it indicates a short read).
 185    smgr-md-write-start (ForkNumber, BlockNumber, Oid, Oid, Oid, int) Probe
 186    that fires when beginning to write a block to a relation. arg0 and arg1
 187    contain the fork and block numbers of the page. arg2, arg3, and arg4
 188    contain the tablespace, database, and relation OIDs identifying the
 189    relation. arg5 is the ID of the backend which created the temporary
 190    relation for a local buffer, or INVALID_PROC_NUMBER (-1) for a shared
 191    buffer.
 192    smgr-md-write-done (ForkNumber, BlockNumber, Oid, Oid, Oid, int, int,
 193    int) Probe that fires when a block write is complete. arg0 and arg1
 194    contain the fork and block numbers of the page. arg2, arg3, and arg4
 195    contain the tablespace, database, and relation OIDs identifying the
 196    relation. arg5 is the ID of the backend which created the temporary
 197    relation for a local buffer, or INVALID_PROC_NUMBER (-1) for a shared
 198    buffer. arg6 is the number of bytes actually written, while arg7 is the
 199    number requested (if these are different it indicates a short write).
 200    sort-start (int, bool, int, int, bool, int) Probe that fires when a
 201    sort operation is started. arg0 indicates heap, index or datum sort.
 202    arg1 is true for unique-value enforcement. arg2 is the number of key
 203    columns. arg3 is the number of kilobytes of work memory allowed. arg4
 204    is true if random access to the sort result is required. arg5 indicates
 205    serial when 0, parallel worker when 1, or parallel leader when 2.
 206    sort-done (bool, long) Probe that fires when a sort is complete. arg0
 207    is true for external sort, false for internal sort. arg1 is the number
 208    of disk blocks used for an external sort, or kilobytes of memory used
 209    for an internal sort.
 210    lwlock-acquire (char *, LWLockMode) Probe that fires when an LWLock has
 211    been acquired. arg0 is the LWLock's tranche. arg1 is the requested lock
 212    mode, either exclusive or shared.
 213    lwlock-release (char *) Probe that fires when an LWLock has been
 214    released (but note that any released waiters have not yet been
 215    awakened). arg0 is the LWLock's tranche.
 216    lwlock-wait-start (char *, LWLockMode) Probe that fires when an LWLock
 217    was not immediately available and a server process has begun to wait
 218    for the lock to become available. arg0 is the LWLock's tranche. arg1 is
 219    the requested lock mode, either exclusive or shared.
 220    lwlock-wait-done (char *, LWLockMode) Probe that fires when a server
 221    process has been released from its wait for an LWLock (it does not
 222    actually have the lock yet). arg0 is the LWLock's tranche. arg1 is the
 223    requested lock mode, either exclusive or shared.
 224    lwlock-condacquire (char *, LWLockMode) Probe that fires when an LWLock
 225    was successfully acquired when the caller specified no waiting. arg0 is
 226    the LWLock's tranche. arg1 is the requested lock mode, either exclusive
 227    or shared.
 228    lwlock-condacquire-fail (char *, LWLockMode) Probe that fires when an
 229    LWLock was not successfully acquired when the caller specified no
 230    waiting. arg0 is the LWLock's tranche. arg1 is the requested lock mode,
 231    either exclusive or shared.
 232    lock-wait-start (unsigned int, unsigned int, unsigned int, unsigned
 233    int, unsigned int, LOCKMODE) Probe that fires when a request for a
 234    heavyweight lock (lmgr lock) has begun to wait because the lock is not
 235    available. arg0 through arg3 are the tag fields identifying the object
 236    being locked. arg4 indicates the type of object being locked. arg5
 237    indicates the lock type being requested.
 238    lock-wait-done (unsigned int, unsigned int, unsigned int, unsigned int,
 239    unsigned int, LOCKMODE) Probe that fires when a request for a
 240    heavyweight lock (lmgr lock) has finished waiting (i.e., has acquired
 241    the lock). The arguments are the same as for lock-wait-start.
 242    deadlock-found () Probe that fires when a deadlock is found by the
 243    deadlock detector.
 244
 245    Table 27.50. Defined Types Used in Probe Parameters
 246           Type         Definition
 247    LocalTransactionId unsigned int
 248    LWLockMode         int
 249    LOCKMODE           int
 250    BlockNumber        unsigned int
 251    Oid                unsigned int
 252    ForkNumber         int
 253    bool               unsigned char
 254
 255 27.5.3. Using Probes #
 256
 257    The example below shows a DTrace script for analyzing transaction
 258    counts in the system, as an alternative to snapshotting
 259    pg_stat_database before and after a performance test:
 260 #!/usr/sbin/dtrace -qs
 261
 262 postgresql$1:::transaction-start
 263 {
 264       @start["Start"] = count();
 265       self->ts  = timestamp;
 266 }
 267
 268 postgresql$1:::transaction-abort
 269 {
 270       @abort["Abort"] = count();
 271 }
 272
 273 postgresql$1:::transaction-commit
 274 /self->ts/
 275 {
 276       @commit["Commit"] = count();
 277       @time["Total time (ns)"] = sum(timestamp - self->ts);
 278       self->ts=0;
 279 }
 280
 281    When executed, the example D script gives output such as:
 282 # ./txn_count.d `pgrep -n postgres` or ./txn_count.d <PID>
 283 ^C
 284
 285 Start                                          71
 286 Commit                                         70
 287 Total time (ns)                        2312105013
 288
 289 Note
 290
 291    SystemTap uses a different notation for trace scripts than DTrace does,
 292    even though the underlying trace points are compatible. One point worth
 293    noting is that at this writing, SystemTap scripts must reference probe
 294    names using double underscores in place of hyphens. This is expected to
 295    be fixed in future SystemTap releases.
 296
 297    You should remember that DTrace scripts need to be carefully written
 298    and debugged, otherwise the trace information collected might be
 299    meaningless. In most cases where problems are found it is the
 300    instrumentation that is at fault, not the underlying system. When
 301    discussing information found using dynamic tracing, be sure to enclose
 302    the script used to allow that too to be checked and discussed.
 303
 304 27.5.4. Defining New Probes #
 305
 306    New probes can be defined within the code wherever the developer
 307    desires, though this will require a recompilation. Below are the steps
 308    for inserting new probes:
 309     1. Decide on probe names and data to be made available through the
 310        probes
 311     2. Add the probe definitions to src/backend/utils/probes.d
 312     3. Include pg_trace.h if it is not already present in the module(s)
 313        containing the probe points, and insert TRACE_POSTGRESQL probe
 314        macros at the desired locations in the source code
 315     4. Recompile and verify that the new probes are available
 316
 317    Example:  Here is an example of how you would add a probe to trace all
 318    new transactions by transaction ID.
 319     1. Decide that the probe will be named transaction-start and requires
 320        a parameter of type LocalTransactionId
 321     2. Add the probe definition to src/backend/utils/probes.d:
 322 probe transaction__start(LocalTransactionId);
 323
 324        Note the use of the double underline in the probe name. In a DTrace
 325        script using the probe, the double underline needs to be replaced
 326        with a hyphen, so transaction-start is the name to document for
 327        users.
 328     3. At compile time, transaction__start is converted to a macro called
 329        TRACE_POSTGRESQL_TRANSACTION_START (notice the underscores are
 330        single here), which is available by including pg_trace.h. Add the
 331        macro call to the appropriate location in the source code. In this
 332        case, it looks like the following:
 333 TRACE_POSTGRESQL_TRANSACTION_START(vxid.localTransactionId);
 334
 335     4. After recompiling and running the new binary, check that your newly
 336        added probe is available by executing the following DTrace command.
 337        You should see similar output:
 338 # dtrace -ln transaction-start
 339    ID    PROVIDER          MODULE           FUNCTION NAME
 340 18705 postgresql49878     postgres     StartTransactionCommand transaction-start
 341 18755 postgresql49877     postgres     StartTransactionCommand transaction-start
 342 18805 postgresql49876     postgres     StartTransactionCommand transaction-start
 343 18855 postgresql49875     postgres     StartTransactionCommand transaction-start
 344 18986 postgresql49873     postgres     StartTransactionCommand transaction-start
 345
 346    There are a few things to be careful about when adding trace macros to
 347    the C code:
 348      * You should take care that the data types specified for a probe's
 349        parameters match the data types of the variables used in the macro.
 350        Otherwise, you will get compilation errors.
 351      * On most platforms, if PostgreSQL is built with --enable-dtrace, the
 352        arguments to a trace macro will be evaluated whenever control
 353        passes through the macro, even if no tracing is being done. This is
 354        usually not worth worrying about if you are just reporting the
 355        values of a few local variables. But beware of putting expensive
 356        function calls into the arguments. If you need to do that, consider
 357        protecting the macro with a check to see if the trace is actually
 358        enabled:
 359 if (TRACE_POSTGRESQL_TRANSACTION_START_ENABLED())
 360     TRACE_POSTGRESQL_TRANSACTION_START(some_function(...));
 361
 362        Each trace macro has a corresponding ENABLED macro.