doc/src/sgml/planstats.sgml

   1 <!-- $PostgreSQL$ -->
   2
   3 <chapter id="planner-stats-details">
   4  <title>How the Planner Uses Statistics</title>
   5
   6   <para>
   7    This chapter builds on the material covered in <xref
   8    linkend="using-explain"> and <xref linkend="planner-stats"> to show some
   9    additional details about how the planner uses the
  10    system statistics to estimate the number of rows each part of a query might
  11    return. This is a significant part of the planning process,
  12    providing much of the raw material for cost calculation.
  13   </para>
  14
  15   <para>
  16    The intent of this chapter is not to document the code in detail,
  17    but to present an overview of how it works.
  18    This will perhaps ease the learning curve for someone who subsequently
  19    wishes to read the code.
  20   </para>
  21
  22  <sect1 id="row-estimation-examples">
  23   <title>Row Estimation Examples</title>
  24
  25   <indexterm zone="row-estimation-examples">
  26    <primary>row estimation</primary>
  27    <secondary>planner</secondary>
  28   </indexterm>
  29
  30   <para>
  31    The examples shown below use tables in the <productname>PostgreSQL</>
  32    regression test database.
  33    The outputs shown are taken from version 8.3.
  34    The behavior of earlier (or later) versions might vary.
  35    Note also that since <command>ANALYZE</> uses random sampling
  36    while producing statistics, the results will change slightly after
  37    any new <command>ANALYZE</>.
  38   </para>
  39
  40   <para>
  41    Let's start with a very simple query:
  42
  43 <programlisting>
  44 EXPLAIN SELECT * FROM tenk1;
  45
  46                          QUERY PLAN
  47 -------------------------------------------------------------
  48  Seq Scan on tenk1  (cost=0.00..458.00 rows=10000 width=244)
  49 </programlisting>
  50
  51    How the planner determines the cardinality of <structname>tenk1</structname>
  52    is covered in <xref linkend="planner-stats">, but is repeated here for
  53    completeness. The number of pages and rows is looked up in
  54    <structname>pg_class</structname>:
  55
  56 <programlisting>
  57 SELECT relpages, reltuples FROM pg_class WHERE relname = 'tenk1';
  58
  59  relpages | reltuples
  60 ----------+-----------
  61       358 |     10000
  62 </programlisting>
  63
  64     These numbers are current as of the last <command>VACUUM</> or
  65     <command>ANALYZE</> on the table.  The planner then fetches the
  66     actual current number of pages in the table (this is a cheap operation,
  67     not requiring a table scan).  If that is different from
  68     <structfield>relpages</structfield> then
  69     <structfield>reltuples</structfield> is scaled accordingly to
  70     arrive at a current number-of-rows estimate.  In this case the values
  71     are correct so the rows estimate is the same as
  72     <structfield>reltuples</structfield>.
  73   </para>
  74
  75   <para>
  76    Let's move on to an example with a range condition in its
  77    <literal>WHERE</literal> clause:
  78
  79 <programlisting>
  80 EXPLAIN SELECT * FROM tenk1 WHERE unique1 &lt; 1000;
  81
  82                                    QUERY PLAN
  83 --------------------------------------------------------------------------------
  84  Bitmap Heap Scan on tenk1  (cost=24.06..394.64 rows=1007 width=244)
  85    Recheck Cond: (unique1 &lt; 1000)
  86    -&gt;  Bitmap Index Scan on tenk1_unique1  (cost=0.00..23.80 rows=1007 width=0)
  87          Index Cond: (unique1 &lt; 1000)
  88 </programlisting>
  89
  90    The planner examines the <literal>WHERE</literal> clause condition
  91    and looks up the selectivity function for the operator
  92    <literal>&lt;</literal> in <structname>pg_operator</structname>.
  93    This is held in the column <structfield>oprrest</structfield>,
  94    and the entry in this case is <function>scalarltsel</function>.
  95    The <function>scalarltsel</function> function retrieves the histogram for
  96    <structfield>unique1</structfield> from
  97    <structname>pg_statistics</structname>.  For manual queries it is more
  98    convenient to look in the simpler <structname>pg_stats</structname>
  99    view:
 100
 101 <programlisting>
 102 SELECT histogram_bounds FROM pg_stats
 103 WHERE tablename='tenk1' AND attname='unique1';
 104
 105                    histogram_bounds
 106 ------------------------------------------------------
 107  {0,993,1997,3050,4040,5036,5957,7057,8029,9016,9995}
 108 </programlisting>
 109
 110    Next the fraction of the histogram occupied by <quote>&lt; 1000</quote>
 111    is worked out. This is the selectivity. The histogram divides the range
 112    into equal frequency buckets, so all we have to do is locate the bucket
 113    that our value is in and count <emphasis>part</emphasis> of it and
 114    <emphasis>all</emphasis> of the ones before. The value 1000 is clearly in
 115    the second bucket (993-1997).  Assuming a linear distribution of
 116    values inside each bucket, we can calculate the selectivity as:
 117
 118 <programlisting>
 119 selectivity = (1 + (1000 - bucket[2].min)/(bucket[2].max - bucket[2].min))/num_buckets
 120             = (1 + (1000 - 993)/(1997 - 993))/10
 121             = 0.100697
 122 </programlisting>
 123
 124    that is, one whole bucket plus a linear fraction of the second, divided by
 125    the number of buckets. The estimated number of rows can now be calculated as
 126    the product of the selectivity and the cardinality of
 127    <structname>tenk1</structname>:
 128
 129 <programlisting>
 130 rows = rel_cardinality * selectivity
 131      = 10000 * 0.100697
 132      = 1007  (rounding off)
 133 </programlisting>
 134   </para>
 135
 136   <para>
 137    Next let's consider an example with an equality condition in its
 138    <literal>WHERE</literal> clause:
 139
 140 <programlisting>
 141 EXPLAIN SELECT * FROM tenk1 WHERE stringu1 = 'CRAAAA';
 142
 143                         QUERY PLAN
 144 ----------------------------------------------------------
 145  Seq Scan on tenk1  (cost=0.00..483.00 rows=30 width=244)
 146    Filter: (stringu1 = 'CRAAAA'::name)
 147 </programlisting>
 148
 149    Again the planner examines the <literal>WHERE</literal> clause condition
 150    and looks up the selectivity function for <literal>=</literal>, which is
 151    <function>eqsel</function>.  For equality estimation the histogram is
 152    not useful; instead the list of <firstterm>most
 153    common values</> (<acronym>MCV</acronym>s) is used to determine the
 154    selectivity. Let's have a look at the MCVs, with some additional columns
 155    that will be useful later:
 156
 157 <programlisting>
 158 SELECT null_frac, n_distinct, most_common_vals, most_common_freqs FROM pg_stats
 159 WHERE tablename='tenk1' AND attname='stringu1';
 160
 161 null_frac         | 0
 162 n_distinct        | 676
 163 most_common_vals  | {EJAAAA,BBAAAA,CRAAAA,FCAAAA,FEAAAA,GSAAAA,JOAAAA,MCAAAA,NAAAAA,WGAAAA}
 164 most_common_freqs | {0.00333333,0.003,0.003,0.003,0.003,0.003,0.003,0.003,0.003,0.003}
 165
 166 </programlisting>
 167
 168    Since <literal>CRAAAA</> appears in the list of MCVs, the selectivity is
 169    merely the corresponding entry in the list of most common frequencies
 170    (<acronym>MCF</acronym>s):
 171
 172 <programlisting>
 173 selectivity = mcf[3]
 174             = 0.003
 175 </programlisting>
 176
 177    As before, the estimated number of rows is just the product of this with the
 178    cardinality of <structname>tenk1</structname>:
 179
 180 <programlisting>
 181 rows = 10000 * 0.003
 182      = 30
 183 </programlisting>
 184   </para>
 185
 186   <para>
 187    Now consider the same query, but with a constant that is not in the
 188    <acronym>MCV</acronym> list:
 189
 190 <programlisting>
 191 EXPLAIN SELECT * FROM tenk1 WHERE stringu1 = 'xxx';
 192
 193                         QUERY PLAN
 194 ----------------------------------------------------------
 195  Seq Scan on tenk1  (cost=0.00..483.00 rows=15 width=244)
 196    Filter: (stringu1 = 'xxx'::name)
 197 </programlisting>
 198
 199    This is quite a different problem: how to estimate the selectivity when the
 200    value is <emphasis>not</emphasis> in the <acronym>MCV</acronym> list.
 201    The approach is to use the fact that the value is not in the list,
 202    combined with the knowledge of the frequencies for all of the
 203    <acronym>MCV</acronym>s:
 204
 205 <programlisting>
 206 selectivity = (1 - sum(mvf))/(num_distinct - num_mcv)
 207             = (1 - (0.00333333 + 0.003 + 0.003 + 0.003 + 0.003 + 0.003 +
 208                     0.003 + 0.003 + 0.003 + 0.003))/(676 - 10)
 209             = 0.0014559
 210 </programlisting>
 211
 212    That is, add up all the frequencies for the <acronym>MCV</acronym>s and
 213    subtract them from one, then
 214    divide by the number of <emphasis>other</emphasis> distinct values.
 215    This amounts to assuming that the fraction of the column that is not any
 216    of the MCVs is evenly distributed among all the other distinct values.
 217    Notice that there are no null values so we don't have to worry about those
 218    (otherwise we'd subtract the null fraction from the numerator as well).
 219    The estimated number of rows is then calculated as usual:
 220
 221 <programlisting>
 222 rows = 10000 * 0.0014559
 223      = 15  (rounding off)
 224 </programlisting>
 225   </para>
 226
 227   <para>
 228    The previous example with <literal>unique1 &lt; 1000</> was an
 229    oversimplification of what <function>scalarltsel</function> really does;
 230    now that we have seen an example of the use of MCVs, we can fill in some
 231    more detail.  The example was correct as far as it went, because since
 232    <structfield>unique1</> is a unique column it has no MCVs (obviously, no
 233    value is any more common than any other value).  For a non-unique
 234    column, there will normally be both a histogram and an MCV list, and
 235    <emphasis>the histogram does not include the portion of the column
 236    population represented by the MCVs</>.  We do things this way because
 237    it allows more precise estimation.  In this situation
 238    <function>scalarltsel</function> directly applies the condition (e.g.,
 239    <quote>&lt; 1000</>) to each value of the MCV list, and adds up the
 240    frequencies of the MCVs for which the condition is true.  This gives
 241    an exact estimate of the selectivity within the portion of the table
 242    that is MCVs.  The histogram is then used in the same way as above
 243    to estimate the selectivity in the portion of the table that is not
 244    MCVs, and then the two numbers are combined to estimate the overall
 245    selectivity.  For example, consider
 246
 247 <programlisting>
 248 EXPLAIN SELECT * FROM tenk1 WHERE stringu1 &lt; 'IAAAAA';
 249
 250                          QUERY PLAN
 251 ------------------------------------------------------------
 252  Seq Scan on tenk1  (cost=0.00..483.00 rows=3077 width=244)
 253    Filter: (stringu1 &lt; 'IAAAAA'::name)
 254 </programlisting>
 255
 256    We already saw the MCV information for <structfield>stringu1</>,
 257    and here is its histogram:
 258
 259 <programlisting>
 260 SELECT histogram_bounds FROM pg_stats
 261 WHERE tablename='tenk1' AND attname='stringu1';
 262
 263                                 histogram_bounds
 264 --------------------------------------------------------------------------------
 265  {AAAAAA,CQAAAA,FRAAAA,IBAAAA,KRAAAA,NFAAAA,PSAAAA,SGAAAA,VAAAAA,XLAAAA,ZZAAAA}
 266 </programlisting>
 267
 268    Checking the MCV list, we find that the condition <literal>stringu1 &lt;
 269    'IAAAAA'</> is satisfied by the first six entries and not the last four,
 270    so the selectivity within the MCV part of the population is
 271
 272 <programlisting>
 273 selectivity = sum(relevant mvfs)
 274             = 0.00333333 + 0.003 + 0.003 + 0.003 + 0.003 + 0.003
 275             = 0.01833333
 276 </programlisting>
 277
 278    Summing all the MCFs also tells us that the total fraction of the
 279    population represented by MCVs is 0.03033333, and therefore the
 280    fraction represented by the histogram is 0.96966667 (again, there
 281    are no nulls, else we'd have to exclude them here).  We can see
 282    that the value <literal>IAAAAA</> falls nearly at the end of the
 283    third histogram bucket.  Using some rather cheesy assumptions
 284    about the frequency of different characters, the planner arrives
 285    at the estimate 0.298387 for the portion of the histogram population
 286    that is less than <literal>IAAAAA</>.  We then combine the estimates
 287    for the MCV and non-MCV populations:
 288
 289 <programlisting>
 290 selectivity = mcv_selectivity + histogram_selectivity * histogram_fraction
 291             = 0.01833333 + 0.298387 * 0.96966667
 292             = 0.307669
 293
 294 rows        = 10000 * 0.307669
 295             = 3077  (rounding off)
 296 </programlisting>
 297
 298    In this particular example, the correction from the MCV list is fairly
 299    small, because the column distribution is actually quite flat (the
 300    statistics showing these particular values as being more common than
 301    others are mostly due to sampling error).  In a more typical case where
 302    some values are significantly more common than others, this complicated
 303    process gives a useful improvement in accuracy because the selectivity
 304    for the most common values is found exactly.
 305   </para>
 306
 307   <para>
 308    Now let's consider a case with more than one
 309    condition in the <literal>WHERE</literal> clause:
 310
 311 <programlisting>
 312 EXPLAIN SELECT * FROM tenk1 WHERE unique1 &lt; 1000 AND stringu1 = 'xxx';
 313
 314                                    QUERY PLAN
 315 --------------------------------------------------------------------------------
 316  Bitmap Heap Scan on tenk1  (cost=23.80..396.91 rows=1 width=244)
 317    Recheck Cond: (unique1 &lt; 1000)
 318    Filter: (stringu1 = 'xxx'::name)
 319    -&gt;  Bitmap Index Scan on tenk1_unique1  (cost=0.00..23.80 rows=1007 width=0)
 320          Index Cond: (unique1 &lt; 1000)
 321 </programlisting>
 322
 323    The planner assumes that the two conditions are independent, so that
 324    the individual selectivities of the clauses can be multiplied together:
 325
 326 <programlisting>
 327 selectivity = selectivity(unique1 &lt; 1000) * selectivity(stringu1 = 'xxx')
 328             = 0.100697 * 0.0014559
 329             = 0.0001466
 330
 331 rows        = 10000 * 0.0001466
 332             = 1  (rounding off)
 333 </programlisting>
 334
 335    Notice that the number of rows estimated to be returned from the bitmap
 336    index scan reflects only the condition used with the index; this is
 337    important since it affects the cost estimate for the subsequent heap
 338    fetches.
 339   </para>
 340
 341   <para>
 342    Finally we will examine a query that involves a join:
 343
 344 <programlisting>
 345 EXPLAIN SELECT * FROM tenk1 t1, tenk2 t2
 346 WHERE t1.unique1 &lt; 50 AND t1.unique2 = t2.unique2;
 347
 348                                       QUERY PLAN
 349 --------------------------------------------------------------------------------------
 350  Nested Loop  (cost=4.64..456.23 rows=50 width=488)
 351    -&gt;  Bitmap Heap Scan on tenk1 t1  (cost=4.64..142.17 rows=50 width=244)
 352          Recheck Cond: (unique1 &lt; 50)
 353          -&gt;  Bitmap Index Scan on tenk1_unique1  (cost=0.00..4.63 rows=50 width=0)
 354                Index Cond: (unique1 &lt; 50)
 355    -&gt;  Index Scan using tenk2_unique2 on tenk2 t2  (cost=0.00..6.27 rows=1 width=244)
 356          Index Cond: (t2.unique2 = t1.unique2)
 357 </programlisting>
 358
 359    The restriction on <structname>tenk1</structname>,
 360    <literal>unique1 &lt; 50</literal>,
 361    is evaluated before the nested-loop join.
 362    This is handled analogously to the previous range example.  This time the
 363    value 50 falls into the first bucket of the
 364    <structfield>unique1</structfield> histogram:
 365
 366 <programlisting>
 367 selectivity = (0 + (50 - bucket[1].min)/(bucket[1].max - bucket[1].min))/num_buckets
 368             = (0 + (50 - 0)/(993 - 0))/10
 369             = 0.005035
 370
 371 rows        = 10000 * 0.005035
 372             = 50  (rounding off)
 373 </programlisting>
 374
 375    The restriction for the join is <literal>t2.unique2 = t1.unique2</>.
 376    The operator is just
 377    our familiar <literal>=</literal>, however the selectivity function is
 378    obtained from the <structfield>oprjoin</structfield> column of
 379    <structname>pg_operator</structname>, and is <function>eqjoinsel</function>.
 380    <function>eqjoinsel</function> looks up the statistical information for both
 381    <structname>tenk2</structname> and <structname>tenk1</structname>:
 382
 383 <programlisting>
 384 SELECT tablename, null_frac,n_distinct, most_common_vals FROM pg_stats
 385 WHERE tablename IN ('tenk1', 'tenk2') AND attname='unique2';
 386
 387 tablename  | null_frac | n_distinct | most_common_vals
 388 -----------+-----------+------------+------------------
 389  tenk1     |         0 |         -1 |
 390  tenk2     |         0 |         -1 |
 391 </programlisting>
 392
 393    In this case there is no <acronym>MCV</acronym> information for
 394    <structfield>unique2</structfield> because all the values appear to be
 395    unique, so we use an algorithm that relies only on the number of
 396    distinct values for both relations together with their null fractions:
 397
 398 <programlisting>
 399 selectivity = (1 - null_frac1) * (1 - null_frac2) * min(1/num_distinct1, 1/num_distinct2)
 400             = (1 - 0) * (1 - 0) / max(10000, 10000)
 401             = 0.0001
 402 </programlisting>
 403
 404    This is, subtract the null fraction from one for each of the relations,
 405    and divide by the maximum of the numbers of distinct values.
 406    The number of rows
 407    that the join is likely to emit is calculated as the cardinality of the
 408    Cartesian product of the two inputs, multiplied by the
 409    selectivity:
 410
 411 <programlisting>
 412 rows = (outer_cardinality * inner_cardinality) * selectivity
 413      = (50 * 10000) * 0.0001
 414      = 50
 415 </programlisting>
 416   </para>
 417
 418   <para>
 419    Had there been MCV lists for the two columns,
 420    <function>eqjoinsel</function> would have used direct comparison of the MCV
 421    lists to determine the join selectivity within the part of the column
 422    populations represented by the MCVs.  The estimate for the remainder of the
 423    populations follows the same approach shown here.
 424   </para>
 425
 426   <para>
 427    Notice that we showed <literal>inner_cardinality</> as 10000, that is,
 428    the unmodified size of <structname>tenk2</>.  It might appear from
 429    inspection of the <command>EXPLAIN</> output that the estimate of
 430    join rows comes from 50 * 1, that is, the number of outer rows times
 431    the estimated number of rows obtained by each inner indexscan on
 432    <structname>tenk2</>.  But this is not the case: the join relation size
 433    is estimated before any particular join plan has been considered.  If
 434    everything is working well then the two ways of estimating the join
 435    size will produce about the same answer, but due to roundoff error and
 436    other factors they sometimes diverge significantly.
 437   </para>
 438
 439   <para>
 440    For those interested in further details, estimation of the size of
 441    a table (before any <literal>WHERE</> clauses) is done in
 442    <filename>src/backend/optimizer/util/plancat.c</filename>. The generic
 443    logic for clause selectivities is in
 444    <filename>src/backend/optimizer/path/clausesel.c</filename>.  The
 445    operator-specific selectivity functions are mostly found
 446    in <filename>src/backend/utils/adt/selfuncs.c</filename>.
 447   </para>
 448
 449  </sect1>
 450
 451 </chapter>