summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authorTom Lane2009-07-16 20:55:44 +0000
committerTom Lane2009-07-16 20:55:44 +0000
commitf5bc74192d2ffb32952a06c62b3458d28ff7f98f (patch)
treed582b83c6ba2ff21d2970660806d353c6ac496ee /doc
parentc43feefa806c81d68115ed03a7f723720cefad31 (diff)
Make GEQO's planning deterministic by having it start from a predictable
random number seed each time. This is how it used to work years ago, but we got rid of the seed reset because it was resetting the main random() sequence and thus having undesirable effects on the rest of the system. To fix, establish a private random number state for each execution of geqo(), and initialize the state using the new GUC variable geqo_seed. People who want to experiment with different random searches can do so by changing geqo_seed, but you'll always get the same plan for the same value of geqo_seed (if holding all other planner inputs constant, of course). The new state is kept in PlannerInfo by adding a "void *" field reserved for use by join_search hooks. Most of the rather bulky code changes in this commit are just arranging to pass PlannerInfo around to all the GEQO functions (many of which formerly didn't receive it). Andres Freund, with some editorialization by Tom
Diffstat (limited to 'doc')
-rw-r--r--doc/src/sgml/config.sgml20
-rw-r--r--doc/src/sgml/geqo.sgml21
2 files changed, 30 insertions, 11 deletions
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 3b527a7ecbd..a86ba6089a4 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1,4 +1,4 @@
-<!-- $PostgreSQL: pgsql/doc/src/sgml/config.sgml,v 1.221 2009/07/03 19:14:25 petere Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/config.sgml,v 1.222 2009/07/16 20:55:44 tgl Exp $ -->
<chapter Id="runtime-config">
<title>Server Configuration</title>
@@ -2149,7 +2149,23 @@ archive_command = 'copy "%p" "C:\\server\\archivedir\\%f"' # Windows
</para>
</listitem>
</varlistentry>
-
+
+ <varlistentry id="guc-geqo-seed" xreflabel="geqo_seed">
+ <term><varname>geqo_seed</varname> (<type>floating point</type>)</term>
+ <indexterm>
+ <primary><varname>geqo_seed</> configuration parameter</primary>
+ </indexterm>
+ <listitem>
+ <para>
+ Controls the initial value of the random number generator used
+ by GEQO to select random paths through the join order search space.
+ The value can range from zero (the default) to one. Varying the
+ value changes the set of join paths explored, and may result in a
+ better or worse best path being found.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
<sect2 id="runtime-config-query-other">
diff --git a/doc/src/sgml/geqo.sgml b/doc/src/sgml/geqo.sgml
index 2f680762c13..97961272a4a 100644
--- a/doc/src/sgml/geqo.sgml
+++ b/doc/src/sgml/geqo.sgml
@@ -1,4 +1,4 @@
-<!-- $PostgreSQL: pgsql/doc/src/sgml/geqo.sgml,v 1.40 2007/07/21 04:02:41 tgl Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/geqo.sgml,v 1.41 2009/07/16 20:55:44 tgl Exp $ -->
<chapter id="geqo">
<chapterinfo>
@@ -49,7 +49,7 @@
methods</firstterm> (e.g., nested loop, hash join, merge join in
<productname>PostgreSQL</productname>) to process individual joins
and a diversity of <firstterm>indexes</firstterm> (e.g.,
- B-tree, hash, GiST and GIN in <productname>PostgreSQL</productname>) as
+ B-tree, hash, GiST and GIN in <productname>PostgreSQL</productname>) as
access paths for relations.
</para>
@@ -88,8 +88,7 @@
<para>
The genetic algorithm (<acronym>GA</acronym>) is a heuristic optimization method which
- operates through
- nondeterministic, randomized search. The set of possible solutions for the
+ operates through randomized search. The set of possible solutions for the
optimization problem is considered as a
<firstterm>population</firstterm> of <firstterm>individuals</firstterm>.
The degree of adaptation of an individual to its environment is specified
@@ -116,7 +115,7 @@
According to the <systemitem class="resource">comp.ai.genetic</> <acronym>FAQ</acronym> it cannot be stressed too
strongly that a <acronym>GA</acronym> is not a pure random search for a solution to a
problem. A <acronym>GA</acronym> uses stochastic processes, but the result is distinctly
- non-random (better than random).
+ non-random (better than random).
</para>
<figure id="geqo-diagram">
@@ -260,9 +259,13 @@
<para>
This process is inherently nondeterministic, because of the randomized
choices made during both the initial population selection and subsequent
- <quote>mutation</> of the best candidates. Hence different plans may
- be selected from one run to the next, resulting in varying run time
- and varying output row order.
+ <quote>mutation</> of the best candidates. To avoid surprising changes
+ of the selected plan, each run of the GEQO algorithm restarts its
+ random number generator with the current <xref linkend="guc-geqo-seed">
+ parameter setting. As long as <varname>geqo_seed</> and the other
+ GEQO parameters are kept fixed, the same plan will be generated for a
+ given query (and other planner inputs such as statistics). To experiment
+ with different search paths, try changing <varname>geqo_seed</>.
</para>
</sect2>
@@ -330,7 +333,7 @@
url="news://comp.ai.genetic"></ulink>)
</para>
</listitem>
-
+
<listitem>
<para>
<ulink url="http://www.red3d.com/cwr/evolve.html">