diff options
author | Tom Lane | 2009-07-16 20:55:44 +0000 |
---|---|---|
committer | Tom Lane | 2009-07-16 20:55:44 +0000 |
commit | f5bc74192d2ffb32952a06c62b3458d28ff7f98f (patch) | |
tree | d582b83c6ba2ff21d2970660806d353c6ac496ee /doc | |
parent | c43feefa806c81d68115ed03a7f723720cefad31 (diff) |
Make GEQO's planning deterministic by having it start from a predictable
random number seed each time. This is how it used to work years ago, but
we got rid of the seed reset because it was resetting the main random()
sequence and thus having undesirable effects on the rest of the system.
To fix, establish a private random number state for each execution of
geqo(), and initialize the state using the new GUC variable geqo_seed.
People who want to experiment with different random searches can do so
by changing geqo_seed, but you'll always get the same plan for the same
value of geqo_seed (if holding all other planner inputs constant, of course).
The new state is kept in PlannerInfo by adding a "void *" field reserved
for use by join_search hooks. Most of the rather bulky code changes in
this commit are just arranging to pass PlannerInfo around to all the GEQO
functions (many of which formerly didn't receive it).
Andres Freund, with some editorialization by Tom
Diffstat (limited to 'doc')
-rw-r--r-- | doc/src/sgml/config.sgml | 20 | ||||
-rw-r--r-- | doc/src/sgml/geqo.sgml | 21 |
2 files changed, 30 insertions, 11 deletions
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 3b527a7ecbd..a86ba6089a4 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -1,4 +1,4 @@ -<!-- $PostgreSQL: pgsql/doc/src/sgml/config.sgml,v 1.221 2009/07/03 19:14:25 petere Exp $ --> +<!-- $PostgreSQL: pgsql/doc/src/sgml/config.sgml,v 1.222 2009/07/16 20:55:44 tgl Exp $ --> <chapter Id="runtime-config"> <title>Server Configuration</title> @@ -2149,7 +2149,23 @@ archive_command = 'copy "%p" "C:\\server\\archivedir\\%f"' # Windows </para> </listitem> </varlistentry> - + + <varlistentry id="guc-geqo-seed" xreflabel="geqo_seed"> + <term><varname>geqo_seed</varname> (<type>floating point</type>)</term> + <indexterm> + <primary><varname>geqo_seed</> configuration parameter</primary> + </indexterm> + <listitem> + <para> + Controls the initial value of the random number generator used + by GEQO to select random paths through the join order search space. + The value can range from zero (the default) to one. Varying the + value changes the set of join paths explored, and may result in a + better or worse best path being found. + </para> + </listitem> + </varlistentry> + </variablelist> </sect2> <sect2 id="runtime-config-query-other"> diff --git a/doc/src/sgml/geqo.sgml b/doc/src/sgml/geqo.sgml index 2f680762c13..97961272a4a 100644 --- a/doc/src/sgml/geqo.sgml +++ b/doc/src/sgml/geqo.sgml @@ -1,4 +1,4 @@ -<!-- $PostgreSQL: pgsql/doc/src/sgml/geqo.sgml,v 1.40 2007/07/21 04:02:41 tgl Exp $ --> +<!-- $PostgreSQL: pgsql/doc/src/sgml/geqo.sgml,v 1.41 2009/07/16 20:55:44 tgl Exp $ --> <chapter id="geqo"> <chapterinfo> @@ -49,7 +49,7 @@ methods</firstterm> (e.g., nested loop, hash join, merge join in <productname>PostgreSQL</productname>) to process individual joins and a diversity of <firstterm>indexes</firstterm> (e.g., - B-tree, hash, GiST and GIN in <productname>PostgreSQL</productname>) as + B-tree, hash, GiST and GIN in <productname>PostgreSQL</productname>) as access paths for relations. </para> @@ -88,8 +88,7 @@ <para> The genetic algorithm (<acronym>GA</acronym>) is a heuristic optimization method which - operates through - nondeterministic, randomized search. The set of possible solutions for the + operates through randomized search. The set of possible solutions for the optimization problem is considered as a <firstterm>population</firstterm> of <firstterm>individuals</firstterm>. The degree of adaptation of an individual to its environment is specified @@ -116,7 +115,7 @@ According to the <systemitem class="resource">comp.ai.genetic</> <acronym>FAQ</acronym> it cannot be stressed too strongly that a <acronym>GA</acronym> is not a pure random search for a solution to a problem. A <acronym>GA</acronym> uses stochastic processes, but the result is distinctly - non-random (better than random). + non-random (better than random). </para> <figure id="geqo-diagram"> @@ -260,9 +259,13 @@ <para> This process is inherently nondeterministic, because of the randomized choices made during both the initial population selection and subsequent - <quote>mutation</> of the best candidates. Hence different plans may - be selected from one run to the next, resulting in varying run time - and varying output row order. + <quote>mutation</> of the best candidates. To avoid surprising changes + of the selected plan, each run of the GEQO algorithm restarts its + random number generator with the current <xref linkend="guc-geqo-seed"> + parameter setting. As long as <varname>geqo_seed</> and the other + GEQO parameters are kept fixed, the same plan will be generated for a + given query (and other planner inputs such as statistics). To experiment + with different search paths, try changing <varname>geqo_seed</>. </para> </sect2> @@ -330,7 +333,7 @@ url="news://comp.ai.genetic"></ulink>) </para> </listitem> - + <listitem> <para> <ulink url="http://www.red3d.com/cwr/evolve.html"> |