summaryrefslogtreecommitdiff
path: root/doc/src/sgml/tsearch2.sgml
blob: 8321c1efb4f4731065c47f0cb6b3d7046f5aefff (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
<!-- doc/src/sgml/tsearch2.sgml -->

<sect1 id="tsearch2">
 <title>tsearch2</title>

 <indexterm zone="tsearch2">
  <primary>tsearch2</primary>
 </indexterm>

 <para>
  The <application>tsearch2</> module provides backwards-compatible
  text search functionality for applications that used
  <application>tsearch2</> before text searching was integrated
  into core <productname>PostgreSQL</productname> in release 8.3.
 </para>

 <sect2>
  <title>Portability Issues</title>

  <para>
   Although the built-in text search features were based on
   <application>tsearch2</> and are largely similar to it,
   there are numerous small differences that will create portability
   issues for existing applications:
  </para>

  <itemizedlist mark="bullet">
   <listitem>
    <para>
     Some functions' names were changed, for example <function>rank</>
     to <function>ts_rank</>.
     The replacement <literal>tsearch2</literal> module
     provides aliases having the old names.
    </para>
   </listitem>

   <listitem>
    <para>
     The built-in text search data types and functions all exist within
     the system schema <literal>pg_catalog</>.  In an installation using
     <application>tsearch2</>, these objects would usually have been in
     the <literal>public</> schema, though some users chose to place them
     in a separate schema of their own.  Explicitly schema-qualified
     references to the objects will therefore fail in either case.
     The replacement <literal>tsearch2</literal> module
     provides alias objects that are stored in <literal>public</>
     (or another schema if necessary) so that such references will still work.
    </para>
   </listitem>

   <listitem>
    <para>
     There is no concept of a <quote>current parser</> or <quote>current
     dictionary</> in the built-in text search features, only of a current
     search configuration (set by the <varname>default_text_search_config</>
     parameter).  While the current parser and current dictionary were used
     only by functions intended for debugging, this might still pose
     a porting obstacle in some cases.
     The replacement <literal>tsearch2</literal> module emulates these
     additional state variables and provides backwards-compatible functions
     for setting and retrieving them.
    </para>
   </listitem>
  </itemizedlist>

  <para>
   There are some issues that are not addressed by the replacement
   <literal>tsearch2</literal> module, and will therefore require
   application code changes in any case:
  </para>

  <itemizedlist mark="bullet">
   <listitem>
    <para>
     The old <function>tsearch2</> trigger function allowed items in its
     argument list to be names of functions to be invoked on the text data
     before it was converted to <type>tsvector</> format.  This was removed
     as being a security hole, since it was not possible to guarantee that
     the function invoked was the one intended.  The recommended approach
     if the data must be massaged before being indexed is to write a custom
     trigger that does the work for itself.
    </para>
   </listitem>

   <listitem>
    <para>
     Text search configuration information has been moved into core
     system catalogs that are noticeably different from the tables used
     by <application>tsearch2</>.  Any applications that examined
     or modified those tables will need adjustment.
    </para>
   </listitem>

   <listitem>
    <para>
     If an application used any custom text search configurations,
     those will need to be set up in the core
     catalogs using the new text search configuration SQL commands.
     The replacement <literal>tsearch2</literal> module offers a little
     bit of support for this by making it possible to load an old set
     of <application>tsearch2</> configuration tables into
     <productname>PostgreSQL</productname> 8.3.  (Without the module,
     it is not possible to load the configuration data because values in the
     <type>regprocedure</> columns cannot be resolved to functions.)
     While those configuration tables won't actually <emphasis>do</>
     anything, at least their contents will be available to be consulted
     while setting up an equivalent custom configuration in 8.3.
    </para>
   </listitem>

   <listitem>
    <para>
     The old <function>reset_tsearch()</> and <function>get_covers()</>
     functions are not supported.
    </para>
   </listitem>

   <listitem>
    <para>
     The replacement <literal>tsearch2</literal> module does not define
     any alias operators, relying entirely on the built-in ones.
     This would only pose an issue if an application used explicitly
     schema-qualified operator names, which is very uncommon.
    </para>
   </listitem>
  </itemizedlist>

 </sect2>

 <sect2>
  <title>Converting a pre-8.3 Installation</title>

  <para>
   The recommended way to update a pre-8.3 installation that uses
   <application>tsearch2</> is:
  </para>

  <procedure>
   <step>
    <para>
     Make a dump from the old installation in the usual way,
     but be sure not to use <literal>-c</> (<literal>--clean</>)
     option of <application>pg_dump</> or <application>pg_dumpall</>.
    </para>
   </step>

   <step>
    <para>
     In the new installation, create empty database(s) and install
     the replacement <literal>tsearch2</literal> module into each
     database that will use text search.  This must be done
     <emphasis>before</> loading the dump data!  If your old installation
     had the <application>tsearch2</> objects in a schema other
     than <literal>public</>, be sure to adjust the
     <command>CREATE EXTENSION</> command so that the replacement
     objects are created in that same schema.
    </para>
   </step>

   <step>
    <para>
     Load the dump data.  There will be quite a few errors reported
     due to failure to recreate the original <application>tsearch2</>
     objects.  These errors can be ignored, but this means you cannot
     restore the dump in a single transaction (eg, you cannot use
     <application>pg_restore</>'s <option>-1</> switch).
    </para>
   </step>

   <step>
    <para>
     Examine the contents of the restored <application>tsearch2</>
     configuration tables (<structname>pg_ts_cfg</> and so on), and
     create equivalent built-in text search configurations as needed.
     You may drop the old configuration tables once you've extracted
     all the useful information from them.
    </para>
   </step>

   <step>
    <para>
     Test your application.
    </para>
   </step>
  </procedure>

  <para>
   At a later time you may wish to rename application references
   to the alias text search objects, so that you can eventually
   uninstall the replacement <literal>tsearch2</literal> module.
  </para>

 </sect2>

 <sect2>
  <title>References</title>
  <para>
   Tsearch2 Development Site
   <ulink url="http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/"></ulink>
  </para>
 </sect2>

</sect1>