Lists: | pgsql-hackers |
---|
From: | Aleksandr Parfenov <a(dot)parfenov(at)postgrespro(dot)ru> |
---|---|
To: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Range phrase operator in tsquery |
Date: | 2018-04-27 11:03:07 |
Message-ID: | 20180427140307.038769f4@asp437-24-g082ur |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hello hackers,
Nowadays, phrase operator in Postgres FTS supports only exact match of
the distance between two words. It is sufficient for a search of
simple/exact phrases, but in some cases exact distance is unknown and we
want to words be close enough. E.g. it may help to search phrases with
additional words in the middle of the phrase ("long, narrow, plastic
brush" vs "long brush")
Proposed patch adds ability to use ranges in phrase operator for
mentioned cases. Few examples:
'term1 <4,10> term2'::tsquery -- Distance between term1 and term2 is
-- at least 4 and no greater than 10
'term1 <,10> term2'::tsquery -- Distance between term1 and term2 is
-- no greater than 10
'term1 <4,> term2'::tsquery -- Distance between term1 and term2 is
-- at least 4
In addition, negative distance is supported and means reverse order of
the words. For example:
'term1 <4,10> term2'::tsquery = 'term2 <-10,-4> term1'::tsquery
'term1 <,10> term2'::tsquery = 'term2 <-10,> term1'::tsquery
'term1 <4,> term2'::tsquery = 'term2 <,-4> term1'::tsquery
Negative distance support introduced to use it for AROUND operator
mentioned in websearch_to_tsquery[1]. In web search query language
AROUND(N) does a search for words within given distance N in
both forward and backward direction and it can be represented as <-N,N>
range phrase operator.
--
Aleksandr Parfenov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Attachment | Content-Type | Size |
---|---|---|
0001-range_phrase_operator_v1.patch | text/x-patch | 42.8 KB |
From: | Aleksandr Parfenov <a(dot)parfenov(at)postgrespro(dot)ru> |
---|---|
To: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Range phrase operator in tsquery |
Date: | 2018-07-09 08:09:05 |
Message-ID: | 20180709150905.6a27d15c@asp437-ThinkPad-L380 |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hello hackers,
Updated version of the patch in the attachment.
--
Aleksandr Parfenov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Attachment | Content-Type | Size |
---|---|---|
0001-range-phrase-operator-v2.patch | text/x-patch | 43.0 KB |
From: | Dmitry Dolgov <9erthalion6(at)gmail(dot)com> |
---|---|
To: | a(dot)parfenov(at)postgrespro(dot)ru |
Cc: | PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Range phrase operator in tsquery |
Date: | 2018-11-15 22:15:07 |
Message-ID: | CA+q6zcXyRY6kKpuXAJgvznyA0Jj3tOe=XkQ1n7e3XaE44ZW1Rw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
> On Fri, 27 Apr 2018 at 13:03, Aleksandr Parfenov <a(dot)parfenov(at)postgrespro(dot)ru> wrote:
>
> Nowadays, phrase operator in Postgres FTS supports only exact match of
> the distance between two words. It is sufficient for a search of
> simple/exact phrases, but in some cases exact distance is unknown and we
> want to words be close enough. E.g. it may help to search phrases with
> additional words in the middle of the phrase
Hi,
Thank you for the patch, it looks like a nice feature. Few questions:
+ if (!distance_from_set)
+ {
+ distance_from = distance_to < 0 ? MINENTRYPOS : 0;
+ }
+ if (!distance_to_set)
+ {
+ distance_to = distance_from < 0 ? 0 : MAXENTRYPOS;
+ }
Why use 0 here instead of MAXENTRYPOS/MINENTRYPOS ? It looks a bit strange:
SELECT 'a <,-1000> b'::tsquery;
tsquery
------------------------
'a' <-16384,-1000> 'b'
(1 row)
SELECT 'a <,1000> b'::tsquery;
tsquery
------------------
'a' <0,1000> 'b'
(1 row)
Also I wonder why after introducing MINENTRYPOS the LIMITPOS wasn't changed?
#define LIMITPOS(x) ( ( (x) >= MAXENTRYPOS ) ? (MAXENTRYPOS-1) : (x) )
From: | Dmitry Dolgov <9erthalion6(at)gmail(dot)com> |
---|---|
To: | a(dot)parfenov(at)postgrespro(dot)ru |
Cc: | PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Range phrase operator in tsquery |
Date: | 2018-11-30 20:08:14 |
Message-ID: | CA+q6zcX2oiF5tGHAvbpn0_PcOC1d3iyb3MpveNqj3QqoBPYJ6A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
> On Thu, Nov 15, 2018 at 11:15 PM Dmitry Dolgov <9erthalion6(at)gmail(dot)com> wrote:
>
> > On Fri, 27 Apr 2018 at 13:03, Aleksandr Parfenov <a(dot)parfenov(at)postgrespro(dot)ru> wrote:
> >
> > Nowadays, phrase operator in Postgres FTS supports only exact match of
> > the distance between two words. It is sufficient for a search of
> > simple/exact phrases, but in some cases exact distance is unknown and we
> > want to words be close enough. E.g. it may help to search phrases with
> > additional words in the middle of the phrase
>
> Hi,
>
> Thank you for the patch, it looks like a nice feature. Few questions:
>
> + if (!distance_from_set)
> + {
> + distance_from = distance_to < 0 ? MINENTRYPOS : 0;
> + }
> + if (!distance_to_set)
> + {
> + distance_to = distance_from < 0 ? 0 : MAXENTRYPOS;
> + }
>
> Why use 0 here instead of MAXENTRYPOS/MINENTRYPOS ? It looks a bit strange:
>
> SELECT 'a <,-1000> b'::tsquery;
> tsquery
> ------------------------
> 'a' <-16384,-1000> 'b'
> (1 row)
>
> SELECT 'a <,1000> b'::tsquery;
> tsquery
> ------------------
> 'a' <0,1000> 'b'
> (1 row)
>
> Also I wonder why after introducing MINENTRYPOS the LIMITPOS wasn't changed?
>
> #define LIMITPOS(x) ( ( (x) >= MAXENTRYPOS ) ? (MAXENTRYPOS-1) : (x) )
Due to lack of response I'm marking this as returned with feedback.