Re: Typo Patch

Поиск
Список
Период
Сортировка
От David G. Johnston
Тема Re: Typo Patch
Дата
Msg-id CAKFQuwZpDuPPBLy4KEyEt0xx_X_3Qp+tbq2zrkf-dZ0mmYdBVg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Typo Patch  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On Tue, Jul 5, 2016 at 11:54 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Sat, Jul 2, 2016 at 12:17 PM, CharSyam <charsyam@gmail.com> wrote:
> I fixed typos. and attached patch for this.
> Thanks.
>
> I only changed comments only in src/backend/utils/adt/tsvector_op.c

Well, that's definitely a typo.  However, typo or no typo, it's hard
for me to figure out with any certainty what that sentence actually
means.
​​

Given that "curitem->qoperator.​distance" is constant.

Also, both loops go from low to high position (over the same scale)  with the inner loop stopping as soon as it moves beyond the position of the outer loop - namely that the left/inner item is always positioned to the left of the right/outer operand within the document being search.

Regardless of whether there is a match at this meeting point increasing the outer loop's position will not cause any of the previously checked (at the lower positions) inner loop items to match where they did not before.  I say this but I'm concerned that for sufficiently large values of curitem->qoperator.distance a given left operand could match multiple right operands since the distance is a maximum extent.

Thus, in the case of a match the current Lpos needs to be checked again during the subsequent iterator of the outer loop.

The "else if (Rpos - Lpos < distance) break" block though is oddly commented/written since distance is a maximum - shouldn't this be treated as a match as well?

Since, for sufficiently large values of "distance", a single left operand could cause multiple right operands to be matched when the less-than condition matches we need to leave Lpos alone and try the next Rpos against it.  For the greater than (default) and the equal cases we can skip rechecking the current Lpos.

The comment in question can be removed - we're not actually resetting anything here.  The use of LposStart is not really needed - just make sure to leave Lpos in the correct position (i.e. optional increment on all paths) and the inner while loop will do the correct thing.

It seems a bit odd to be keying off of the RIGHT operand and returning its position when left-to-right languages would consider the position of the leading word to be reported.

Code Comment Suggestion:

For each physically positioned right-side operand iterate over each instance of the left-side operand to locate one within the specified distance.  As soon as a match is found move onto the next right-operand and continue searching starting with the last checked left-side operand.  Note that for an exact distance match the current left-side operand can be skipped over.

For some graphical imagery consider how a Slinky operates.  Hold the left side, move the right side, release the left and let it collapse; repeat.  In this case, though, the collapsing can be stopped mid-stream since the distance check has an inequality.

The following shouldn't be taken as an actual patch submission but rather a codification of the above.

diff --git a/src/backend/utils/adt/tsvector_op.c b/src/backend/utils/adt/tsvector_op.c
index 242b7e1..fefaca5 100644
--- a/src/backend/utils/adt/tsvector_op.c
+++ b/src/backend/utils/adt/tsvector_op.c
@@ -1375,7 +1375,6 @@ TS_phrase_execute(QueryItem *curitem,
  ExecPhraseData Ldata = {0, false, NULL},
  Rdata = {0, false, NULL};
  WordEntryPos *Lpos,
-   *LposStart,
    *Rpos,
    *pos_iter = NULL;
 
@@ -1423,15 +1422,17 @@ TS_phrase_execute(QueryItem *curitem,
  * ExecPhraseData->data can point to the tsvector's WordEntryPosVector
  */
 
+ /*
+  * For each physically positioned right-side operand iterate over each
+  * instance of the left-side operand to locate one within the specified
+  * distance.  As soon as a match is found move onto the next right-operand
+  * and continue searching starting with the last checked left-side operand.
+  * Note that for an exact distance match the current left-side operand
+  * can be skipped over.
+  */
  Rpos = Rdata.pos;
- LposStart = Ldata.pos;
  while (Rpos < Rdata.pos + Rdata.npos)
  {
- /*
- * We need to check all possible distances, so reset Lpos
- * to guranteed not yet satisfied position.
- */
- Lpos = LposStart;
  while (Lpos < Ldata.pos + Ldata.npos)
  {
  if (WEP_GETPOS(*Rpos) - WEP_GETPOS(*Lpos) ==
@@ -1449,7 +1450,7 @@ TS_phrase_execute(QueryItem *curitem,
  * could not satisfy distance for any other right
  * position
  */
- LposStart = Lpos + 1;
+ Lpos++
  break;
  }
  else
@@ -1462,16 +1463,26 @@ TS_phrase_execute(QueryItem *curitem,
  }
 
  }
- else if (WEP_GETPOS(*Rpos) <= WEP_GETPOS(*Lpos) ||
- WEP_GETPOS(*Rpos) - WEP_GETPOS(*Lpos) <
+ else if (WEP_GETPOS(*Rpos) <= WEP_GETPOS(*Lpos)
+ {
+ /*
+ * No Increment - we are beyond the current right operand
+                     * so its possible this left operand could match the next right
+                     * operand.
+                     */
+ break;
+ }
+ else if (WEP_GETPOS(*Rpos) - WEP_GETPOS(*Lpos) <
  curitem->qoperator.distance)
  {
+ /* THIS SHOULD BE A MATCH? */
  /*
- * Go to the next Rpos, because Lpos is ahead or on less
- * distance than required by current operator
+ * No Increment - since we are within the distance specified
+ * it is possible that adding the additional distance to the next
+ * right operand will still leave us within range of this left operand
+ * and so we need to check it again.
  */
  break;
-
  }
 
  Lpos++;

David J.

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: can we optimize STACK_DEPTH_SLOP
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: Typo Patch