Re: check for null value before looking up the hash function

Поиск
Список
Период
Сортировка
От David G. Johnston
Тема Re: check for null value before looking up the hash function
Дата
Msg-id CAKFQuwbFBCk+_LF59TcF16KOz9u5qAbx2UtAuju2aFfFhXbMHg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: check for null value before looking up the hash function  (Ranier Vilela <ranier.vf@gmail.com>)
Список pgsql-hackers
On Sat, May 21, 2022 at 8:32 AM Ranier Vilela <ranier.vf@gmail.com> wrote:
Em sáb., 21 de mai. de 2022 às 12:05, Tomas Vondra <tomas.vondra@enterprisedb.com> escreveu:


On 5/21/22 15:06, Ranier Vilela wrote:
>>Zhihong Yu <zyu(at)yugabyte(dot)com> writes:
>>> I was looking at the code in hash_record()
>>> of src/backend/utils/adt/rowtypes.c
>>> It seems if nulls[i] is true, we don't need to look up the hash function.
>
>>I don't think this is worth changing. It complicates the logic,
>>rendering it unlike quite a few other functions written in the same
>>style. In cases where the performance actually matters, the hash
>>function is cached across multiple calls anyway. You might save
>>something if you have many calls in a query and not one of them
>>receives a non-null input, but how likely is that?
>
> I disagree.
> I think that is worth changing. The fact of complicating the logic
> is irrelevant.

That's a quite bold claim, and yet you haven't supported it by any
argument whatsoever. Trade-offs between complexity and efficiency are a
crucial development task, so complicating the logic clearly does matter.
What I meant is that complicating the logic in search of efficiency is worth it, and that's what everyone is looking for in this thread.
Likewise, not complicating the logic, losing a little bit of efficiency, applied to all the code, leads to a big loss of efficiency.
In other words, I never miss an opportunity to gain efficiency.


I disliked the fact that 80% of the patch was adding indentation.  Instead, remove indentation for the normal flow case and move the loop short-circuit to the top of the loop where it is traditionally found.

This seems like a win on both efficiency and complexity grounds.  Having the "/* see hash_array() */" comment and logic twice is a downside but a minor one that could be replaced with a function call if desired.

diff --git a/src/backend/utils/adt/rowtypes.c b/src/backend/utils/adt/rowtypes.c
index db843a0fbf..0bc28d1742 100644
--- a/src/backend/utils/adt/rowtypes.c
+++ b/src/backend/utils/adt/rowtypes.c
@@ -1838,6 +1838,13 @@ hash_record(PG_FUNCTION_ARGS)
        TypeCacheEntry *typentry;
        uint32      element_hash;
 
+       if (nulls[i])
+       {
+           /* see hash_array() */
+           result = (result << 5) - result + 0;
+           continue;
+       }
+
        att = TupleDescAttr(tupdesc, i);
 
        if (att->attisdropped)
@@ -1860,24 +1867,16 @@ hash_record(PG_FUNCTION_ARGS)
            my_extra->columns[i].typentry = typentry;
        }
 
-       /* Compute hash of element */
-       if (nulls[i])
-       {
-           element_hash = 0;
-       }
-       else
-       {
-           LOCAL_FCINFO(locfcinfo, 1);
+       LOCAL_FCINFO(locfcinfo, 1);
 
-           InitFunctionCallInfoData(*locfcinfo, &typentry->hash_proc_finfo, 1,
-                                    att->attcollation, NULL, NULL);
-           locfcinfo->args[0].value = values[i];
-           locfcinfo->args[0].isnull = false;
-           element_hash = DatumGetUInt32(FunctionCallInvoke(locfcinfo));
+       InitFunctionCallInfoData(*locfcinfo, &typentry->hash_proc_finfo, 1,
+                                   att->attcollation, NULL, NULL);
+       locfcinfo->args[0].value = values[i];
+       locfcinfo->args[0].isnull = false;
+       element_hash = DatumGetUInt32(FunctionCallInvoke(locfcinfo));
 
-           /* We don't expect hash support functions to return null */
-           Assert(!locfcinfo->isnull);
-       }
+       /* We don't expect hash support functions to return null */
+       Assert(!locfcinfo->isnull);
 
        /* see hash_array() */
        result = (result << 5) - result + element_hash;

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Ranier Vilela
Дата:
Сообщение: Re: check for null value before looking up the hash function
Следующее
От: Tom Lane
Дата:
Сообщение: Re: check for null value before looking up the hash function