On 11/27/17 14:17, Tomas Vondra wrote:
> Hi,
>
> On 11/27/2017 07:57 PM, tcook@blackducksoftware.com wrote:
>> The following bug has been logged on the website:
>>
>> Bug reference: 14932
>> Logged by: Todd Cook
>> Email address: tcook@blackducksoftware.com
>> PostgreSQL version: 10.1
>> Operating system: CentOS Linux release 7.4.1708 (Core)
>> Description:
>>
>> It hangs on a table with 167834 rows, though it works fine with only 167833
>> rows. When it hangs, CTRL-C does not interrupt it, and the backend has to
>> be killed to stop it.
>>
>
> Can you share the query and data, so that we can reproduce the issue?
>
> Based on the stack traces this smells like a bug in the simplehash,
> introduced in PostgreSQL 10. Perhaps somewhere in tuplehash_grow(),
> which gets triggered for 167834 rows (but not for 167833).
FWIW, changing the guts of hashint8() to
+ if (val >= INT32_MIN && val <= INT32_MAX)
+ return hash_uint32((uint32) val);
+ else
+ return hash_any((unsigned char *) &val, sizeof(val));
allows us to process a full-sized data set of around 900 million rows. However,
memory usage seemed to be rather excessive (we can only run 7 of these jobs in parallel
on a 128GB system before the OOM killer kicked in, rather than the usual 24); if there's
any interest, I can try to measure exactly how excessive.
-- todd