On Thu, Aug 3, 2017 at 8:49 AM, Daniel Verite <daniel@manitou-mail.org> wrote:
> With query #2 it ends up crashing after ~5hours and produces
> the log in log-valgrind-2.txt.gz with some other entries than
> case #1, but AFAICS still all about reading uninitialised values
> in space allocated by datumCopy().
Right. This part is really interesting to me:
==48827== Uninitialised value was created by a heap allocation
==48827== at 0x4C28C20: malloc (vg_replace_malloc.c:296)
==48827== by 0x80B597: AllocSetAlloc (aset.c:771)
==48827== by 0x810ADC: palloc (mcxt.c:862)
==48827== by 0x72BFEF: datumCopy (datum.c:171)
==48827== by 0x81A74C: tuplesort_putdatum (tuplesort.c:1515)
==48827== by 0x5E91EB: advance_aggregates (nodeAgg.c:1023)
If you actually go to datum.c:171, you see that that's a codepath for
pass-by-reference datatypes that lack a varlena header. Text is a
datatype that has a varlena header, though, so that's clearly wrong. I
don't know how this actually happened, but working back through the
relevant tuplesort_begin_datum() caller, initialize_aggregate(), does
suggest some things. (tuplesort_begin_datum() is where datumTypeLen is
determined for the entire datum tuplesort.)
I am once again only guessing, but I have to wonder if this is a
problem in commit b8d7f053. It seems likely that the problem begins
before tuplesort_begin_datum() is even called, which is the basis of
this suspicion. If the problem is within tuplesort, then that could
only be because get_typlenbyval() gives wrong answers, which seems
very unlikely.
--
Peter Geoghegan
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs