Обсуждение: Query crash with 15.5 on debian bookworm/armv8

Поиск
Список
Период
Сортировка

Query crash with 15.5 on debian bookworm/armv8

От
Clemens Eisserer
Дата:
Hi,

I've just updated my raspberry pi 3 from postgresql-13.3 on
bullseye/armv6 to postgresq-15.5 on debian-bookworm/armv8.

However after the upgrade, I experience reproducable crashes quering
the following table:

CREATE TABLE public.smartmeter (
   leistungsfaktor real,
   momentanleistung integer,
   spannungl1 real,
   spannungl2 real,
   spannungl3 real,
   stroml1 real,
   stroml2 real,
   stroml3 real,
   wirkenergien real,
   wirkenergiep real,
   ts timestamp with time zone NOT NULL
);
CREATE INDEX smartmeter_ts_idx ON public.smartmeter USING brin (ts);

with the following query:
SELECT floor(extract(epoch from ts)/60)*60 AS "time", AVG(spannungL1)
as l1, AVG(spannungL2) as l2, AVG(spannungL3) as l3 FROM smartmeter
WHERE ts BETWEEN '2023-12-01T13:01:30.514Z' AND
'2023-12-25T19:01:30.514Z' GROUP BY time order by time;

any ideas how to diagnose the issue further?
is this a known problem?

Thanks & best regards, Clemens

Using host libthread_db library "/lib/aarch64-linux-gnu/libthread_db.so.1".
0x0000007ff6eb7fe0 in __GI_epoll_pwait (epfd=4, events=0x5555ea2d20,
maxevents=1, timeout=timeout@entry=-1, set=set@entry=0x0) at
../sysdeps/unix/sysv/linux/epoll_pwait.c:40
40      ../sysdeps/unix/sysv/linux/epoll_pwait.c: No such file or directory.
(gdb) c
Continuing.

Program received signal SIGUSR1, User defined signal 1.
0x0000007ff6ea7f58 in __libc_pread64 (fd=25,
buf=buf@entry=0x7feb754880, count=count@entry=8192,
offset=offset@entry=16384) at ../sysdeps/unix/sysv/linux/pread64.c:25
25      ../sysdeps/unix/sysv/linux/pread64.c: No such file or directory.
(gdb) c
Continuing.

Program received signal SIGSEGV, Segmentation fault.
0x0000007fe5e6a9f0 in ?? () from /lib/aarch64-linux-gnu/libLLVM-14.so.1
(gdb) bt full
#0  0x0000007fe5e6a9f0 in ?? () from /lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#1  0x0000007fe59bb49c in llvm::raw_ostream::write(char const*,
unsigned long) () from /lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#2  0x0000007fe6d71048 in
llvm::MCContext::createTempSymbol(llvm::Twine const&, bool) () from
/lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#3  0x0000007fe6d713f0 in llvm::MCContext::createTempSymbol() () from
/lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#4  0x0000007fe6d95c6c in
llvm::MCObjectStreamer::emitCFIEndProcImpl(llvm::MCDwarfFrameInfo&) ()
from /lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#5  0x0000007fe619f4c0 in ?? () from /lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#6  0x0000007fe6180b6c in llvm::AsmPrinter::emitFunctionBody() () from
/lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#7  0x0000007fe72a4ba4 in ?? () from /lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#8  0x0000007fe5d3122c in
llvm::MachineFunctionPass::runOnFunction(llvm::Function&) () from
/lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#9  0x0000007fe5b14390 in
llvm::FPPassManager::runOnFunction(llvm::Function&) () from
/lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#10 0x0000007fe5b1af70 in
llvm::FPPassManager::runOnModule(llvm::Module&) () from
/lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#11 0x0000007fe5b14d98 in
llvm::legacy::PassManagerImpl::run(llvm::Module&) () from
/lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#12 0x0000007fe7187d70 in
llvm::orc::SimpleCompiler::operator()(llvm::Module&) () from
/lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#13 0x0000007fe71dc138 in ?? () from /lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#14 0x0000007fe71dbf44 in
llvm::orc::IRCompileLayer::emit(std::unique_ptr<llvm::orc::MaterializationResponsibility,
std::default_delete<llvm::orc::MaterializationResponsibility> >,
llvm::orc::ThreadSafeModule) ()
  from /lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#15 0x0000007fe71dc634 in
llvm::orc::IRTransformLayer::emit(std::unique_ptr<llvm::orc::MaterializationResponsibility,
std::default_delete<llvm::orc::MaterializationResponsibility> >,
llvm::orc::ThreadSafeModule) ()
  from /lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#16 0x0000007fe71dc634 in
llvm::orc::IRTransformLayer::emit(std::unique_ptr<llvm::orc::MaterializationResponsibility,
std::default_delete<llvm::orc::MaterializationResponsibility> >,
llvm::orc::ThreadSafeModule) ()
  from /lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#17 0x0000007fe71e2648 in
llvm::orc::BasicIRLayerMaterializationUnit::materialize(std::unique_ptr<llvm::orc::MaterializationResponsibility,
std::default_delete<llvm::orc::MaterializationResponsibility> >) ()
  from /lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#18 0x0000007fe7199c18 in llvm::orc::MaterializationTask::run() ()
from /lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#19 0x0000007fe71a4ea0 in ?? () from /lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#20 0x0000007fe719bad0 in
llvm::orc::ExecutionSession::dispatchOutstandingMUs() () from
/lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#21 0x0000007fe719ea84 in
llvm::orc::ExecutionSession::OL_completeLookup(std::unique_ptr<llvm::orc::InProgressLookupState,
std::default_delete<llvm::orc::InProgressLookupState> >,
std::shared_ptr<llvm::orc::AsynchronousSymbolQuery>, std::funct
ion<void (llvm::DenseMap<llvm::orc::JITDylib*,
llvm::DenseSet<llvm::orc::SymbolStringPtr,
llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void> >,
llvm::DenseMapInfo<llvm::orc::JITDylib*, void>,
llvm::detail::DenseMapPair<llvm::orc::JITDylib*,
llvm::DenseSet<llvm::orc::SymbolStringPtr,
llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void> > > > const&)>)
() from /lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#22 0x0000007fe71ab544 in ?? () from /lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#23 0x0000007fe718dd90 in
llvm::orc::ExecutionSession::OL_applyQueryPhase1(std::unique_ptr<llvm::orc::InProgressLookupState,
std::default_delete<llvm::orc::InProgressLookupState> >, llvm::Error)
() from /lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#24 0x0000007fe718bf34 in
llvm::orc::ExecutionSession::lookup(llvm::orc::LookupKind,
std::vector<std::pair<llvm::orc::JITDylib*,
llvm::orc::JITDylibLookupFlags>,
std::allocator<std::pair<llvm::orc::JITDylib*,
llvm::orc::JITDylibLookupFlags> >
--Type <RET> for more, q to quit, c to continue without paging--
> const&, llvm::orc::SymbolLookupSet, llvm::orc::SymbolState, llvm::unique_function<void
(llvm::Expected<llvm::DenseMap<llvm::orc::SymbolStringPtr,llvm::JITEvaluatedSymbol,
llvm::DenseMapInfo<llvm::orc::SymbolStringPtr,void>, llvm::detail::D
 
enseMapPair<llvm::orc::SymbolStringPtr, llvm::JITEvaluatedSymbol> >
>)>, std::function<void (llvm::DenseMap<llvm::orc::JITDylib*,
llvm::DenseSet<llvm::orc::SymbolStringPtr,
llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void> >, llvm::DenseMap
Info<llvm::orc::JITDylib*, void>,
llvm::detail::DenseMapPair<llvm::orc::JITDylib*,
llvm::DenseSet<llvm::orc::SymbolStringPtr,
llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void> > > > const&)>)
() from /lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#25 0x0000007fe719bce0 in
llvm::orc::ExecutionSession::lookup(std::vector<std::pair<llvm::orc::JITDylib*,
llvm::orc::JITDylibLookupFlags>,
std::allocator<std::pair<llvm::orc::JITDylib*,
llvm::orc::JITDylibLookupFlags> > > const&, llvm::orc::Sy
mbolLookupSet const&, llvm::orc::LookupKind, llvm::orc::SymbolState,
std::function<void (llvm::DenseMap<llvm::orc::JITDylib*,
llvm::DenseSet<llvm::orc::SymbolStringPtr,
llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void> >,
llvm::DenseMapInfo
<llvm::orc::JITDylib*, void>,
llvm::detail::DenseMapPair<llvm::orc::JITDylib*,
llvm::DenseSet<llvm::orc::SymbolStringPtr,
llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void> > > > const&)>)
() from /lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
Backtrace stopped: previous frame identical to this frame (corrupt stack?)



Re: Query crash with 15.5 on debian bookworm/armv8

От
Adrian Klaver
Дата:
On 12/25/23 13:01, Clemens Eisserer wrote:
> Hi,
> 
> I've just updated my raspberry pi 3 from postgresql-13.3 on
> bullseye/armv6 to postgresq-15.5 on debian-bookworm/armv8.

How did you upgrade?



-- 
Adrian Klaver
adrian.klaver@aklaver.com




Re: Query crash with 15.5 on debian bookworm/armv8

От
Clemens Eisserer
Дата:
Hi Adrian,

> How did you upgrade?

A fresh install based on  "Raspberry Pi OS Lite" image provided (based
on debian bookworm) with pgdump_all & plsql -f.



Re: Query crash with 15.5 on debian bookworm/armv8

От
Adrian Klaver
Дата:
On 12/25/23 13:51, Clemens Eisserer wrote:
> Hi Adrian,
> 
>> How did you upgrade?
> 
> A fresh install based on  "Raspberry Pi OS Lite" image provided (based

Does that install Postgres as part of the image or did you get it from 
somewhere else?

> on debian bookworm) with pgdump_all & plsql -f.
> 
> 

-- 
Adrian Klaver
adrian.klaver@aklaver.com




Re: Query crash with 15.5 on debian bookworm/armv8

От
Clemens Eisserer
Дата:
> Does that install Postgres as part of the image or did you get it from
> somewhere else?

I installed it via "apt-get install postgresql" and it downloaded
postgresql-15_15.5-0+deb12u1_arm64.deb - which seems to be the current
package shipped with debian bookworm for arm64:
https://packages.debian.org/bookworm/arm64/postgresql-15/download

best regards, Clemens



Re: Query crash with 15.5 on debian bookworm/armv8

От
Adrian Klaver
Дата:
On 12/25/23 13:51, Clemens Eisserer wrote:
> Hi Adrian,
> 
>> How did you upgrade?
> 
> A fresh install based on  "Raspberry Pi OS Lite" image provided (based
> on debian bookworm) with pgdump_all & plsql -f.
> 

Did you install the 32 or 64 bit version from here?:

https://www.raspberrypi.com/software/operating-systems/

-- 
Adrian Klaver
adrian.klaver@aklaver.com




Re: Query crash with 15.5 on debian bookworm/armv8

От
Tom Lane
Дата:
Clemens Eisserer <linuxhippy@gmail.com> writes:
> Program received signal SIGSEGV, Segmentation fault.
> 0x0000007fe5e6a9f0 in ?? () from /lib/aarch64-linux-gnu/libLLVM-14.so.1
> (gdb) bt full
> #0  0x0000007fe5e6a9f0 in ?? () from /lib/aarch64-linux-gnu/libLLVM-14.so.1
> No symbol table info available.
> #1  0x0000007fe59bb49c in llvm::raw_ostream::write(char const*,
> unsigned long) () from /lib/aarch64-linux-gnu/libLLVM-14.so.1
> No symbol table info available.

FWIW, since this crash is inside LLVM you could presumably dodge the bug
by setting "jit" to off.

As for an actual fix, perhaps a newer version of LLVM is needed?
I don't see a problem testing this query on my RPI with Ubuntu 23.10
(LLVM 16).

            regards, tom lane



Re: Query crash with 15.5 on debian bookworm/armv8

От
Clemens Eisserer
Дата:
Hi Tom,

> FWIW, since this crash is inside LLVM you could presumably dodge the bug
> by setting "jit" to off.

Thanks, this indeed solved the crash.
Just to make sure this crash doesn't have anything to do with my
setup/config (I'd changed quite a few settings in postgresql.conf),
I gave it a try on a fresh bookworm install and it also crashed immeditaly.

> As for an actual fix, perhaps a newer version of LLVM is needed?
> I don't see a problem testing this query on my RPI with Ubuntu 23.10
> (LLVM 16).

I also gave Ubuntu 23.10 a try (15.4 built with llvm-15) and it worked
as expected, explain analyze even mentioned the JIT was active.

I've filed a debian bug report with a link to this discussion and a
plea to build postgresql against llvm >= 15:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1059476

To be honest I don't know why llvm-14 was chosen, as 15 is also
available in bookworm.

Thanks & best regards, Clemens



Re: Query crash with 15.5 on debian bookworm/armv8

От
Thomas Munro
Дата:
On Wed, Dec 27, 2023 at 5:17 AM Clemens Eisserer <linuxhippy@gmail.com> wrote:
> > FWIW, since this crash is inside LLVM you could presumably dodge the bug
> > by setting "jit" to off.
>
> Thanks, this indeed solved the crash.
> Just to make sure this crash doesn't have anything to do with my
> setup/config (I'd changed quite a few settings in postgresql.conf),
> I gave it a try on a fresh bookworm install and it also crashed immeditaly.
>
> > As for an actual fix, perhaps a newer version of LLVM is needed?
> > I don't see a problem testing this query on my RPI with Ubuntu 23.10
> > (LLVM 16).
>
> I also gave Ubuntu 23.10 a try (15.4 built with llvm-15) and it worked
> as expected, explain analyze even mentioned the JIT was active.

I can't reproduce this on LLVM 14 on an aarch64 Mac FWIW (after
setting jit_*_cost to 0, as required since the table is empty).

> I've filed a debian bug report with a link to this discussion and a
> plea to build postgresql against llvm >= 15:
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1059476

I doubt they'll change that, and in any case we'll need to get to the
bottom of this.  Perhaps an assertion build of LLVM will fail in some
illuminating internal assertion?  Unfortunately it's a non-trivial
business to get a debug build of LLVM going (it takes oodles of disk
and CPU and a few confusing-to-me steps)...

. o O ( It would be wonderful if assertion-enabled packages were
readily available for a common platform like Debian.  I've finally
been spurred on to reach out to the maintainer of apt.llvm.org to ask
about that.  It'd also be very handy for automated next-version
monitoring. )



Re: Query crash with 15.5 on debian bookworm/armv8

От
Clemens Eisserer
Дата:
Hi Thomas,

In case it is helpful for analyzing whats causing the cash, I've
uploaded the db dump I experience the crash on to:
https://drive.google.com/file/d/1H9Y3FaoBafakHwXhpT3s8NNQ1UNJLpJY/view?usp=sharing

The only steps I had to do to trigger the crash were:
- Start with fresh rasperry pi os bookworm 64-bit image
- install postgresql (packages are pulled from debian and also match
debian's md5 sums so I guess there should be no difference caused by
using raspberry pi os base image)
- import the linked db export with psql -f (I had to generate de_AT locale first
- execute the query

Best regards, Clemens

Am Di., 26. Dez. 2023 um 23:16 Uhr schrieb Thomas Munro
<thomas.munro@gmail.com>:
>
> On Wed, Dec 27, 2023 at 5:17 AM Clemens Eisserer <linuxhippy@gmail.com> wrote:
> > > FWIW, since this crash is inside LLVM you could presumably dodge the bug
> > > by setting "jit" to off.
> >
> > Thanks, this indeed solved the crash.
> > Just to make sure this crash doesn't have anything to do with my
> > setup/config (I'd changed quite a few settings in postgresql.conf),
> > I gave it a try on a fresh bookworm install and it also crashed immeditaly.
> >
> > > As for an actual fix, perhaps a newer version of LLVM is needed?
> > > I don't see a problem testing this query on my RPI with Ubuntu 23.10
> > > (LLVM 16).
> >
> > I also gave Ubuntu 23.10 a try (15.4 built with llvm-15) and it worked
> > as expected, explain analyze even mentioned the JIT was active.
>
> I can't reproduce this on LLVM 14 on an aarch64 Mac FWIW (after
> setting jit_*_cost to 0, as required since the table is empty).
>
> > I've filed a debian bug report with a link to this discussion and a
> > plea to build postgresql against llvm >= 15:
> > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1059476
>
> I doubt they'll change that, and in any case we'll need to get to the
> bottom of this.  Perhaps an assertion build of LLVM will fail in some
> illuminating internal assertion?  Unfortunately it's a non-trivial
> business to get a debug build of LLVM going (it takes oodles of disk
> and CPU and a few confusing-to-me steps)...
>
> . o O ( It would be wonderful if assertion-enabled packages were
> readily available for a common platform like Debian.  I've finally
> been spurred on to reach out to the maintainer of apt.llvm.org to ask
> about that.  It'd also be very handy for automated next-version
> monitoring. )