Обсуждение: [MASSMAIL]constant crashing

Поиск
Список
Период
Сортировка

[MASSMAIL]constant crashing

От
jack
Дата:
Hello,
I am trying to load about 1.4 billion records into a postgresql table.
The data is in about 100 individual CSV files which I load individually into the table using the COPY command.
After the data is loaded I perform updates to the data, and this is when I get errors. Some updates work, some crash.
I am using psql and an SQL script file (update.sql) to perform the updating (UPDATE table SET field=UPPER(field), etc.).

The errors are all "Server closed the connection unexpectedly"

The problem is that the errors are never the same and occur at different times and at different steps in the process.
If I update the machine (apt update/upgrade), reboot and try again, the errors occur at different places.
The errors messages and logs are not helpful and I have not been able to determine the cause.

I decided to load the CSV files into temporary tables and then perform the updates on the temporary tables before moving the data to the main table.
This works for most of the tables, but there are always 1 or 2 that crash when I try to update them.
In my last attempt, all tables updated successfully except the last one, with about 100 million records.

I tried changing these settings to 60 minutes and it still crashes.
SET statement_timeout = '60min';
SET idle_in_transaction_session_timeout = '60min';

Then I tried to update the data in blocks of 100,000 records and it crashed 4 times on 4 different blocks.
So I updated the first crashed block down to the a block of 10 records, until it crashed.
Then I updated each of the 10 records individually to identify the record that is problemantic, but then all 10 records updated without crashing!
Pure insanity!

I am using a dedicated machine which is a 10 core i9 with 128 GIG of RAM and 2 x 4 TB NVMEs.
There is only 1 user on this stand-alone machine, me, and the machine is not connected to the internet or a network.
I have reinstalled ubuntu 22.04 server many times, wiping out the discs and starting over.
Last attempts have been with ubuntu 22.04.04 and postgreSQL 16.
But the errors persist.

I have been working on this for just over 1 year now, documenting every step, and I am still unable to get this to work without it crashing somewhere along the way.
I am beginning to wonder if postgreSQL is bi-polar.

Any help would be greatly appreciated.
Thank you

Re: constant crashing

От
Adrian Klaver
Дата:
On 4/14/24 07:24, jack wrote:
> Hello,
> I am trying to load about 1.4 billion records into a postgresql table.
> The data is in about 100 individual CSV files which I load individually 
> into the table using the COPY command.
> After the data is loaded I perform updates to the data, and this is when 
> I get errors. Some updates work, some crash.
> I am using psql and an SQL script file (update.sql) to perform the 
> updating (UPDATE table SET field=UPPER(field), etc.).
> 
> The errors are all "Server closed the connection unexpectedly"
> 


> I have been working on this for just over 1 year now, documenting every 
> step, and I am still unable to get this to work without it crashing 
> somewhere along the way.
> I am beginning to wonder if postgreSQL is bi-polar.

More information needed, to start:

1) Complete example of the code in the script.

2) Table definition(s) of those being copied into.

3) Sample of the data being copied.

4) The error message(s) generated.

5) Database locale

> 
> Any help would be greatly appreciated.
> Thank you

-- 
Adrian Klaver
adrian.klaver@aklaver.com




Re: constant crashing

От
Tom Lane
Дата:
jack <jack4pg@a7q.com> writes:
> I am trying to load about 1.4 billion records into a postgresql table.
> The data is in about 100 individual CSV files which I load individually into the table using the COPY command.
> After the data is loaded I perform updates to the data, and this is when I get errors. Some updates work, some crash.
> I am using psql and an SQL script file (update.sql) to perform the updating (UPDATE table SET field=UPPER(field),
etc.).

> The errors are all "Server closed the connection unexpectedly"

What appears in the postmaster log when this happens?

It would also be instructive to run "top" while you trigger
the problem, and see if there is obvious distress like the
server process's memory consumption ballooning.

It smells like a resource-exhaustion problem.  Which probably
shouldn't be happening on a machine with 128G RAM, but maybe
something is misconfigured to give up way below that.  In any
case, you'll never find the cause from the client side.

> Last attempts have been with ubuntu 22.04.04 and postgreSQL 16.

16 dot which?

            regards, tom lane



Re: constant crashing

От
David Rowley
Дата:
On Mon, 15 Apr 2024 at 02:25, jack <jack4pg@a7q.com> wrote:
> Then I tried to update the data in blocks of 100,000 records and it crashed 4 times on 4 different blocks.
> So I updated the first crashed block down to the a block of 10 records, until it crashed.
> Then I updated each of the 10 records individually to identify the record that is problemantic, but then all 10
recordsupdated without crashing!
 
> Pure insanity!

> I have been working on this for just over 1 year now, documenting every step, and I am still unable to get this to
workwithout it crashing somewhere along the way.
 
> I am beginning to wonder if postgreSQL is bi-polar.

There are two things I can think of that might make PostgreSQL appear
"bi-polar" here.

1) The plan isn't the same when you update a small number of rows vs a
large number rows.  The planner may prefer an index depending on
what's in your WHERE clause.
2) JIT compilation is less likely with a small number or rows as the
plan will come out cheaper.

A stack trace as mentioned by Tom would be ideal, but if you can't get that...

I'd try:

a) Disable "jit" in postgresql.conf and reload the config file with
SELECT pg_reload_conf();

b) Still crashing? Try dropping all indexes before performing the UPDATE.

c)

Still crashing? Can you make it consistently crash with one
problematic CSV file and UPDATE statement?

Does it still crash if you drop all indexes from the table before
performing the UPDATE?
Can you narrow the UPDATE's scope by forcing a TID Range scan by
starting with a WHERE clause such as:

WHERE ctid >= '(0,0)' AND ctid <= '(4294967294,0)';

then increasly try to home in on the problematic value by "binary
searching" for the problematic record.  You can get the number of
blocks in the table with:

select pg_relation_size('your_table'::regclass) /
current_setting('block_size')::int;

replace the 4294967294 with the value from that and bisect the table
until you home in on the block with the issue.

David