Обсуждение: [BUGS] BUG #14845: postgres_fdw and mysql_fdw can cause crash of instance

Поиск
Список
Период
Сортировка

[BUGS] BUG #14845: postgres_fdw and mysql_fdw can cause crash of instance

От
josef.machytka@gmail.com
Дата:
The following bug has been logged on the website:

Bug reference:      14845
Logged by:          Josef Machytka
Email address:      josef.machytka@gmail.com
PostgreSQL version: 9.6.5
Operating system:   Debian 8
Description:

Hi,
I work from time to time with postgres_fdw and mysql_fdw and they both seem
to have the same problem - they use too much memory if I query huge remote
tables.

Since I have seen this behavior before on our monitoring I decided to test
it on testing instance with really huge amount of remote data which would
not fit into memory. 

Behavior was as expected - postgesql started to use more and more memory
until the whole available memory was used.

After that in first case postgresql crashed (which was more or less OK
because it restarted) but in second case the whole Debian instance crashed
and did not want to start again even after repeated attempts. I did not
fiddle with it and just recreated it but I even do not want to imagine if it
would happen on some other hugely used instance.

This behavior limits usage of those fdw very significantly. I cannot allow
other people to use foreign tables on really huge tables we have and I have
to check sizes before I use some query which is expected to be "heavy"... Of
course I would love to hear from someone that I should not use "heavy
queries"... :-)

So is there a way to for example force fdw to swap data into some temp file
and not use so much memory? This behavior seems to not depend on work_mem
setting...

Thanks
Josef

(Config: postgresql 9.6.5, latest versions of both fdw, Debian 8)




--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #14845: postgres_fdw and mysql_fdw can cause crash of instance

От
Feike Steenbergen
Дата:


On 6 October 2017 at 13:24, <josef.machytka@gmail.com> wrote:
> After that in first case postgresql crashed (which was more or less OK
> because it restarted) but in second case the whole Debian instance crashed

What causes the crash, is it the out-of-memory (OOM) killer?
How have you configured overcommit on your Debian instance?

https://www.postgresql.org/docs/current/static/kernel-resources.html#linux-memory-overcommit

If it is the OOM killer killing your process, you may want to disable overcommit. Doing that should cause the client to get a decent error message instead of the the whole cluster being restarted.

regards,

Feike

Re: [BUGS] BUG #14845: postgres_fdw and mysql_fdw can cause crash of instance

От
Fabrízio de Royes Mello
Дата:

On Fri, Oct 6, 2017 at 8:24 AM, <josef.machytka@gmail.com> wrote:
>
> The following bug has been logged on the website:
>
> Bug reference:      14845
> Logged by:          Josef Machytka
> Email address:      josef.machytka@gmail.com
> PostgreSQL version: 9.6.5
> Operating system:   Debian 8
> Description:
>
> Hi,
> I work from time to time with postgres_fdw and mysql_fdw and they both seem
> to have the same problem - they use too much memory if I query huge remote
> tables.
>
> Since I have seen this behavior before on our monitoring I decided to test
> it on testing instance with really huge amount of remote data which would
> not fit into memory.
>
> Behavior was as expected - postgesql started to use more and more memory
> until the whole available memory was used.
>
> After that in first case postgresql crashed (which was more or less OK
> because it restarted) but in second case the whole Debian instance crashed
> and did not want to start again even after repeated attempts. I did not
> fiddle with it and just recreated it but I even do not want to imagine if it
> would happen on some other hugely used instance.
>

OOMKiller?? What about your overcommit settings?

 
>
> This behavior limits usage of those fdw very significantly. I cannot allow
> other people to use foreign tables on really huge tables we have and I have
> to check sizes before I use some query which is expected to be "heavy"... Of
> course I would love to hear from someone that I should not use "heavy
> queries"... :-)
>
> So is there a way to for example force fdw to swap data into some temp file
> and not use so much memory?


Would be nice if you can provide to us a reproducible test case, so we can try to figure out what's happen...

 
>
> This behavior seems to not depend on work_mem
> setting...
>

No, work_mem is used for sorting purposes.

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello

Re: [BUGS] BUG #14845: postgres_fdw and mysql_fdw can cause crash of instance

От
Tom Lane
Дата:
Fabrízio de Royes Mello <fabriziomello@gmail.com> writes:
>> I work from time to time with postgres_fdw and mysql_fdw and they both seem
>> to have the same problem - they use too much memory if I query huge remote
>> tables.

> Would be nice if you can provide to us a reproducible test case, so we can
> try to figure out what's happen...

Indeed.  I can't say about mysql_fdw, but postgres_fdw is only supposed to
fetch 100 rows at a time (cf fetch_size option), so it shouldn't run out
of memory just because the amount of data to be fetched is large.

One idea is that, because the planner lacks information about the remote
table, it picks some stupid plan that ends up blowing out memory.  In that
case, perhaps turning on use_remote_estimate would help.  But we're just
guessing here due to lack of useful detail.
        regards, tom lane


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #14845: postgres_fdw and mysql_fdw can cause crash of instance

От
Josef Machytka
Дата:
Hi,
yes, sorry, somehow I forgot description of task....

In this case which went wrong I used postgres_fdw to compare data on local and remote database using "select all from remote except select all from local".

I selected the same table on remote and local which has ~200M rows and total size ~20GB. I needed to see all differences because we get some erratic differences... Estimation from previous limited queries was that differences are only in approx 1 to 3% rows. So I decided to try to select them all to look for some patterns...

Testing instance (on Google compute engine) had 4 CPUs, 26 GB of RAM, as for OOM killer - I used default setting on Debian 8 without any changes - so
/proc/sys/vm/overcommit_memory = 0
Monitoring done by telegraf on local + influxDB + grafana on other instance.

Nothing else running on that instance, postgresql on instance contained only this huge table.

After ~25 minutes of run all memory was used and as I mentioned in first case postgresql crashed and in second test (in which I lowered work_mem from 24M to 8M and increased a shared_buffers to 8GB to see if it helps) the whole instance crashed and did not want to start any more. SSD disk 500GB was almost empty so no problems with disk space.

Since I did not have time to fiddle with it I just dropped crashed instance and used ansible to create a new one.

Thanks

On 6 October 2017 at 16:54, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Fabrízio de Royes Mello <fabriziomello@gmail.com> writes:
>> I work from time to time with postgres_fdw and mysql_fdw and they both seem
>> to have the same problem - they use too much memory if I query huge remote
>> tables.

> Would be nice if you can provide to us a reproducible test case, so we can
> try to figure out what's happen...

Indeed.  I can't say about mysql_fdw, but postgres_fdw is only supposed to
fetch 100 rows at a time (cf fetch_size option), so it shouldn't run out
of memory just because the amount of data to be fetched is large.

One idea is that, because the planner lacks information about the remote
table, it picks some stupid plan that ends up blowing out memory.  In that
case, perhaps turning on use_remote_estimate would help.  But we're just
guessing here due to lack of useful detail.

                        regards, tom lane

Re: [BUGS] BUG #14845: postgres_fdw and mysql_fdw can cause crash of instance

От
Tom Lane
Дата:
Josef Machytka <josef.machytka@gmail.com> writes:
> In this case which went wrong I used postgres_fdw to compare data on local
> and remote database using "select all from remote except select all from
> local".

I think my theory B probably applies then: without use_remote_estimate,
the planner would not guess that the remote table is huge, and that could
well allow it to pick a hashed EXCEPT implementation --- which will then
try to collect the entire remote table into an in-memory hash table.

I don't know whether mysql_fdw has a comparable switch, but in any case
if it wasn't turned on then the same failure mode could apply to that FDW.

I wonder if we should rethink the default setting of use_remote_estimate
being "off".  Particularly as we keep on allowing postgres_fdw to consider
more different plan types, it seems like the potential for stupid plan
choices is getting worse all the time.
        regards, tom lane


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #14845: postgres_fdw and mysql_fdw can cause crash of instance

От
Jeff Janes
Дата:
On Fri, Oct 6, 2017 at 10:01 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Josef Machytka <josef.machytka@gmail.com> writes:
> In this case which went wrong I used postgres_fdw to compare data on local
> and remote database using "select all from remote except select all from
> local".

I think my theory B probably applies then: without use_remote_estimate,
the planner would not guess that the remote table is huge, and that could
well allow it to pick a hashed EXCEPT implementation --- which will then
try to collect the entire remote table into an in-memory hash table.

Unfortunately, use_remote_estimate won't help for this particular case.  While it gets the estimate for the number of rows from the remote estimate, HashSetOp doesn't care about estimated number of rows, only estimated number of distinct rows.  And use_remote_estimate doesn't provide that, so it falls back on assuming (I think) 200 distinct rows.

What could help is running 'analyze' locally against the foreign table.  That can cause other problems, though.
 
Cheers,

Jeff

Re: [BUGS] BUG #14845: postgres_fdw and mysql_fdw can cause crash of instance

От
Josef Machytka
Дата:


"What could help is running 'analyze' locally against the foreign table.  That can cause other problems, though."

Yes, this really helped - ANALYZE on foreign table took ~45 minutes but after that I was able to get results and memory was OK too. Only several hundreds of MB used...