Обсуждение: max_stack_depth problem though query is substantially smaller

Поиск
Список
Период
Сортировка

max_stack_depth problem though query is substantially smaller

От
"Charles Clavadetscher"
Дата:
Hello

We have a process in R which reads statistical raw data from a table and computes time series values from them.
The time series values are in a hstore field with the date as the key and the value as the value.
The process writes the computed value into a temporary table and locks the corresponding row(s) of the target table for
update.
Finally the row(s) are updated if they already exist or inserted if they do not exist.

This process runs nightly and processes many GB of data without generating any error. Normally these are low frequency
timeseries 
(monthly or quarterly data).

Now we have a daily time serie with about 46'000 key/value pairs. In near future this will increase to 86'000 including
datafrom 
previous years.

When R processes the daily time serie we get a stack size exceeded error, followed by the hint to increase the
max_stack_depth.My 
colleague, who wrote the R function and performed the test read the docs and increased, according to ulimit -s the
max_stack_depth
to 7MB.

Here the details of OS and PG:
OS: osx 10.10.5
PG: 9.3.3

ulimit -s = 8192

The resize did work as *show max_stack_depth;" has shown. After this change, however, the query states the same error
asbefore, 
just with the new limit of 7 MB.

The query itself was written to a file in order to verify its size. The size turned out to be 1.7MB, i.e. even below
the
conservative default limit of 2 MB, yet alone substantially below 7 MB.

Apart from the fact that we could consider using a different strategy to store time series, we would like to understand
whatis 
causing the problem.

Here the query as it looks like in the R code:
sql_query_data <- sprintf("BEGIN;
                           CREATE TEMPORARY TABLE ts_updates(ts_key varchar, ts_data hstore, ts_frequency integer) ON
COMMITDROP; 
                           INSERT INTO ts_updates(ts_key, ts_data) VALUES %s;
                           LOCK TABLE %s.timeseries_main IN EXCLUSIVE MODE;

                           UPDATE %s.timeseries_main
                           SET ts_data = ts_updates.ts_data
                           FROM ts_updates
                           WHERE ts_updates.ts_key = %s.timeseries_main.ts_key;

                           INSERT INTO %s.timeseries_main
                           SELECT ts_updates.ts_key, ts_updates.ts_data, ts_updates.ts_frequency
                           FROM ts_updates
                           LEFT OUTER JOIN %s.timeseries_main ON (%s.timeseries_main.ts_key = ts_updates.ts_key)
                           WHERE %s.timeseries_main.ts_key IS NULL;
                           COMMIT;",
                           values, schema, schema, schema, schema, schema, schema, schema)

And here is how it looks like at the end:

INSERT INTO ts_updates(ts_key, ts_data, ts_frequency) VALUES ('somekey',hstore('1900-01-01','-0.395131869823009')||
                                                                        hstore('1900-01-02','-0.595131869823009')||
                                                                        hstore('1900-01-03','-0.395131869823009')||
                                                                        [...] 46'000 times

hstore('1900-01-04','-0.395131869823009'),NULL);

The computer where my colleague made the test is local. There are no other concurrent users.
We thank you for hints on what the problem may be and/or how to investigate it further.

Please reply to all, as my colleague is not yet subscribed to the mailing list.

Regards,
Charles and Matthias




Re: max_stack_depth problem though query is substantially smaller

От
Albe Laurenz
Дата:
Charles Clavadetscher wrote:
> We have a process in R which reads statistical raw data from a table and computes time series values
> from them.
> The time series values are in a hstore field with the date as the key and the value as the value.
> The process writes the computed value into a temporary table and locks the corresponding row(s) of the
> target table for update.
> Finally the row(s) are updated if they already exist or inserted if they do not exist.
> 
> This process runs nightly and processes many GB of data without generating any error. Normally these
> are low frequency time series
> (monthly or quarterly data).
> 
> Now we have a daily time serie with about 46'000 key/value pairs. In near future this will increase to
> 86'000 including data from
> previous years.
> 
> When R processes the daily time serie we get a stack size exceeded error, followed by the hint to
> increase the max_stack_depth. My
> colleague, who wrote the R function and performed the test read the docs and increased, according to
> ulimit -s the max_stack_depth
> to 7MB.
> 
> Here the details of OS and PG:
> OS: osx 10.10.5
> PG: 9.3.3
> 
> ulimit -s = 8192
> 
> The resize did work as *show max_stack_depth;" has shown. After this change, however, the query states
> the same error as before,
> just with the new limit of 7 MB.
> 
> The query itself was written to a file in order to verify its size. The size turned out to be 1.7MB,
> i.e. even below the
> conservative default limit of 2 MB, yet alone substantially below 7 MB.
> 
> Apart from the fact that we could consider using a different strategy to store time series, we would
> like to understand what is
> causing the problem.
> 
> Here the query as it looks like in the R code:
> sql_query_data <- sprintf("BEGIN;
>                            CREATE TEMPORARY TABLE ts_updates(ts_key varchar, ts_data hstore,
> ts_frequency integer) ON COMMIT DROP;
>                            INSERT INTO ts_updates(ts_key, ts_data) VALUES %s;
>                            LOCK TABLE %s.timeseries_main IN EXCLUSIVE MODE;
> 
>                            UPDATE %s.timeseries_main
>                            SET ts_data = ts_updates.ts_data
>                            FROM ts_updates
>                            WHERE ts_updates.ts_key = %s.timeseries_main.ts_key;
> 
>                            INSERT INTO %s.timeseries_main
>                            SELECT ts_updates.ts_key, ts_updates.ts_data, ts_updates.ts_frequency
>                            FROM ts_updates
>                            LEFT OUTER JOIN %s.timeseries_main ON (%s.timeseries_main.ts_key =
> ts_updates.ts_key)
>                            WHERE %s.timeseries_main.ts_key IS NULL;
>                            COMMIT;",
>                            values, schema, schema, schema, schema, schema, schema, schema)
> 
> And here is how it looks like at the end:
> 
> INSERT INTO ts_updates(ts_key, ts_data, ts_frequency) VALUES ('somekey',hstore('1900-01-01','-
> 0.395131869823009')||
>                                                                         hstore('1900-01-02','-
> 0.595131869823009')||
>                                                                         hstore('1900-01-03','-
> 0.395131869823009')||
>                                                                         [...] 46'000 times
>                                                                         hstore('1900-01-04','-
> 0.395131869823009'),NULL);

I don't understand which query causes the error.
The queries you quote above are smaller than 1.7 MB...
You could log errors to find out which statement causes the error.

One idea would be to attach a debugger to the backend, set a breakpoint in check_stack_depth()
where the error is thrown, and take a stack trace when you hit the error.
Maybe that can show you what is going on.

Yours,
Laurenz Albe

Re: max_stack_depth problem though query is substantially smaller

От
Tom Lane
Дата:
"Charles Clavadetscher" <clavadetscher@swisspug.org> writes:
> When R processes the daily time serie we get a stack size exceeded
error, followed by the hint to increase the max_stack_depth.

Postgres doesn't generally allocate large values on the stack, and I doubt
that R does either.  Almost certainly, what is causing this is not data
size per se but unreasonable call nesting depth in your R code.  You may
have a function that's actually in infinite recursion, or maybe it's
recursing to a depth governed by the number of data elements.  If so,
consider revising it into iteration with an explicitly-represented state
stack.

            regards, tom lane


Re: max_stack_depth problem though query is substantially smaller

От
"Bannert Matthias"
Дата:
Thanks for your reply. I do think it is rather a postgres than an R issue, here's why:

a) R simply puts an SQL string together. What Charles had posted was an excerpt of that string.
Basically we have 1.7 MB of that string. Everything else is equal just the hstore contains 40K key value pairs.

b) The error message clearly mentions max_stack_depth which is a postgres parameter.

c) If I just take that SQL string (only the first part of it, i.e. the create temp table and insert into  part w/o all
the 
update and join gibberish and put it to a .sql file and simply run it through a psql client like this:
\i myquery.sql

I get exactly the same error message (without any R involved at any stage)

psql:query.sql:3: ERROR:  stack depth limit exceeded
HINT:  Increase the configuration parameter "max_stack_depth" (currently 7168kB), after ensuring the platform's stack
depthlimit is adequate. 

d) I ran into to quite some R stack errors and they look different... (C_STACK_SIZE)

conclusion:
We are running a simple insert. Nothing special except for the fact that hstore has 40K key value pairs. Could it be
thatthe indexing of that hstore gets kinda large  
and thus a query string that only has 1.7 MB exceeds the stack ?






________________________________________
From: Tom Lane [tgl@sss.pgh.pa.us]
Sent: Friday, April 08, 2016 4:20 PM
To: Charles Clavadetscher
Cc: pgsql-general@postgresql.org; Bannert  Matthias
Subject: Re: [GENERAL] max_stack_depth problem though query is substantially smaller

"Charles Clavadetscher" <clavadetscher@swisspug.org> writes:
> When R processes the daily time serie we get a stack size exceeded
error, followed by the hint to increase the max_stack_depth.

Postgres doesn't generally allocate large values on the stack, and I doubt
that R does either.  Almost certainly, what is causing this is not data
size per se but unreasonable call nesting depth in your R code.  You may
have a function that's actually in infinite recursion, or maybe it's
recursing to a depth governed by the number of data elements.  If so,
consider revising it into iteration with an explicitly-represented state
stack.

                        regards, tom lane


Re: max_stack_depth problem though query is substantially smaller

От
Tom Lane
Дата:
"Bannert  Matthias" <bannert@kof.ethz.ch> writes:
> Thanks for your reply. I do think it is rather a postgres than an R issue, here's why:
> a) R simply puts an SQL string together. What Charles had posted was an excerpt of that string.
> Basically we have 1.7 MB of that string. Everything else is equal just the hstore contains 40K key value pairs.

Well, as a test I ran a query that included an hstore literal with 4
million key/value pairs (a bit shy of 70MB of query text).  I didn't see
any misbehavior on a machine with 2MB max_stack_depth.  So there's
something else going on in your situation.

I concur with the suggestion to try to get a stack backtrace from the
point of the error.  Setting a breakpoint at errfinish() is usually
an effective strategy when you know that the query will provoke a SQL
error report.

https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend

            regards, tom lane


Re: max_stack_depth problem though query is substantially smaller

От
Tom Lane
Дата:
"Bannert  Matthias" <bannert@kof.ethz.ch> writes:
> [ very deep stack of parser transformExprRecurse calls ]

> #20137 0x00007fe7fb80ab8c in pg_analyze_and_rewrite (parsetree=parsetree@entry=0x7fe7fffdb2a0,
query_string=query_string@entry=0x7fe7fdf606b0"INSERT INTO ts_updates(ts_key, ts_data, ts_frequency) VALUES
('some_id.sector_all.news_all_d',hstore('1900-01-01','-0.395131869823009')||hstore('1900-01-02','-0.395131869823009')||hstore('1"...,
paramTypes=paramTypes@entry=0x0,numParams=numParams@entry=0) at
/build/postgresql-9.3-G1RSAD/postgresql-9.3-9.3.11/build/../src/backend/tcop/postgres.c:640

The SQL fragment we can see here suggests that your "40K entry hstore" is
getting built up by stringing together 40K hstore concatenation operators.
Don't do that.  Even without the parser stack depth issue, it's uselessly
inefficient.  I presume you're generating this statement mechanically,
not by hand, so you could equally well have the app emit

'1900-01-01 => -0.395131869823009, 1900-01-02 => -0.395131869823009, ...'::hstore

which would look like a single hstore literal to the parser, and be
processed much more quickly.

If you insist on emitting SQL statements that have operators nested
to such depths, then yes you'll need to increase max_stack_depth to
whatever it takes to allow it.

            regards, tom lane


Re: max_stack_depth problem though query is substantially smaller

От
Tom Lane
Дата:
"Bannert  Matthias" <bannert@kof.ethz.ch> writes:
> Fwiw, I was not stubbornly insisting on nesting operators. Actually I switched from "=>" to the hstore function cause
> a note in the manual said it was deprecated (http://www.postgresql.org/docs/9.0/static/hstore.html). Somehow I must
haveunderstand that note the wrong way.  
> What's your take on that operator being deprecated?

That's the old SQL operator (which is not even there anymore) that's
equivalent to the hstore(text,text) constructor function, ie
"text => text returning hstore".  It's quite a different concept
from the => notation inside an hstore literal.  That is:
    'foo'::text => 'bar'::text
is not like
    '"foo" => "bar"'::hstore
even though they have the same end result.

            regards, tom lane


Re: max_stack_depth problem though query is substantially smaller

От
"Bannert Matthias"
Дата:
I guess you are right. I have narrowed the query down
to a simple create table, followed by an insert, one text field, one hstore field and an integer field.
No temporary table, no BEGIN etc. One record, yet the hstore has 40K kvp. No R involved.
and I still end up with the same error.

Thanks for the pointer to the stack trace backend. I'll try to set that up and report what I find.




________________________________________
From: Tom Lane [tgl@sss.pgh.pa.us]
Sent: Friday, April 08, 2016 9:39 PM
To: Bannert  Matthias
Cc: Charles Clavadetscher; pgsql-general@postgresql.org
Subject: Re: [GENERAL] max_stack_depth problem though query is substantially smaller

"Bannert  Matthias" <bannert@kof.ethz.ch> writes:
> Thanks for your reply. I do think it is rather a postgres than an R issue, here's why:
> a) R simply puts an SQL string together. What Charles had posted was an excerpt of that string.
> Basically we have 1.7 MB of that string. Everything else is equal just the hstore contains 40K key value pairs.

Well, as a test I ran a query that included an hstore literal with 4
million key/value pairs (a bit shy of 70MB of query text).  I didn't see
any misbehavior on a machine with 2MB max_stack_depth.  So there's
something else going on in your situation.

I concur with the suggestion to try to get a stack backtrace from the
point of the error.  Setting a breakpoint at errfinish() is usually
an effective strategy when you know that the query will provoke a SQL
error report.

https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend

                        regards, tom lane


Re: max_stack_depth problem though query is substantially smaller

От
"Bannert Matthias"
Дата:
I am not sure if I did it the right way, cause it's my first time stack tracing, but I did get some more information.
Here's what I did.

1. Switched to ubuntu server 14.04 /w postgres 9.3 server (in a Virtual Box VM)
2. setup postgres and made sure I was able to reproduce the same error.
3. followed the link to get gdb etc going.
4. started a psql session and determined the pid using SELECT pg_backend_pid();
5. attached the backend to gdb: sudo gdb -p 1521
6. in gdb:
(gdb) set pagination off
(gdb) set logging file debuglog.txt
(gdb) set logging on
break errfinish
cont
7. in psql: \i query.sql
query.sql contains a create TABLE statement and the 1.7 MB INSERT with the 40K pairs hstore.

8. in gdb: ctrl + c
bt
detach
quit

Finally that logfile hast almost 4 MB, so I am not posting it here.
Yet, here are some lines that might be helpful.... In case there is anything in particular I could look for, please let
meknow...  

Breakpoint 1, errfinish (dummy=0) at
/build/postgresql-9.3-G1RSAD/postgresql-9.3-9.3.11/build/../src/backend/utils/error/elog.c:406
406    /build/postgresql-9.3-G1RSAD/postgresql-9.3-9.3.11/build/../src/backend/utils/error/elog.c: No such file or
directory.
#0  errfinish (dummy=0) at
/build/postgresql-9.3-G1RSAD/postgresql-9.3-9.3.11/build/../src/backend/utils/error/elog.c:406
#1  0x00007fe7fb809c67 in check_stack_depth () at
/build/postgresql-9.3-G1RSAD/postgresql-9.3-9.3.11/build/../src/backend/tcop/postgres.c:3115
#2  0x00007fe7fb6b9177 in transformExprRecurse (pstate=pstate@entry=0x7fe7fffdb340, expr=expr@entry=0x7fe7ff8fc8d0) at
/build/postgresql-9.3-G1RSAD/postgresql-9.3-9.3.11/build/../src/backend/parser/parse_expr.c:132
#3  0x00007fe7fb6ba314 in transformAExprOp (a=0x7fe7ff8fcb88, pstate=0x7fe7fffdb340) at
/build/postgresql-9.3-G1RSAD/postgresql-9.3-9.3.11/build/../src/backend/parser/parse_expr.c:907
#4  transformExprRecurse (pstate=pstate@entry=0x7fe7fffdb340, expr=expr@entry=0x7fe7ff8fcb88) at
/build/postgresql-9.3-G1RSAD/postgresql-9.3-9.3.11/build/../src/backend/parser/parse_expr.c:223
#5  0x00007fe7fb6ba314 in transformAExprOp (a=0x7fe7ff8fce58, pstate=0x7fe7fffdb340) at
/build/postgresql-9.3-G1RSAD/postgresql-9.3-9.3.11/build/../src/backend/parser/parse_expr.c:907 

what follows are tons of similar lines, just at the end it seems to get more specific, i.e. mentioning the hstore:

#20133 0x00007fe7fb698d74 in transformInsertStmt (stmt=0x7fe7fffdb2a0, pstate=0x7fe7fffdb340) at
/build/postgresql-9.3-G1RSAD/postgresql-9.3-9.3.11/build/../src/backend/parser/analyze.c:701
#20134 transformStmt (pstate=pstate@entry=0x7fe7fffdb340, parseTree=0x7fe7fffdb2a0) at
/build/postgresql-9.3-G1RSAD/postgresql-9.3-9.3.11/build/../src/backend/parser/analyze.c:225
#20135 0x00007fe7fb699e43 in transformTopLevelStmt (pstate=pstate@entry=0x7fe7fffdb340, parseTree=<optimized out>,
parseTree@entry=0x7fe7fffdb2a0)at
/build/postgresql-9.3-G1RSAD/postgresql-9.3-9.3.11/build/../src/backend/parser/analyze.c:207
#20136 0x00007fe7fb699ee9 in parse_analyze (parseTree=0x7fe7fffdb2a0, sourceText=0x7fe7fdf606b0 "INSERT INTO
ts_updates(ts_key,ts_data, ts_frequency) VALUES
('some_id.sector_all.news_all_d',hstore('1900-01-01','-0.395131869823009')||hstore('1900-01-02','-0.395131869823009')||hstore('1"...,
paramTypes=0x0,numParams=0) at
/build/postgresql-9.3-G1RSAD/postgresql-9.3-9.3.11/build/../src/backend/parser/analyze.c:100
#20137 0x00007fe7fb80ab8c in pg_analyze_and_rewrite (parsetree=parsetree@entry=0x7fe7fffdb2a0,
query_string=query_string@entry=0x7fe7fdf606b0"INSERT INTO ts_updates(ts_key, ts_data, ts_frequency) VALUES
('some_id.sector_all.news_all_d',hstore('1900-01-01','-0.395131869823009')||hstore('1900-01-02','-0.395131869823009')||hstore('1"...,
paramTypes=paramTypes@entry=0x0,numParams=numParams@entry=0) at
/build/postgresql-9.3-G1RSAD/postgresql-9.3-9.3.11/build/../src/backend/tcop/postgres.c:640
#20138 0x00007fe7fb80cff3 in exec_simple_query (query_string=0x7fe7fdf606b0 "INSERT INTO ts_updates(ts_key, ts_data,
ts_frequency)VALUES
('some_id.sector_all.news_all_d',hstore('1900-01-01','-0.395131869823009')||hstore('1900-01-02','-0.395131869823009')||hstore('1"...)
at/build/postgresql-9.3-G1RSAD/postgresql-9.3-9.3.11/build/../src/backend/tcop/postgres.c:1001 
#20139 PostgresMain (argc=<optimized out>, argv=argv@entry=0x7fe7fdd26090, dbname=0x7fe7fdd25f40 "postgres",
username=<optimizedout>) at
/build/postgresql-9.3-G1RSAD/postgresql-9.3-9.3.11/build/../src/backend/tcop/postgres.c:4072
#20140 0x00007fe7fb5fe424 in BackendRun (port=0x7fe7fdd69900) at
/build/postgresql-9.3-G1RSAD/postgresql-9.3-9.3.11/build/../src/backend/postmaster/postmaster.c:4177
#20141 BackendStartup (port=0x7fe7fdd69900) at
/build/postgresql-9.3-G1RSAD/postgresql-9.3-9.3.11/build/../src/backend/postmaster/postmaster.c:3840
#20142 ServerLoop () at
/build/postgresql-9.3-G1RSAD/postgresql-9.3-9.3.11/build/../src/backend/postmaster/postmaster.c:1691
#20143 0x00007fe7fb7c6361 in PostmasterMain (argc=5, argv=<optimized out>) at
/build/postgresql-9.3-G1RSAD/postgresql-9.3-9.3.11/build/../src/backend/postmaster/postmaster.c:1315
#20144 0x00007fe7fb5ff0a3 in main (argc=5, argv=0x7fe7fdd25190) at
/build/postgresql-9.3-G1RSAD/postgresql-9.3-9.3.11/build/../src/backend/main/main.c:227
Detaching from program: /usr/lib/postgresql/9.3/bin/postgres, process 1521


Is this of any help at all? I really how you can get almost 70 MB done and I can't even get 2-5 MB done. Maybe you can
sharea brief part of you 70 MB query? 

regards, matt bannert



________________________________________
From: Tom Lane [tgl@sss.pgh.pa.us]
Sent: Friday, April 08, 2016 9:39 PM
To: Bannert  Matthias
Cc: Charles Clavadetscher; pgsql-general@postgresql.org
Subject: Re: [GENERAL] max_stack_depth problem though query is substantially smaller

"Bannert  Matthias" <bannert@kof.ethz.ch> writes:
> Thanks for your reply. I do think it is rather a postgres than an R issue, here's why:
> a) R simply puts an SQL string together. What Charles had posted was an excerpt of that string.
> Basically we have 1.7 MB of that string. Everything else is equal just the hstore contains 40K key value pairs.

Well, as a test I ran a query that included an hstore literal with 4
million key/value pairs (a bit shy of 70MB of query text).  I didn't see
any misbehavior on a machine with 2MB max_stack_depth.  So there's
something else going on in your situation.

I concur with the suggestion to try to get a stack backtrace from the
point of the error.  Setting a breakpoint at errfinish() is usually
an effective strategy when you know that the query will provoke a SQL
error report.

https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend

                        regards, tom lane


Re: max_stack_depth problem though query is substantially smaller

От
"Bannert Matthias"
Дата:
Heureka! thanks so much for your help, patience and the right hunch. Actually I am glad now I ran into that stack issue
(andyou) cause the entire thing is also much faster now.  
I changed my app to emit strings like you suggested and it works, also with smaller max_stack_depth.

Fwiw, I was not stubbornly insisting on nesting operators. Actually I switched from "=>" to the hstore function cause
a note in the manual said it was deprecated (http://www.postgresql.org/docs/9.0/static/hstore.html). Somehow I must
haveunderstand that note the wrong way.  
What's your take on that operator being deprecated?

regards, matt bannert



________________________________________
From: Tom Lane [tgl@sss.pgh.pa.us]
Sent: Saturday, April 09, 2016 5:25 PM
To: Bannert  Matthias
Cc: Charles Clavadetscher; pgsql-general@postgresql.org
Subject: Re: [GENERAL] max_stack_depth problem though query is substantially smaller

"Bannert  Matthias" <bannert@kof.ethz.ch> writes:
> [ very deep stack of parser transformExprRecurse calls ]

> #20137 0x00007fe7fb80ab8c in pg_analyze_and_rewrite (parsetree=parsetree@entry=0x7fe7fffdb2a0,
query_string=query_string@entry=0x7fe7fdf606b0"INSERT INTO ts_updates(ts_key, ts_data, ts_frequency) VALUES
('some_id.sector_all.news_all_d',hstore('1900-01-01','-0.395131869823009')||hstore('1900-01-02','-0.395131869823009')||hstore('1"...,
paramTypes=paramTypes@entry=0x0,numParams=numParams@entry=0) at
/build/postgresql-9.3-G1RSAD/postgresql-9.3-9.3.11/build/../src/backend/tcop/postgres.c:640

The SQL fragment we can see here suggests that your "40K entry hstore" is
getting built up by stringing together 40K hstore concatenation operators.
Don't do that.  Even without the parser stack depth issue, it's uselessly
inefficient.  I presume you're generating this statement mechanically,
not by hand, so you could equally well have the app emit

'1900-01-01 => -0.395131869823009, 1900-01-02 => -0.395131869823009, ...'::hstore

which would look like a single hstore literal to the parser, and be
processed much more quickly.

If you insist on emitting SQL statements that have operators nested
to such depths, then yes you'll need to increase max_stack_depth to
whatever it takes to allow it.

                        regards, tom lane