Обсуждение: High-CPU consumption on information_schema (only) query
Hi,
An SQL (with only information_schema related JOINS) when triggered, runs with max CPU (and never ends - killed after 2 days).
- It runs similarly (very slow) on a replicated server that acts as a read-only slave.
- Top shows only postgres as hitting max CPU (nothing else). When query killed, CPU near 0%.
- When the DB is restored on a separate test server (with the exact postgresql.conf) the same query works fine.
- There is no concurrent usage on the replicated / test server (although the primary is a Production server and has concurrent users).
Questions:
- If this was a postgres bug or a configuration issue, query on the restored DB should have been slow too. Is there something very basic I am missing here?
If someone asks for I could provide SQL + EXPLAIN, but it feels irrelevant here. I amn't looking for a specific solution but what else should I be looking for here?
p.s.: All postgres servers are running the v9.3.10
-
robins
--
-
robins
<p dir="ltr"><p dir="ltr">On 8 Sep. 2016 7:38 am, "Robins Tharakan" <<a href="mailto:tharakan@gmail.com">tharakan@gmail.com</a>>wrote:<br /> ><br /> > Hi,<br /> ><br /> > An SQL(with only information_schema related JOINS) when triggered, runs with max CPU (and never ends - killed after 2 days).<br/> > - It runs similarly (very slow) on a replicated server that acts as a read-only slave.<br /> > - Topshows only postgres as hitting max CPU (nothing else). When query killed, CPU near 0%.<br /> > - When the DB is restoredon a separate test server (with the exact postgresql.conf) the same query works fine.<br /> > - There is no concurrentusage on the replicated / test server (although the primary is a Production server and has concurrent users).<br/> ><br /> > Questions:<br /> > - If this was a postgres bug or a configuration issue, query on the restoredDB should have been slow too. Is there something very basic I am missing here?<br /> ><br /> > If someone asksfor I could provide SQL + EXPLAIN, but it feels irrelevant here. I amn't looking for a specific solution but what elseshould I be looking for here? <p dir="ltr">Get a series of stack traces.<p dir="ltr">Perf with stack output would begood too.<p dir="ltr">You need debug info for both.
<p dir="ltr">On Wed, Sep 7, 2016 at 4:37 PM, Robins Tharakan <<a href="mailto:tharakan@gmail.com">tharakan@gmail.com</a>>wrote:<br /><blockquote><p dir="ltr">><br /></blockquote><pdir="ltr">> Hi,<br /> ><br /> > An SQL (with only information_schema related JOINS) when triggered,runs with max CPU (and never ends - killed after 2 days).<br /> > - It runs similarly (very slow) on a replicatedserver that acts as a read-only slave.<br /> > - Top shows only postgres as hitting max CPU (nothing else).When query killed, CPU near 0%.<br /> > - When the DB is restored on a separate test server (with the exact postgresql.conf)the same query works fine.<br /> > - There is no concurrent usage on the replicated / test server (althoughthe primary is a Production server and has concurrent users).<br /> ><br /> > Questions:<br />> - If this was a postgres bug or a configuration issue, query on the restored DB should have been slow too. Is theresomething very basic I am missing here?<br /> ><br /> > If someone asks for I could provide SQL + EXPLAIN, butit feels irrelevant here. I amn't looking for a specific solution but what else should I be looking for here? <br /><pdir="ltr">strace -ttt -T -y the process to see what system calls it is making. If it is not doing many systme calls,or they are uninformative, then attach the gdb debugger to it and periodically interrupt the process (ctrl c) and takea back trace (bt), then restart it (c) and repeat. If all the stack traces look similar, you will know where the timeis going. <br /> <br /> Cheers, <p dir="ltr">Jeff<br />
On 2016-09-07 23:37:31 +0000, Robins Tharakan wrote: > If someone asks for I could provide SQL + EXPLAIN, but it feels irrelevant > here. Why is that? information_schema are normal sql queries, and some of them far from trivial. Andres
On Fri, 9 Sep 2016 at 09:39 Andres Freund <andres@anarazel.de> wrote:
On 2016-09-07 23:37:31 +0000, Robins Tharakan wrote:Hi Andres,
> If someone asks for I could provide SQL + EXPLAIN, but it feels irrelevant
> here.
Why is that? information_schema are normal sql queries, and some of them
far from trivial.
Andres
I completely agree. With 'irrelevant' I was only trying to imply that irrespective of the complexity of the query, a replicated box was seeing similar slowness whereas a Restored DB wasn't. It felt that the SQL itself isn't to blame here...
In effect, I was trying to ask if I am forgetting / missing something very obvious / important that could cause such an observation.
As others recommended, I am unable to have direct access to the production (master / slave) instances and so GDB / stack trace options are out of bounds at this time. I'll revert if I am able to do that.
-
thanks
robins
--
-
robins
Robins Tharakan <tharakan@gmail.com> writes: > I completely agree. With 'irrelevant' I was only trying to imply that > irrespective of the complexity of the query, a replicated box was seeing > similar slowness whereas a Restored DB wasn't. It felt that the SQL itself > isn't to blame here... Without having at least compared EXPLAIN outputs from the two boxes, you have no business jumping to that conclusion. If EXPLAIN does show different plans, my first instinct would be to wonder whether the pg_stats data is equally up-to-date on both boxes. regards, tom lane
Without having at least compared EXPLAIN outputs from the two boxes, you
have no business jumping to that conclusion.
If EXPLAIN does show different plans, my first instinct would be to wonder
whether the pg_stats data is equally up-to-date on both boxes.
regards, tom lane
Thanks. EXPLAIN plans were different but (don't have them now and) didn't know system catalogs were so severely affected by outdated statistics as well (which is why I was looking for any other reasons I might be missing). I reckon an ANALYSE; should solve this? ... Would get back if I have something else to offer.
-
robins
--
-
robins