Обсуждение: How to cancel a query if SIGINT does not work?

Поиск
Список
Период
Сортировка

How to cancel a query if SIGINT does not work?

От
Adam Witney
Дата:
Hi,

I have a query that appears to have hung somehow. I have tried sending a
SIGINT but this does not cancel it... What's the next step without taking
down the whole server?

Thanks

Adam


--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


Re: How to cancel a query if SIGINT does not work?

От
Tom Lane
Дата:
Adam Witney <awitney@sgul.ac.uk> writes:
> I have a query that appears to have hung somehow. I have tried sending a
> SIGINT but this does not cancel it... What's the next step without taking
> down the whole server?

In theory that should always work.  What PG version is this?  Can you
attach to the stuck backend with gdb and get a stack trace?

            regards, tom lane

Re: How to cancel a query if SIGINT does not work?

От
Adam Witney
Дата:
On 19/8/05 5:27 pm, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

> Adam Witney <awitney@sgul.ac.uk> writes:
>> I have a query that appears to have hung somehow. I have tried sending a
>> SIGINT but this does not cancel it... What's the next step without taking
>> down the whole server?
>
> In theory that should always work.  What PG version is this?  Can you
> attach to the stuck backend with gdb and get a stack trace?

select version();
                            version
---------------------------------------------------------------
 PostgreSQL 7.4.8 on i686-pc-linux-gnu, compiled by GCC 2.95.4

Am not particularly familiar with gdb, but if you tell me the steps I will
do it and report back

Thanks for your help

Adam


--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


Re: How to cancel a query if SIGINT does not work?

От
Tom Lane
Дата:
Adam Witney <awitney@sgul.ac.uk> writes:
> On 19/8/05 5:27 pm, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
>> In theory that should always work.  What PG version is this?  Can you
>> attach to the stuck backend with gdb and get a stack trace?

> Am not particularly familiar with gdb, but if you tell me the steps I will
> do it and report back

Think I've posted this before, but:

Determine PID of backend you are interested in, eg with ps

As postgres user, do
    $ gdb /path/to/postgres-executable PID-of-process
    gdb prints a bunch of junk
    gdb> bt
    ... copy and paste the output given here ...
    gdb> quit
    gdb: ok to detach? y
    $

If you get a pile of purely numeric output from bt, it's useless
(means you're running symbol-stripped executables).  I'm hoping
for at least some function names.

            regards, tom lane

Re: How to cancel a query if SIGINT does not work?

От
Adam Witney
Дата:
On 19/8/05 6:08 pm, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

> Adam Witney <awitney@sgul.ac.uk> writes:
>> On 19/8/05 5:27 pm, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
>>> In theory that should always work.  What PG version is this?  Can you
>>> attach to the stuck backend with gdb and get a stack trace?
>
>> Am not particularly familiar with gdb, but if you tell me the steps I will
>> do it and report back
>
> Think I've posted this before, but:
>
> Determine PID of backend you are interested in, eg with ps
>
> As postgres user, do
> $ gdb /path/to/postgres-executable PID-of-process
> gdb prints a bunch of junk
> gdb> bt
> ... copy and paste the output given here ...
> gdb> quit
> gdb: ok to detach? y
> $
>
> If you get a pile of purely numeric output from bt, it's useless
> (means you're running symbol-stripped executables).  I'm hoping
> for at least some function names.

(gdb) bt
#0  0x40193812 in recv () from /lib/libc.so.6
#1  0x081044d8 in secure_read ()
#2  0x081084c7 in pq_recvbuf ()
#3  0x0810853d in pq_getbyte ()
#4  0x080cb12b in CopyGetData ()
#5  0x080cb36e in CopyGetChar ()
#6  0x080cd9d7 in CopyReadLine ()
#7  0x080cce90 in CopyFrom ()
#8  0x080cbdea in DoCopy ()
#9  0x08157759 in ProcessUtility ()
#10 0x08156a1b in PortalRunUtility ()
#11 0x08156ae2 in PortalRunMulti ()
#12 0x0815667e in PortalRun ()
#13 0x08153843 in exec_simple_query ()
#14 0x08155ab9 in PostgresMain ()
#15 0x0813523d in BackendFork ()
#16 0x08134ad5 in BackendStartup ()
#17 0x08133439 in ServerLoop ()
#18 0x08132fa5 in PostmasterMain ()
#19 0x081097af in main ()
#20 0x400dc14f in __libc_start_main () from /lib/libc.so.6

Also, what's the best way to cancel the query now?

Thanks again for your help

adam


--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


Re: How to cancel a query if SIGINT does not work?

От
Martijn van Oosterhout
Дата:
On Fri, Aug 19, 2005 at 06:10:28PM +0100, Adam Witney wrote:
> > If you get a pile of purely numeric output from bt, it's useless
> > (means you're running symbol-stripped executables).  I'm hoping
> > for at least some function names.

[trace below]

That's not stuck, somebody has typed COPY FROM STDIN and not sent any
data (or not finished it). Should be able to Ctrl-C it though. Are you
sure you have the right backend?

Closing the frontend attached to it should do...

> (gdb) bt
> #0  0x40193812 in recv () from /lib/libc.so.6
> #1  0x081044d8 in secure_read ()
> #2  0x081084c7 in pq_recvbuf ()
> #3  0x0810853d in pq_getbyte ()
> #4  0x080cb12b in CopyGetData ()
> #5  0x080cb36e in CopyGetChar ()
> #6  0x080cd9d7 in CopyReadLine ()
> #7  0x080cce90 in CopyFrom ()
> #8  0x080cbdea in DoCopy ()
> #9  0x08157759 in ProcessUtility ()
> #10 0x08156a1b in PortalRunUtility ()
> #11 0x08156ae2 in PortalRunMulti ()
> #12 0x0815667e in PortalRun ()
> #13 0x08153843 in exec_simple_query ()
> #14 0x08155ab9 in PostgresMain ()
> #15 0x0813523d in BackendFork ()
> #16 0x08134ad5 in BackendStartup ()
> #17 0x08133439 in ServerLoop ()
> #18 0x08132fa5 in PostmasterMain ()
> #19 0x081097af in main ()
> #20 0x400dc14f in __libc_start_main () from /lib/libc.so.6
>
> Also, what's the best way to cancel the query now?

--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Вложения

Re: How to cancel a query if SIGINT does not work?

От
Tom Lane
Дата:
Adam Witney <awitney@sgul.ac.uk> writes:
> (gdb) bt
> #0  0x40193812 in recv () from /lib/libc.so.6
> #1  0x081044d8 in secure_read ()
> #2  0x081084c7 in pq_recvbuf ()
> #3  0x0810853d in pq_getbyte ()
> #4  0x080cb12b in CopyGetData ()
> #5  0x080cb36e in CopyGetChar ()
> #6  0x080cd9d7 in CopyReadLine ()
> #7  0x080cce90 in CopyFrom ()
> #8  0x080cbdea in DoCopy ()
> #9  0x08157759 in ProcessUtility ()
> #10 0x08156a1b in PortalRunUtility ()
> #11 0x08156ae2 in PortalRunMulti ()
> #12 0x0815667e in PortalRun ()
> #13 0x08153843 in exec_simple_query ()
> #14 0x08155ab9 in PostgresMain ()
> #15 0x0813523d in BackendFork ()
> #16 0x08134ad5 in BackendStartup ()
> #17 0x08133439 in ServerLoop ()
> #18 0x08132fa5 in PostmasterMain ()
> #19 0x081097af in main ()
> #20 0x400dc14f in __libc_start_main () from /lib/libc.so.6

OK, apparently it's waiting for data from the client during a COPY FROM.

Not sure if we should try to fix things so that a SIGINT will work
there --- seems like it'd be more likely to bollix the protocol state
than do anything very useful.  Something for the TODO list.

> Also, what's the best way to cancel the query now?

Kill the connected client.  The recv() should error out once the network
connection is dropped.

            regards, tom lane

Re: How to cancel a query if SIGINT does not work?

От
Adam Witney
Дата:
On 19/8/05 6:20 pm, "Martijn van Oosterhout" <kleptog@svana.org> wrote:

> On Fri, Aug 19, 2005 at 06:10:28PM +0100, Adam Witney wrote:
>>> If you get a pile of purely numeric output from bt, it's useless
>>> (means you're running symbol-stripped executables).  I'm hoping
>>> for at least some function names.
>
> [trace below]
>
> That's not stuck, somebody has typed COPY FROM STDIN and not sent any
> data (or not finished it). Should be able to Ctrl-C it though. Are you
> sure you have the right backend?
>
> Closing the frontend attached to it should do...

Ah.... Its a web application... I assumed that because the browser had been
closed that the query must have hung. But just restarted the apache server
and that cancelled it, so apache must have kept the query running despite
the user closing the browser.

Sorry I probably should have thought of that!

Thanks for your help Tom and Martijn

Adam


--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


Optimize a VIEW question

От
Bill Moseley
Дата:
I'm just starting out using VIEWs -- and mostly new to postgresql and
I'm trying to learn a little about reading the EXPLAIN ANALYZE output.

I have the following view setup and I'm wonder where to create
indexes, and mostly if I'm doing anything the incorrect or a very
expensive way.

    CREATE VIEW class_list
        (
            id, name, class_time, location, workshop, review_mode, workshop_group,
            location_name,
            address, city, state, zip,
            region, region_name
        )
    AS
        SELECT class.id, class.name, class.class_time, class.location, class.workshop,
               class.review_mode, class.workshop_group,
               location.name,
               address.id, address.city, address.state, address.zip,
               region.id, region.name

          FROM class, location, address, region

         WHERE class.location           = location.id
               AND location.address     = address.id
               AND location.region      = region.id;


I'm not clear about the Seq Scan below.  The region
table is quite small, so am I correct that is why the planner is doing
a seq scan on that table?

    \d region
                                 Table "public.region"
       Column   |  Type   |                       Modifiers
    ------------+---------+--------------------------------------------------------
     id         | integer | not null default nextval('public.region_id_seq'::text)
     active     | boolean | not null default true
     sort_order | integer | not null default 1
     name       | text    | not null
    Indexes:
        "region_pkey" primary key, btree (id)
        "region_name_key" unique, btree (name)


EXPLAIN ANALYZE select * from class_list where workshop = 28;
                                                                          QUERY PLAN
                                      

--------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=51.78..93.07 rows=9 width=157) (actual time=1.306..1.468 rows=6 loops=1)
   ->  Hash Join  (cost=51.78..76.87 rows=8 width=129) (actual time=1.245..1.299 rows=6 loops=1)
         Hash Cond: ("outer".id = "inner".region)
         ->  Seq Scan on region  (cost=0.00..20.00 rows=1000 width=36) (actual time=0.016..0.027 rows=10 loops=1)
         ->  Hash  (cost=51.76..51.76 rows=8 width=97) (actual time=1.019..1.019 rows=0 loops=1)
               ->  Hash Join  (cost=26.68..51.76 rows=8 width=97) (actual time=0.201..1.007 rows=6 loops=1)
                     Hash Cond: ("outer".id = "inner"."location")
                     ->  Seq Scan on "location"  (cost=0.00..20.00 rows=1000 width=44) (actual time=0.014..0.694
rows=104loops=1) 
                     ->  Hash  (cost=26.66..26.66 rows=7 width=57) (actual time=0.150..0.150 rows=0 loops=1)
                           ->  Index Scan using class_workshop_index on "class"  (cost=0.00..26.66 rows=7 width=57)
(actualtime=0.057..0.137 rows=6 loops=1) 
                                 Index Cond: (workshop = 28)
   ->  Index Scan using address_pkey on address  (cost=0.00..2.01 rows=1 width=32) (actual time=0.013..0.015 rows=1
loops=6)
         Index Cond: ("outer".address = address.id)
 Total runtime: 1.853 ms
(14 rows)


By the way -- at one point I managed to hang postgresql (7.4.8-16 on
Debian Sid).  I have not been able to make it happen again, but it
seemed odd.

    (gdb) bt
    #0  0x081e51ee in tuplestore_gettuple ()
    #1  0x0810c7f0 in ExecMaterial ()
    #2  0x08102cb2 in ExecProcNode ()
    #3  0x0810d8d5 in ExecNestLoop ()
    #4  0x08102ceb in ExecProcNode ()
    #5  0x081093a4 in ExecAgg ()
    #6  0x08102c79 in ExecProcNode ()
    #7  0x08101ecc in ExecutorRun ()
    #8  0x0816f58b in PortalSetResultFormat ()
    #9  0x0816f8c7 in PortalRun ()
    #10 0x0816da9f in PostgresMain ()
    #11 0x08148b4e in ClosePostmasterPorts ()
    #12 0x0814a4e1 in PostmasterMain ()
    #13 0x0811c2e7 in main ()


--
Bill Moseley
moseley@hank.org