Обсуждение: Possible reasons for copy_from to never terminate?

Поиск
Список
Период
Сортировка

Possible reasons for copy_from to never terminate?

От
Дата:

The line of code I’m executing is

 

cursor.copy_from(io.StringIO('\n'.join(list_of_strings)), 'tablename')

 

Every once in a while the command seems to run forever, causing my server to crash.

 

I'm running this on Python 3 inside a Flask app being served by Waitress on Heroku. The source of the data is a POST request and consists of tab-separated timestamps and numbers. There is nothing different happening on the client side when this error happens. The format of the data is the same with every POST.

 

This is the error I get from Heroku when I finally manually restart the dyno:

 

2015-04-24T12:18:39Z app[postgres.23344]: [BLACK] process 23344 acquired ShareLock on transaction 720433 after 59127607.154 ms

Query: COPY tablename FROM stdin WITH DELIMITER AS '    ' NULL AS '\N'

2015-04-24T12:18:39Z app[postgres.23344]: [BLACK] connection to client lost

2015-04-24T12:18:39Z app[postgres.23344]: [BLACK] duration: 59127623.384 ms  statement: COPY tablename FROM stdin WITH DELIMITER AS '   ' NULL AS '\N'

2015-04-24T12:18:39Z app[postgres.23344]: [BLACK] could not send data to client: Connection timed out 2015-04-24T12:18:39Z app[postgres.13451]: [BLACK] unexpected EOF on client connection with an open transaction

 

So the copy command was running for 59 million milliseconds or 16 and a half hours before I finally killed it by restarting the dyno.

 

Any suggestions on where I should look for the cause? Alternatively, is it possible to simply put a timeout on the copy?

 

-Daniel

 

Re: Possible reasons for copy_from to never terminate?

От
Adrian Klaver
Дата:
On 04/24/2015 07:06 AM, Daniel.Straight@Intralox.com wrote:
> The line of code I’m executing is
>
> cursor.copy_from(io.StringIO('\n'.join(list_of_strings)), 'tablename')
>
> Every once in a while the command seems to run forever, causing my
> server to crash.

So how long does it 'usually' run?

>
> I'm running this on Python 3 inside a Flask app being served by Waitress
> on Heroku. The source of the data is a POST request and consists of
> tab-separated timestamps and numbers. There is nothing different
> happening on the client side when this error happens. The format of the
> data is the same with every POST.

Define client.

Are you talking about the Web client making the POST or the Flask app as
client to Postgres?

>
> This is the error I get from Heroku when I finally manually restart the
> dyno:
>
> 2015-04-24T12:18:39Z app[postgres.23344]: [BLACK] process 23344 acquired
> ShareLock on transaction 720433 after 59127607.154 ms
>
> Query: COPY tablename FROM stdin WITH DELIMITER AS '    ' NULL AS '\N'
>
> 2015-04-24T12:18:39Z app[postgres.23344]: [BLACK] connection to client lost
>
> 2015-04-24T12:18:39Z app[postgres.23344]: [BLACK] duration: 59127623.384
> ms  statement: COPY tablename FROM stdin WITH DELIMITER AS '   ' NULL AS
> '\N'
>
> 2015-04-24T12:18:39Z app[postgres.23344]: [BLACK] could not send data to
> client: Connection timed out 2015-04-24T12:18:39Z app[postgres.13451]:
> [BLACK] unexpected EOF on client connection with an open transaction

This would Postgres complaining about losing connection to presumably
the code you show in the Flask app. Though it would be nice to know
where exactly the copy_from is in the Flask flow of things.

>
> So the copy command was running for 59 million milliseconds or 16 and a
> half hours before I finally killed it by restarting the dyno.
>
> Any suggestions on where I should look for the cause? Alternatively, is
> it possible to simply put a timeout on the copy?

I would say a more complete code snippet would help. Something
encompassing how the POST data gets into the list_of_strings and then
into the copy_from()
>
> -Daniel
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: Possible reasons for copy_from to never terminate?

От
Дата:
> So how long does it 'usually' run?

Usual is best measured in ms. The amount of data is on the order of 100-1000 lines of 10 or so columns.

> Define client.
> Are you talking about the Web client making the POST or the Flask app as client to Postgres?

I'm talking about the web client making the POST request.

> This would Postgres complaining about losing connection to presumably the code you show
> in the Flask app. Though it would be nice to know where exactly the copy_from is in the Flask flow of things.

Right, it's losing connection when I restart, but my concern is how long it was connected in the first place.

> I would say a more complete code snippet would help. Something
> encompassing how the POST data gets into the list_of_strings and then
> into the copy_from()

Sure thing. I have a Flask route which accepts POST requests with a UTF-8 plain text body.
The body is tab-separated data, timestamps and numbers. I read the data line by line using
a map over flask.request.stream, split on tabs, reformat timestamps with isoformat so
Postgres will take them, re-join with tabs and then send it all to copy_from. Here's the code.
convert_timestamps just takes known timestamp columns and replaces them with equivalent
isoformat strings. get_db returns an open psycopg2 connection.
All of this works without issue >99% of the time.

def handle_post():
  lines = list(map(
    lambda line_data: '\t'.join(convert_timestamps(line_data)),
    map(
      lambda line: line.decode('utf-8').strip().split('\t'),
      flask.request.stream)))
  db = get_db()
  cursor = db.cursor()
  cursor.copy_from(io.StringIO('\n'.join(lines)), 'tablename')
  db.commit()
  cursor.close()
  return len(lines)

So copy_from should be getting data which is entirely in memory. I don't see how it
could never finish reading the data, which is what seems to be happening.

The incoming data is being sent over a slow connection, so sometimes it takes a while
to send, making the total server response time over a second, but it seems like by the
time copy_from is hit, it should definitely have all the data, so the slow connection
doesn't seem like it should be an issue here.

-Daniel


Re: Possible reasons for copy_from to never terminate?

От
Adrian Klaver
Дата:
On 04/24/2015 07:57 AM, Daniel.Straight@Intralox.com wrote:
>> So how long does it 'usually' run?
>
> Usual is best measured in ms. The amount of data is on the order of 100-1000 lines of 10 or so columns.
>
>> Define client.
>> Are you talking about the Web client making the POST or the Flask app as client to Postgres?
>
> I'm talking about the web client making the POST request.
>
>> This would Postgres complaining about losing connection to presumably the code you show
>> in the Flask app. Though it would be nice to know where exactly the copy_from is in the Flask flow of things.
>
> Right, it's losing connection when I restart, but my concern is how long it was connected in the first place.
>
>> I would say a more complete code snippet would help. Something
>> encompassing how the POST data gets into the list_of_strings and then
>> into the copy_from()
>
> Sure thing. I have a Flask route which accepts POST requests with a UTF-8 plain text body.
> The body is tab-separated data, timestamps and numbers. I read the data line by line using
> a map over flask.request.stream, split on tabs, reformat timestamps with isoformat so
> Postgres will take them, re-join with tabs and then send it all to copy_from. Here's the code.
> convert_timestamps just takes known timestamp columns and replaces them with equivalent
> isoformat strings. get_db returns an open psycopg2 connection.
> All of this works without issue >99% of the time.
>
> def handle_post():
>    lines = list(map(
>      lambda line_data: '\t'.join(convert_timestamps(line_data)),
>      map(
>        lambda line: line.decode('utf-8').strip().split('\t'),
>        flask.request.stream)))
>    db = get_db()
>    cursor = db.cursor()
>    cursor.copy_from(io.StringIO('\n'.join(lines)), 'tablename')
>    db.commit()
>    cursor.close()
>    return len(lines)

Not seeing anywhere in above where connection is closed. I do not know
Flask well enough to know if that is handled for you or not?

>
> So copy_from should be getting data which is entirely in memory. I don't see how it
> could never finish reading the data, which is what seems to be happening.

Maybe it is, but is not closing correctly. When I do this sort of thing
I explicitly close() the memory file.

>
> The incoming data is being sent over a slow connection, so sometimes it takes a while
> to send, making the total server response time over a second, but it seems like by the
> time copy_from is hit, it should definitely have all the data, so the slow connection
> doesn't seem like it should be an issue here.
>
> -Daniel
>


--
Adrian Klaver
adrian.klaver@aklaver.com