Обсуждение: Proposal to add connection request Wait-time in PSQL client.
Hello, I have observed the following same situation in PG 9.3beta1 Multiple PSQL clients are connected to server, some of them running transaction and some of them are idle state. When one of the backend is killed or crashed (using kill -9 <backend-pid>). The connection reset attempt from the active clients( that is, which were running a transaction and crashed in between)fails, since they immediately make the attempt while the server is in startup phase. I just gone through and found following: 1. When backend crashes , server goes into recovery mode and come in the normal state to accept connection, it take littletime. 2. But at busy client(which was running transaction before crash), immediately tries to reconnect to server which is understartup phase so it gets a negative reply and fails to reconnect. So I thought, before sending reconnect request from client need to wait for the server come to a state when it can acceptconnections. It should have some timeout wait. I am not sure is this correct way to code modification or does it have any other impact. I tried wait to client before sending reconnect request to server. For that added some sleep time for client in src/bin/psql/common.c (that is it changes things only for psql clients) Please check the attached patch for the modification. Regards, Amul Sul
Вложения
On 05/17/2013 08:22 AM, amul sul wrote: > Hello, > > I have observed the following same situation in PG 9.3beta1 > Multiple PSQL clients are connected to server, some of them running transaction and some of them are idle state. > > > When one of the backend is killed or crashed (using kill -9 <backend-pid>). > The connection reset attempt from the active clients( that is, which were running a transaction and crashed in between)fails, since they immediately make the attempt while the server is in startup phase. It isn't clear to me why this needs to be tackled in psql or the other clients. Usually one has retry and back-off code in whatever's using the client - shell script using psql, Python program with psycopg2, Java program with PgJDBC, etc - that manages reconnection and retries. If server restarts were routine it might more sense to teach the client code about this - but they should not be. Your first problem is "killed using kill -9". You should not need to do that, nor should you be experiencing backed crashes. If you are, investigate the underlying cause and fix that. I'm not a big fan of the idea of psql having its own internal retry loop. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
> It isn't clear to me why this needs to be tackled in psql or the other > clients. This case only for the client PSQL, who has running transaction,yet not has been finished and suddenly some other backend crashed or killed, then server restarted in recovery mode. Then this client immediately send request to connection reset, But it might fail because of server is not startup properly to accept connection from recovery mode. So, idea behind this, before throwing connection fail error, client should wait keep trying for connection reset, in boundedwait time. > Usually one has retry and back-off code in whatever's using the client - > shell script using psql, Python program with psycopg2, Java program with > PgJDBC, etc - that manages reconnection and retries. Yes, you are correct. even in those script need to add time interval to resend request again and again. Instead of this, can we add loop in Client code, so it can keep trying to connection request? in a way, we client terminal wont hangup by throwing *The connection to the server was lost. Attempting reset: Failed. !* Regards, Amul Sul
On 05/19/2013 11:41 AM, amul sul wrote: > in a way, we client terminal wont hangup by throwing *The connection to the server was lost. Attempting reset: Failed.!* The thing is that this just should not be a routine occurrence. It's a minor irritation to me when debugging sometimes, but it's not something that you should be encountering in production. If you are, changing psql's reconnect behaviour is not necessarily the best solution. I'd try to work out why you're having so many unrecoverable disconnects and restarts and fix that instead. Making psql smarter about reconnecting to solve server crashes feels a bit like carrying a bunch of spare fuel cans in the back of a car for when you run out of fuel. It's not necessarily a terrible idea, but you should really never need it either. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
>> in a way, we client terminal wont hangup by throwing *The connection to > the server was lost. Attempting reset: Failed. !* > The thing is that this just should not be a routine occurrence. It's a > minor irritation to me when debugging sometimes, but it's not something > that you should be encountering in production. If you are, changing > psql's reconnect behaviour is not necessarily the best solution. I'd try > to work out why you're having so many unrecoverable disconnects and > restarts and fix that instead. Yes, I think so. This is very rare case, and won't have any affect in production. Initially, IMO, I thought no harm if PSQL client wait for few seconds till server recovered properly and ready to acceptconnection. Any way, I will follow-up your suggestion. Thank you for sharing your concerns and explaining me actual needed things. Regards, Amul Sul