Re: [GENERAL] Service outage: each postgres process use the exactamount of the configured work_mem [(AUTO-RE)SOLVED...]

Поиск
Список
Период
Сортировка
От Moreno Andreo
Тема Re: [GENERAL] Service outage: each postgres process use the exactamount of the configured work_mem [(AUTO-RE)SOLVED...]
Дата
Msg-id 186b6b6a-2a2d-3241-f70d-09e59712750a@evolu-s.it
обсуждение исходный текст
Ответ на Re: [GENERAL] Service outage: each postgres process use the exactamount of the configured work_mem  (Igor Neyman <ineyman@perceptron.com>)
Список pgsql-general
Il 14/04/2017 21:15, Igor Neyman ha scritto:

 

From: Moreno Andreo [mailto:moreno.andreo@evolu-s.it]
Sent: Friday, April 14, 2017 3:11 PM
To: Igor Neyman <ineyman@perceptron.com>; Melvin Davidson <melvin6925@gmail.com>
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Service outage: each postgres process use the exact amount of the configured work_mem

 

Attention: This email was sent from someone outside of Perceptron. Always exercise caution when opening attachments or clicking links from unknown senders or when receiving unexpected emails.

 

Il 14/04/2017 21:06, Igor Neyman ha scritto:

 

From: Moreno Andreo [mailto:moreno.andreo@evolu-s.it]
Sent: Friday, April 14, 2017 3:00 PM
To: Igor Neyman <ineyman@perceptron.com>; Melvin Davidson <melvin6925@gmail.com>
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Service outage: each postgres process use the exact amount of the configured work_mem

 

Attention: This email was sent from someone outside of Perceptron. Always exercise caution when opening attachments or clicking links from unknown senders or when receiving unexpected emails.

 

Il 14/04/2017 20:40, Igor Neyman ha scritto:

 

E. How many users were connected when the problem occurred?

About 350 connections

Thanks

______________________________________________________________________________________

 

Probably that is your problem, if you don’t have connection pooler.

I’d recommend to start with PgBouncer, very light-weight and easy to configure.

 

Regards,

Igor Neyman

 

I'm building such a system, but I don't think that's the problem because now, with 30 active connections, it still does not work (actually it's extremely slow).

Both servers are far from being under heavy load (so they were before...)

Having total of 350 connections and only 30 of them being active, is the exact case where connection pooler helps.

Igor Neyman

 

Sorry, I misexplained.
4 hours ago we were in rush hour and the server was running with its usual load (graphs do not show spikes in server load), and we had about 350 connections.
To troubleshoot the whole thing, I needed to restart the servers, in the meantime customers disconnected.
Now I have 30 of them (most of them alive), select * from pg_stat_activity shows 30 connections.
That's why I don't think it's a connection # problem.
____________________________________________________________________________________________________________________

 

Right.

But you had 300 connections when the problem occurred, and leads me to believe that pgBouncer is needed.

 

Igor Neyman

 

For whom may be in interest.....
After creating another VM for the Windows instance, using a snapshot from past night (and choosing a good client to test... our customer's has some other problems, I think........) everything worked.
No lags, no delays.
Fast as usual.
So we switched back to the original server.
Everything worked.

This is the WORST situation.
Honestly, I didn't understand which server and module was the problem, neither did figure out the cause. I found *nothing* in logs.
It can happen again.
And I'll be in the same situation.

I need to hurry up with the new structure (cluster primary/standby with pgpool II for pooling and load balancing and barman for backup), that is growing in the test lab.

Being on Google Cloud Platform, this makes me suspect of some "strange" problem (next Tuesday I'll ask them) on their structure.... but on the other side I think they'd be alerting us on some issues they're experiencing.

We designed this application 6 years ago, in a hurry (long story short: we needed to build a product to be *absolutely* sold before fall 2012, and we started in July 2011) and went a bit superficial with some modules of the program. One of them was database design. I was using Postgresql since 2005, but I never used it in scenarios that go over having 2 or 3 databases that collect logs and do some reporting, so I was lacking experience in *battle* scenarios like this, so I went with a design in which every customer has its own role, which makes him authenticate, and with this role he can connect to his database and to some of his colleagues' (imagine a group of 10... if everyone connects to everyone, it's 100 connections!). Now we're updating this, so only one role is used for everyone, and authentication is made in other ways, and I think that in this situation pgbouncer will help a little, but with the new design a pooler will lend us a great hand.

Thanks for your patience
Happy Easter
Moreno.

В списке pgsql-general по дате отправления:

Предыдущее
От: Osahon Oduware
Дата:
Сообщение: Re: [GENERAL] Error During PostGIS Build From Source on Linux
Следующее
От: Adrian Klaver
Дата:
Сообщение: Re: [GENERAL] Error During PostGIS Build From Source on Linux