Обсуждение: Re: Question on Hot Standby in PostgreSQL

Поиск
Список
Период
Сортировка

Re: Question on Hot Standby in PostgreSQL

От
"Williams, Alex"
Дата:
Hi,

I have a question concerning using the Postgres Hot Standby feature. We
are currently using that feature to sync any changes from one master to
one slave. The slave role is as a reporting db server with queries 
constantly and concurrently running (some in ms, some in seconds, 
some in minutes,)

We have adjusted these two params to allow long queries on the hot standby:


max_standby_archive_delay = -1  # max delay before canceling queries
max_standby_streaming_delay = -1        # max delay before canceling
queries

And looking at an archived mail question similar to ours:

response:

It looks like WAL updates cannot be applied while queries are running.
I've tested that to be true. In our case, the reporting server has a 
good chance of not being synced for hours or possible longer because 
of all the queries that are running. Is it possible to run long running 
queries while having WAL updates applied to the slave?

I understand the concept of preventing WAL updates not being applied to
slaves while queries are running. However, I thought with the use of MVCC,
an active query on the slave (long running, 30 seconds+) can run reading
from row xid versions/snapshot, while the WAL update is being applied, so
subsequent queries will get the WAL updates when that WAL transaction is
committed. I haven't fully digested the MVCC model used in PostgreSQL yet
just my assumption -- that even if a table is dropped / truncated during a
WAL update, the current running query should still work as it's using a
version/snapshot of the table(s) it's querying?

Is there any way (even with a third party extension) we can sync slaves
from a master and have those updates from the master be applied to the
slave right away while letting queries of any execution time continue to
run till they complete on the standby/slave? If Hot Standby can't do that,
what would you recommend for this situation? Our scenario is that we are
hitting postgres with queries constantly and concurrently running (some in
ms, some in seconds, some in minutes,) leaving almost no time for a WAL
update to be applied.

Any help would be greatly appreciated.

Thanks,

Alex

Re: Question on Hot Standby in PostgreSQL

От
Scott Ribe
Дата:
On Nov 20, 2015, at 12:50 PM, Williams, Alex <awilliams@teamdrg.com> wrote:
>
> However, I thought with the use of MVCC,
> an active query on the slave (long running, 30 seconds+) can run reading
> from row xid versions/snapshot...

The problem is that the master could well make changes that would remove rows in that snapshot because the master does
notknow anything about the rows that need preserving for the replica's query. 

That said, IIRC there may be something in 9.4 or 9.5 which provides for the replica to send that info to the server in
orderto coordinate things, but I haven't used that and so cannot offer more. 

--
Scott Ribe
scott_ribe@elevated-dev.com
http://www.elevated-dev.com/
https://www.linkedin.com/in/scottribe/
(303) 722-0567 voice







Re: Question on Hot Standby in PostgreSQL

От
Yaser Raja
Дата:


On Mon, Nov 23, 2015 at 3:24 PM, Scott Ribe <scott_ribe@elevated-dev.com> wrote:
On Nov 20, 2015, at 12:50 PM, Williams, Alex <awilliams@teamdrg.com> wrote:
>
> However, I thought with the use of MVCC,
> an active query on the slave (long running, 30 seconds+) can run reading
> from row xid versions/snapshot...

The problem is that the master could well make changes that would remove rows in that snapshot because the master does not know anything about the rows that need preserving for the replica's query.


hot_standby_feedback addresses this problem. 

Here is the detail from the PG docs:

"Specifies whether or not a hot standby will send feedback to the primary or upstream standby about queries currently executing on the standby. This parameter can be used to eliminate query cancels caused by cleanup records, but can cause database bloat on the primary for some workloads. Feedback messages will not be sent more frequently than once per wal_receiver_status_interval. The default value is off. This parameter can only be set in the postgresql.conf file or on the server command line."

http://www.postgresql.org/docs/9.3/static/runtime-config-replication.html


Note that this will only avoid the conflicts caused by cleanup records (which normally is the majority of conflicts). 


 

That said, IIRC there may be something in 9.4 or 9.5 which provides for the replica to send that info to the server in order to coordinate things, but I haven't used that and so cannot offer more.

--
Scott Ribe
scott_ribe@elevated-dev.com
http://www.elevated-dev.com/
https://www.linkedin.com/in/scottribe/
(303) 722-0567 voice







--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: Question on Hot Standby in PostgreSQL

От
"Williams, Alex"
Дата:
Thank you both, I really appreciate your responses, and hopefully this may help others who may run into this.

I noticed on the link to that ref. it was 9.3, we have 9.2 (which can use that param too,) but I didn't realize I was looking at the 9.0 docs and didn't see that param there.

Concerning these two params below, when using hot_standby_feedback,  what would you set them if you had the situation where you had queries constantly running on the standby, some quick and some long, practically all day/night, everyday?

max_standby_archive_delay (integer)
max_standby_streaming_delay (integer)

[http://www.postgresql.org/docs/9.2/static/hot-standby.html]

Remedial possibilities exist if the number of standby-query cancellations is found to be unacceptable. The first option is to set the parameter hot_standby_feedback, which prevents VACUUM from removing recently-dead rows and so cleanup conflicts do not occur. If you do this, you should note that this will delay cleanup of dead rows on the primary, which may result in undesirable table bloat. However, the cleanup situation will be no worse than if the standby queries were running directly on the primary server, and you are still getting the benefit of off-loading execution onto the standby. max_standby_archive_delay must be kept large in this case, because delayed WAL files might already contain entries that conflict with the desired standby queries.

Should I still keep them as -1 (infinite)? If I do, will the WAL updates still be applied while current queries are running, for example:

1. Query1 starts SELECT (10 minute query) @ 10:00:00 AM (current row version xid's)
2. The WAL update is applied to the standby (2 seconds) starts at 10:00:08 AM, completes at 10:00:10 AM
3. Query2 starts SELECT @ 10:00:15 AM (starts 5 seconds after the previous WAL update was applied; will it get the new data -- row versions xids -- just applied from the WAL update from #2? It looks like the xids are preserved on the primary when that param is set: hot_standby_feedback, but just want to confirm, that while the primary maybe more bloated now with dead rows, the new and existing (which could be dead on the primary)  are there for the standby to see?)

And sorry, just one last question (I'll be going over the docs thoroughly today and will be testing these in two VM's before I do anything in production) and that is: for the vacuuming of dead rows on the master, I assume that will now occur once there are no queries running on the standby? We have a lot of cpu power/space/mem, so that won't be an issue, but I'm just curious what will trigger that, I just assume no active queries running would do that -- pls let me know if I'm wrong.

I've seen this, but prefer not to set that because it may still kill long running queries on the standby.
"Another option is to increase vacuum_defer_cleanup_age on the primary server, so that dead rows will not be cleaned up as quickly as they normally would be. This will allow more time for queries to execute before they are canceled on the standby, without having to set a high max_standby_streaming_delay. However it is difficult to guarantee any specific execution-time window with this approach, since vacuum_defer_cleanup_age is measured in transactions executed on the primary server."

Again,  thank you both.

Alex

From: Yaser Raja <yrraja@gmail.com>
Date: Monday, November 23, 2015 4:08 PM
To: Scott Ribe <scott_ribe@elevated-dev.com>
Cc: "Williams, Alex" <awilliams@teamdrg.com>, "pgsql-admin@postgresql.org" <pgsql-admin@postgresql.org>
Subject: Re: [ADMIN] Question on Hot Standby in PostgreSQL



On Mon, Nov 23, 2015 at 3:24 PM, Scott Ribe <scott_ribe@elevated-dev.com> wrote:
On Nov 20, 2015, at 12:50 PM, Williams, Alex <awilliams@teamdrg.com> wrote:
>
> However, I thought with the use of MVCC,
> an active query on the slave (long running, 30 seconds+) can run reading
> from row xid versions/snapshot...

The problem is that the master could well make changes that would remove rows in that snapshot because the master does not know anything about the rows that need preserving for the replica's query.


hot_standby_feedback addresses this problem. 

Here is the detail from the PG docs:

"Specifies whether or not a hot standby will send feedback to the primary or upstream standby about queries currently executing on the standby. This parameter can be used to eliminate query cancels caused by cleanup records, but can cause database bloat on the primary for some workloads. Feedback messages will not be sent more frequently than once per wal_receiver_status_interval. The default value is off. This parameter can only be set in the postgresql.conf file or on the server command line."

http://www.postgresql.org/docs/9.3/static/runtime-config-replication.html


Note that this will only avoid the conflicts caused by cleanup records (which normally is the majority of conflicts). 


 

That said, IIRC there may be something in 9.4 or 9.5 which provides for the replica to send that info to the server in order to coordinate things, but I haven't used that and so cannot offer more.

--
Scott Ribe
scott_ribe@elevated-dev.com
http://www.elevated-dev.com/
https://www.linkedin.com/in/scottribe/
(303) 722-0567 voice







--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: Question on Hot Standby in PostgreSQL

От
Albe Laurenz
Дата:
Alex Williams wrote:
> Concerning these two params below, when using hot_standby_feedback,  what would you set them if you
> had the situation where you had queries constantly running on the standby, some quick and some long,
> practically all day/night, everyday?
> 
> max_standby_archive_delay (integer)
> max_standby_streaming_delay (integer)

> Should I still keep them as -1 (infinite)?

I'd say set them both to 0.

If you have hot_standby_feedback=on, the primary will not remove any rows still
needed on the standby, so there should never be a need to delay WAL application.

The drawback is that you may get considerable bloat on the primary if many
DELETE/UPDATE operation are going on and queries on the standby take a long time.

Yours,
Laurenz Albe

Re: Question on Hot Standby in PostgreSQL

От
Yaser Raja
Дата:
On Wed, Nov 25, 2015 at 3:43 AM, Albe Laurenz <laurenz.albe@wien.gv.at> wrote:
Alex Williams wrote:
> Concerning these two params below, when using hot_standby_feedback,  what would you set them if you
> had the situation where you had queries constantly running on the standby, some quick and some long,
> practically all day/night, everyday?
>
> max_standby_archive_delay (integer)
> max_standby_streaming_delay (integer)

> Should I still keep them as -1 (infinite)?

I'd say set them both to 0.

If you have hot_standby_feedback=on, the primary will not remove any rows still
needed on the standby, so there should never be a need to delay WAL application.

hot_standby_feedback will significantly reduce the conflicts but it will not totally eliminate them. For example if a new column is added to a table, any query using this table on slave will be cancelled to apply this change.

IF query cancellation is totally not acceptable then you will have to set these delays to -1.

 

The drawback is that you may get considerable bloat on the primary if many
DELETE/UPDATE operation are going on and queries on the standby take a long time.

Yours,
Laurenz Albe

Re: Question on Hot Standby in PostgreSQL

От
"Williams, Alex"
Дата:
Hi Laurenz and Raja,

Thank you both for your feedback and insight, I really appreciate it. I think I'll leave it with -1 (since we may add columns in the future and other possible DDL changes,) so we'll schedule downtime on the standby at least 1x a week for cleaning up the master.

Again, I thank everyone for your help, this really helped my situation.

Happy Thanksgiving and Best Regards,

Alex

P.S. If anyone is ever in NJ, let me know, lunch is on me :)

From: Yaser Raja <yrraja@gmail.com>
Date: Wednesday, November 25, 2015 3:57 PM
To: Albe Laurenz <laurenz.albe@wien.gv.at>
Cc: "Williams, Alex" <awilliams@teamdrg.com>, Scott Ribe <scott_ribe@elevated-dev.com>, "pgsql-admin@postgresql.org" <pgsql-admin@postgresql.org>
Subject: Re: [ADMIN] Question on Hot Standby in PostgreSQL

On Wed, Nov 25, 2015 at 3:43 AM, Albe Laurenz <laurenz.albe@wien.gv.at> wrote:
Alex Williams wrote:
> Concerning these two params below, when using hot_standby_feedback,  what would you set them if you
> had the situation where you had queries constantly running on the standby, some quick and some long,
> practically all day/night, everyday?
>
> max_standby_archive_delay (integer)
> max_standby_streaming_delay (integer)

> Should I still keep them as -1 (infinite)?

I'd say set them both to 0.

If you have hot_standby_feedback=on, the primary will not remove any rows still
needed on the standby, so there should never be a need to delay WAL application.

hot_standby_feedback will significantly reduce the conflicts but it will not totally eliminate them. For example if a new column is added to a table, any query using this table on slave will be cancelled to apply this change.

IF query cancellation is totally not acceptable then you will have to set these delays to -1.

 

The drawback is that you may get considerable bloat on the primary if many
DELETE/UPDATE operation are going on and queries on the standby take a long time.

Yours,
Laurenz Albe

[ADMIN] Re: Question on installing pgadmin4 1.2 - htmlmin conflict withdjango.

От
"Williams, Alex"
Дата:
Hi,

I'm currently using CentOS 7 (CentOS Linux release 7.3.1611 (Core))

I was able to install pgadmin4 1.1 but it has a lot of bugs,  to the
point where I had to stop using it, esp. because it locks up when a
connection is lost and I'm forced to shut down pgadmin4 losing my queries.

I'm currently using the free trial of Navicat, but I noticed with a
recent yum update, pgadmin4 1.2 is available, but when trying to update
it, I get a conflict with htmlmin library:

Has anyone else got this and how did you resolve it?

Many thanks in advance!

Alex

Dependencies Resolved


=========================================================================================================================================================================================================================================
 Package
Arch
Version
Repository                                        Size

=========================================================================================================================================================================================================================================
Updating:
 firefox
x86_64
45.7.0-2.el7.centos
updates                                           77 M
 openssl
x86_64
1:1.0.1e-60.el7_3.1
updates                                          713 k
 openssl-devel
x86_64
1:1.0.1e-60.el7_3.1
updates                                          1.2 M
 openssl-libs
x86_64
1:1.0.1e-60.el7_3.1
updates                                          959 k
 pgadmin4-v1
x86_64
1.2-3.rhel7
pgdg95                                           105 k
 pgadmin4-v1-docs
noarch
1.2-3.rhel7
pgdg95                                            34 M
 pgadmin4-v1-web
noarch
1.2-3.rhel7
pgdg95                                           4.1 M
 postgresql-jdbc
noarch
42.0.0-1.rhel7
pgdg95                                           508 k
 python-blinker
noarch
1.4-1.rhel7
pgdg95                                           106 k
 sshpass
x86_64
1.06-1.el7
epel                                              21 k
Installing for dependencies:
 python-flask-htmlmin
noarch
1.2-3.rhel7
pgdg95                                           7.3 k
 python-htmlmin
noarch
0.1.10-5.gitcc611c3.rhel7
pgdg95                                            40 k

Transaction Summary

=========================================================================================================================================================================================================================================
Install              ( 2 Dependent packages)
Upgrade  10 Packages

Total size: 118 M
Is this ok [y/d/N]: y
Downloading packages:
Running transaction check
Running transaction test


Transaction check error:
  file /usr/lib/python2.7/site-packages/htmlmin/__init__.py from install
of python-htmlmin-0.1.10-5.gitcc611c3.rhel7.noarch conflicts with file
from package django-htmlmin-0.10.0-1.rhel7.noarch
  file /usr/lib/python2.7/site-packages/htmlmin/__init__.pyc from
install of python-htmlmin-0.1.10-5.gitcc611c3.rhel7.noarch conflicts
with file from package django-htmlmin-0.10.0-1.rhel7.noarch
  file /usr/lib/python2.7/site-packages/htmlmin/__init__.pyo from
install of python-htmlmin-0.1.10-5.gitcc611c3.rhel7.noarch conflicts
with file from package django-htmlmin-0.10.0-1.rhel7.noarch
  file /usr/lib/python2.7/site-packages/htmlmin/middleware.py from
install of python-htmlmin-0.1.10-5.gitcc611c3.rhel7.noarch conflicts
with file from package django-htmlmin-0.10.0-1.rhel7.noarch
  file /usr/lib/python2.7/site-packages/htmlmin/middleware.pyc from
install of python-htmlmin-0.1.10-5.gitcc611c3.rhel7.noarch conflicts
with file from package django-htmlmin-0.10.0-1.rhel7.noarch
  file /usr/lib/python2.7/site-packages/htmlmin/middleware.pyo from
install of python-htmlmin-0.1.10-5.gitcc611c3.rhel7.noarch conflicts
with file from package django-htmlmin-0.10.0-1.rhel7.noarch

Error Summary
-------------



Re: [ADMIN] Re: Question on installing pgadmin4 1.2 - htmlminconflict with django.

От
Devrim Gündüz
Дата:
Hi,

On Wed, 2017-02-22 at 20:08 +0000, Williams, Alex wrote:
> nection is lost and I'm forced to shut down pgadmin4 losing my queries.
>
> I'm currently using the free trial of Navicat, but I noticed with a
> recent yum update, pgadmin4 1.2 is available, but when trying to update
> it, I get a conflict with htmlmin library:

Can you please remove django-htmlmin prior to update? Sorry about that, but
there is no clear upgrade path now :(

Regards,
--
Devrim Gündüz
EnterpriseDB: http://www.enterprisedb.com
PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer
Twitter: @DevrimGunduz , @DevrimGunduzTR

Вложения

Re: [ADMIN] Re: Question on installing pgadmin4 1.2 - htmlminconflict with django.

От
"Williams, Alex"
Дата:
Hi Devrim.

Thanks, that worked! I was concerned about losing my setup connections
but they are still there after installing 1.2.

By removing django-htmlmin, it also uninstalled pgadmin4 1.1 (lost the
exact dependencies of what was removed, as I did a yum list installed
right after and couldn't scroll back enough to copy/paste here.)

With that said, I did the following which worked:

yum remove django-htmlmin

then

yum install pgadmin4-v1.x86_64

Thanks again!!

Alex


On 02/22/2017 03:44 PM, Devrim Gündüz wrote:
> Hi,
>
> On Wed, 2017-02-22 at 20:08 +0000, Williams, Alex wrote:
>> nection is lost and I'm forced to shut down pgadmin4 losing my queries.
>>
>> I'm currently using the free trial of Navicat, but I noticed with a
>> recent yum update, pgadmin4 1.2 is available, but when trying to update
>> it, I get a conflict with htmlmin library:
> Can you please remove django-htmlmin prior to update? Sorry about that, but
> there is no clear upgrade path now :(
>
> Regards,




Re: [ADMIN] Re: Question on installing pgadmin4 1.2 - htmlminconflict with django.

От
Devrim Gündüz
Дата:
Hi Alex,

On Wed, 2017-02-22 at 21:05 +0000, Williams, Alex wrote:
> Thanks, that worked!

You are welcome.

> I was concerned about losing my setup connections but they are still there
> after installing 1.2.

They are kept under ~/.pgadmin directory, so no issues.

> By removing django-htmlmin, it also uninstalled pgadmin4 1.1 (lost the
> exact dependencies of what was removed, as I did a yum list installed
> right after and couldn't scroll back enough to copy/paste here.)

<snip>

It probably removed all Python related dependencies of pgadmin4, but as you
reinstalled 1.2, they are installed again. Hopefully this will be the last
breakage.

Regards,
--
Devrim Gündüz
EnterpriseDB: http://www.enterprisedb.com
PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer
Twitter: @DevrimGunduz , @DevrimGunduzTR

Вложения

Hi,


Does anyone know if the check boxes in the result set introduced in 1.5 in the column names can be removed? They push down the column names to the next line, and get clipped by field container, where it's barely readable, sometimes not possible at all to read.


I checked the settings, but there are no options to do so.


Also, does anyone know when we will be able to copy the result-set rows from the result-set pane?


Thanks,


Alex