Обсуждение: Performance degradation in commit 6150a1b0

Поиск

Список

Период

Сортировка

Performance degradation in commit 6150a1b0

От

Amit Kapila

Дата:

25 февраля 2016 г., 10:26:44

From past few weeks, we were facing some performance degradation in the read-only performance bench marks in high-end machines. My colleague Mithun, has tried by reverting commit ac1d794 which seems to degrade the performance in HEAD on high-end m/c's as reported previously[1], but still we were getting degradation, then we have done some profiling to see what has caused it and we found that it's mainly caused by spin lock when called via pin/unpin buffer and then we tried by reverting commit 6150a1b0 which has recently changed the structures in that area and it turns out that reverting that patch, we don't see any degradation in performance. The important point to note is that the performance degradation doesn't occur every time, but if the tests are repeated twice or thrice, it is easily visible.

m/c details

IBM POWER-8

24 cores,192 hardware threads

RAM - 492GB

Non-default postgresql.conf settings-

shared_buffers=16GB

max_connections=200

min_wal_size=15GB

max_wal_size=20GB

checkpoint_timeout=900

maintenance_work_mem=1GB

checkpoint_completion_target=0.9

scale_factor - 300

Performance at commit 43cd468cf01007f39312af05c4c92ceb6de8afd8 is 469002 at 64-client count and then at 6150a1b08a9fe7ead2b25240be46dddeae9d98e1, it went down to 200807. This performance numbers are median of 3 15-min pgbench read-only tests. The similar data is seen even when we revert the patch on latest commit. We have yet to perform detail analysis as to why the commit 6150a1b08a9fe7ead2b25240be46dddeae9d98e1 lead to degradation, but any ideas are welcome.

[1] -

http://www.postgresql.org/message-id/CAB-SwXZh44_2ybvS5Z67p_CDz=XFn4hNAD=CnMEF+QqkXwFrGg@mail.gmail.com

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: Performance degradation in commit 6150a1b0

От

Simon Riggs

Дата:

25 февраля 2016 г., 21:09:03

On 24 February 2016 at 23:26, Amit Kapila <amit.kapila16@gmail.com> wrote:

From past few weeks, we were facing some performance degradation in the read-only performance bench marks in high-end machines. My colleague Mithun, has tried by reverting commit ac1d794 which seems to degrade the performance in HEAD on high-end m/c's as reported previously[1], but still we were getting degradation, then we have done some profiling to see what has caused it and we found that it's mainly caused by spin lock when called via pin/unpin buffer and then we tried by reverting commit 6150a1b0 which has recently changed the structures in that area and it turns out that reverting that patch, we don't see any degradation in performance. The important point to note is that the performance degradation doesn't occur every time, but if the tests are repeated twice or thrice, it is easily visible.

Not seen that on the original patch I posted. 6150a1b0 contains multiple changes to the lwlock structures, one written by me, others by Andres.

Perhaps we should revert that patch and re-apply the various changes in multiple commits so we can see the differences.

Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Performance degradation in commit 6150a1b0

От

Amit Kapila

Дата:

26 февраля 2016 г., 05:42:52

On Thu, Feb 25, 2016 at 11:38 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

On 24 February 2016 at 23:26, Amit Kapila <amit.kapila16@gmail.com> wrote:
From past few weeks, we were facing some performance degradation in the read-only performance bench marks in high-end machines. My colleague Mithun, has tried by reverting commit ac1d794 which seems to degrade the performance in HEAD on high-end m/c's as reported previously[1], but still we were getting degradation, then we have done some profiling to see what has caused it and we found that it's mainly caused by spin lock when called via pin/unpin buffer and then we tried by reverting commit 6150a1b0 which has recently changed the structures in that area and it turns out that reverting that patch, we don't see any degradation in performance. The important point to note is that the performance degradation doesn't occur every time, but if the tests are repeated twice or thrice, it is easily visible.

Not seen that on the original patch I posted. 6150a1b0 contains multiple changes to the lwlock structures, one written by me, others by Andres.

Perhaps we should revert that patch and re-apply the various changes in multiple commits so we can see the differences.

Yes, thats one choice, other is locally we can narrow down the root cause of problem and then try to address the same. Last time similar issue came up on list, agreement [1] was to note down it in PostgreSQL 9.6 open items and then work on it. I think for this problem, we haven't got to the root cause of problem, so we can try to investigate it. If nobody else steps up to reproduce and look into problem, in few days, I will look into it.

[1] - http://www.postgresql.org/message-id/CA+TgmoYjYqegXzrBizL-Ov7zDsS=GavCnxYnGn9WZ1S=rP8DaA@mail.gmail.com

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: Performance degradation in commit 6150a1b0

От

Simon Riggs

Дата:

26 февраля 2016 г., 18:11:08

On 25 February 2016 at 18:42, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Feb 25, 2016 at 11:38 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
On 24 February 2016 at 23:26, Amit Kapila <amit.kapila16@gmail.com> wrote:
From past few weeks, we were facing some performance degradation in the read-only performance bench marks in high-end machines. My colleague Mithun, has tried by reverting commit ac1d794 which seems to degrade the performance in HEAD on high-end m/c's as reported previously[1], but still we were getting degradation, then we have done some profiling to see what has caused it and we found that it's mainly caused by spin lock when called via pin/unpin buffer and then we tried by reverting commit 6150a1b0 which has recently changed the structures in that area and it turns out that reverting that patch, we don't see any degradation in performance. The important point to note is that the performance degradation doesn't occur every time, but if the tests are repeated twice or thrice, it is easily visible.

Not seen that on the original patch I posted. 6150a1b0 contains multiple changes to the lwlock structures, one written by me, others by Andres.

Perhaps we should revert that patch and re-apply the various changes in multiple commits so we can see the differences.

Yes, thats one choice, other is locally we can narrow down the root cause of problem and then try to address the same. Last time similar issue came up on list, agreement [1] was to note down it in PostgreSQL 9.6 open items and then work on it. I think for this problem, we haven't got to the root cause of problem, so we can try to investigate it. If nobody else steps up to reproduce and look into problem, in few days, I will look into it.

[1] - http://www.postgresql.org/message-id/CA+TgmoYjYqegXzrBizL-Ov7zDsS=GavCnxYnGn9WZ1S=rP8DaA@mail.gmail.com

Don't understand this. If a problem is caused by one of two things, first you check one, then the other.

Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Performance degradation in commit 6150a1b0

От

Robert Haas

Дата:

26 февраля 2016 г., 18:57:23

On Fri, Feb 26, 2016 at 8:41 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> Don't understand this. If a problem is caused by one of two things, first
> you check one, then the other.

I don't quite understand how you think that patch can be decomposed
into multiple, independent changes.  It was one commit because every
change in there is interdependent with every other one, at least as
far as I can see.  I don't really understand how you'd split it up, or
what useful information you'd hope to gain from testing a split patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Performance degradation in commit 6150a1b0

От

Andres Freund

Дата:

26 февраля 2016 г., 22:12:14

Hi,

On 2016-02-25 12:56:39 +0530, Amit Kapila wrote:
> From past few weeks, we were facing some performance degradation in the
> read-only performance bench marks in high-end machines.  My colleague
> Mithun, has tried by reverting commit ac1d794 which seems to degrade the
> performance in HEAD on high-end m/c's as reported previously[1], but still
> we were getting degradation, then we have done some profiling to see what
> has caused it  and we found that it's mainly caused by spin lock when
> called via pin/unpin buffer and then we tried by reverting commit 6150a1b0
> which has recently changed the structures in that area and it turns out
> that reverting that patch, we don't see any degradation in performance.
> The important point to note is that the performance degradation doesn't
> occur every time, but if the tests are repeated twice or thrice, it
> is easily visible.

> m/c details
> IBM POWER-8
> 24 cores,192 hardware threads
> RAM - 492GB
> 
> Non-default postgresql.conf settings-
> shared_buffers=16GB
> max_connections=200
> min_wal_size=15GB
> max_wal_size=20GB
> checkpoint_timeout=900
> maintenance_work_mem=1GB
> checkpoint_completion_target=0.9
> 
> scale_factor - 300
> 
> Performance at commit 43cd468cf01007f39312af05c4c92ceb6de8afd8 is 469002 at
> 64-client count and then at 6150a1b08a9fe7ead2b25240be46dddeae9d98e1, it
> went down to 200807.  This performance numbers are median of 3 15-min
> pgbench read-only tests.  The similar data is seen even when we revert the
> patch on latest commit.  We have yet to perform detail analysis as to why
> the commit 6150a1b08a9fe7ead2b25240be46dddeae9d98e1 lead to degradation,
> but any ideas are welcome.

Ugh. Especially the varying performance is odd. Does it vary between
restarts, or is it just happenstance?  If it's the former, we might be
dealing with some alignment issues.

If not, I wonder if the issue is massive buffer header contention. As a
LL/SC architecture acquiring the content lock might interrupt buffer
spinlock acquisition and vice versa.

Does applying the patch from
http://archives.postgresql.org/message-id/CAPpHfdu77FUi5eiNb%2BjRPFh5S%2B1U%2B8ax4Zw%3DAUYgt%2BCPsKiyWw%40mail.gmail.com
change the picture?

Regards,

Andres

Re: Performance degradation in commit 6150a1b0

От

Amit Kapila

Дата:

27 февраля 2016 г., 06:55:23

On Sat, Feb 27, 2016 at 12:41 AM, Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2016-02-25 12:56:39 +0530, Amit Kapila wrote:
> > From past few weeks, we were facing some performance degradation in the
> > read-only performance bench marks in high-end machines. My colleague
> > Mithun, has tried by reverting commit ac1d794 which seems to degrade the
> > performance in HEAD on high-end m/c's as reported previously[1], but still
> > we were getting degradation, then we have done some profiling to see what
> > has caused it and we found that it's mainly caused by spin lock when
> > called via pin/unpin buffer and then we tried by reverting commit 6150a1b0
> > which has recently changed the structures in that area and it turns out
> > that reverting that patch, we don't see any degradation in performance.
> > The important point to note is that the performance degradation doesn't
> > occur every time, but if the tests are repeated twice or thrice, it
> > is easily visible.
>
> > m/c details
> > IBM POWER-8
> > 24 cores,192 hardware threads
> > RAM - 492GB
> >
> > Non-default postgresql.conf settings-
> > shared_buffers=16GB
> > max_connections=200
> > min_wal_size=15GB
> > max_wal_size=20GB
> > checkpoint_timeout=900
> > maintenance_work_mem=1GB
> > checkpoint_completion_target=0.9
> >
> > scale_factor - 300
> >
> > Performance at commit 43cd468cf01007f39312af05c4c92ceb6de8afd8 is 469002 at
> > 64-client count and then at 6150a1b08a9fe7ead2b25240be46dddeae9d98e1, it
> > went down to 200807. This performance numbers are median of 3 15-min
> > pgbench read-only tests. The similar data is seen even when we revert the
> > patch on latest commit. We have yet to perform detail analysis as to why
> > the commit 6150a1b08a9fe7ead2b25240be46dddeae9d98e1 lead to degradation,
> > but any ideas are welcome.
>
> Ugh. Especially the varying performance is odd. Does it vary between
> restarts, or is it just happenstance? If it's the former, we might be
> dealing with some alignment issues.
>

It varies between restarts.

>
> If not, I wonder if the issue is massive buffer header contention. As a
> LL/SC architecture acquiring the content lock might interrupt buffer
> spinlock acquisition and vice versa.
>
> Does applying the patch from http://archives.postgresql.org/message-id/CAPpHfdu77FUi5eiNb%2BjRPFh5S%2B1U%2B8ax4Zw%3DAUYgt%2BCPsKiyWw%40mail.gmail.com
> change the picture?
>

Not tried, but if this is alignment issue as you are suspecting above, then does it make sense to try this out?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: Performance degradation in commit 6150a1b0

От

Andres Freund

Дата:

27 февраля 2016 г., 06:56:53

On February 26, 2016 7:55:18 PM PST, Amit Kapila <amit.kapila16@gmail.com> wrote:
>On Sat, Feb 27, 2016 at 12:41 AM, Andres Freund <andres@anarazel.de>
>wrote:
>>
>> Hi,
>>
>> On 2016-02-25 12:56:39 +0530, Amit Kapila wrote:
>> > From past few weeks, we were facing some performance degradation in
>the
>> > read-only performance bench marks in high-end machines.  My
>colleague
>> > Mithun, has tried by reverting commit ac1d794 which seems to
>degrade the
>> > performance in HEAD on high-end m/c's as reported previously[1],
>but
>still
>> > we were getting degradation, then we have done some profiling to
>see
>what
>> > has caused it  and we found that it's mainly caused by spin lock
>when
>> > called via pin/unpin buffer and then we tried by reverting commit
>6150a1b0
>> > which has recently changed the structures in that area and it turns
>out
>> > that reverting that patch, we don't see any degradation in
>performance.
>> > The important point to note is that the performance degradation
>doesn't
>> > occur every time, but if the tests are repeated twice or thrice, it
>> > is easily visible.
>>
>> > m/c details
>> > IBM POWER-8
>> > 24 cores,192 hardware threads
>> > RAM - 492GB
>> >
>> > Non-default postgresql.conf settings-
>> > shared_buffers=16GB
>> > max_connections=200
>> > min_wal_size=15GB
>> > max_wal_size=20GB
>> > checkpoint_timeout=900
>> > maintenance_work_mem=1GB
>> > checkpoint_completion_target=0.9
>> >
>> > scale_factor - 300
>> >
>> > Performance at commit 43cd468cf01007f39312af05c4c92ceb6de8afd8 is
>469002 at
>> > 64-client count and then at
>6150a1b08a9fe7ead2b25240be46dddeae9d98e1, it
>> > went down to 200807.  This performance numbers are median of 3
>15-min
>> > pgbench read-only tests.  The similar data is seen even when we
>revert
>the
>> > patch on latest commit.  We have yet to perform detail analysis as
>to
>why
>> > the commit 6150a1b08a9fe7ead2b25240be46dddeae9d98e1 lead to
>degradation,
>> > but any ideas are welcome.
>>
>> Ugh. Especially the varying performance is odd. Does it vary between
>> restarts, or is it just happenstance?  If it's the former, we might
>be
>> dealing with some alignment issues.
>>
>
>It varies between restarts.
>
>>
>> If not, I wonder if the issue is massive buffer header contention. As
>a
>> LL/SC architecture acquiring the content lock might interrupt buffer
>> spinlock acquisition and vice versa.
>>
>> Does applying the patch from

>http://archives.postgresql.org/message-id/CAPpHfdu77FUi5eiNb%2BjRPFh5S%2B1U%2B8ax4Zw%3DAUYgt%2BCPsKiyWw%40mail.gmail.com
>> change the picture?
>>
>
>Not tried, but if this is alignment issue as you are suspecting above,
>then
>does it make sense to try this out?

It's the other theory I had. And it's additionally useful testing regardless of this regression...

--- 
Please excuse brevity and formatting - I am writing this on my mobile phone.

Re: Performance degradation in commit 6150a1b0

От

Ashutosh Sharma

Дата:

23 марта 2016 г., 11:29:26

Hi All,

I have been working on this issue for last few days trying to investigate what could be the probable reasons for Performance degradation at commit 6150a1b0. After going through Andres patch for moving buffer I/O and content lock out of Main Tranche, the following two things come into my
mind.

1. Content Lock is no more used as a pointer in BufferDesc structure instead it is included as LWLock structure. This basically increases the overall structure size from 64bytes to 80 bytes. Just to investigate on this, I have reverted the changes related to content lock from commit 6150a1b0 and taken at least 10 readings and with this change i can see that the overall performance is similar to what it was observed earlier i.e. before commit 6150a1b0.

2. Secondly, i can see that the BufferDesc structure padding is 64 bytes however the PG CACHE LINE ALIGNMENT is 128 bytes. Also, after changing the BufferDesc structure padding size to 128 bytes along with the changes mentioned in above point #1, I see that the overall performance is again similar to what is observed before commit 6150a1b0.

Please have a look into the attached test report that contains the performance test results for all the scenarios discussed above and let me know your thoughts.

With Regards,
Ashutosh Sharma
EnterpriseDB: http://www.enterprisedb.com

On Sat, Feb 27, 2016 at 9:26 AM, Andres Freund <andres@anarazel.de> wrote:

On February 26, 2016 7:55:18 PM PST, Amit Kapila <amit.kapila16@gmail.com> wrote:
>On Sat, Feb 27, 2016 at 12:41 AM, Andres Freund <andres@anarazel.de>
>wrote:
>>
>> Hi,
>>
>> On 2016-02-25 12:56:39 +0530, Amit Kapila wrote:
>> > From past few weeks, we were facing some performance degradation in
>the
>> > read-only performance bench marks in high-end machines. My
>colleague
>> > Mithun, has tried by reverting commit ac1d794 which seems to
>degrade the
>> > performance in HEAD on high-end m/c's as reported previously[1],
>but
>still
>> > we were getting degradation, then we have done some profiling to
>see
>what
>> > has caused it and we found that it's mainly caused by spin lock
>when
>> > called via pin/unpin buffer and then we tried by reverting commit
>6150a1b0
>> > which has recently changed the structures in that area and it turns
>out
>> > that reverting that patch, we don't see any degradation in
>performance.
>> > The important point to note is that the performance degradation
>doesn't
>> > occur every time, but if the tests are repeated twice or thrice, it
>> > is easily visible.
>>
>> > m/c details
>> > IBM POWER-8
>> > 24 cores,192 hardware threads
>> > RAM - 492GB
>> >
>> > Non-default postgresql.conf settings-
>> > shared_buffers=16GB
>> > max_connections=200
>> > min_wal_size=15GB
>> > max_wal_size=20GB
>> > checkpoint_timeout=900
>> > maintenance_work_mem=1GB
>> > checkpoint_completion_target=0.9
>> >
>> > scale_factor - 300
>> >
>> > Performance at commit 43cd468cf01007f39312af05c4c92ceb6de8afd8 is
>469002 at
>> > 64-client count and then at
>6150a1b08a9fe7ead2b25240be46dddeae9d98e1, it
>> > went down to 200807. This performance numbers are median of 3
>15-min
>> > pgbench read-only tests. The similar data is seen even when we
>revert
>the
>> > patch on latest commit. We have yet to perform detail analysis as
>to
>why
>> > the commit 6150a1b08a9fe7ead2b25240be46dddeae9d98e1 lead to
>degradation,
>> > but any ideas are welcome.
>>
>> Ugh. Especially the varying performance is odd. Does it vary between
>> restarts, or is it just happenstance? If it's the former, we might
>be
>> dealing with some alignment issues.
>>
>
>It varies between restarts.
>
>>
>> If not, I wonder if the issue is massive buffer header contention. As
>a
>> LL/SC architecture acquiring the content lock might interrupt buffer
>> spinlock acquisition and vice versa.
>>
>> Does applying the patch from
>http://archives.postgresql.org/message-id/CAPpHfdu77FUi5eiNb%2BjRPFh5S%2B1U%2B8ax4Zw%3DAUYgt%2BCPsKiyWw%40mail.gmail.com
>> change the picture?
>>
>
>Not tried, but if this is alignment issue as you are suspecting above,
>then
>does it make sense to try this out?

It's the other theory I had. And it's additionally useful testing regardless of this regression...

---
Please excuse brevity and formatting - I am writing this on my mobile phone.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

Performance_Results.xlsx

Re: Performance degradation in commit 6150a1b0

От

Amit Kapila

Дата:

25 марта 2016 г., 06:59:42

On Wed, Mar 23, 2016 at 1:59 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>
> Hi All,
>
> I have been working on this issue for last few days trying to investigate what could be the probable reasons for Performance degradation at commit 6150a1b0. After going through Andres patch for moving buffer I/O and content lock out of Main Tranche, the following two things come into my
> mind.
>
> 1. Content Lock is no more used as a pointer in BufferDesc structure instead it is included as LWLock structure. This basically increases the overall structure size from 64bytes to 80 bytes. Just to investigate on this, I have reverted the changes related to content lock from commit 6150a1b0 and taken at least 10 readings and with this change i can see that the overall performance is similar to what it was observed earlier i.e. before commit 6150a1b0.
>
> 2. Secondly, i can see that the BufferDesc structure padding is 64 bytes however the PG CACHE LINE ALIGNMENT is 128 bytes. Also, after changing the BufferDesc structure padding size to 128 bytes along with the changes mentioned in above point #1, I see that the overall performance is again similar to what is observed before commit 6150a1b0.
>
> Please have a look into the attached test report that contains the performance test results for all the scenarios discussed above and let me know your thoughts.
>

So this indicates that changing back content lock as LWLock* in BufferDesc brings back the performance which indicates that increase in BufferDesc size to more than 64bytes on this platform has caused regression. I think it is worth trying the patch [1] as suggested by Andres as that will reduce the size of BufferDesc which can bring back the performance. Can you once try the same?

[1] - http://www.postgresql.org/message-id/CAPpHfdsRoT1JmsnRnCCqpNZEU9vUT7TX6B-N1wyOuWWfhD6F+g@mail.gmail.com

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: Performance degradation in commit 6150a1b0

От

Andres Freund

Дата:

25 марта 2016 г., 10:08:46

On 2016-03-25 09:29:34 +0530, Amit Kapila wrote:
> > 2. Secondly, i can see that the BufferDesc structure padding is 64 bytes
> however the PG CACHE LINE ALIGNMENT is 128 bytes. Also, after changing the
> BufferDesc structure padding size to 128 bytes along with the changes
> mentioned in above point #1, I see that the overall performance is again
> similar to what is observed before commit 6150a1b0.

That makes sense, as it restores alignment.

> So this indicates that changing back content lock as LWLock* in BufferDesc
> brings back the performance which indicates that increase in BufferDesc
> size to more than 64bytes on this platform has caused regression.  I think
> it is worth trying the patch [1] as suggested by Andres as that will reduce
> the size of BufferDesc which can bring back the performance.  Can you once
> try the same?
> 
> [1] -
> http://www.postgresql.org/message-id/CAPpHfdsRoT1JmsnRnCCqpNZEU9vUT7TX6B-N1wyOuWWfhD6F+g@mail.gmail.com

Yes please. I'll try to review that once more ASAP.


Regards,

Andres

Re: Performance degradation in commit 6150a1b0

От

Ashutosh Sharma

Дата:

26 марта 2016 г., 19:01:58

Hi,

I am getting some reject files while trying to apply "pinunpin-cas-5.patch" attached with the thread,

http://www.postgresql.org/message-id/CAPpHfdsRoT1JmsnRnCCqpNZEU9vUT7TX6B-N1wyOuWWfhD6F+g@mail.gmail.com

Note: I am applying this patch on top of commit "6150a1b08a9fe7ead2b25240be46dddeae9d98e1".

With Regards,

Ashutosh Sharma
EnterpriseDB: http://www.enterprisedb.com

On Fri, Mar 25, 2016 at 9:29 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Mar 23, 2016 at 1:59 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>
> Hi All,
>
> I have been working on this issue for last few days trying to investigate what could be the probable reasons for Performance degradation at commit 6150a1b0. After going through Andres patch for moving buffer I/O and content lock out of Main Tranche, the following two things come into my
> mind.
>
> 1. Content Lock is no more used as a pointer in BufferDesc structure instead it is included as LWLock structure. This basically increases the overall structure size from 64bytes to 80 bytes. Just to investigate on this, I have reverted the changes related to content lock from commit 6150a1b0 and taken at least 10 readings and with this change i can see that the overall performance is similar to what it was observed earlier i.e. before commit 6150a1b0.
>
> 2. Secondly, i can see that the BufferDesc structure padding is 64 bytes however the PG CACHE LINE ALIGNMENT is 128 bytes. Also, after changing the BufferDesc structure padding size to 128 bytes along with the changes mentioned in above point #1, I see that the overall performance is again similar to what is observed before commit 6150a1b0.
>
> Please have a look into the attached test report that contains the performance test results for all the scenarios discussed above and let me know your thoughts.
>

So this indicates that changing back content lock as LWLock* in BufferDesc brings back the performance which indicates that increase in BufferDesc size to more than 64bytes on this platform has caused regression. I think it is worth trying the patch [1] as suggested by Andres as that will reduce the size of BufferDesc which can bring back the performance. Can you once try the same?

[1] - http://www.postgresql.org/message-id/CAPpHfdsRoT1JmsnRnCCqpNZEU9vUT7TX6B-N1wyOuWWfhD6F+g@mail.gmail.com

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: Performance degradation in commit 6150a1b0

От

Ashutosh Sharma

Дата:

27 марта 2016 г., 00:04:37

Hi,

As mentioned in my earlier mail i was not able to apply pinunpin-cas-5.patch on commit 6150a1b0, therefore i thought of applying it on the

latest commit and i was able to do it successfully. I have now taken the performance readings at latest commit i.e. 76281aa9 with and without
applying pinunpin-cas-5.patch and my observations are as follows,

1. I can still see that the current performance lags by 2-3% from the expected performance when pinunpin-cas-5.patch is applied on the commit 76281aa9.

2. When pinunpin-cas-5.patch is ignored and performance is measured at commit 76281aa9 the overall performance lags by 50-60% from the expected performance.

Note: Here, the expected performance is the performance observed before commit 6150a1b0 when ac1d794 is reverted.

Please refer to the attached performance report sheet for more insights.

With Regards,

Ashutosh Sharma

EnterpriseDB: http://www.enterprisedb.com

On Sat, Mar 26, 2016 at 9:31 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:

Hi,

I am getting some reject files while trying to apply "pinunpin-cas-5.patch" attached with the thread,

http://www.postgresql.org/message-id/CAPpHfdsRoT1JmsnRnCCqpNZEU9vUT7TX6B-N1wyOuWWfhD6F+g@mail.gmail.com

Note: I am applying this patch on top of commit "6150a1b08a9fe7ead2b25240be46dddeae9d98e1".

With Regards,
Ashutosh Sharma
EnterpriseDB: http://www.enterprisedb.com

On Fri, Mar 25, 2016 at 9:29 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Mar 23, 2016 at 1:59 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>
> Hi All,
>
> I have been working on this issue for last few days trying to investigate what could be the probable reasons for Performance degradation at commit 6150a1b0. After going through Andres patch for moving buffer I/O and content lock out of Main Tranche, the following two things come into my
> mind.
>
> 1. Content Lock is no more used as a pointer in BufferDesc structure instead it is included as LWLock structure. This basically increases the overall structure size from 64bytes to 80 bytes. Just to investigate on this, I have reverted the changes related to content lock from commit 6150a1b0 and taken at least 10 readings and with this change i can see that the overall performance is similar to what it was observed earlier i.e. before commit 6150a1b0.
>
> 2. Secondly, i can see that the BufferDesc structure padding is 64 bytes however the PG CACHE LINE ALIGNMENT is 128 bytes. Also, after changing the BufferDesc structure padding size to 128 bytes along with the changes mentioned in above point #1, I see that the overall performance is again similar to what is observed before commit 6150a1b0.
>
> Please have a look into the attached test report that contains the performance test results for all the scenarios discussed above and let me know your thoughts.
>

So this indicates that changing back content lock as LWLock* in BufferDesc brings back the performance which indicates that increase in BufferDesc size to more than 64bytes on this platform has caused regression. I think it is worth trying the patch [1] as suggested by Andres as that will reduce the size of BufferDesc which can bring back the performance. Can you once try the same?

[1] - http://www.postgresql.org/message-id/CAPpHfdsRoT1JmsnRnCCqpNZEU9vUT7TX6B-N1wyOuWWfhD6F+g@mail.gmail.com

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Вложения

Performance_Results_with_pinunpin_cas_changes.xlsx

Re: Performance degradation in commit 6150a1b0

От

Andres Freund

Дата:

27 марта 2016 г., 15:15:56

Hi,

On 2016-03-27 02:34:32 +0530, Ashutosh Sharma wrote:
> As mentioned in my earlier mail i was not able to apply
> *pinunpin-cas-5.patch* on commit *6150a1b0,

That's not surprising; that's pretty old.

> *therefore i thought of applying it on the latest commit and i was
> able to do it successfully. I have now taken the performance readings
> at latest commit i.e. *76281aa9* with and without applying
> *pinunpin-cas-5.patch* and my observations are as follows,
>

> 1. I can still see that the current performance lags by 2-3% from the
> expected performance when *pinunpin-cas-5.patch *is applied on the commit
> 
> *76281aa9.*
> 2. When *pinunpin-cas-5.patch *is ignored and performance is measured at
> commit *76281aa9 *the overall performance lags by 50-60% from the expected
> performance.
> 
> *Note:* Here, the expected performance is the performance observed before
> commit *6150a1b0 *when* ac1d794 *is reverted.

Thanks for doing these benchmarks. What's the performance if you revert
6150a1b0 on top of a recent master? There've been a lot of other patches
influencing performance since 6150a1b0, so minor performance differences
aren't necessarily meaningful; especially when that older version then
had other patches reverted.

Thanks,

Andres

Re: Performance degradation in commit 6150a1b0

От

Ashutosh Sharma

Дата:

29 марта 2016 г., 15:30:54

Hi,

I am unable to revert 6150a1b0 on top of recent commit in the master branch. It seems like there has been some commit made recently that has got dependency on 6150a1b0.

With Regards,

Ashutosh Sharma
EnterpriseDB: http://www.enterprisedb.com

On Sun, Mar 27, 2016 at 5:45 PM, Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2016-03-27 02:34:32 +0530, Ashutosh Sharma wrote:
> As mentioned in my earlier mail i was not able to apply
> *pinunpin-cas-5.patch* on commit *6150a1b0,

That's not surprising; that's pretty old.

> *therefore i thought of applying it on the latest commit and i was
> able to do it successfully. I have now taken the performance readings
> at latest commit i.e. *76281aa9* with and without applying
> *pinunpin-cas-5.patch* and my observations are as follows,
>

> 1. I can still see that the current performance lags by 2-3% from the
> expected performance when *pinunpin-cas-5.patch *is applied on the commit
>
> *76281aa9.*
> 2. When *pinunpin-cas-5.patch *is ignored and performance is measured at
> commit *76281aa9 *the overall performance lags by 50-60% from the expected
> performance.
>
> *Note:* Here, the expected performance is the performance observed before
> commit *6150a1b0 *when* ac1d794 *is reverted.

Thanks for doing these benchmarks. What's the performance if you revert
6150a1b0 on top of a recent master? There've been a lot of other patches
influencing performance since 6150a1b0, so minor performance differences
aren't necessarily meaningful; especially when that older version then
had other patches reverted.

Thanks,

Andres

Re: Performance degradation in commit 6150a1b0

От

Noah Misch

Дата:

31 марта 2016 г., 08:11:04

On Sun, Mar 27, 2016 at 02:15:50PM +0200, Andres Freund wrote:
> On 2016-03-27 02:34:32 +0530, Ashutosh Sharma wrote:
> > As mentioned in my earlier mail i was not able to apply
> > *pinunpin-cas-5.patch* on commit *6150a1b0,
> 
> That's not surprising; that's pretty old.
> 
> > *therefore i thought of applying it on the latest commit and i was
> > able to do it successfully. I have now taken the performance readings
> > at latest commit i.e. *76281aa9* with and without applying
> > *pinunpin-cas-5.patch* and my observations are as follows,
> >
> 
> > 1. I can still see that the current performance lags by 2-3% from the
> > expected performance when *pinunpin-cas-5.patch *is applied on the commit
> > 
> > *76281aa9.*
> > 2. When *pinunpin-cas-5.patch *is ignored and performance is measured at
> > commit *76281aa9 *the overall performance lags by 50-60% from the expected
> > performance.
> > 
> > *Note:* Here, the expected performance is the performance observed before
> > commit *6150a1b0 *when* ac1d794 *is reverted.
> 
> Thanks for doing these benchmarks. What's the performance if you revert
> 6150a1b0 on top of a recent master? There've been a lot of other patches
> influencing performance since 6150a1b0, so minor performance differences
> aren't necessarily meaningful; especially when that older version then
> had other patches reverted.

[This is a generic notification.]

The above-described topic is currently a PostgreSQL 9.6 open item.  Andres,
since you committed the patch believed to have created it, you own this open
item.  If that responsibility lies elsewhere, please let us know whose
responsibility it is to fix this.  Since new open items may be discovered at
any time and I want to plan to have them all fixed well in advance of the ship
date, I will appreciate your efforts toward speedy resolution.  Please
present, within 72 hours, a plan to fix the defect within seven days of this
message.  Thanks.

Re: Performance degradation in commit 6150a1b0

От

Noah Misch

Дата:

31 марта 2016 г., 08:16:39

On Thu, Mar 31, 2016 at 01:10:56AM -0400, Noah Misch wrote:
> On Sun, Mar 27, 2016 at 02:15:50PM +0200, Andres Freund wrote:
> > On 2016-03-27 02:34:32 +0530, Ashutosh Sharma wrote:
> > > As mentioned in my earlier mail i was not able to apply
> > > *pinunpin-cas-5.patch* on commit *6150a1b0,
> > 
> > That's not surprising; that's pretty old.
> > 
> > > *therefore i thought of applying it on the latest commit and i was
> > > able to do it successfully. I have now taken the performance readings
> > > at latest commit i.e. *76281aa9* with and without applying
> > > *pinunpin-cas-5.patch* and my observations are as follows,
> > >
> > 
> > > 1. I can still see that the current performance lags by 2-3% from the
> > > expected performance when *pinunpin-cas-5.patch *is applied on the commit
> > > 
> > > *76281aa9.*
> > > 2. When *pinunpin-cas-5.patch *is ignored and performance is measured at
> > > commit *76281aa9 *the overall performance lags by 50-60% from the expected
> > > performance.
> > > 
> > > *Note:* Here, the expected performance is the performance observed before
> > > commit *6150a1b0 *when* ac1d794 *is reverted.
> > 
> > Thanks for doing these benchmarks. What's the performance if you revert
> > 6150a1b0 on top of a recent master? There've been a lot of other patches
> > influencing performance since 6150a1b0, so minor performance differences
> > aren't necessarily meaningful; especially when that older version then
> > had other patches reverted.
> 
> [This is a generic notification.]
> 
> The above-described topic is currently a PostgreSQL 9.6 open item.  Andres,
> since you committed the patch believed to have created it, you own this open
> item.  If that responsibility lies elsewhere, please let us know whose
> responsibility it is to fix this.  Since new open items may be discovered at
> any time and I want to plan to have them all fixed well in advance of the ship
> date, I will appreciate your efforts toward speedy resolution.  Please
> present, within 72 hours, a plan to fix the defect within seven days of this
> message.  Thanks.

My attribution above was incorrect.  Robert Haas is the committer and owner of
this one.  I apologize.

Re: Performance degradation in commit 6150a1b0

От

Andres Freund

Дата:

31 марта 2016 г., 10:51:24


On March 31, 2016 7:16:33 AM GMT+02:00, Noah Misch <noah@leadboat.com> wrote:
>On Thu, Mar 31, 2016 at 01:10:56AM -0400, Noah Misch wrote:
>> On Sun, Mar 27, 2016 at 02:15:50PM +0200, Andres Freund wrote:
>> > On 2016-03-27 02:34:32 +0530, Ashutosh Sharma wrote:
>> > > As mentioned in my earlier mail i was not able to apply
>> > > *pinunpin-cas-5.patch* on commit *6150a1b0,
>> > 
>> > That's not surprising; that's pretty old.
>> > 
>> > > *therefore i thought of applying it on the latest commit and i
>was
>> > > able to do it successfully. I have now taken the performance
>readings
>> > > at latest commit i.e. *76281aa9* with and without applying
>> > > *pinunpin-cas-5.patch* and my observations are as follows,
>> > >
>> > 
>> > > 1. I can still see that the current performance lags by 2-3% from
>the
>> > > expected performance when *pinunpin-cas-5.patch *is applied on
>the commit
>> > > 
>> > > *76281aa9.*
>> > > 2. When *pinunpin-cas-5.patch *is ignored and performance is
>measured at
>> > > commit *76281aa9 *the overall performance lags by 50-60% from the
>expected
>> > > performance.
>> > > 
>> > > *Note:* Here, the expected performance is the performance
>observed before
>> > > commit *6150a1b0 *when* ac1d794 *is reverted.
>> > 
>> > Thanks for doing these benchmarks. What's the performance if you
>revert
>> > 6150a1b0 on top of a recent master? There've been a lot of other
>patches
>> > influencing performance since 6150a1b0, so minor performance
>differences
>> > aren't necessarily meaningful; especially when that older version
>then
>> > had other patches reverted.
>> 
>> [This is a generic notification.]
>> 
>> The above-described topic is currently a PostgreSQL 9.6 open item. 
>Andres,
>> since you committed the patch believed to have created it, you own
>this open
>> item.  If that responsibility lies elsewhere, please let us know
>whose
>> responsibility it is to fix this.  Since new open items may be
>discovered at
>> any time and I want to plan to have them all fixed well in advance of
>the ship
>> date, I will appreciate your efforts toward speedy resolution. 
>Please
>> present, within 72 hours, a plan to fix the defect within seven days
>of this
>> message.  Thanks.
>
>My attribution above was incorrect.  Robert Haas is the committer and
>owner of
>this one.  I apologize.

Fine in this case I guess. I've posted a proposal nearby either way, it appears to be a !x86 problem.

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Re: Performance degradation in commit 6150a1b0

От

Robert Haas

Дата:

31 марта 2016 г., 13:43:25

On Thu, Mar 31, 2016 at 3:51 AM, Andres Freund <andres@anarazel.de> wrote:
>>My attribution above was incorrect.  Robert Haas is the committer and
>>owner of
>>this one.  I apologize.
>
> Fine in this case I guess. I've posted a proposal nearby either way, it appears to be a !x86 problem.

To which proposal are you referring?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Performance degradation in commit 6150a1b0

От

Andres Freund

Дата:

31 марта 2016 г., 13:45:29

On 2016-03-31 06:43:19 -0400, Robert Haas wrote:
> On Thu, Mar 31, 2016 at 3:51 AM, Andres Freund <andres@anarazel.de> wrote:
> >>My attribution above was incorrect.  Robert Haas is the committer and
> >>owner of
> >>this one.  I apologize.
> >
> > Fine in this case I guess. I've posted a proposal nearby either way, it appears to be a !x86 problem.
> 
> To which proposal are you referring?

1) in http://www.postgresql.org/message-id/20160328130904.4mhugvkf4f3wg4qb@awork2.anarazel.de

Re: Performance degradation in commit 6150a1b0

От

Robert Haas

Дата:

31 марта 2016 г., 13:51:05

On Thu, Mar 31, 2016 at 6:45 AM, Andres Freund <andres@anarazel.de> wrote:
> On 2016-03-31 06:43:19 -0400, Robert Haas wrote:
>> On Thu, Mar 31, 2016 at 3:51 AM, Andres Freund <andres@anarazel.de> wrote:
>> >>My attribution above was incorrect.  Robert Haas is the committer and
>> >>owner of
>> >>this one.  I apologize.
>> >
>> > Fine in this case I guess. I've posted a proposal nearby either way, it appears to be a !x86 problem.
>>
>> To which proposal are you referring?
>
> 1) in http://www.postgresql.org/message-id/20160328130904.4mhugvkf4f3wg4qb@awork2.anarazel.de

OK.  So, Noah, my proposed strategy is to wait and see if Andres can
make that work, and if not, then revisit the issue of what to do.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Performance degradation in commit 6150a1b0

От

Tom Lane

Дата:

31 марта 2016 г., 17:13:52

Robert Haas <robertmhaas@gmail.com> writes:
> On Thu, Mar 31, 2016 at 6:45 AM, Andres Freund <andres@anarazel.de> wrote:
>> On 2016-03-31 06:43:19 -0400, Robert Haas wrote:
>>> To which proposal are you referring?

>> 1) in http://www.postgresql.org/message-id/20160328130904.4mhugvkf4f3wg4qb@awork2.anarazel.de

> OK.  So, Noah, my proposed strategy is to wait and see if Andres can
> make that work, and if not, then revisit the issue of what to do.

I thought that proposal had already crashed and burned, on the grounds
that byte-size spinlocks require instructions that many PPC machines
don't have.
        regards, tom lane

Re: Performance degradation in commit 6150a1b0

От

Robert Haas

Дата:

13 апреля 2016 г., 00:36:12

On Thu, Mar 31, 2016 at 10:13 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Thu, Mar 31, 2016 at 6:45 AM, Andres Freund <andres@anarazel.de> wrote:
>>> On 2016-03-31 06:43:19 -0400, Robert Haas wrote:
>>>> To which proposal are you referring?
>
>>> 1) in http://www.postgresql.org/message-id/20160328130904.4mhugvkf4f3wg4qb@awork2.anarazel.de
>
>> OK.  So, Noah, my proposed strategy is to wait and see if Andres can
>> make that work, and if not, then revisit the issue of what to do.
>
> I thought that proposal had already crashed and burned, on the grounds
> that byte-size spinlocks require instructions that many PPC machines
> don't have.

So the current status of this issue is:

1. Andres committed a patch (008608b9d51061b1f598c197477b3dc7be9c4a64)
to reduce the size of an LWLock by an amount equal to the size of a
mutex (modulo alignment).

2. Andres also committed a patch
(48354581a49c30f5757c203415aa8412d85b0f70) to remove the spinlock from
a BufferDesc, which also reduces its size, I think, because it
replaces members of types BufFlags (2 bytes), uint8, slock_t, and
unsigned with a single member of type pg_atomic_uint32.

The reason why these changes are relevant is because Andres thought
the observed regression might be related to the BufferDesc growing to
more than 64 bytes on POWER, which in turn could cause buffer
descriptors to get split across cache lines.  However, in the
meantime, I did some performance tests on the same machine that Amit
used for testing in the email that started this thread:

http://www.postgresql.org/message-id/CA+TgmoZJdA6K7-17K4A48rVB0UPR98HVuaNcfNNLrGsdb1uChg@mail.gmail.com

The upshot of that is that (1) the performance degradation I saw was
significant but smaller than what Amit reported in the OP, and (2) it
looked like the patches Andres gave me to test at the time got
performance back to about the same level we were at before 6150a1b0.
So there's room for optimism that this is fixed, but perhaps some
retesting is in order, since what was committed was, I think, not
identical to what I tested.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Performance degradation in commit 6150a1b0

От

Noah Misch

Дата:

13 апреля 2016 г., 05:30:21

On Tue, Apr 12, 2016 at 05:36:07PM -0400, Robert Haas wrote:
> So the current status of this issue is:
> 
> 1. Andres committed a patch (008608b9d51061b1f598c197477b3dc7be9c4a64)
> to reduce the size of an LWLock by an amount equal to the size of a
> mutex (modulo alignment).
> 
> 2. Andres also committed a patch
> (48354581a49c30f5757c203415aa8412d85b0f70) to remove the spinlock from
> a BufferDesc, which also reduces its size, I think, because it
> replaces members of types BufFlags (2 bytes), uint8, slock_t, and
> unsigned with a single member of type pg_atomic_uint32.
> 
> The reason why these changes are relevant is because Andres thought
> the observed regression might be related to the BufferDesc growing to
> more than 64 bytes on POWER, which in turn could cause buffer
> descriptors to get split across cache lines.  However, in the
> meantime, I did some performance tests on the same machine that Amit
> used for testing in the email that started this thread:
> 
> http://www.postgresql.org/message-id/CA+TgmoZJdA6K7-17K4A48rVB0UPR98HVuaNcfNNLrGsdb1uChg@mail.gmail.com
> 
> The upshot of that is that (1) the performance degradation I saw was
> significant but smaller than what Amit reported in the OP, and (2) it
> looked like the patches Andres gave me to test at the time got
> performance back to about the same level we were at before 6150a1b0.
> So there's room for optimism that this is fixed, but perhaps some
> retesting is in order, since what was committed was, I think, not
> identical to what I tested.

That sounds like this open item is ready for CLOSE_WAIT status; is it?

If someone does retest this, it would be informative to see how the system
performs with 6150a1b0 reverted.  Your testing showed performance of 6150a1b0
alone and of 6150a1b0 plus predecessors of 008608b and 4835458.  I don't
recall seeing figures for 008608b + 4835458 - 6150a1b0, though.

Re: Performance degradation in commit 6150a1b0

От

Robert Haas

Дата:

13 апреля 2016 г., 06:40:50

On Tue, Apr 12, 2016 at 10:30 PM, Noah Misch <noah@leadboat.com> wrote:
> That sounds like this open item is ready for CLOSE_WAIT status; is it?

I just retested this on power2.  Here are the results.  I retested
3fed4174 and 6150a1b0 plus master as of deb71fa9.  5-minute pgbench -S
runs, scale factor 300, with predictable prewarming to minimize
variation, as well as numactl --interleave.  Each result is a median
of three.

1 client: 3fed4174 = 13701.014931, 6150a1b0 = 13669.626916, master =
19685.571089
8 clients: 3fed4174 = 126676.357079, 6150a1b0 = 125239.911105, master
= 122940.079404
32 clients: 3fed4174 = 323989.685428, 6150a1b0 = 338638.095126, master
= 333656.861590
64 clients: 3fed4174 = 495434.372578, 6150a1b0 = 457794.475129, master
= 493034.922791
128 clients: 3fed4174 = 376412.090366, 6150a1b0 = 363157.294391,
master = 625498.280370

On this test 8, 32, and 64 clients are coming out about the same as
3fed4174, but 1 client and 128 clients are dramatically improved with
current master.  The 1-client result is a lot more surprising than the
128-client result; I don't know what's going on there.  But anyway I
don't see a regression here.

So, yes, I would say this should go to CLOSE_WAIT at this point,
unless Amit or somebody else turns up further evidence of a continuing
issue here.

Random points of possible interest:

1. During a 128-client run, top shows about 45% user time, 10% system
time, 45% idle.

2. About 3 minutes into a 128-client run, perf looks like this
(substantially abridged):
   3.55%  postgres         postgres             [.] GetSnapshotData   2.15%  postgres         postgres             [.]
LWLockAttemptLock                   |--32.82%-- LockBuffer                     |          |--48.59%-- _bt_relandgetbuf
                  |          |--44.07%-- _bt_getbuf                     |--29.81%-- ReadBuffer_common
 |--23.88%-- GetSnapshotData                     |--5.30%-- LockAcquireExtended    2.12%  postgres         postgres
       [.] LWLockRelease    2.02%  postgres         postgres             [.] _bt_compare    1.88%  postgres
postgres            [.]

hash_search_with_hash_value                     |--47.21%-- BufTableLookup                     |--10.93%--
LockAcquireExtended                    |--5.43%-- GetPortalByName                     |--5.21%-- ReadBuffer_common
              |--4.68%-- RelationIdGetRelation    1.87%  postgres         postgres             [.] AllocSetAlloc
1.42% postgres         postgres             [.] PinBuffer.isra.3    0.96%  postgres         libc-2.17.so         [.]
__memcpy_power7   0.89%  postgres         postgres             [.]

UnpinBuffer.constprop.7    0.80%  postgres         postgres             [.] PostgresMain    0.80%  postgres
postgres            [.]

pg_encoding_mbcliplen    0.71%  postgres         postgres             [.] hash_any    0.62%  postgres         postgres
          [.] AllocSetFree    0.59%  postgres         postgres             [.] palloc    0.57%  postgres
libc-2.17.so        [.] _int_free

A context-switch profile, somewhat amazingly, shows no context
switches for anything other than waiting on client read, implying that
performance is entirely constrained by memory bandwidth and CPU speed,
not lock contention.

> If someone does retest this, it would be informative to see how the system
> performs with 6150a1b0 reverted.  Your testing showed performance of 6150a1b0
> alone and of 6150a1b0 plus predecessors of 008608b and 4835458.  I don't
> recall seeing figures for 008608b + 4835458 - 6150a1b0, though.

That revert isn't trivial: even what exactly that would mean at this
point is somewhat subjective.  I'm also not sure there is much point.
6150a1b08a9fe7ead2b25240be46dddeae9d98e1 was written in such a way
that only platforms with single-byte spinlocks were going to have a
BufferDesc that fits into 64 bytes, which in retrospect was a bit
short-sighted.  Because the changes that were made to get it back down
to 64 bytes might also have other performance-relevant consequences,
it's a bit hard to be sure that that was the precise thing that caused
the regression.  And of course there was a fury of other commits going
in at the same time, some even on related topics, which further adds
to the difficulty of pinpointing this precisely.  All that is a bit
unfortunate in some sense, but I think we're just going to have to
keep moving forward and hope for the best.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Performance degradation in commit 6150a1b0

От

Noah Misch

Дата:

14 апреля 2016 г., 06:10:51

On Tue, Apr 12, 2016 at 11:40:43PM -0400, Robert Haas wrote:
> On Tue, Apr 12, 2016 at 10:30 PM, Noah Misch <noah@leadboat.com> wrote:
> > That sounds like this open item is ready for CLOSE_WAIT status; is it?
> 
> I just retested this on power2.

> So, yes, I would say this should go to CLOSE_WAIT at this point,
> unless Amit or somebody else turns up further evidence of a continuing
> issue here.

Thanks for testing again.

> > If someone does retest this, it would be informative to see how the system
> > performs with 6150a1b0 reverted.  Your testing showed performance of 6150a1b0
> > alone and of 6150a1b0 plus predecessors of 008608b and 4835458.  I don't
> > recall seeing figures for 008608b + 4835458 - 6150a1b0, though.
> 
> That revert isn't trivial: even what exactly that would mean at this
> point is somewhat subjective.  I'm also not sure there is much point.
> 6150a1b08a9fe7ead2b25240be46dddeae9d98e1 was written in such a way
> that only platforms with single-byte spinlocks were going to have a
> BufferDesc that fits into 64 bytes, which in retrospect was a bit
> short-sighted.  Because the changes that were made to get it back down
> to 64 bytes might also have other performance-relevant consequences,
> it's a bit hard to be sure that that was the precise thing that caused
> the regression.  And of course there was a fury of other commits going
> in at the same time, some even on related topics, which further adds
> to the difficulty of pinpointing this precisely.  All that is a bit
> unfortunate in some sense, but I think we're just going to have to
> keep moving forward and hope for the best.

I can live with that.

Re: Performance degradation in commit 6150a1b0

От

Amit Kapila

Дата:

14 апреля 2016 г., 06:22:31

On Wed, Apr 13, 2016 at 9:10 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Tue, Apr 12, 2016 at 10:30 PM, Noah Misch <noah@leadboat.com> wrote:
> > That sounds like this open item is ready for CLOSE_WAIT status; is it?
>
> I just retested this on power2. Here are the results. I retested
> 3fed4174 and 6150a1b0 plus master as of deb71fa9. 5-minute pgbench -S
> runs, scale factor 300, with predictable prewarming to minimize
> variation, as well as numactl --interleave. Each result is a median
> of three.
>
> 1 client: 3fed4174 = 13701.014931, 6150a1b0 = 13669.626916, master =
> 19685.571089
> 8 clients: 3fed4174 = 126676.357079, 6150a1b0 = 125239.911105, master
> = 122940.079404
> 32 clients: 3fed4174 = 323989.685428, 6150a1b0 = 338638.095126, master
> = 333656.861590
> 64 clients: 3fed4174 = 495434.372578, 6150a1b0 = 457794.475129, master
> = 493034.922791
> 128 clients: 3fed4174 = 376412.090366, 6150a1b0 = 363157.294391,
> master = 625498.280370
>
> On this test 8, 32, and 64 clients are coming out about the same as
> 3fed4174, but 1 client and 128 clients are dramatically improved with
> current master. The 1-client result is a lot more surprising than the
> 128-client result; I don't know what's going on there. But anyway I
> don't see a regression here.
>
> So, yes, I would say this should go to CLOSE_WAIT at this point,
> unless Amit or somebody else turns up further evidence of a continuing
> issue here.
>

Yes, I also think that this particular issue can be closed. However I felt that the observation related to performance variation is still present as I never need to perform prewarm or anything else to get consistent results during my work in 9.5 or early 9.6. Also, Andres, Alexander and myself are working on similar observation (run-to-run performance variation) in a nearby thread [1].

[1] - http://www.postgresql.org/message-id/20160412160246.nyzil35w3wein5fm@alap3.anarazel.de

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: Performance degradation in commit 6150a1b0

От

Robert Haas

Дата:

14 апреля 2016 г., 21:09:57

On Wed, Apr 13, 2016 at 11:22 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> Yes, I also think that this particular issue can be closed.  However I felt
> that the observation related to performance variation is still present as I
> never need to perform prewarm or anything else to get consistent results
> during my work in 9.5 or early 9.6.  Also, Andres, Alexander and myself are
> working on similar observation (run-to-run performance variation) in a
> nearby thread [1].

Yeah.  My own measurements do not seem to support the idea that the
variance recently increased, but I haven't tested incredibly widely.
It may be that whatever is causing the variance is something that used
to be hidden by locking bottlenecks and now no longer is.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Performance degradation in commit 6150a1b0

Вложения

Вложения