Обсуждение: Optimizing TransactionIdIsCurrentTransactionId()

Поиск
Список
Период
Сортировка

Optimizing TransactionIdIsCurrentTransactionId()

От
Simon Riggs
Дата:
TransactionIdIsCurrentTransactionId() doesn't seem to be well optimized for the case when an xid has not yet been assigned, so for read only transactions.

A patch for this is attached.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Solutions for the Enterprise
Вложения

Re: Optimizing TransactionIdIsCurrentTransactionId()

От
Robert Haas
Дата:
On Wed, Dec 18, 2019 at 5:07 AM Simon Riggs <simon@2ndquadrant.com> wrote:
> TransactionIdIsCurrentTransactionId() doesn't seem to be well optimized for the case when an xid has not yet been
assigned,so for read only transactions.
 
>
> A patch for this is attached.

It might be an idea to first call TransactionIdIsNormal(xid), then
GetTopTransactionIdIfAny(), then TransactionIdIsNormal(topxid), so
that we don't bother with GetTopTransactionIdIfAny() when
!TransactionIdIsNormal(xid).

But it's also not clear to me whether this is actually a win. You're
dong an extra TransactionIdIsNormal() test to sometimes avoid a
GetTopTransactionIdIfAny() test. TransactionIdIsNormal() is pretty
cheap, but GetTopTransactionIdIfAny() isn't all that expensive either,
and adding more branches costs something.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Optimizing TransactionIdIsCurrentTransactionId()

От
Tomas Vondra
Дата:
On Thu, Dec 19, 2019 at 02:27:01PM -0500, Robert Haas wrote:
>On Wed, Dec 18, 2019 at 5:07 AM Simon Riggs <simon@2ndquadrant.com> wrote:
>> TransactionIdIsCurrentTransactionId() doesn't seem to be well optimized for the case when an xid has not yet been
assigned,so for read only transactions.
 
>>
>> A patch for this is attached.
>
>It might be an idea to first call TransactionIdIsNormal(xid), then
>GetTopTransactionIdIfAny(), then TransactionIdIsNormal(topxid), so
>that we don't bother with GetTopTransactionIdIfAny() when
>!TransactionIdIsNormal(xid).
>
>But it's also not clear to me whether this is actually a win. You're
>dong an extra TransactionIdIsNormal() test to sometimes avoid a
>GetTopTransactionIdIfAny() test. TransactionIdIsNormal() is pretty
>cheap, but GetTopTransactionIdIfAny() isn't all that expensive either,
>and adding more branches costs something.
>

I think "optimization" patches should generally come with some sort of
quantification of the gains - e.g. a benchmark with somewhat realistic
workload (but even synthetic is better than nothing). Or at least some
explanation *why* it's going to be an improvement.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 



Re: Optimizing TransactionIdIsCurrentTransactionId()

От
Simon Riggs
Дата:
On Thu, 19 Dec 2019 at 19:27, Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Dec 18, 2019 at 5:07 AM Simon Riggs <simon@2ndquadrant.com> wrote:
> TransactionIdIsCurrentTransactionId() doesn't seem to be well optimized for the case when an xid has not yet been assigned, so for read only transactions.
>
> A patch for this is attached.

It might be an idea to first call TransactionIdIsNormal(xid), then
GetTopTransactionIdIfAny(), then TransactionIdIsNormal(topxid), so
that we don't bother with GetTopTransactionIdIfAny() when
!TransactionIdIsNormal(xid).

But it's also not clear to me whether this is actually a win. You're
dong an extra TransactionIdIsNormal() test to sometimes avoid a
GetTopTransactionIdIfAny() test.

That's not the point of the patch.

If the TopTransactionId is not assigned, we can leave the whole function more quickly, not just avoid a test.

Read only transactions should have a fast path thru this function since they frequently read more data than write transactions.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Solutions for the Enterprise

Re: Optimizing TransactionIdIsCurrentTransactionId()

От
Robert Haas
Дата:
On Fri, Dec 20, 2019 at 12:46 AM Simon Riggs <simon@2ndquadrant.com> wrote:
> If the TopTransactionId is not assigned, we can leave the whole function more quickly, not just avoid a test.

Those things are not really any different from each other. You leave
the function when you've done all the necessary tests....

> Read only transactions should have a fast path thru this function since they frequently read more data than write
transactions.

With regard to this point, I second Tomas's comments.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Optimizing TransactionIdIsCurrentTransactionId()

От
Simon Riggs
Дата:
On Fri, 20 Dec 2019 at 13:07, Robert Haas <robertmhaas@gmail.com> wrote:

> Read only transactions should have a fast path thru this function since they frequently read more data than write transactions.

With regard to this point, I second Tomas's comments.

I also agree with Tomas' comments. I am explaining *why* it will be an improvement, expanding on my earlier notes.

This function is called extremely frequently in query processing and is fairly efficient. I'm pointing out cases where making it even quicker makes sense.

The TopXid is assigned in very few calls. Write transactions perform searching before the xid is assigned, so UPDATE and DELETE transactions will call this with TopXid unassigned in many small transactions, e.g. simple pgbench. In almost all read-only cases and especially on standby nodes there will be no TopXid assigned, so I estimate that 90-99% of calls will be made with TopXid invalid. In this case it makes a great deal of sense to have a fastpath out of this function, by testing TransactionIdIsNormal(topxid).

I also now notice that on entry the xid provided is hardly ever InvalidTransactionId. Once, it might have been called repeatedly with FrozenTransactionId, but that is no longer the case since we no longer reset the xid on freezing. So the test for TransactionIdIsNormal(xid) appears to need rethinking since it is now mostly redundant.

So if adding a test is considered heavy, I would swap the test for TransactionIdIsNormal(xid) and replace with a test for TransactionIdIsNormal(topxid).

Such a frequently used function is worth discussing, just as we previously optimised TransactionIdIsInProgress() and MVCC visibility routines, where we discussed what the most common routes through the functions were before deciding how to optimize them.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Solutions for the Enterprise

Re: Optimizing TransactionIdIsCurrentTransactionId()

От
Tom Lane
Дата:
Simon Riggs <simon@2ndquadrant.com> writes:
> On Fri, 20 Dec 2019 at 13:07, Robert Haas <robertmhaas@gmail.com> wrote:
>> With regard to this point, I second Tomas's comments.

> I also agree with Tomas' comments. I am explaining *why* it will be an
> improvement, expanding on my earlier notes.
> This function is called extremely frequently in query processing and is
> fairly efficient. I'm pointing out cases where making it even quicker makes
> sense.

I think the point is that you haven't demonstrated that this particular
patch makes it quicker.

            regards, tom lane



Re: Optimizing TransactionIdIsCurrentTransactionId()

От
Simon Riggs
Дата:
On Fri, 20 Dec 2019 at 17:46, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Simon Riggs <simon@2ndquadrant.com> writes:
> On Fri, 20 Dec 2019 at 13:07, Robert Haas <robertmhaas@gmail.com> wrote:
>> With regard to this point, I second Tomas's comments.

> I also agree with Tomas' comments. I am explaining *why* it will be an
> improvement, expanding on my earlier notes.
> This function is called extremely frequently in query processing and is
> fairly efficient. I'm pointing out cases where making it even quicker makes
> sense.

I think the point is that you haven't demonstrated that this particular
patch makes it quicker.

Not yet, but I was trying to agree what an appropriate test would be before running it. 

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Solutions for the Enterprise

Re: Optimizing TransactionIdIsCurrentTransactionId()

От
Tomas Vondra
Дата:
On Fri, Dec 20, 2019 at 05:57:55PM +0000, Simon Riggs wrote:
>On Fri, 20 Dec 2019 at 17:46, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
>> Simon Riggs <simon@2ndquadrant.com> writes:
>> > On Fri, 20 Dec 2019 at 13:07, Robert Haas <robertmhaas@gmail.com> wrote:
>> >> With regard to this point, I second Tomas's comments.
>>
>> > I also agree with Tomas' comments. I am explaining *why* it will be an
>> > improvement, expanding on my earlier notes.
>> > This function is called extremely frequently in query processing and is
>> > fairly efficient. I'm pointing out cases where making it even quicker
>> makes
>> > sense.
>>
>> I think the point is that you haven't demonstrated that this particular
>> patch makes it quicker.
>>
>
>Not yet, but I was trying to agree what an appropriate test would be before
>running it.
>

Isn't that a bit backwards? I mean, we usually identify opportunities
for optimizations by observing poor performance with a workload, which
means that workload can serve as a test. Of course, it's possible to
notice an opprtunity by eye-balling the code, but you've already said
this is supposed to improve read-only transactions.

I've actually tried to measure if/how this affects performance using a
simple read-only pgbench

   pgbench -S -M prepared -T 60 test

I did this with a long-running transaction to prevent hint bits from
getting set. But I've not measured any difference in performane. So
either this improves a different workload, or maybe I'm doing something
silly that makes the patch irrelevant.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services