Обсуждение: [WIP] showing index maintenance on EXPLAIN
Hi, This patch implements $subject only when ANALYZE and VERBOSE are on. I made it that way because for years nobody seemed interested in this info (at least no one did it) so i decided that maybe is to much information for most people (actually btree indexes are normally very fast). But because we have GiST and GIN this became an interested piece of data (there are other cases even when using btree). Current patch doesn't have docs yet i will add them soon. -- Jaime Casanova www.2ndQuadrant.com Professional PostgreSQL: Soporte 24x7 y capacitación Phone: +593 4 5107566 Cell: +593 987171157
Вложения
On 05/07/2014 05:00 PM, Jaime Casanova wrote: > Hi, > > This patch implements $subject only when ANALYZE and VERBOSE are on. > I made it that way because for years nobody seemed interested in this > info (at least no one did it) so i decided that maybe is to much > information for most people (actually btree indexes are normally very > fast). What's "index maintenance"? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On Wed, May 7, 2014 at 7:03 PM, Josh Berkus <josh@agliodbs.com> wrote: > On 05/07/2014 05:00 PM, Jaime Casanova wrote: >> Hi, >> >> This patch implements $subject only when ANALYZE and VERBOSE are on. >> I made it that way because for years nobody seemed interested in this >> info (at least no one did it) so i decided that maybe is to much >> information for most people (actually btree indexes are normally very >> fast). > > What's "index maintenance"? > Maybe is not the right term... i mean: the time that take to update indexes on an INSERT/UPDATE operation -- Jaime Casanova www.2ndQuadrant.com Professional PostgreSQL: Soporte 24x7 y capacitación Phone: +593 4 5107566 Cell: +593 987171157
On Thu, May 8, 2014 at 5:30 AM, Jaime Casanova <jaime@2ndquadrant.com> wrote: > Hi, > > This patch implements $subject only when ANALYZE and VERBOSE are on. > I made it that way because for years nobody seemed interested in this > info (at least no one did it) so i decided that maybe is to much > information for most people (actually btree indexes are normally very > fast). Why to capture only for Index Insert/Update and not for Read; is it because Read will be always fast ot implementation complexity? Why not similar timings for heap? Why can't we print when only Analyze is used with Explain, the execution time is printed with Analyze option? Could you please tell in what all kind of scenario's, do you expect it to be useful? One I could think is that if there are multiple indexes on a table and user wants to find out if any particular index is consuming more time. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
On Wed, May 7, 2014 at 10:52 PM, Amit Kapila <amit.kapila16@gmail.com> wrote: > On Thu, May 8, 2014 at 5:30 AM, Jaime Casanova <jaime@2ndquadrant.com> wrote: >> Hi, >> >> This patch implements $subject only when ANALYZE and VERBOSE are on. >> I made it that way because for years nobody seemed interested in this >> info (at least no one did it) so i decided that maybe is to much >> information for most people (actually btree indexes are normally very >> fast). > > > Why to capture only for Index Insert/Update and not for Read; is it > because Read will be always fast ot implementation complexity? > EXPLAIN ANALYZE already shows that on any SELECT that uses an index in some way. Or are you thinking on something else? > Why not similar timings for heap? > well "actual time" shows us total time of the operation so just deducting the time spent on triggers, indexes and planning seems like a way to get "heap modification time". yes, maybe we still need some additional data. for example, i could want to know how much time we spent extending a relation. > Why can't we print when only Analyze is used with Explain, the > execution time is printed with Analyze option? > i'm not sure the info is useful for everyone, i'm not opposed to show it all the time though > Could you please tell in what all kind of scenario's, do you expect it > to be useful? > One I could think is that if there are multiple indexes on a table and user > wants to find out if any particular index is consuming more time. > exactly my use case. consider this plan (we spent 78% of the time updating the index uniq_idx_on_text): QUERY PLAN --------------------------------------------------------------------------------------------Insert on public.t1 (actual time=0.540..0.540rows=0 loops=1) -> Result (actual time=0.046..0.049 rows=1 loops=1) Output: <some long data here>Indexuniq_idx_on_text on t1: time=0.421 rows=1Index t1_pkey on t1: time=0.027 rows=1Total runtime: 0.643 ms (6 rows) so i want to answer questions like, how much an index is hurting write performance? once i know that i can look for alternative solutions. In that vein, it was interesting to see how fastupdate affect performance in a GIN index using gin_trgm_ops (5 times slower with fastupdate=off) (fastupdate=on) Index idx_ggin on t1: time=0.418 rows=1 (fastupdate=off) Index idx_ggin on t1: time=2.205 rows=1 this is not different to showing trigger time info, which we already do -- Jaime Casanova www.2ndQuadrant.com Professional PostgreSQL: Soporte 24x7 y capacitación Phone: +593 4 5107566 Cell: +593 987171157
On Thu, May 8, 2014 at 2:31 AM, Jaime Casanova <jaime@2ndquadrant.com> wrote: > On Wed, May 7, 2014 at 10:52 PM, Amit Kapila <amit.kapila16@gmail.com> wrote: >> On Thu, May 8, 2014 at 5:30 AM, Jaime Casanova <jaime@2ndquadrant.com> wrote: >>> Hi, >>> >>> This patch implements $subject only when ANALYZE and VERBOSE are on. >>> I made it that way because for years nobody seemed interested in this >>> info (at least no one did it) so i decided that maybe is to much >>> information for most people (actually btree indexes are normally very >>> fast). >> >> >> Why to capture only for Index Insert/Update and not for Read; is it >> because Read will be always fast ot implementation complexity? >> > > EXPLAIN ANALYZE already shows that on any SELECT that uses an index in > some way. Or are you thinking on something else? > >> Why not similar timings for heap? >> > > well "actual time" shows us total time of the operation so just > deducting the time spent on triggers, indexes and planning seems like > a way to get "heap modification time". > > yes, maybe we still need some additional data. for example, i could > want to know how much time we spent extending a relation. > >> Why can't we print when only Analyze is used with Explain, the >> execution time is printed with Analyze option? >> > > i'm not sure the info is useful for everyone, i'm not opposed to show > it all the time though > >> Could you please tell in what all kind of scenario's, do you expect it >> to be useful? >> One I could think is that if there are multiple indexes on a table and user >> wants to find out if any particular index is consuming more time. >> > > exactly my use case. consider this plan (we spent 78% of the time > updating the index uniq_idx_on_text): > > QUERY PLAN > -------------------------------------------------------------------------------------------- > Insert on public.t1 (actual time=0.540..0.540 rows=0 loops=1) > -> Result (actual time=0.046..0.049 rows=1 loops=1) > Output: <some long data here> > Index uniq_idx_on_text on t1: time=0.421 rows=1 > Index t1_pkey on t1: time=0.027 rows=1 > Total runtime: 0.643 ms > (6 rows) I would have expected the information about index maintenance times to be associated with the Insert node, not the plan overall. IIUC, you could have more than one such node if, for example, there are writeable CTEs involved. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, May 8, 2014 at 10:41 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Thu, May 8, 2014 at 2:31 AM, Jaime Casanova <jaime@2ndquadrant.com> wrote: >> On Wed, May 7, 2014 at 10:52 PM, Amit Kapila <amit.kapila16@gmail.com> wrote: >>> On Thu, May 8, 2014 at 5:30 AM, Jaime Casanova <jaime@2ndquadrant.com> wrote: >>>> Hi, >>>> >>>> This patch implements $subject only when ANALYZE and VERBOSE are on. >>>> I made it that way because for years nobody seemed interested in this >>>> info (at least no one did it) so i decided that maybe is to much >>>> information for most people (actually btree indexes are normally very >>>> fast). >>> >>> > I would have expected the information about index maintenance times to > be associated with the Insert node, not the plan overall. IIUC, you > could have more than one such node if, for example, there are > writeable CTEs involved. > i followed how trigger info is showed in explain. I can try to change it, if that is what most people prefer. QUERY PLAN ---------------------------------------------------------------------------------------------------------------------Insert onpgbench_accounts (actual time=0.249..0.249 rows=0 loops=1) CTE results -> Insert on pgbench_accounts pgbench_accounts_1(actual time=0.152..0.159 rows=1 loops=1) -> Result (actual time=0.003..0.004 rows=1 loops=1) -> CTE Scan on results(actual time=0.165..0.174 rows=1 loops=1)Trigger trg1 on pgbench_accounts: time=0.033 calls=1Trigger trg1 on pgbench_accounts:time=0.059 calls=1Total runtime: 0.377 ms (8 rows) -- Jaime Casanova www.2ndQuadrant.com Professional PostgreSQL: Soporte 24x7 y capacitación Phone: +593 4 5107566 Cell: +593 987171157
On Thu, May 8, 2014 at 12:01 PM, Jaime Casanova <jaime@2ndquadrant.com> wrote: > On Wed, May 7, 2014 at 10:52 PM, Amit Kapila <amit.kapila16@gmail.com> wrote: >> >> Why to capture only for Index Insert/Update and not for Read; is it >> because Read will be always fast ot implementation complexity? >> > > EXPLAIN ANALYZE already shows that on any SELECT that uses an index in > some way. Or are you thinking on something else? postgres=# explain analyze select * from t1 where c1 > 50000 and c1 <60000; QUERY PLAN -------------------------------------------------------------------------------- ----------------------------------------Index Scan using idx1_t1 on t1 (cost=0.29..983.57 rows=10014 width=508) (actua l time=0.033..11.826 rows=9999 loops=1) Index Cond: ((c1 > 50000) AND (c1 < 60000))Planning time: 2.001 msExecution time:18.486 ms (4 rows) Are you referring actual time in above print? The actual time is node execution time which in above kind of cases will be: scanning the index + scanning the heap. I think it is not same what you are planning to show for Insert/Update case. >> Why not similar timings for heap? >> > > well "actual time" shows us total time of the operation so just > deducting the time spent on triggers, indexes and planning seems like > a way to get "heap modification time". planning time doesn't include parse time, so above calculation might not give time spent in heap during statement execution. >> Why can't we print when only Analyze is used with Explain, the >> execution time is printed with Analyze option? > > i'm not sure the info is useful for everyone, i'm not opposed to show > it all the time though Okay, no problem. I think it can be done based on what most people expect. >> Could you please tell in what all kind of scenario's, do you expect it >> to be useful? >> One I could think is that if there are multiple indexes on a table and user >> wants to find out if any particular index is consuming more time. >> > > exactly my use case. consider this plan (we spent 78% of the time > updating the index uniq_idx_on_text): I think this will be useful for such cases. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Robert Haas wrote > On Thu, May 8, 2014 at 2:31 AM, Jaime Casanova < > jaime@ > > wrote: >> On Wed, May 7, 2014 at 10:52 PM, Amit Kapila < > amit.kapila16@ > > wrote: >>> On Thu, May 8, 2014 at 5:30 AM, Jaime Casanova < > jaime@ > > wrote: >> >> QUERY PLAN >> -------------------------------------------------------------------------------------------- >> Insert on public.t1 (actual time=0.540..0.540 rows=0 loops=1) >> -> Result (actual time=0.046..0.049 rows=1 loops=1) >> Output: > <some long data here> >> Index uniq_idx_on_text on t1: time=0.421 rows=1 >> Index t1_pkey on t1: time=0.027 rows=1 >> Total runtime: 0.643 ms >> (6 rows) > > I would have expected the information about index maintenance times to > be associated with the Insert node, not the plan overall. IIUC, you > could have more than one such node if, for example, there are > writeable CTEs involved. Even with multiple nodes I would think that typically each node would focus on a different table and so listing everything, by table, at the bottom of the explain would not cause that much of an issue. Even if a table gets hit from two CTEs, and thus have their calls and times added together, the relative (per call) numbers overall and comparative numbers between indexes on the same table would remain constant. And if you are testing indexes you would likely want to keep the queries the same. Not having to scan through and handle multiple locations for index timings will make interpretation, parsing, and presentation much easier. It wouldn't preclude having total timings and node timings in the future if the demand is sufficient. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/WIP-showing-index-maintenance-on-EXPLAIN-tp5803106p5803337.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.
On 05/08/2014 08:04 AM, Jaime Casanova wrote: > Maybe is not the right term... i mean: the time that take to update > indexes on an INSERT/UPDATE operation That'd be a big plus IMO, especially for expensive GiST or GIN index updates. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 8 May 2014 01:00, Jaime Casanova <jaime@2ndquadrant.com> wrote:
Hi,
This patch implements $subject only when ANALYZE and VERBOSE are on.
I made it that way because for years nobody seemed interested in this
info (at least no one did it) so i decided that maybe is to much
information for most people (actually btree indexes are normally very
fast).
But because we have GiST and GIN this became an interested piece of
data (there are other cases even when using btree).
Current patch doesn't have docs yet i will add them soon.
+1
I've asked for something like this previously: http://www.postgresql.org/message-id/CAA-aLv6O7QKdtZvacDJ+R_YjuLGCQTGzj6JXZTPmCnnxroX=7g@mail.gmail.com
Thom
On Thu, May 8, 2014 at 10:44 PM, Amit Kapila <amit.kapila16@gmail.com> wrote: > On Thu, May 8, 2014 at 12:01 PM, Jaime Casanova <jaime@2ndquadrant.com> wrote: >> On Wed, May 7, 2014 at 10:52 PM, Amit Kapila <amit.kapila16@gmail.com> wrote: >>> >>> Why to capture only for Index Insert/Update and not for Read; is it >>> because Read will be always fast ot implementation complexity? >>> >> EXPLAIN ANALYZE already shows that on any SELECT that uses an index in >> some way. Or are you thinking on something else? > [...] > > Are you referring actual time in above print? > > The actual time is node execution time which in above kind of cases will > be: scanning the index + scanning the heap. I think it is not same what > you are planning to show for Insert/Update case. > ah! good point! my current case is because of write performance, but will look at it a little -- Jaime Casanova www.2ndQuadrant.com Professional PostgreSQL: Soporte 24x7 y capacitación Phone: +593 4 5107566 Cell: +593 987171157
On Thu, May 8, 2014 at 10:00 AM, Jaime Casanova <jaime@2ndquadrant.com> wrote: > Hi, > > This patch implements $subject only when ANALYZE and VERBOSE are on. > I made it that way because for years nobody seemed interested in this > info (at least no one did it) so i decided that maybe is to much > information for most people (actually btree indexes are normally very > fast). > > But because we have GiST and GIN this became an interested piece of > data (there are other cases even when using btree). > > Current patch doesn't have docs yet i will add them soon. This patch provides the Insertion time of an index operation. This information is useful to the administrator for reorganizing the indexes based on the insertion time. Quick review: Applies to Head. Regress test is passed. Coding is fine. Minor comments: There is no need of printing the index insertion time as zero in case of hot update operations. Please correct the same. Add the documentation changes. Regards, Hari Babu Fujitsu Australia