Обсуждение: Auto-analyse on insert operations

Поиск
Список
Период
Сортировка

Auto-analyse on insert operations

От
Bertrand Roos
Дата:
Hello,

I try to configure auto-analyse task with postgresql 9.4.
I have the following configuration (default configuration):
track_counts = on
autovacuum = on
log_autovacuum_min_duration = -1
autovacuum_max_workers = 3
autovacuum_naptime = 300s
autovacuum_vacuum_threshold = 50
autovacuum_analyze_threshold = 50
autovacuum_vacuum_scale_factor = 0.2
autovacuum_analyze_scale_factor = 0.2
autovacuum_freeze_max_age = 200000000
autovacuum_multixact_freeze_max_age = 400000000
autovacuum_vacuum_cost_delay = 20ms
autovacuum_vacuum_cost_limit = -1

With this configuration, I can observe that some tables are
auto-analysed, but some others are not. Even if there are millions of
insert operations on an empty table (all tables are in cluster mode).
In fact it seems that tables with update operations are the only ones
that are auto-analysed.
I'm quite suprised because the documentation says that daemon check the
count of insert, update and delete operations.
What could it be the reason ? Why tables which have only update
operation, aren't analysed ?
Are update operations really taken into account ?

Thanks,

Bertrand



Re: Auto-analyse on insert operations

От
Adrian Klaver
Дата:
On 11/04/2015 05:32 AM, Bertrand Roos wrote:
> Hello,
>
> I try to configure auto-analyse task with postgresql 9.4.
> I have the following configuration (default configuration):
> track_counts = on
> autovacuum = on
> log_autovacuum_min_duration = -1
> autovacuum_max_workers = 3
> autovacuum_naptime = 300s
> autovacuum_vacuum_threshold = 50
> autovacuum_analyze_threshold = 50
> autovacuum_vacuum_scale_factor = 0.2
> autovacuum_analyze_scale_factor = 0.2
> autovacuum_freeze_max_age = 200000000
> autovacuum_multixact_freeze_max_age = 400000000
> autovacuum_vacuum_cost_delay = 20ms
> autovacuum_vacuum_cost_limit = -1
>
> With this configuration, I can observe that some tables are
> auto-analysed, but some others are not. Even if there are millions of
> insert operations on an empty table (all tables are in cluster mode).
> In fact it seems that tables with update operations are the only ones
> that are auto-analysed.
> I'm quite suprised because the documentation says that daemon check the
> count of insert, update and delete operations.
> What could it be the reason ? Why tables which have only update
> operation, aren't analysed ?
> Are update operations really taken into account ?

Somewhere you switched actions. You start by saying updated tables are
being analyzed and end by saying they are not, or are the last sentences
supposed to refer to inserts?

>
> Thanks,
>
> Bertrand
>
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: Auto-analyse on insert operations

От
Bill Moran
Дата:
On Wed, 4 Nov 2015 14:32:37 +0100
Bertrand Roos <bertrand.roos@areal.fr> wrote:
>
> I try to configure auto-analyse task with postgresql 9.4.
> I have the following configuration (default configuration):
> track_counts = on
> autovacuum = on
> log_autovacuum_min_duration = -1
> autovacuum_max_workers = 3
> autovacuum_naptime = 300s
> autovacuum_vacuum_threshold = 50
> autovacuum_analyze_threshold = 50
> autovacuum_vacuum_scale_factor = 0.2
> autovacuum_analyze_scale_factor = 0.2
> autovacuum_freeze_max_age = 200000000
> autovacuum_multixact_freeze_max_age = 400000000
> autovacuum_vacuum_cost_delay = 20ms
> autovacuum_vacuum_cost_limit = -1
>
> With this configuration, I can observe that some tables are
> auto-analysed, but some others are not. Even if there are millions of
> insert operations on an empty table (all tables are in cluster mode).
> In fact it seems that tables with update operations are the only ones
> that are auto-analysed.
> I'm quite suprised because the documentation says that daemon check the
> count of insert, update and delete operations.
> What could it be the reason ? Why tables which have only update
> operation, aren't analysed ?
> Are update operations really taken into account ?

Given that autoanalyze is pretty critical to the way the system functions,
it's unlikely that it just doesn't work (someone else would have noticed).

A more likely scenario is that you've found some extremely obscure edge
case. If that's the case, you're going to have to give very specific
details as to how you're testing it before anyone is liable to be able
to help you.

I get the impression that you're somewhat new to Postgres, in which case
it's very likely that the problem is that you're not testing the situation
correctly. In that case, we're going to need specific details on how you're
observing that tables are or are not being analysed.

As a wild-guess theory: the process that does the analyze only wakes up
to check tables every 5 minutes (based on the config you show) ... so are
you doing the inserts then checking the table without leaving enough time
in between for the system to wake up and notice the change?

--
Bill Moran


Re: Auto-analyse on insert operations

От
Bertrand Roos
Дата:
Le 04/11/2015 14:45, Adrian Klaver a écrit :
> On 11/04/2015 05:32 AM, Bertrand Roos wrote:
>> Hello,
>>
>> I try to configure auto-analyse task with postgresql 9.4.
>> I have the following configuration (default configuration):
>> track_counts = on
>> autovacuum = on
>> log_autovacuum_min_duration = -1
>> autovacuum_max_workers = 3
>> autovacuum_naptime = 300s
>> autovacuum_vacuum_threshold = 50
>> autovacuum_analyze_threshold = 50
>> autovacuum_vacuum_scale_factor = 0.2
>> autovacuum_analyze_scale_factor = 0.2
>> autovacuum_freeze_max_age = 200000000
>> autovacuum_multixact_freeze_max_age = 400000000
>> autovacuum_vacuum_cost_delay = 20ms
>> autovacuum_vacuum_cost_limit = -1
>>
>> With this configuration, I can observe that some tables are
>> auto-analysed, but some others are not. Even if there are millions of
>> insert operations on an empty table (all tables are in cluster mode).
>> In fact it seems that tables with update operations are the only ones
>> that are auto-analysed.
>> I'm quite suprised because the documentation says that daemon check the
>> count of insert, update and delete operations.
>> What could it be the reason ? Why tables which have only update
>> operation, aren't analysed ?
>> Are update operations really taken into account ?
>
> Somewhere you switched actions. You start by saying updated tables are
> being analyzed and end by saying they are not, or are the last
> sentences supposed to refer to inserts?
>
>>
>> Thanks,
>>
>> Bertrand
>>
>>
>>
>
>
Yes the last sentences referred to inserts.
To make it clear, tables with all kinds of operations are auto-analysed.
Table with only insert operations are never auto-analysed.


Re: Auto-analyse on insert operations

От
Adrian Klaver
Дата:
On 11/04/2015 06:35 AM, Bertrand Roos wrote:
> Le 04/11/2015 14:45, Adrian Klaver a écrit :
>> On 11/04/2015 05:32 AM, Bertrand Roos wrote:
>>> Hello,
>>>
>>> I try to configure auto-analyse task with postgresql 9.4.
>>> I have the following configuration (default configuration):
>>> track_counts = on
>>> autovacuum = on
>>> log_autovacuum_min_duration = -1
>>> autovacuum_max_workers = 3
>>> autovacuum_naptime = 300s
>>> autovacuum_vacuum_threshold = 50
>>> autovacuum_analyze_threshold = 50
>>> autovacuum_vacuum_scale_factor = 0.2
>>> autovacuum_analyze_scale_factor = 0.2
>>> autovacuum_freeze_max_age = 200000000
>>> autovacuum_multixact_freeze_max_age = 400000000
>>> autovacuum_vacuum_cost_delay = 20ms
>>> autovacuum_vacuum_cost_limit = -1
>>>
>>> With this configuration, I can observe that some tables are
>>> auto-analysed, but some others are not. Even if there are millions of
>>> insert operations on an empty table (all tables are in cluster mode).
>>> In fact it seems that tables with update operations are the only ones
>>> that are auto-analysed.
>>> I'm quite suprised because the documentation says that daemon check the
>>> count of insert, update and delete operations.
>>> What could it be the reason ? Why tables which have only update
>>> operation, aren't analysed ?
>>> Are update operations really taken into account ?
>>
>> Somewhere you switched actions. You start by saying updated tables are
>> being analyzed and end by saying they are not, or are the last
>> sentences supposed to refer to inserts?
>>
>>>
>>> Thanks,
>>>
>>> Bertrand
>>>
>>>
>>>
>>
>>
> Yes the last sentences referred to inserts.
> To make it clear, tables with all kinds of operations are auto-analysed.
> Table with only insert operations are never auto-analysed.

How are you determining this?

>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: Auto-analyse on insert operations

От
Bertrand Roos
Дата:

Le 04/11/2015 14:55, Bill Moran a écrit :
> On Wed, 4 Nov 2015 14:32:37 +0100
> Bertrand Roos <bertrand.roos@areal.fr> wrote:
>> I try to configure auto-analyse task with postgresql 9.4.
>> I have the following configuration (default configuration):
>> track_counts = on
>> autovacuum = on
>> log_autovacuum_min_duration = -1
>> autovacuum_max_workers = 3
>> autovacuum_naptime = 300s
>> autovacuum_vacuum_threshold = 50
>> autovacuum_analyze_threshold = 50
>> autovacuum_vacuum_scale_factor = 0.2
>> autovacuum_analyze_scale_factor = 0.2
>> autovacuum_freeze_max_age = 200000000
>> autovacuum_multixact_freeze_max_age = 400000000
>> autovacuum_vacuum_cost_delay = 20ms
>> autovacuum_vacuum_cost_limit = -1
>>
>> With this configuration, I can observe that some tables are
>> auto-analysed, but some others are not. Even if there are millions of
>> insert operations on an empty table (all tables are in cluster mode).
>> In fact it seems that tables with update operations are the only ones
>> that are auto-analysed.
>> I'm quite suprised because the documentation says that daemon check the
>> count of insert, update and delete operations.
>> What could it be the reason ? Why tables which have only update
>> operation, aren't analysed ?
>> Are update operations really taken into account ?
> Given that autoanalyze is pretty critical to the way the system functions,
> it's unlikely that it just doesn't work (someone else would have noticed).
>
> A more likely scenario is that you've found some extremely obscure edge
> case. If that's the case, you're going to have to give very specific
> details as to how you're testing it before anyone is liable to be able
> to help you.
>
> I get the impression that you're somewhat new to Postgres, in which case
> it's very likely that the problem is that you're not testing the situation
> correctly. In that case, we're going to need specific details on how you're
> observing that tables are or are not being analysed.
>
> As a wild-guess theory: the process that does the analyze only wakes up
> to check tables every 5 minutes (based on the config you show) ... so are
> you doing the inserts then checking the table without leaving enough time
> in between for the system to wake up and notice the change?
>
Thanks for your answer Bill.
Indeed, I'm pretty new to Postgres and I don't exclude that I'm doing
something wrong. But I did my test on a more than 1 day duration, so
it's not an issue of autovacuum_naptime (I insert 760 lignes each 30
seconds during 36 hours).
I can't give all the details of this test because it is to complicated
with triggers and partman (and your objective is not to solve
configuration issues of others).

In fact, I was telling the question because I have read on some forums
that the auto vacuum deamon only count dead tuple so only update and
delete operations can cause the scheduling of auto-analyse.
So if it's the case it perfectly explain why my test case doesn't work.
But in the other hand the documentation says that delete, update and
insert operations are counted.
Is it an know issue that insert operations are not counted for the
trigger of auto-analyse ?

If it's not, I can try to reproduce this weird behaviour with a simpler
test and give you all the details of the test.

Bertrand




Re: Auto-analyse on insert operations

От
Tom Lane
Дата:
Bertrand Roos <bertrand.roos@areal.fr> writes:
> In fact, I was telling the question because I have read on some forums
> that the auto vacuum deamon only count dead tuple so only update and
> delete operations can cause the scheduling of auto-analyse.

That might have been true years ago, don't remember at the moment.
But in modern PG versions auto-vacuum is driven by the estimated number
of dead tuples while auto-analyze is driven by the total count of
inserts+updates+deletes.  It's easy to show in a standalone experiment
that auto-analyze *will* run against a table that's only had inserts.

With the numbers you're showing, auto-analyze should trigger once the
table gets to 20% new tuples.  It would be interesting to see the
pg_stat_all_tables values for one of your problematic tables.

            regards, tom lane


Re: Auto-analyse on insert operations

От
Adrian Klaver
Дата:
On 11/04/2015 07:43 AM, Bertrand Roos wrote:
>
>
> Le 04/11/2015 14:55, Bill Moran a écrit :
>> On Wed, 4 Nov 2015 14:32:37 +0100
>> Bertrand Roos <bertrand.roos@areal.fr> wrote:
>>> I try to configure auto-analyse task with postgresql 9.4.
>>> I have the following configuration (default configuration):
>>> track_counts = on
>>> autovacuum = on
>>> log_autovacuum_min_duration = -1
>>> autovacuum_max_workers = 3
>>> autovacuum_naptime = 300s
>>> autovacuum_vacuum_threshold = 50
>>> autovacuum_analyze_threshold = 50
>>> autovacuum_vacuum_scale_factor = 0.2
>>> autovacuum_analyze_scale_factor = 0.2
>>> autovacuum_freeze_max_age = 200000000
>>> autovacuum_multixact_freeze_max_age = 400000000
>>> autovacuum_vacuum_cost_delay = 20ms
>>> autovacuum_vacuum_cost_limit = -1
>>>
>>> With this configuration, I can observe that some tables are
>>> auto-analysed, but some others are not. Even if there are millions of
>>> insert operations on an empty table (all tables are in cluster mode).
>>> In fact it seems that tables with update operations are the only ones
>>> that are auto-analysed.
>>> I'm quite suprised because the documentation says that daemon check the
>>> count of insert, update and delete operations.
>>> What could it be the reason ? Why tables which have only update
>>> operation, aren't analysed ?
>>> Are update operations really taken into account ?
>> Given that autoanalyze is pretty critical to the way the system
>> functions,
>> it's unlikely that it just doesn't work (someone else would have
>> noticed).
>>
>> A more likely scenario is that you've found some extremely obscure edge
>> case. If that's the case, you're going to have to give very specific
>> details as to how you're testing it before anyone is liable to be able
>> to help you.
>>
>> I get the impression that you're somewhat new to Postgres, in which case
>> it's very likely that the problem is that you're not testing the
>> situation
>> correctly. In that case, we're going to need specific details on how
>> you're
>> observing that tables are or are not being analysed.
>>
>> As a wild-guess theory: the process that does the analyze only wakes up
>> to check tables every 5 minutes (based on the config you show) ... so are
>> you doing the inserts then checking the table without leaving enough time
>> in between for the system to wake up and notice the change?
>>
> Thanks for your answer Bill.
> Indeed, I'm pretty new to Postgres and I don't exclude that I'm doing
> something wrong. But I did my test on a more than 1 day duration, so
> it's not an issue of autovacuum_naptime (I insert 760 lignes each 30
> seconds during 36 hours).
> I can't give all the details of this test because it is to complicated
> with triggers and partman (and your objective is not to solve
> configuration issues of others).
>
> In fact, I was telling the question because I have read on some forums
> that the auto vacuum deamon only count dead tuple so only update and
> delete operations can cause the scheduling of auto-analyse.
> So if it's the case it perfectly explain why my test case doesn't work.
> But in the other hand the documentation says that delete, update and
> insert operations are counted.
> Is it an know issue that insert operations are not counted for the
> trigger of auto-analyse ?

No, see below:

http://www.postgresql.org/docs/9.4/interactive/routine-vacuuming.html#AUTOVACUUM

23.1.3. Updating Planner Statistics

"The autovacuum daemon, if enabled, will automatically issue ANALYZE
commands whenever the content of a table has changed sufficiently.
However, administrators might prefer to rely on manually-scheduled
ANALYZE operations, particularly if it is known that update activity on
a table will not affect the statistics of "interesting" columns. The
daemon schedules ANALYZE strictly as a function of the number of rows
inserted or updated; it has no knowledge of whether that will lead to
meaningful statistical changes."


Is the partman you refer to this?:

https://github.com/keithf4/pg_partman

Can you give an outline view of what you are doing and how you are
determining the status of analyze?

>
> If it's not, I can try to reproduce this weird behaviour with a simpler
> test and give you all the details of the test.
>
> Bertrand
>
>
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: Auto-analyse on insert operations

От
Bill Moran
Дата:
On Wed, 4 Nov 2015 16:43:57 +0100
Bertrand Roos <bertrand.roos@areal.fr> wrote:
>
> Le 04/11/2015 14:55, Bill Moran a écrit :
> > On Wed, 4 Nov 2015 14:32:37 +0100
> > Bertrand Roos <bertrand.roos@areal.fr> wrote:
> >> I try to configure auto-analyse task with postgresql 9.4.
> >> I have the following configuration (default configuration):
> >> track_counts = on
> >> autovacuum = on
> >> log_autovacuum_min_duration = -1
> >> autovacuum_max_workers = 3
> >> autovacuum_naptime = 300s
> >> autovacuum_vacuum_threshold = 50
> >> autovacuum_analyze_threshold = 50
> >> autovacuum_vacuum_scale_factor = 0.2
> >> autovacuum_analyze_scale_factor = 0.2
> >> autovacuum_freeze_max_age = 200000000
> >> autovacuum_multixact_freeze_max_age = 400000000
> >> autovacuum_vacuum_cost_delay = 20ms
> >> autovacuum_vacuum_cost_limit = -1
> >>
> >> With this configuration, I can observe that some tables are
> >> auto-analysed, but some others are not. Even if there are millions of
> >> insert operations on an empty table (all tables are in cluster mode).
> >> In fact it seems that tables with update operations are the only ones
> >> that are auto-analysed.
> >> I'm quite suprised because the documentation says that daemon check the
> >> count of insert, update and delete operations.
> >> What could it be the reason ? Why tables which have only update
> >> operation, aren't analysed ?
> >> Are update operations really taken into account ?
> > Given that autoanalyze is pretty critical to the way the system functions,
> > it's unlikely that it just doesn't work (someone else would have noticed).
> >
> > A more likely scenario is that you've found some extremely obscure edge
> > case. If that's the case, you're going to have to give very specific
> > details as to how you're testing it before anyone is liable to be able
> > to help you.
> >
> > I get the impression that you're somewhat new to Postgres, in which case
> > it's very likely that the problem is that you're not testing the situation
> > correctly. In that case, we're going to need specific details on how you're
> > observing that tables are or are not being analysed.
> >
> > As a wild-guess theory: the process that does the analyze only wakes up
> > to check tables every 5 minutes (based on the config you show) ... so are
> > you doing the inserts then checking the table without leaving enough time
> > in between for the system to wake up and notice the change?
> >
> Thanks for your answer Bill.
> Indeed, I'm pretty new to Postgres and I don't exclude that I'm doing
> something wrong. But I did my test on a more than 1 day duration, so
> it's not an issue of autovacuum_naptime (I insert 760 lignes each 30
> seconds during 36 hours).
> I can't give all the details of this test because it is to complicated
> with triggers and partman (and your objective is not to solve
> configuration issues of others).

Others have answered some of your other questions, so I'll just throw
out another possibility: have the per-table analyze settings been altered
on the table(s) that are behaving badly? See
http://www.postgresql.org/docs/9.4/static/sql-createtable.html#SQL-CREATETABLE-STORAGE-PARAMETERS

Attaching the output of
pg_dump -s -t $table_name -U postgres $database_name
will probably go a long way toward getting more targeted assistance.
(substitute the actual database name, and the name of a table that is
giving you trouble)

In addition, the output of
SELECT * FROM pg_stat_user_tables WHERE relname = '$table_name';
(Again, substitute an actual table name that's giving you trouble,
preferrably the same table as from the pg_dump)

--
Bill Moran


Re: Auto-analyse on insert operations

От
Bertrand Roos
Дата:

Le 04/11/2015 16:57, Adrian Klaver a écrit :
> On 11/04/2015 07:43 AM, Bertrand Roos wrote:
>>
>>
>> Le 04/11/2015 14:55, Bill Moran a écrit :
>>> On Wed, 4 Nov 2015 14:32:37 +0100
>>> Bertrand Roos <bertrand.roos@areal.fr> wrote:
>>>> I try to configure auto-analyse task with postgresql 9.4.
>>>> I have the following configuration (default configuration):
>>>> track_counts = on
>>>> autovacuum = on
>>>> log_autovacuum_min_duration = -1
>>>> autovacuum_max_workers = 3
>>>> autovacuum_naptime = 300s
>>>> autovacuum_vacuum_threshold = 50
>>>> autovacuum_analyze_threshold = 50
>>>> autovacuum_vacuum_scale_factor = 0.2
>>>> autovacuum_analyze_scale_factor = 0.2
>>>> autovacuum_freeze_max_age = 200000000
>>>> autovacuum_multixact_freeze_max_age = 400000000
>>>> autovacuum_vacuum_cost_delay = 20ms
>>>> autovacuum_vacuum_cost_limit = -1
>>>>
>>>> With this configuration, I can observe that some tables are
>>>> auto-analysed, but some others are not. Even if there are millions of
>>>> insert operations on an empty table (all tables are in cluster mode).
>>>> In fact it seems that tables with update operations are the only ones
>>>> that are auto-analysed.
>>>> I'm quite suprised because the documentation says that daemon check
>>>> the
>>>> count of insert, update and delete operations.
>>>> What could it be the reason ? Why tables which have only update
>>>> operation, aren't analysed ?
>>>> Are update operations really taken into account ?
>>> Given that autoanalyze is pretty critical to the way the system
>>> functions,
>>> it's unlikely that it just doesn't work (someone else would have
>>> noticed).
>>>
>>> A more likely scenario is that you've found some extremely obscure edge
>>> case. If that's the case, you're going to have to give very specific
>>> details as to how you're testing it before anyone is liable to be able
>>> to help you.
>>>
>>> I get the impression that you're somewhat new to Postgres, in which
>>> case
>>> it's very likely that the problem is that you're not testing the
>>> situation
>>> correctly. In that case, we're going to need specific details on how
>>> you're
>>> observing that tables are or are not being analysed.
>>>
>>> As a wild-guess theory: the process that does the analyze only wakes up
>>> to check tables every 5 minutes (based on the config you show) ...
>>> so are
>>> you doing the inserts then checking the table without leaving enough
>>> time
>>> in between for the system to wake up and notice the change?
>>>
>> Thanks for your answer Bill.
>> Indeed, I'm pretty new to Postgres and I don't exclude that I'm doing
>> something wrong. But I did my test on a more than 1 day duration, so
>> it's not an issue of autovacuum_naptime (I insert 760 lignes each 30
>> seconds during 36 hours).
>> I can't give all the details of this test because it is to complicated
>> with triggers and partman (and your objective is not to solve
>> configuration issues of others).
>>
>> In fact, I was telling the question because I have read on some forums
>> that the auto vacuum deamon only count dead tuple so only update and
>> delete operations can cause the scheduling of auto-analyse.
>> So if it's the case it perfectly explain why my test case doesn't work.
>> But in the other hand the documentation says that delete, update and
>> insert operations are counted.
>> Is it an know issue that insert operations are not counted for the
>> trigger of auto-analyse ?
>
> No, see below:
>
> http://www.postgresql.org/docs/9.4/interactive/routine-vacuuming.html#AUTOVACUUM
>
>
> 23.1.3. Updating Planner Statistics
>
> "The autovacuum daemon, if enabled, will automatically issue ANALYZE
> commands whenever the content of a table has changed sufficiently.
> However, administrators might prefer to rely on manually-scheduled
> ANALYZE operations, particularly if it is known that update activity
> on a table will not affect the statistics of "interesting" columns.
> The daemon schedules ANALYZE strictly as a function of the number of
> rows inserted or updated; it has no knowledge of whether that will
> lead to meaningful statistical changes."
>
>
> Is the partman you refer to this?:
>
> https://github.com/keithf4/pg_partman
>
> Can you give an outline view of what you are doing and how you are
> determining the status of analyze?
>
>>
>> If it's not, I can try to reproduce this weird behaviour with a simpler
>> test and give you all the details of the test.
>>
>> Bertrand
>>
>>
>>
>>
>
>
I have done multiple tests trying to reproduce my issue and finally find
a partial explanation.

My non-working use case is :
With a database with configuration by default (autovacuum=on,
autovacuum_max_workers=3 ...)
A table ('table_name') partioned with partman with a child table per
month (@Adrian Klaver, yes it's https://github.com/keithf4/pg_partman)
A remote deamon that calls an analyze every day on the table 'table_name'
A remote application that calls multiple stored procedures doing an
insert operation on 'table_name' (785 inserts each 30s), with no delete,
no update operation
This stored procedures are sometimes blocked by a short lock (they are
doing some update operations on a cache table of 'table_name')

It happened that :
The current child table has no statistics :
- SELECT * FROM pg_stat WHERE tablename = '$table_name_month'; => empty row
- SELECT * FROM pg_stat_user_tables WHERE relname = '$table_name_month';
=> statistics are good (coherent with values I should have) but I saw no
value for last_autoanalyze or last_analyze (even if the test was running
for a few days and the table start with no data and grow to 6 million rows)

The behaviour of the auto-analyze I have observed :
Insert operations on a table trigger auto-analyze (with track_counts=on).
Stored procedure on a table trigger auto-analyze (even if
track_functions=off)
Insert operations on a partitioned table trigger auto-analyze.
Insert operations on a daily analyze table may not trigger auto-analyze,
because an analyze reset the count of operation for auto-analyze.
An analyze on a partitioned table update statistics of the parent table
but not for children tables.
An analyze on a partitioned table do not reset the count of operations
for auto-analyze of children tables.

I may not be able yet to explain the reason why the auto-analyze is
rarely not triggered.
But at least I found a way to fix pg_stat of the current table.

@Bill Moran, Are you sure that a dump of the table could help you ? The
values of the table are correct, the main issue is that statistics are
becoming invalid and so postgres is choosing dummy request paths.

Bertrand



Re: Auto-analyse on insert operations

От
Adrian Klaver
Дата:
On 11/06/2015 09:13 AM, Bertrand Roos wrote:
>
>
> Le 04/11/2015 16:57, Adrian Klaver a écrit :
>> On 11/04/2015 07:43 AM, Bertrand Roos wrote:
>>>
>>>
>>> Le 04/11/2015 14:55, Bill Moran a écrit :
>>>> On Wed, 4 Nov 2015 14:32:37 +0100
>>>> Bertrand Roos <bertrand.roos@areal.fr> wrote:
>>>>> I try to configure auto-analyse task with postgresql 9.4.
>>>>> I have the following configuration (default configuration):
>>>>> track_counts = on
>>>>> autovacuum = on
>>>>> log_autovacuum_min_duration = -1
>>>>> autovacuum_max_workers = 3
>>>>> autovacuum_naptime = 300s
>>>>> autovacuum_vacuum_threshold = 50
>>>>> autovacuum_analyze_threshold = 50
>>>>> autovacuum_vacuum_scale_factor = 0.2
>>>>> autovacuum_analyze_scale_factor = 0.2
>>>>> autovacuum_freeze_max_age = 200000000
>>>>> autovacuum_multixact_freeze_max_age = 400000000
>>>>> autovacuum_vacuum_cost_delay = 20ms
>>>>> autovacuum_vacuum_cost_limit = -1
>>>>>
>>>>> With this configuration, I can observe that some tables are
>>>>> auto-analysed, but some others are not. Even if there are millions of
>>>>> insert operations on an empty table (all tables are in cluster mode).
>>>>> In fact it seems that tables with update operations are the only ones
>>>>> that are auto-analysed.
>>>>> I'm quite suprised because the documentation says that daemon check
>>>>> the
>>>>> count of insert, update and delete operations.
>>>>> What could it be the reason ? Why tables which have only update
>>>>> operation, aren't analysed ?
>>>>> Are update operations really taken into account ?
>>>> Given that autoanalyze is pretty critical to the way the system
>>>> functions,
>>>> it's unlikely that it just doesn't work (someone else would have
>>>> noticed).
>>>>
>>>> A more likely scenario is that you've found some extremely obscure edge
>>>> case. If that's the case, you're going to have to give very specific
>>>> details as to how you're testing it before anyone is liable to be able
>>>> to help you.
>>>>
>>>> I get the impression that you're somewhat new to Postgres, in which
>>>> case
>>>> it's very likely that the problem is that you're not testing the
>>>> situation
>>>> correctly. In that case, we're going to need specific details on how
>>>> you're
>>>> observing that tables are or are not being analysed.
>>>>
>>>> As a wild-guess theory: the process that does the analyze only wakes up
>>>> to check tables every 5 minutes (based on the config you show) ...
>>>> so are
>>>> you doing the inserts then checking the table without leaving enough
>>>> time
>>>> in between for the system to wake up and notice the change?
>>>>
>>> Thanks for your answer Bill.
>>> Indeed, I'm pretty new to Postgres and I don't exclude that I'm doing
>>> something wrong. But I did my test on a more than 1 day duration, so
>>> it's not an issue of autovacuum_naptime (I insert 760 lignes each 30
>>> seconds during 36 hours).
>>> I can't give all the details of this test because it is to complicated
>>> with triggers and partman (and your objective is not to solve
>>> configuration issues of others).
>>>
>>> In fact, I was telling the question because I have read on some forums
>>> that the auto vacuum deamon only count dead tuple so only update and
>>> delete operations can cause the scheduling of auto-analyse.
>>> So if it's the case it perfectly explain why my test case doesn't work.
>>> But in the other hand the documentation says that delete, update and
>>> insert operations are counted.
>>> Is it an know issue that insert operations are not counted for the
>>> trigger of auto-analyse ?
>>
>> No, see below:
>>
>> http://www.postgresql.org/docs/9.4/interactive/routine-vacuuming.html#AUTOVACUUM
>>
>>
>> 23.1.3. Updating Planner Statistics
>>
>> "The autovacuum daemon, if enabled, will automatically issue ANALYZE
>> commands whenever the content of a table has changed sufficiently.
>> However, administrators might prefer to rely on manually-scheduled
>> ANALYZE operations, particularly if it is known that update activity
>> on a table will not affect the statistics of "interesting" columns.
>> The daemon schedules ANALYZE strictly as a function of the number of
>> rows inserted or updated; it has no knowledge of whether that will
>> lead to meaningful statistical changes."
>>
>>
>> Is the partman you refer to this?:
>>
>> https://github.com/keithf4/pg_partman
>>
>> Can you give an outline view of what you are doing and how you are
>> determining the status of analyze?
>>
>>>
>>> If it's not, I can try to reproduce this weird behaviour with a simpler
>>> test and give you all the details of the test.
>>>
>>> Bertrand
>>>
>>>
>>>
>>>
>>
>>
> I have done multiple tests trying to reproduce my issue and finally find
> a partial explanation.
>
> My non-working use case is :
> With a database with configuration by default (autovacuum=on,
> autovacuum_max_workers=3 ...)
> A table ('table_name') partioned with partman with a child table per
> month (@Adrian Klaver, yes it's https://github.com/keithf4/pg_partman)
> A remote deamon that calls an analyze every day on the table 'table_name'

So the above is a manual analyze run by another program correct?

> A remote application that calls multiple stored procedures doing an
> insert operation on 'table_name' (785 inserts each 30s), with no delete,
> no update operation

So a batch of 785 inserts over 30 seconds?

> This stored procedures are sometimes blocked by a short lock (they are
> doing some update operations on a cache table of 'table_name')

I not following the above.

You have another table that 'shadows' the 'table_name' table?

>
> It happened that :
> The current child table has no statistics :
> - SELECT * FROM pg_stat WHERE tablename = '$table_name_month'; => empty row

You are actually talking about pg_stats above and below correct?

> - SELECT * FROM pg_stat_user_tables WHERE relname = '$table_name_month';
> => statistics are good (coherent with values I should have) but I saw no
> value for last_autoanalyze or last_analyze (even if the test was running
> for a few days and the table start with no data and grow to 6 million rows)

So what user are you running the above as and is it one that has
permissions on 'table_name'?

>
> The behaviour of the auto-analyze I have observed :
> Insert operations on a table trigger auto-analyze (with track_counts=on).
> Stored procedure on a table trigger auto-analyze (even if
> track_functions=off)

Well:

http://www.postgresql.org/docs/9.4/static/runtime-config-statistics.html#GUC-TRACK-FUNCTIONS

  track_functions (enum)

     Enables tracking of function call counts and time used. Specify pl
to track only procedural-language functions, all to also track SQL and C
language functions. The default is none, which disables function
statistics tracking. Only superusers can change this setting.

So it tracks(or not) function usage not what the function does. If the
function is doing INSERT/UPDATE/DELETEs then that is covered by
track_counts.

> Insert operations on a partitioned table trigger auto-analyze.
> Insert operations on a daily analyze table may not trigger auto-analyze,
> because an analyze reset the count of operation for auto-analyze.
> An analyze on a partitioned table update statistics of the parent table
> but not for children tables.
> An analyze on a partitioned table do not reset the count of operations
> for auto-analyze of children tables.

I do not do enough partitioning to be of much help here. One thing I
would say is that it might be helpful to see the schema partman sets up
when it does the partitioning.

>
> I may not be able yet to explain the reason why the auto-analyze is
> rarely not triggered.
> But at least I found a way to fix pg_stat of the current table.
>
> @Bill Moran, Are you sure that a dump of the table could help you ? The
> values of the table are correct, the main issue is that statistics are
> becoming invalid and so postgres is choosing dummy request paths.
>
> Bertrand
>
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com