Re: [PATCH] Proposal to Enable/Disable Index using ALTER INDEX

Поиск
Список
Период
Сортировка
От Shayon Mukherjee
Тема Re: [PATCH] Proposal to Enable/Disable Index using ALTER INDEX
Дата
Msg-id 009A812B-5C47-469E-9C4F-A3547C849149@gmail.com
обсуждение исходный текст
Ответ на Re: [PATCH] Proposal to Enable/Disable Index using ALTER INDEX  (Robert Treat <rob@xzilla.net>)
Список pgsql-hackers

> On Jul 15, 2025, at 1:58 PM, Robert Treat <rob@xzilla.net> wrote:
>
> On Tue, Jul 15, 2025 at 8:19 AM Shayon Mukherjee <shayonj@gmail.com> wrote:
>> On Jun 23, 2025, at 10:14 AM, Robert Treat <rob@xzilla.net> wrote:
>> Glad to hear you are still interested, slightly disheartened by the
>> general lack of concern around operational safety in this thread. I
>> actually think what you have done covers a lot of the ground for
>> multiple implementations, so I'm optimistic we can get something for
>> 19.
>>
>> Just for my own learning and mental model - what would be a good way to understand the change that wasn’t
operationallysafe? 
>>
>
> Generally speaking, the two biggest factors for operational safety are
> the ability to slowly ramp up changes in a controlled fashion, and
> conversely the ability to quickly reverse them. On its surface, the
> ALTER feature looks like it passes both of these tests because (in
> simple cases) it appears better than drop/create index process alone;
> indeed, the ability to "turn off" an index before dropping it feels
> like a slower roll out than dropping it, and the ability to "turn it
> back on" seems like a much quicker reversal than having to recreate
> the index. Our problem is that this only gives the appearance of
> safety without having provided any significant improvement in system
> safety, especially in more complex and/or demanding setups. With
> regards to roll out specifically, the ALTER method is no safer than
> drop index because both use DDL which means they are both open to
> blocking or being blocked by conflicting queries, which increase
> operational risk within the system. Similarly, the nature of the DDL
> change also requires that all sessions be impacted everywhere at once;
> there is no way to slowly roll the change to some segment of the
> database or some specific workload within the system. So it fails the
> first test. With regards to the ability to quickly reverse the change,
> it does beat the need to rebuild an index, but that only helps in a
> very small subset of the typical use cases for this feature; ie where
> you are concerned that your server might get "swamped" by poorly
> performing queries while the index rebuilds. But that's a pretty low
> level version of the problem; on very busy systems and/or system with
> delicately balanced buffer caching, even a small pause measured in
> seconds could be enough to bring a system down, and again our use of
> DDL opens us up to delays from conflicting queries, untimely
> wraparound vacuums, concurrent WAL traffic in the case of wanting to
> do this across replica trees (which you can't not do). So we generally
> fail the second test for a large portion of the use cases involved.
> And maybe that would be ok if we didn't have a way to solve this
> problem that doesn't fail these tests, but we do, which is through
> using a GUC.
>
>> I was thinking about this some more over the weekend and it does seem
>> like you can't get away from doing something with DDL; even though it
>> is the wrong mental model... like when your AC is running but you
>> don't think it is cool enough, so you turn it down farther, as if it
>> would blow colder air... but that isn't how AC actually work... it
>> seems you can't eliminate the desire for this mental model entirely.
>> Which to be clear, I am not against, it's just a bad tool for the hard
>> cases, but not in every case. Anyway, if I were picking this up, I
>> would separate out the two ideas; as I laid out in my email to David,
>> the GUC solution can stand on it's own without the DDL implementation,
>> and I would do that first, and then add a simplified DDL
>> implementation after the fact. Of course it could be done the other
>> way around, but I think you're more likely to land on the correct GUC
>> implementation if it isn't mixed up with DDL, and the best way to
>> assure that is by not having the DDL for the initial patch. Just my
>> .02, but happy to help spec it out further.
>>
>>
>> I am happy to split this into two, however I think starting with GUC first may not achieve a lot of cases that David
andI were talking about earlier in the thread, perhaps? Where, if you want quick feedback without needing to make
application/ session / connection level changes (i.e GUC) then you can quickly do it via the ALTER statement. Happy to
redothe patch and just keep ALTER for v1 accordingly, if it still makes sense. 
>>
>
> I think it is much more the other way around; the GUC handles far more
> of the potential use cases that you might want to use the ALTER for,
> and the ALTER clearly falls short of what the GUC can do. (Side note,
> remember you can modify the GUC at the database level. And if you
> really want to get ambitious, GUCs can be extended to work through
> ALTER TABLE).
>
>> Would folks have any preference between the two approaches?
>>
>
> Contrary to how it sounds, I'm not actually opposed to having both :-)
> But I am very concerned that an implementation which does ALTER first
> sets a sort of anchoring bias that would affect how the GUC feature
> gets implemented, which is how I suspect Oracle ended up with their
> crappy implementation. I don't think this happens in reverse; ie. the
> GUC first implementation handles most of the heavy lifting so the
> ALTER only needs to cover the suite spot of the use cases that it can
> actually help with.
>
>
>


I know the thread has gotten quite long and I'm a little late to the party, but thank you for taking the time to walk
methrough the operational safety considerations you were referencing. I have a much better mental model of what you
meannow. 

Just like David mentioned earlier in this thread, I was approaching this problem from the perspective of being 99%
certainthe index isn't used, and thinking that a DDL to enable/disable indexes without needing a rebuild would be
useful.However, I understand your point that not everyone may approach using the DDL in the same way, or may not have
donetheir due diligence, which could negatively impact busy systems. Though I think that's partly the responsibility of
theDBA/developer as well - for example, we have `pg_stat_user_indexes`, and I'd typically use the stats in that table
tounderstand index usage before trying to disable it. At the same time, I do agree with your point about making common
workflowseasy and safe for users. There is def something there. 

The whole thread has made it very clear that there are pros and cons to all the approaches mentioned so far, and I'll
refrainfrom extending this tangent any further :D. I'll reply on the main thread with some thoughts additional
thoughts.I just wanted to express my appreciation for you taking the time to walk me through your perspective in more
depth.

P.S. I think we're approaching the 11-month anniversary of this thread (started on September 9, 2024, by yours truly)
andI am very humbled, and impressed by the community's rigor here :). I personally would love to see something land in
core,and I'm confident we'll get there. 

Thanks
Shayon


В списке pgsql-hackers по дате отправления: