Re: A worst case for qsort
От | Robert Haas |
---|---|
Тема | Re: A worst case for qsort |
Дата | |
Msg-id | CA+TgmoaFYNL0N3QJK=Tj9nbe3Usg=jYAvrziJZy298R=tC7gFA@mail.gmail.com обсуждение исходный текст |
Ответ на | A worst case for qsort (Peter Geoghegan <pg@heroku.com>) |
Ответы |
Re: A worst case for qsort
(John Cochran <j69cochran@gmail.com>)
Re: A worst case for qsort (Peter Geoghegan <pg@heroku.com>) |
Список | pgsql-hackers |
On Tue, Aug 5, 2014 at 8:15 PM, Peter Geoghegan <pg@heroku.com> wrote: > "The adversarial method works for almost any polymorphic program > recognizable as quicksort. The subject quicksort may copy values at > will, or work with lists rather than arrays. It may even pick the > pivot at random. The quicksort will be vulnerable provided only that > it satisfies some mild assumptions that are met by every > implementation I have seen". > > IMHO, while worst case performance is a very useful tool for analyzing > algorithms (particularly their worst case time complexity), a worst > case should be put in its practical context. For example, if we had > reason to be concerned about *adversarial* inputs, I think that there > is a good chance that our qsort() actually would be problematic to the > point of driving us to prefer some generally slower alternative. I completely agree with this, and I think everyone else does, too. Where we don't all necessarily agree is which worst cases are realistic and which worst cases are simply pathological cases with which we need not be concerned in practice. For example, when I proposed the patch to use MVCC catalog snapshots, Andres invented a test case which I thought was far more brutal than anything likely to be encountered in the real world. But Andres didn't agree; he thought the test case was realistic. So, I worked on the patch some more and came up with a solution that performs acceptably even if those brutal conditions are encountered in the world. As a result, the patch as committed is better than the one originally submitted. I could certainly have spent more time debating whether that effort was worthwhile, and maybe I would have won the argument, but it was a better of use of that time to instead try to improve the patch. So here. You may not agree that the mitigation strategies for which others are asking for are worthwhile, but you can't expect everyone else to agree with your assessment of which cases are likely to occur in practice. The case of a cohort of strings to be sorted which share a long fixed prefix and have different stuff at the end does not seem particularly pathological to me. It doesn't, in other words, require an adversary: some real-world data sets will look like that. I will forebear repeating examples I've given before, but I've seen that kind of thing more than once in real data sets that people (well, me, anyway) actually wanted to put into a PostgreSQL database. So I'm personally of the opinion that the time you've put into trying to protect against those cases is worthwhile. I understand that you may disagree with that, and that's fine: we're not all going to agree on everything. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления: