Re: new heapcheck contrib module
| От | Peter Geoghegan | 
|---|---|
| Тема | Re: new heapcheck contrib module | 
| Дата | |
| Msg-id | CAH2-WznaU6HcahLV4Hg-DnhEmW8DuSdYfn3vfWXoj3Me9jq=sQ@mail.gmail.com обсуждение исходный текст | 
| Ответ на | Re: new heapcheck contrib module (Robert Haas <robertmhaas@gmail.com>) | 
| Ответы | Re: new heapcheck contrib module | 
| Список | pgsql-hackers | 
On Wed, May 13, 2020 at 12:22 PM Robert Haas <robertmhaas@gmail.com> wrote: > I think this is a good summary of the problems in this area. On the > one hand, I think it's hideous that we sanity check user input to > death, but blindly trust the bytes on disk to the point of seg > faulting if they're wrong. The idea that int4 + int4 has to have > overflow checking because otherwise a user might be sad when they get > a negative result from adding two negative numbers, while at the same > time supposing that the same user will be unwilling to accept the > performance hit to avoid crashing if they have a bad tuple, is quite > suspect in my mind. The overflow checking is also expensive, but we do > it because it's the right thing to do, and then we try to minimize the > overhead. It is unclear to me why we shouldn't also take that approach > with bytes that come from disk. In particular, using Assert() checks > for such things instead of elog() is basically Assert(there is no such > thing as a corrupted database). I think that it depends. It's nice to be able to add an Assert() without really having to worry about the overhead at all. I sometimes call relatively expensive functions in assertions. For example, there is an assert that calls _bt_compare() within _bt_check_unique() that I added at one point -- it caught a real bug a few weeks later. You could always be doing more. In general we don't exactly trust the bytes blindly. I've found that corrupting tuples in a creative way with pg_hexedit doesn't usually result in a segfault. Sometimes we'll do things like display NULL values when heap line pointers are corrupt, which isn't as good as an error but is still okay. We ought to protect against Murphy, not Machiavelli. ISTM that access method code naturally evolves towards avoiding the most disruptive errors in the event of real world corruption, in particular avoiding segfaulting. It's very hard to prove that, though. Do you recall seeing corruption resulting in segfaults in production? I personally don't recall seeing that. If it happened, the segfaults themselves probably wouldn't be the main concern. > On the other hand, that problem is clearly way above this patch's pay > grade. There's a lot of stuff all over the code base that would have > to be changed to fix it. It can't be done as an incidental thing as > part of this patch or any other. It's a massive effort unto itself. We > need to somehow draw a clean line between what this patch does and > what it does not do, such that the scope of this patch remains > something achievable. Otherwise, we'll end up with nothing. I can easily come up with an adversarial input that will segfault a backend, even amcheck, but it'll be somewhat contrived. It's hard to fool amcheck currently because it doesn't exactly trust line pointers. But I'm sure I could get the backend to segfault amcheck if I tried. I'd probably try to play around with varlena headers. It would require a certain amount of craftiness. It's not exactly clear where you draw the line here. And I don't think that the line will be very clearly defined, in the end. It'll be something that is subject to change over time, as new information comes to light. I think that it's necessary to accept a certain amount of ambiguity here. -- Peter Geoghegan
В списке pgsql-hackers по дате отправления: