Adding facility for injection points (or probe points?) for more advanced tests
От | Michael Paquier |
---|---|
Тема | Adding facility for injection points (or probe points?) for more advanced tests |
Дата | |
Msg-id | ZTiV8tn_MIb_H2rE@paquier.xyz обсуждение исходный текст |
Ответы |
Re: Adding facility for injection points (or probe points?) for more advanced tests
Re: Adding facility for injection points (or probe points?) for more advanced tests Re: Adding facility for injection points (or probe points?) for more advanced tests Re: Adding facility for injection points (or probe points?) for more advanced tests |
Список | pgsql-hackers |
Hi all, I don't remember how many times in the last few years when I've had to hack the backend to produce a test case that involves a weird race condition across multiple processes running in the backend, to be able to prove a point or just test a fix (one recent case: 2b8e5273e949). Usually, I come to hardcoding stuff for the following situations: - Trigger a PANIC, to force recovery. - A FATAL, to take down a session, or just an ERROR. - palloc() failure injection. - Sleep to slow down a code path. - Pause and release with condition variable. And, while that's helpful to prove a point on a thread, nothing comes out of it in terms of regression test coverage in the tree because these tests are usually too slow and expensive, as they usually rely on hardcoded timeouts. So that's pretty much attempting to emulate what one would do with a debugger in a predictable way, without the manual steps because human hands don't scale well. The reason behind that is of course more advanced testing, to be able to expand coverage when we have weird and non-deterministic race issues to deal with, and the code becoming more complex every year makes that even harder. Fault and failure injection in specific paths comes into mind, additionally, particularly if you manage complex projects based on Postgres. So, please find attached a patch set that introduces an in-core facility to be able to set what I'm calling here an "injection point", that consists of being able to register in shared memory a callback that can be run within a defined location of the code. It means that it is not possible to trigger a callback before shared memory is set, but I've faced far more the case where I wanted to trigger something after shmem is set anyway. Persisting an injection point across restarts is also possible by adding some through an extension's shmem hook, as we do for custom LWLocks for example, as long as a library is loaded. This will remind a bit of what Alexander Korotkov has proposed here: https://www.postgresql.org/message-id/CAPpHfdtSEOHX8dSk9Qp%2BZ%2B%2Bi4BGQoffKip6JDWngEA%2Bg7Z-XmQ%40mail.gmail.com Also, this is much closee to what Craig Ringer is mentioning here, where it is named probe points, but I am using a minimal design that allows to achieve the same: https://www.postgresql.org/message-id/CAPpHfdsn-hzneYNbX4qcY5rnwr-BA1ogOCZ4TQCKQAw9qa48kA%40mail.gmail.com A difference is that I don't really see a point in passing to the callback triggered an area of data coming from the hash table itself, as at the end a callback could just refer to an area in shared memory or a static set of variables depending on what it wants, with one or more injection points (say a location to set a state, and a second to check it). So, at the end, the problem comes down in my opinion to two things: - Possibility to trigger a condition defined by some custom code, in the backend (core code or even out-of-core). - Possibility to define a location in the code where a named point would be checked. 0001 introduces three APIs to create, run, and drop injection points: +extern void InjectionPointCreate(const char *name, + InjectionPointCallback callback); +extern void InjectionPointRun(const char *name); +extern void InjectionPointDrop(const char *name); Then one just needs to add a macro like that to trigger the callback registered in the code to test: INJECTION_POINT_RUN("String"); So the footprint in the core tree is not zero, but it is as minimal as it can be. I have added some documentation to explain that, as well. I am not wedded to the name proposed in the patch, so if you feel there is better, feel free to propose ideas. This facility is hidden behind a specific configure/Meson switch, making it a no-op by default: --enable-injection-points -Dinjection_points={ true | false } 0002 is a test module to test these routines, that I have kept a maximum simple to ease review of the basics proposed here. This could be extended further to propose more default modes with TAP tests on its own, as I don't see a real point in having the SQL bits or some common callbacks (like for the PANIC or the FATAL cases) in core. Thoughts and comments are welcome. -- Michael
Вложения
В списке pgsql-hackers по дате отправления:
Предыдущее
От: John NaylorДата:
Сообщение: Re: [PATCH] Clarify the behavior of the system when approaching XID wraparound
Следующее
От: Amul SulДата:
Сообщение: Re: Adding facility for injection points (or probe points?) for more advanced tests