Re: PostgreSQL and Real Application Testing (RAT)

Поиск
Список
Период
Сортировка
От Nikolay Samokhvalov
Тема Re: PostgreSQL and Real Application Testing (RAT)
Дата
Msg-id CANNMO+K8TL8SMTwQZ0JBBMb1JX9TX=kTp1b+eUTzmktTcLW-gw@mail.gmail.com
обсуждение исходный текст
Ответ на PostgreSQL and Real Application Testing (RAT)  (ROS Didier <didier.ros@edf.fr>)
Ответы Re: PostgreSQL and Real Application Testing (RAT)  (Jaime Casanova <jaime.casanova@2ndquadrant.com>)
Список pgsql-hackers
On Tue, Aug 27, 2019 at 3:47 AM ROS Didier <didier.ros@edf.fr> wrote:

Hi

 

In my business, one of the things blocking the migration from Oracle to PostgreSQL is not having the equivalent of Oracle Real Application Testing .

This product captures a charge in production and replay it in a test environment.

this allows to know the impacts of a migration to a newer version, the creation of an index..

is there an equivalent in the PostgreSQL community?

if not, do you think it's technically possible to do it ?

who would be interested in this project ?


Replaying workload might or might not apply well to your case.

There are several major difficulties if you want to replay workload:

1) How to "record" workload. You need to write all your queries to the Postgres log. Three problems here:
  1a) pgreplay expects log_statements = 'all' while you might prefer dealing with log_min_duration_statement instead. This is a minor issue though, quite easy to solve with preprocessing.
  1b) under heavy load, log_min_duration_statement = 0 (or log_statements = 'all') will lead to performance degradation or even downtime. Possible solutions are: write to memory, or don't write at all but send over the network.
  1c) ideally, recoding just queries is not enough. To replay workload "as is", we need to replay queries with known plans. There is no easy solution to this problem in the Postgres ecosystem yet.

A couple of additional points regarding item 1b and 1c. In Postgres 12, there is a cool new capability: sampling for query logging, implemented by Adrien Nayrat https://commitfest.postgresql.org/20/1691/  WIth this, it will be possible to fully log, say, 5% of all transactions and use it for replaying. Moreover, with auto_explain, it will be possible to have plans! Open questions are: (a) how to determine, if N% is enough, and (b) how to replay with specified plans. [If anyone is interested in working in this direction – please reach out to me.]

2) Issues with replaying itself. I can highlight at least two problems here:
  2a) pgreplay might be not enough for your workload, it doesn't scale well. If interested, look at its analog written in Go, https://github.com/gocardless/pgreplay-go, but this is quite a young project.
  2b) Postgres logs have millisecond precision (if you switched from %t to %m in log_line_prefix), this might be not enough. There is a patch to microsecond precision from David Fetter https://www.postgresql.org/message-id/flat/20181023185050.GE6049%40fetter.org, but that conversation hasn't yet led to commit.

Another approach you might be interested in -- workload simulation. This is what we (Postgres.ai) now used in most times when building "lab" environments for our clients. The idea is as follows:
- carefully analyze workload using pg_stat_statements (here, our open-source tool called "postgres-checkup" https://gitlab.com/postgres-ai/postgres-checkup might be helpful, see reports in section K),
- take the most resource-consuming query groups (Top-N ordered by total_time),
- create a set of files with statements with randomly filled parameters (won't work for most cases, I discuss restrictions below),
- use pgbench, feed workload files to it, using multiple -f options, with balancing (-f filename@XX, where XX is to be taked from pg_statements_analysis, but this time, "calls" and their ratio in the whole workload will be needed -- again, postgres-checkup can help here).
- run, analyze, compare behavior.

Restrictions of this approach are obvious:
- doesn't work well if most of your transactions have multiple statements,
- in many cases, randomization is hard (not obvious how to organize; synthetic approach is far from real data distribution in storage and workload; etc),
- the approach requires a significant amount of manual efforts.

However, the "workload simulation" approach is an extremely helpful approach in many cases, helping with change management. It doesn't require anything that might negatively affect your production workload, it utilizes pgbench (or any other tool) which is reliable, has great features and scales well.

You might be interested in looking at our tool that we built to conduct a huge amount of DB experiments, Nancy CLI https://gitlab.com/postgres-ai/nancy. It supports both "workload replay" method (with pgreplay) and "workload simulation" (with pgbench). PM me if you're interested in discussing details.

Thanks,
Nik


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alexandra Wang
Дата:
Сообщение: Re: Zedstore - compressed in-core columnar storage
Следующее
От: Jaime Casanova
Дата:
Сообщение: Re: PostgreSQL and Real Application Testing (RAT)