Re: Deletion Challenge

Поиск
Список
Период
Сортировка
От Adrian Klaver
Тема Re: Deletion Challenge
Дата
Msg-id 5667829F.1010600@aklaver.com
обсуждение исходный текст
Ответ на Deletion Challenge  (Berend Tober <btober@computer.org>)
Ответы Re: Deletion Challenge  (Berend Tober <btober@computer.org>)
Re: Deletion Challenge  (Benjamin Smith <lists@benjamindsmith.com>)
Список pgsql-general
On 12/05/2015 08:08 AM, Berend Tober wrote:
> /*
>
> Deletion Challenge
>
> I want to delete all but the most recent transaction, per person, from a
> table that records a transaction history because at some point the
> transaction history grows large enough to adversely effect performance,
> and also becomes less relevant for retention.
>
> I have devised a way to accomplish this, but it is a 'two-stage'
> approach: that is, it requires two delete statements. I would like to
> know if there is a way to do it in a single statement.
>
> Bonus challenge: Same question, except preserving the most recent N, for
> N > 1, rows for each person so that a short history is retained after
> the deletion.
>
> I have included below an annotated test case and my current solution for
> the N = 1 case.
>
> */
>
> DROP TABLE IF EXISTS cash_journal;
>
>
> CREATE TABLE cash_journal (
>      click bigint NOT NULL,
>      cash_journal_id bigint NOT NULL,
>      fairian_id bigint NOT NULL,
>      debit double precision,
>      credit double precision,
>      balance real DEFAULT 0,
>      description text
> );
>
> COMMENT ON COLUMN cash_journal.click        IS 'Time of transaction.';
> COMMENT ON COLUMN cash_journal.cash_journal_id    IS 'Sequence of
> transaction within current click.';
> COMMENT ON COLUMN cash_journal.fairian_id    IS 'Fairian account
> effected.';
> COMMENT ON COLUMN cash_journal.debit        IS 'Account balance increase
> amount.';
> COMMENT ON COLUMN cash_journal.credit        IS 'Account balance
> decrease amount.';
> COMMENT ON COLUMN cash_journal.balance        IS 'Account balance, per
> Fairian running total.';
> COMMENT ON COLUMN cash_journal.description    IS 'Transaction
> description.';
>
> /*
>
> Below is some sample data, listed in the click/sequence order that the
> data would actually be entered. That is, the 'click' column represents
> advancing time, and within each click, transactions are sequenced by the
> 'cash_journal_id' column. Note there are some missing cash_journal_id
> sequence numbers. This is an artifact of having presented here only
> an illustrative sample. Generally, within each click, the sequence
> would start at one and increment uniformly by one for each new row
> in the same click, and then reset to one for the next click. The
> missing increments in the sample data should not make any difference
> in the solution.
>
> The 'balance' column is a per-player running total, which is a
> deliberate denormalization. It is calculated in a before insert trigger
> by starting with the per-player previous balance, and then adding
> the new row debit, if any, and subtracting the new row credit, if any.
>
> Note, not all Fairians will have a transaction in every click, but any
> number of Fairians may have multiple transactions in any click.
>
> */
>
> copy cash_journal
> (click,cash_journal_id,fairian_id,debit,credit,balance,description) from
> stdin;
> 36    3    7    0    0    0    Initial cash balance
> 36    4    8    0    0    0    Initial cash balance
> 36    5    9    0    0    0    Initial cash balance
> 36    14    18    0    0    0    initial cash balance
> 37    5    7    9    \N    9    Ratified contract fa35e192121eab
> 37    7    8    8    \N    8    Ratified contract f1abd670358e03
> 37    9    9    7    \N    7    Ratified contract 1574bddb75c78a
> 411    1    25    0    0    0    Initial cash balance
> 411    2    25    1000    \N    1000    Issued bond 7719a1c782a1ba
> 412    1    7    5    \N    14    Sold food quantity 7 units.
> 412    2    25    \N    5    995    Bought food quantity 7 units.
> 413    1    25    \N    995    0    Redeemed bond 7719a1c782a1ba
> \.
>
>
> SELECT * FROM cash_journal order by fairian_id, click, cash_journal_id;
>
> /*
>
> The sample starting data is shown here in order by Fairian so that it is
> perhaps easier to see what is happening for each player. Note that the
> result of the deletion should be the last row for each player.
>
>   click | cash_journal_id | fairian_id | debit | credit | balance
> |           description
> -------+-----------------+------------+-------+--------+---------+----------------------------------
>
>      36 |               3 |          7 |     0 |      0 |       0 |
> Initial cash balance
>      37 |               5 |          7 |     9 |        |       9 |
> Ratified contract fa35e192121eab
>     412 |               1 |          7 |     5 |        |      14 | Sold
> food quantity 7 units.
>      36 |               4 |          8 |     0 |      0 |       0 |
> Initial cash balance
>      37 |               7 |          8 |     8 |        |       8 |
> Ratified contract f1abd670358e03
>      36 |               5 |          9 |     0 |      0 |       0 |
> Initial cash balance
>      37 |               9 |          9 |     7 |        |       7 |
> Ratified contract 1574bddb75c78a
>      36 |              14 |         18 |     0 |      0 |       0 |
> initial cash balance
>     411 |               1 |         25 |     0 |      0 |       0 |
> Initial cash balance
>     411 |               2 |         25 |  1000 |        |    1000 |
> Issued bond 7719a1c782a1ba
>     412 |               2 |         25 |       |      5 |     995 |
> Bought food quantity 7 units.
>     413 |               1 |         25 |       |    995 |       0 |
> Redeemed bond 7719a1c782a1ba
> (12 rows)
>
> */
>
>
> /*
>
> Here is the current, two-stage solution in use. Is there a way to do it
> with a single statement?
>
> Can you create a solution that retains an arbitrarily specified number
> of rows per player?
>
> */
> BEGIN;
>
> WITH max_click AS (
>    SELECT
>      cash_journal.fairian_id,
>      max(cash_journal.click) AS click
>      FROM cash_journal
>      GROUP BY cash_journal.fairian_id
>      )
>    delete from cash_journal j
>      using max_click b
>      where j.fairian_id = b.fairian_id
>      and j.click        < b.click;
>
> WITH max_journal_id AS (
>    SELECT
>      cash_journal.fairian_id,
>      cash_journal.click,
>      max(cash_journal.cash_journal_id) AS cash_journal_id
>      FROM cash_journal
>      GROUP BY cash_journal.fairian_id, cash_journal.click
>      )
>    delete from cash_journal j
>       using max_journal_id b
>       where j.fairian_id    = b.fairian_id
>       and j.click           = b.click
>       and j.cash_journal_id < b.cash_journal_id;
>
> COMMIT;
>
> SELECT * FROM cash_journal order by fairian_id, click, cash_journal_id;
>
> /*
>
>   click | cash_journal_id | fairian_id | debit | credit | balance
> |           description
> -------+-----------------+------------+-------+--------+---------+----------------------------------
>
>     412 |               1 |          7 |     5 |        |      14 | Sold
> food quantity 7 units.
>      37 |               7 |          8 |     8 |        |       8 |
> Ratified contract f1abd670358e03
>      37 |               9 |          9 |     7 |        |       7 |
> Ratified contract 1574bddb75c78a
>      36 |              14 |         18 |     0 |      0 |       0 |
> initial cash balance
>     413 |               1 |         25 |       |    995 |       0 |
> Redeemed bond 7719a1c782a1ba
> (5 rows)
>
>
> */
>
>


test=> delete from cash_journal where ARRAY[click, cash_journal_id] NOT in (select max(ARRAY[click,cash_journal_id])
fromcash_journal group by fairian_id); 
DELETE 7

test=> SELECT * FROM cash_journal order by fairian_id, click, cash_journal_id;
 click | cash_journal_id | fairian_id | debit | credit | balance |           description
-------+-----------------+------------+-------+--------+---------+----------------------------------
   412 |               1 |          7 |     5 |        |      14 | Sold food quantity 7 units.
    37 |               7 |          8 |     8 |        |       8 | Ratified contract f1abd670358e03
    37 |               9 |          9 |     7 |        |       7 | Ratified contract 1574bddb75c78a
    36 |              14 |         18 |     0 |      0 |       0 | initial cash balance
   413 |               1 |         25 |       |    995 |       0 | Redeemed bond 7719a1c782a1ba
(5 rows)


--
Adrian Klaver
adrian.klaver@aklaver.com


В списке pgsql-general по дате отправления:

Предыдущее
От: Steve Crawford
Дата:
Сообщение: Re: Deletion Challenge
Следующее
От: Craig Ringer
Дата:
Сообщение: Re: bdr manual cleanup required