Re: PL/PGSQL + inserts+updates+limit - Postgres 9.3

Поиск

Список

Период

Сортировка

От	David G. Johnston
Тема	Re: PL/PGSQL + inserts+updates+limit - Postgres 9.3
Дата	3 июня 2016 г. 01:10:18
Msg-id	CAKFQuwZvt1DFKAKTRtsKEGrCFV+=wfcuavaKpGTeKoWfGfLTtA@mail.gmail.com обсуждение исходный текст
Ответ на	Re: PL/PGSQL + inserts+updates+limit - Postgres 9.3 (Patrick Baker <patrickbakerbr@gmail.com>)
Список	pgsql-general

Дерево обсуждения

On Thu, Jun 2, 2016 at 5:03 PM, Patrick Baker <patrickbakerbr@gmail.com> wrote:

2016-06-03 2:10 GMT+12:00 David G. Johnston <david.g.johnston@gmail.com>:
On Thu, Jun 2, 2016 at 1:04 AM, Patrick Baker <patrickbakerbr@gmail.com> wrote:

It's all working, except the LIMIT... if possible can you please give me an example of that LIMIT in some of those queries?

You also should use ORDER BY when using LIMIT and OFFSET; though depending on the setup it could be omitted. Usually as long as the second execution cannot select any of the records the first execution touched you can choose a random quantity. But if you want random then using OFFSET is pointless.

SELECT *
FROM generate_series(1, 10)
ORDER BY 1
LIMIT 5
OFFSET 3

generate_series
----------------------
4
5
6
7
8

You are going to have difficultly finding people willing to help when you cannot put together a self-contained and syntax error free example (I think the last one is...) of what you want to do. The PostgreSQL parser is very good at reading code and telling you what it doesn't like. I'm not inclined to spend time reading queries that obviously cannot run and point out those same problems. If you can a particular error you don't understand I'll be happy to try and explain what it is trying to tell you.

You probably need to reformulate your update to read:

UPDATE tbl
FROM (
SELECT 50 RECORDS
) src
WHERE src = tbl;

And ensure that the 50 being selected each time through are a different 50.

Writeable CTEs will probably help here.

https://www.postgresql.org/docs/current/static/queries-with.html

David J.

Hi David.

The SQLs inside the function works....

Really? You seem to have lost your FOR loop for starters, and your RETURN statement, and a semi-colon after END, and I doubt crtRow.file_id works, should I go on...so, yes, you can run the four individual SQL statements correctly but the function itself is bogus.

I'm just having problem about limiting the query to the number of rows I want, and also, to teach the update SQL to only touch the records the other SQLs inside the function have touched.

This is the function updated:

CREATE or REPLACE FUNCTION function_data_1(rows integer)
RETURNS INTEGER AS $$

declare
completed integer;
offset_num integer;
crtRow record;

BEGIN
offset_num = 0;

INSERT INTO table2_y_b (note_id, size, file_id, full_path)
(
SELECT
t1.note_id,
t1.size,
t1.file_id,
t1.full_path
FROM
table1_n_b t1
JOIN
table3_n_b t3 ON t3.file_id = t1.file_id
);

UPDATE table2_y_b t2 SET segment_data =
(
SELECT
o1.data
FROM
original_table1_b o1
JOIN
table3_n_b t3 ON t3.file_id = o1.file_id
WHERE
t2.migrated = 0
AND
t2.file_id = o1.file_id
);

UPDATE table2_y_b SET migrated = 1 WHERE file_id = crtRow.file_id AND migrated = 0;

UPDATE original_table1_b SET data = NULL WHERE file_id = crtRow.file_id;

END

$$ language 'plpgsql';

- As you can see, the first insert, inserts data into a new table from another select. This query must be limited by the number of rows I'll provide when calling the function; example:

select function_data_1(5000);
select function_data_1(60000);
select function_data_1(15000);

- The first update, copies the BLOBS from the original_table1_b table into the new one (as above). Here, I also need the query knows to only touch those records that have been touched by the above query.

- The second update, set the table2_y_b.migrated column from 0 to 1, telling me that, that record has been touched by the query. So the next call ( select function_data_1(60000); ) will already know that it does not need to touch that record; example:

WHERE
t2.migrated = 0

- The third and last update, deletes (set the blobs column as null) the blobs that have already been touched by the above queries.... Still.. don't know how to tell postgres to only touches the rows that have been touched by the above queries....

Here's a fish - though you will still need to clean it.

This is not tested, and I haven't ever build this exact query for real, but it should work in theory...

--assumes that to be migrated records have previously had their migrated flag set to 0

function name (number_of_rows_to_process integer)

LANGUAGE sql -- this no longer requires procedural logic so no need for plpgsql

RETURNS SETOF bigint --returns the affected ids

AS $$

WITH the_records_I_want_to_affect AS (

-- pick N records to process

SELECT id, ...

FROM source_tbl

WHERE migrated = 0

ORDER BY ...

LIMIT number_of_rows_to_process -- your function argument goes here

FOR UPDATE

migrate_the_data AS (

-- place a copy of them into the archive table

INSERT INTO migration_table

SELECT id, ...

FROM the_records_I_want_to_affect

RETURNING *

mark_as_migrated AS (

-- mark them as having been archived and nullify the blob data

UPDATE source_tbl

SET migrated = 1, data = null

FROM migrate_the_data recs

WHERE recs.id = source_tbl.id

RETURNING source_tbl.id

)

SELECT id FROM mark_as_migrated;

I am sure a fully working version of this idiom in present in one and more places on the internet. Feel free to search out fully working examples with additional commentary.

You can make a FOR loop version of this work, and had to many years ago before writable CTEs were implemented.

David J.

В списке pgsql-general по дате отправления:

Предыдущее

От: Kevin Grittner
Дата: 03 июня 2016 г., 00:44:11
Сообщение: Re: dumb question

Следующее

От: "David G. Johnston"
Дата: 03 июня 2016 г., 01:19:45
Сообщение: Re: dumb question

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: PL/PGSQL + inserts+updates+limit - Postgres 9.3

Предыдущее

Следующее