Re: pt_toast table seems to be

Поиск

Список

Период

Сортировка

От	Rui DeSousa
Тема	Re: pt_toast table seems to be
Дата	4 января 2018 г. 01:23:37
Msg-id	79C6454F-E7F3-4AB9-A329-BECB41835BD4@icloud.com обсуждение исходный текст
Ответ на	pt_toast table seems to be (Jorge Daniel <elgaita@hotmail.com>)
Список	pgsql-admin

Дерево обсуждения

Do you mean the Postgres instance is killed or the docker instance is killed? For OOM, what’s the memory configuration and how much swap is allocated?

I would look at your disk subsystem from the prospected of Postgres; when it issues a sync() does your setup honor it? If you docker instance is crashing then it seems like sync() might not honored which would lead to corruption issues you’re describing.

On Jan 3, 2018, at 10:25 AM, Jorge Daniel <elgaita@hotmail.com> wrote:

Hi guys
This is my first post in the comunity so my apologies in advance about the formalities.

In a production db , and after several OOM killer events on the postgres running on a docker (now fixed) we remain with a kind of corrupted pg_toast 😐:

2017-11-29 23:46:13.147 PST rhost=10.149.54.5(54750) app=PostgreSQL JDBC Driver:user=veon:db=veon:ERROR: unexpected chunk number 0 (expected 1) for toast value 17143928 in pg_toast_77809
2017-11-29 23:46:13.147 PST rhost=10.149.54.5(54750) app=PostgreSQL JDBC Driver:user=veon:db=veon:STATEMENT: SELECT "id", "needs_pumping", "needs_lucene", "surveybatchid", "ip_address", "is_open_proxy", "is_partial", "cookies", "social_media_source", "social_media_review_id", "social_media_pull_id", "social_media_pull_id_long", "source_config_unitid", "source_config_source_id", "updated", "external_id_printable_coder", "unique_reusable_id", "invite_errmsg", "reminder_errmsg", "last_seen_pagename", "last_submitted_pagename", "has_content_fields", "logid", "alertid", "language_name", "e_status", "e_reminder_status", "e_lastupdated", "e_creationdate", "e_sampleddate", "e_responsedate", "e_invitationdate", "reminderdate", "e_delivereddate", "e_remindeddate", "e_accepteddate", "e_initialfinishdate", "e_ta_completed_date", "e_expirationdate", "e_survey_method", "e_survey_source", "e_survey_type", "parse_status", "tagging_attempts", "e_optout", "e_bounce_category", "feed_fileid", "feed_file_recordid", "startdate", "e_lastname", "e_firstname", "e_address", "e_address2", "e_city", "e_state", "e_postalcode", "e_phone", "e_email", "is_mobile", "is_mobile_first", "is_mobile_finished", "is_cookie_confirmation_needed", "exclude_from_problem_tracker", "is_invitation_error", "page_number_seen_last", "page_number_seen_highest", "pages_submitted", "pages_validation_failed", "last_submit_date", "user_agent", "jsonb_fields", "thrift", "episodeid", "e_unitid", "e_committed_survey_specid", "customer_veonid", "xmin" FROM "survey_prior" WHERE survey_veon.surveyid = ANY($1)

The usual fix is :

REINDEX TABLE pg_toast_77809;

After this we can select * from survery_prior with no problem, and we can dump the entire table if we wish it.

Then the jobs resumes and we're happy until we hit again with the same error , this occurs randomly , we suspect on the UPDATE on certain row is creating the corruption.

We can't reproduce the ERROR because we can't do a full scan of the table while the error is still alive () on-call guys applies the fix inmediatly to resume production ).

So how can we discover the bad-rows and expose them, or how do we check the consistency of the pt_toast table?

We need this evidence to ask for window and a complete dump/restore that will wipe out this problem.

Thx in advance

Jorge Daniel Fernandez

В списке pgsql-admin по дате отправления:

Предыдущее

От: Scott Whitney
Дата: 04 января 2018 г., 01:14:51
Сообщение: Re: Failed to execute pg_dump

Следующее

От: "Dave Bolt"
Дата: 04 января 2018 г., 02:29:07
Сообщение: RE: Failed to execute pg_dump

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: pt_toast table seems to be

Предыдущее

Следующее