Re: bug or lacking doc hint

Поиск

Список

Период

Сортировка

От	Marc Millas
Тема	Re: bug or lacking doc hint
Дата	26 июня 2023 г. 23:21:31
Msg-id	CADX_1aZoTkpo24cFNtkSqAAPEV3Dm8tAzWw9eWJqeKyAERZ8nw@mail.gmail.com обсуждение исходный текст
Ответ на	Re: bug or lacking doc hint (Ron <ronljohnsonjr@gmail.com>)
Ответы	Re: bug or lacking doc hint
Список	pgsql-general

Дерево обсуждения

On Mon, Jun 26, 2023 at 4:05 PM Ron <ronljohnsonjr@gmail.com> wrote:

On 6/26/23 07:22, Marc Millas wrote:

On Mon, Jun 26, 2023 at 5:47 AM Avin Kavish <avin@baseboard.ai> wrote:
Sounds like the problem you are having is, the server is running out of temporary resources for the operation that users are trying to do. So according to Tom, on the postgres side, the operation cannot be optimized further.

I think you have few choices here,
- See if increasing the resources of the server will allow them to run the operation
- Ask users not to do that operation
- Use a extension like citus to scale horizontally

But I'm thinking why a massively inefficient join is needed in the first place. Shouldn't joins be for following keyed relationships. So ideally a unique indexed column, but at the very least an indexed column. Why is a join required on a dynamically calculated substring? Can it be made into a static computed value and indexed? Substring sounds like an op that should be in the filter stage.

Can you describe your data model? Maybe we can give some specific advice.

There is a set of big tables containing fine grain health data. The DB is accessed by people doing research on various kind of sicknesses. So, by nature, all columns (I mean ALL) can be used for every kind of SQL including, obviously, lots of intricated joins.

This sounds like the kind of problem normally solved by data warehouses. Is your schema designed like a DW, or is it in 3NF?

it's, indeed, some kind of dwh.

but it's neither a star nor a snowflake .at least not used like those standard schemas.

in one of the big tables (10 billions+ rows), there is around 60 columns, describing one event: some guy have had a given sickness, got a given medoc etc

The pb is that its not one simple event with a set of dimensions, the people using that db are NOT looking for an event according to various criterias, they are looking for correlations between each of the 60+ columns.

As a consequence very few indexes are used as most requests end in some kind of huge sequential reads.

The machine was built for this and perform well, but some requests are posing pb and we must find solutions/workaround.

one of the users did rewrite the request using a select distinct matched with left join(s) and table.a is not null set of conditions.

looks crazy, but does work. I'll get the request tomorrow.

--
Born in Arizona, moved to Babylonia.

Marc MILLAS

В списке pgsql-general по дате отправления:

Предыдущее

От: Peter Geoghegan
Дата: 26 июня 2023 г., 20:59:00
Сообщение: Re: Large pkey index on insert-only table

Следующее

От: B M
Дата: 27 июня 2023 г., 02:48:39
Сообщение: Large scale reliable software system

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: bug or lacking doc hint

Предыдущее

Следующее