Re: Sequence vs UUID

Поиск

Список

Период

Сортировка

От	Miles Elam
Тема	Re: Sequence vs UUID
Дата	29 января 2023 г. 06:28:43
Msg-id	CAALojA98Jf6M1NFmKy5qqOyydU6-eCNE8rY2Yg+W+Qp9P0jqDA@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Sequence vs UUID (Ron <ronljohnsonjr@gmail.com>)
Ответы	Re: Sequence vs UUID (Ron <ronljohnsonjr@gmail.com>)
Список	pgsql-general

Дерево обсуждения

On Sat, Jan 28, 2023 at 6:02 PM Ron <ronljohnsonjr@gmail.com> wrote:
>
> Type 4 UUIDs are sub-optimal for big table because cache hit rates drop through the floor.
>
> This lesson was burned into my psyche waaaay back in the Clinton administration.  It was my task to speed up a five
hourbatch job which read input records from a flat file, did some validations and then inserted them.  Simply sorting
theinput file on the primary key fields -- we used natural keys, not synthetics -- dropped the run time to two hours.
(VMSSORT saved the day, because you could tell it the sort order you wanted; thus, I could preserve the header record
atthe top of the file, and the trailer record at the end of the file without jumping through a bunch of hoops.) 

This can be mitigated with judicious use of a sequence at the front of
the uuidv4.

    https://www.2ndquadrant.com/en/blog/sequential-uuid-generators/

More effort than just calling the built-in gen_random_uuid() or
equivalent in app code, but a substantial performance gain for your
effort.

    https://github.com/tvondra/sequential-uuids

And in a managed environment where you can't install custom
extensions, a fairly simple function with divide on unix epoch seconds
combined with a call to overlay(...) should suffice performance-wise.
At 60 seconds, this will loop every 45 days or so, and you can choose
how much "sequentialness" works for you, from 1 to 4 bytes at the
expense of pseudo-randomness.

-----------------------------

-- Generate time interval UUID
CREATE OR REPLACE FUNCTION gen_interval_uuid(interval_seconds int4 =
60, block_num_bytes int2 = 2)
RETURNS uuid LANGUAGE sql VOLATILE PARALLEL SAFE AS $$
  SELECT encode(
    overlay(
      -- convert the uuid to byte array
      uuid_send(gen_random_uuid())
      -- use only the bottom bytes
      PLACING substring(
          int4send((extract(epoch FROM now()) / interval_seconds)::int4)
          FROM (5 - block_num_bytes)
      )
      -- place at the front two bytes of the uuid
      FROM 1
    )
    -- convert the resulting byte array to hex for conversion to uuid
    , 'hex')::uuid
  WHERE interval_seconds > 0 AND block_num_bytes BETWEEN 1 AND 4
$$;

-----------------------------

Technically affecting the v4 spec. You could always convert to a
UUIDv8, which is the intention behind that new version even though the
standard hasn't been ratified yet.

Cheers,

Miles Elam

В списке pgsql-general по дате отправления:

Предыдущее

От: Ron
Дата: 29 января 2023 г., 05:01:50
Сообщение: Re: Sequence vs UUID

Следующее

От: Ron
Дата: 29 января 2023 г., 07:02:36
Сообщение: Re: Sequence vs UUID

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Sequence vs UUID

Предыдущее

Следующее