Re: [GENERAL] Large data and slow queries

Поиск
Список
Период
Сортировка
От vinny
Тема Re: [GENERAL] Large data and slow queries
Дата
Msg-id 2d52de6778a7e907db41267d8e96d373@xs4all.nl
обсуждение исходный текст
Ответ на Re: [GENERAL] Large data and slow queries  (Samuel Williams <space.ship.traveller@gmail.com>)
Ответы Re: [GENERAL] Large data and slow queries  (John R Pierce <pierce@hogranch.com>)
Список pgsql-general
On 2017-04-19 07:04, Samuel Williams wrote:
> Thanks John. Yes, you are absolutely right, you want the index to be
> bottom heavy so you can cull as much as possible at the top. I'm
> familiar with that, once implementing a brute-force sudoku solver, it
> has the same principle.
>
> I've been working on this all afternoon. By reducing the longitude,
> latitude columns to float4, in my test cases, I found about 50%
> improvement in performance. It may also use less space. So part of the
> problem was my choice of data type. We've computed that float4 has a
> worst case precision of about 1.6m which we are okay with for
> analytics data.
>
> Another option we may consider is using a (signed) integer - e.g.
> longitude = 180*(v/2^31) and latitude = 180*(v/2^31) as this has a
> uniform error across all points, but it's a bit more cumbersome to
> handle. Is there a rational datatype in postgres which works like
> this?
>
>
>
> On 19 April 2017 at 16:42, John R Pierce <pierce@hogranch.com> wrote:
>> On 4/18/2017 9:01 PM, Samuel Williams wrote:
>>>
>>> We want the following kinds of query to be fast:
>>>
>>> SELECT ... AND (latitude > -37.03079375089291 AND latitude <
>>> -36.67086424910709 AND longitude > 174.6307139779924 AND longitude <
>>> 175.0805140220076);
>>
>>
>>
>> I wonder if GIST would work better if you use the native POINT type,
>> and
>> compared it like
>>
>> mypoint <@ BOX
>> '((174.6307139779924,-37.03079375089291),(175.0805140220076,-36.67086424910709
>> ))'
>>
>> with a gist index on mypoint...
>>
>> but, it all hinges on which clauses in your query are most selective,
>> thats
>> where you want an index.
>>
>> --
>> john r pierce, recycling bits in santa cruz
>>
>>
>>
>> --
>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-general

Did that 50% performance gain come from just the datatype, or that fact
that the index became smaller?

Given the number of records, my first thought was either partitioning or
partial-indexes.
The fewer rows are in the index, the quicker it will be to check,
and it's not a lot of work to create separate indexes for lat/long
ranges or dates.



В списке pgsql-general по дате отправления:

Предыдущее
От: Tom DalPozzo
Дата:
Сообщение: Re: [GENERAL] tuple statistics update
Следующее
От: Tom DalPozzo
Дата:
Сообщение: Re: [GENERAL] tuple statistics update