Обсуждение: BUG #19114: ORDER BY ASC is tampering result when calculating distance btw vectors
BUG #19114: ORDER BY ASC is tampering result when calculating distance btw vectors
От
PG Bug reporting form
Дата:
The following bug has been logged on the website:
Bug reference: 19114
Logged by: Naveen Krishna S
Email address: naveenkrishna.s@sky.uk
PostgreSQL version: 14.13
Operating system: Mac OS
Description:
In a table with vector column,
SELECT embedding <=> CAST('[0.01, 0.23, -0.1,..]' as vector)
AS distance
FROM my_table
WHERE TRUE
order by distance desc
LIMIT 100;
is giving 100 records whereas
SELECT embedding <=> CAST('[0.01, 0.23, -0.1,..]' as vector)
AS distance
FROM my_table
WHERE TRUE
order by distance asc
LIMIT 100;
is giving 40 records for a specific embedding and always less than the limit
for any query embedding. Why is this?
I have also noticed if I use NULLS FIRST or COALESCE(distance, 99999999999)
it is returning the requested limit. But when I tried to list the records
with distance as null, there were none.
Re: BUG #19114: ORDER BY ASC is tampering result when calculating distance btw vectors
От
Thomas Munro
Дата:
On Mon, Nov 17, 2025 at 7:14 PM PG Bug reporting form <noreply@postgresql.org> wrote: > 40 records Sounds like pgvector[1]? "Why are there less results for a query after adding an HNSW index? Results are limited by the size of the dynamic candidate list (hnsw.ef_search), which is 40 by default. ...." [1] https://github.com/pgvector/pgvector
Re: [EXTERNAL] Re: BUG #19114: ORDER BY ASC is tampering result when calculating distance btw vectors
От
"S, Naveen Krishna (Development Engineer 3)"
Дата:
Hi Thomas,
Very good day!
As you referred I am able to confirm the following:
- The issue is occurring when HNSW indexing is done.
- The issue is not replicable with vector extension 0.8.0
- I have tried with multiple m and ef_construction pairs ranging from (16,200) to (64,512) and nothing is working.
May I know where could I see the release updates on 0.8.1 of the vector extension?
Also please feel free to share your thoughts for me to test more on the same.
Thanks and regards,
Naveen Krishna
From: Thomas Munro <thomas.munro@gmail.com>
Date: Monday, 17 November 2025 at 1:21 PM
To: S, Naveen Krishna (Development Engineer 3) <naveenkrishna.s@sky.uk>, pgsql-bugs@lists.postgresql.org <pgsql-bugs@lists.postgresql.org>
Subject: [EXTERNAL] Re: BUG #19114: ORDER BY ASC is tampering result when calculating distance btw vectors
Date: Monday, 17 November 2025 at 1:21 PM
To: S, Naveen Krishna (Development Engineer 3) <naveenkrishna.s@sky.uk>, pgsql-bugs@lists.postgresql.org <pgsql-bugs@lists.postgresql.org>
Subject: [EXTERNAL] Re: BUG #19114: ORDER BY ASC is tampering result when calculating distance btw vectors
On Mon, Nov 17, 2025 at 7:14 PM PG Bug reporting form
<noreply@postgresql.org> wrote:
> 40 records
Sounds like pgvector[1]?
"Why are there less results for a query after adding an HNSW index?
Results are limited by the size of the dynamic candidate list
(hnsw.ef_search), which is 40 by default. ...."
[1] https://urldefense.com/v3/__https://github.com/pgvector/pgvector__;!!IlCVUJ0!pgEkbcfGC71pECmG2FAxk-UVuOknYxbjtoQO3yaM1g09YxwPgPSDF9XXWQgCAxG3rmlFOkoesem_tILjQC_EsxIduQ$
--------------------------------------------------------------------
This email is from an external source. Please do not open attachments or click links from an unknown or suspicious origin. Phishing attempts can be reported by using the report message button in Outlook or sending them as an attachment to phishing@sky.uk. Thank you
--------------------------------------------------------------------
<noreply@postgresql.org> wrote:
> 40 records
Sounds like pgvector[1]?
"Why are there less results for a query after adding an HNSW index?
Results are limited by the size of the dynamic candidate list
(hnsw.ef_search), which is 40 by default. ...."
[1] https://urldefense.com/v3/__https://github.com/pgvector/pgvector__;!!IlCVUJ0!pgEkbcfGC71pECmG2FAxk-UVuOknYxbjtoQO3yaM1g09YxwPgPSDF9XXWQgCAxG3rmlFOkoesem_tILjQC_EsxIduQ$
--------------------------------------------------------------------
This email is from an external source. Please do not open attachments or click links from an unknown or suspicious origin. Phishing attempts can be reported by using the report message button in Outlook or sending them as an attachment to phishing@sky.uk. Thank you
--------------------------------------------------------------------
Sky UK Limited (Registration No. 2906991), Sky-In-Home Service Limited (Registration No. 2067075), Sky Subscribers Services Limited (Registration No. 2340150) and Sky CP Limited (Registration No. 9513259) are direct or indirect subsidiaries of Sky Limited (Registration No. 2247735). All of the companies mentioned in this paragraph are incorporated in England and Wales and share the same registered office at Grant Way, Isleworth, Middlesex TW7 5QD
On Wed, Nov 19, 2025 at 10:55 PM S, Naveen Krishna (Development Engineer 3) <naveenkrishna.s@sky.uk> wrote: > The issue is occurring when HNSW indexing is done. > The issue is not replicable with vector extension 0.8.0 > I have tried with multiple m and ef_construction pairs ranging from (16,200) to (64,512) and nothing is working. > > May I know where could I see the release updates on 0.8.1 of the vector extension? > Also please feel free to share your thoughts for me to test more on the same. Hi Naveen, pgvector is an independent project. If you think it's misbehaving you should probably raise an issue here (after reading all the documentation): https://github.com/pgvector/pgvector/issues