Обсуждение: SQL queries not matching on certain fields

Поиск
Список
Период
Сортировка

SQL queries not matching on certain fields

От
Felix Ableitner
Дата:

Hello,

I'm having a very strange problem with the Postgres database for my website. Some SQL queries are not matching on certain fields. I am running these commands via the psql command.

Here is a query that works as expected:

# SELECT id, "preferredUsername" FROM actor WHERE "preferredUsername"='emma'; id   | preferredUsername
-------+-------------------48952 | emma58672 | emma
(2 rows)

The following query should work as well, because the username exists. But in fact, it consistently returns nothing:

# SELECT id, "preferredUsername" FROM actor WHERE "preferredUsername"='mailab';id | preferredUsername
----+-------------------
     (0 rows)

There are some workarounds which fix the WHERE statement, all of the following work as expected:

SELECT id, "preferredUsername" FROM actor WHERE trim("preferredUsername")='mailab';

SELECT id, "preferredUsername" FROM actor WHERE "preferredUsername" ILIKE 'mailab'; 

SELECT id, "preferredUsername" FROM actor WHERE md5("preferredUsername")=md5('mailab'); 


Now you might think that there is something wrong with the encoding, or the field contains some extra whitespace. But SHOW SERVER_ENCODING and SHOW SERVER_ENCODING show UTF8. Length and m5 sum are also exactly as expected. And I checked the individual bytes with get_byte(), all of them are in the range 97-122.

About the setup, this is Postgres 10 running in Docker, on an Ubuntu VPS (see below for all versions etc). I had this problem before on the same setup, so I did an export to text file with pg_dump, and imported into a completely new database with psql. That fixed the problem for a few days, but it came back soon after.

The problem only seems to affect one or two specific columns, and only a few specific rows in those columns. Most other rows work normally. Affected columns also randomly start working again after a few days, and other columns get affected. I havent noticed any kind of pattern.

You can find the table definition here: https://gitlab.com/snippets/1840320

Version info:

Postgres Docker Image: postgres:10.7-alpine
Docker version: 18.09.2
OS: Ubuntu 18.04.2

Please tell me if you have any idea how to fix or debug this. I already asked multiple people, and no one has a clue what is going on.

Best,
Felix Ableitner

Re: SQL queries not matching on certain fields

От
Steve Atkins
Дата:

> On Apr 3, 2019, at 2:06 PM, Felix Ableitner <me@nutomic.com> wrote:
>
> Hello,
>
> I'm having a very strange problem with the Postgres database for my website. Some SQL queries are not matching on
certainfields. I am running these commands via the psql command. 
>
> Here is a query that works as expected:
>
> # SELECT id, "preferredUsername" FROM actor WHERE "preferredUsername"='emma';
>   id   | preferredUsername
> -------+-------------------
>  48952 | emma
>  58672 | emma
> (2 rows)
>
> The following query should work as well, because the username exists. But in fact, it consistently returns nothing:
>
> # SELECT id, "preferredUsername" FROM actor WHERE "preferredUsername"='mailab';
>  id | preferredUsername
> ----+-------------------
>
>       (0 rows)
>
> There are some workarounds which fix the WHERE statement, all of the following work as expected:
>
> SELECT id, "preferredUsername" FROM actor WHERE trim("preferredUsername")='mailab';
>
>
>
> SELECT id, "preferredUsername" FROM actor WHERE "preferredUsername" ILIKE 'mailab';
>
> SELECT id, "preferredUsername" FROM actor WHERE md5("preferredUsername")=md5('mailab');
>
>
> Now you might think that there is something wrong with the encoding, or the field contains some extra whitespace. But
SHOWSERVER_ENCODING and SHOW SERVER_ENCODING show UTF8. Length and m5 sum are also exactly as expected. And I checked
theindividual bytes with get_byte(), all of them are in the range 97-122. 
>
> About the setup, this is Postgres 10 running in Docker, on an Ubuntu VPS (see below for all versions etc). I had this
problembefore on the same setup, so I did an export to text file with pg_dump, and imported into a completely new
databasewith psql. That fixed the problem for a few days, but it came back soon after. 
>
> The problem only seems to affect one or two specific columns, and only a few specific rows in those columns. Most
otherrows work normally. Affected columns also randomly start working again after a few days, and other columns get
affected.I havent noticed any kind of pattern. 
>
> You can find the table definition here: https://gitlab.com/snippets/1840320

You can use "explain" to see what plan is being used for the query, but I'm betting that it's using an index on
preferredUsername.Your workarounds won't use that index, they'll scan the table. 

If that index is corrupted it could cause the symptoms you're seeing. You can use "reindex" to rebuild the index from
scratchand see if it fixes it but corrupted indexes aren't normal, and the issue seems to be recurring. On physical
hardwareI'd be wondering about filesystem corruption and taking a good look at my system logs. On someone else's VPS
youdon't have the same visibility, but I'd still check the system logs for issues. 

Cheers,
  Steve

>
> Version info:
>
> Postgres Docker Image: postgres:10.7-alpine
> Docker version: 18.09.2
> OS: Ubuntu 18.04.2
>
> Please tell me if you have any idea how to fix or debug this. I already asked multiple people, and no one has a clue
whatis going on. 
>
> Best,
> Felix Ableitner
>




Re: SQL queries not matching on certain fields

От
Torsten Förtsch
Дата:
Broken index?

I had a similar problem a while ago. We were streaming a DB from a Debian box to an alpine docker image. The underlying system libraries were a little different and that resulted in broken index behavior.

On Wed, Apr 3, 2019 at 3:06 PM Felix Ableitner <me@nutomic.com> wrote:

Hello,

I'm having a very strange problem with the Postgres database for my website. Some SQL queries are not matching on certain fields. I am running these commands via the psql command.

Here is a query that works as expected:

# SELECT id, "preferredUsername" FROM actor WHERE "preferredUsername"='emma'; id   | preferredUsername
-------+-------------------48952 | emma58672 | emma
(2 rows)

The following query should work as well, because the username exists. But in fact, it consistently returns nothing:

# SELECT id, "preferredUsername" FROM actor WHERE "preferredUsername"='mailab';id | preferredUsername
----+-------------------
     (0 rows)

There are some workarounds which fix the WHERE statement, all of the following work as expected:

SELECT id, "preferredUsername" FROM actor WHERE trim("preferredUsername")='mailab';

SELECT id, "preferredUsername" FROM actor WHERE "preferredUsername" ILIKE 'mailab'; 

SELECT id, "preferredUsername" FROM actor WHERE md5("preferredUsername")=md5('mailab'); 


Now you might think that there is something wrong with the encoding, or the field contains some extra whitespace. But SHOW SERVER_ENCODING and SHOW SERVER_ENCODING show UTF8. Length and m5 sum are also exactly as expected. And I checked the individual bytes with get_byte(), all of them are in the range 97-122.

About the setup, this is Postgres 10 running in Docker, on an Ubuntu VPS (see below for all versions etc). I had this problem before on the same setup, so I did an export to text file with pg_dump, and imported into a completely new database with psql. That fixed the problem for a few days, but it came back soon after.

The problem only seems to affect one or two specific columns, and only a few specific rows in those columns. Most other rows work normally. Affected columns also randomly start working again after a few days, and other columns get affected. I havent noticed any kind of pattern.

You can find the table definition here: https://gitlab.com/snippets/1840320

Version info:

Postgres Docker Image: postgres:10.7-alpine
Docker version: 18.09.2
OS: Ubuntu 18.04.2

Please tell me if you have any idea how to fix or debug this. I already asked multiple people, and no one has a clue what is going on.

Best,
Felix Ableitner

Sv: SQL queries not matching on certain fields

От
Andreas Joseph Krogh
Дата:
På onsdag 03. april 2019 kl. 15:06:03, skrev Felix Ableitner <me@nutomic.com>:

Hello,

I'm having a very strange problem with the Postgres database for my website. Some SQL queries are not matching on certain fields. I am running these commands via the psql command.

Here is a query that works as expected:

# SELECT id, "preferredUsername" FROM actor WHERE "preferredUsername"='emma';  id   | preferredUsername
-------+------------------- 48952 | emma 58672 | emma
(2 rows)

The following query should work as well, because the username exists. But in fact, it consistently returns nothing:

# SELECT id, "preferredUsername" FROM actor WHERE "preferredUsername"='mailab'; id | preferredUsername
----+-------------------
      (0 rows)

There are some workarounds which fix the WHERE statement, all of the following work as expected:

SELECT id, "preferredUsername" FROM actor WHERE trim("preferredUsername")='mailab';

SELECT id, "preferredUsername" FROM actor WHERE "preferredUsername" ILIKE 'mailab'; 

SELECT id, "preferredUsername" FROM actor WHERE md5("preferredUsername")=md5('mailab'); 

   

Now you might think that there is something wrong with the encoding, or the field contains some extra whitespace. But SHOW SERVER_ENCODING and SHOW SERVER_ENCODING show UTF8. Length and m5 sum are also exactly as expected. And I checked the individual bytes with get_byte(), all of them are in the range 97-122.

About the setup, this is Postgres 10 running in Docker, on an Ubuntu VPS (see below for all versions etc). I had this problem before on the same setup, so I did an export to text file with pg_dump, and imported into a completely new database with psql. That fixed the problem for a few days, but it came back soon after.

The problem only seems to affect one or two specific columns, and only a few specific rows in those columns. Most other rows work normally. Affected columns also randomly start working again after a few days, and other columns get affected. I havent noticed any kind of pattern.

You can find the table definition here: https://gitlab.com/snippets/1840320

Version info:

Postgres Docker Image: postgres:10.7-alpine
Docker version: 18.09.2
OS: Ubuntu 18.04.2

Please tell me if you have any idea how to fix or debug this. I already asked multiple people, and no one has a clue what is going on.

Best,
Felix Ableitner

 
Does disabling index-scan make a difference?
 
SET enable_indexscan to off;
 
How about dumping the relevant data and reloading it into another similar (but smaller) table, can you reproduce it then?
 
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
 
Вложения