Обсуждение: Create temp table query hangs

Поиск
Список
Период
Сортировка

Create temp table query hangs

От
Novak Ivan
Дата:

Greetings,

 

We’ve a hanging query which creates a temp table.

 

Some context:

  • The application transforms from one data model to another one (to a canonical datamodel). The source tables for these transformations are on one schema of the postgresDB (we call it LandingZone) and the target tables of these transformations are in a different schema of the exact same postgresDB (we call it the CDM schema).

 

Our problem in summary:

  • a query for creating temporary table on target schema is hanging forever and we are out of ideas why.

The query is not generated by our own "mapper" tool that we wrote. We don't use any third party product here (except JOOQ library for communicating to DB)

 

When we try to create a (temporary) table with one statement, and that query takes forever (even after >24h the query is still there). In the Postgres Database we see three identical active PIDs for the same query (with same starttimestamp). But we only sent the statement once to the DB. See screenshot below.

 

The query hangs when creating of temporary tables (see below) and also creating normal tables.

The query hangs using our go-to library for Database queries (JOOQ) and also when we do it manually (we rewrote the implementation using java JDBC) -> so it is not dependent on that library

The query hangs also after updating the default JDBC drivers to newest version (42.2.19)

The query does NOT hang when running on local development environment (not containerized) -> we see 3 PIDs in database (which is strange), but the query finishes after some seconds

The query does NOT hang when running directly on the database using DBeaver, i.e. not using java -> we see 3 PIDs in database (which is strange), but the query finishes after some seconds

This doesn't happen every time we create a table. For source tables that have no new data (or are empty) the query completes. Otherwise it doesn’t seem to be volume dependent (small vs large vs huge volume)

It does consistently happen when creating temp table selecting on specific tables that have some volume.

  

In all environments (ours or clients) the Postgres DB are on own VM and not containerized.

On local dev environment (noncontainerized) the query does not hang. It only hangs when using a container environment (local docker or openshift)

 

The process of a delta load is as follows (for each source/target table combination):

1) get max "lastupdate" timestamp from target table

2) delete data on target that has been replaced (i.e. data that was marked for deletion since "lastupdate")

3) create temporary_table with new data from the source table (new means greater than "lastupdate"; this is defined in the subselect)

4) insert into target table as select * from temporary_table

 

The query on step3 is the one that hangs

 

Content of the query:

The "create table as select from" has a small to large resultset (can be 0 or millions of rows). The select statement has 2 joins with the same table (a code table to resolve some code values for later WHERE clauses) and an additional condition in the WHERE clause with a subselect inside (the subselect selects from the same table as the parent select. It checks if this record is in a list of "to be updated" records. i.e. for a "delta" load).

 

Postgres version: PostgreSQL 12.6 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.3.1 20191121 (Red Hat 8.3.1-5), 64-bit

Postgres JDBC driver version: 42.2.19

Java version: 11.0.9.1

 

  • All those parameters are the same whether we run the java code containerized or not containerized.

 

Attachments:

  • The query that is hanging
  • Excerpt of the sessions showing the 3 PIDs (see screenshot below as well)

 

Any help is appreciated,

 

Ivan 

 

 

Вложения

Re: Create temp table query hangs

От
Ron
Дата:
On 4/13/21 1:26 PM, Novak Ivan wrote:

Greetings,

 

We’ve a hanging query which creates a temp table.

 

Some context:

  • The application transforms from one data model to another one (to a canonical datamodel). The source tables for these transformations are on one schema of the postgresDB (we call it LandingZone) and the target tables of these transformations are in a different schema of the exact same postgresDB (we call it the CDM schema).

 

Our problem in summary:

  • a query for creating temporary table on target schema is hanging forever and we are out of ideas why.

The query is not generated by our own "mapper" tool that we wrote. We don't use any third party product here (except JOOQ library for communicating to DB)

 

When we try to create a (temporary) table with one statement, and that query takes forever (even after >24h the query is still there). In the Postgres Database we see three identical active PIDs for the same query (with same starttimestamp). But we only sent the statement once to the DB. See screenshot below.

 

The query hangs when creating of temporary tables (see below) and also creating normal tables.

The query hangs using our go-to library for Database queries (JOOQ) and also when we do it manually (we rewrote the implementation using java JDBC) -> so it is not dependent on that library

The query hangs also after updating the default JDBC drivers to newest version (42.2.19)

The query does NOT hang when running on local development environment (not containerized) -> we see 3 PIDs in database (which is strange), but the query finishes after some seconds

The query does NOT hang when running directly on the database using DBeaver, i.e. not using java -> we see 3 PIDs in database (which is strange), but the query finishes after some seconds

This doesn't happen every time we create a table. For source tables that have no new data (or are empty) the query completes. Otherwise it doesn’t seem to be volume dependent (small vs large vs huge volume)

It does consistently happen when creating temp table selecting on specific tables that have some volume.


Something similar happens to us when running a long script full of DDL from JDBC on Postgresql 9.6.6.

--
Angular momentum makes the world go 'round.

Re: Create temp table query hangs

От
Tom Lane
Дата:
Novak Ivan <ivan.novak@adcubum.com> writes:
>   *   a query for creating temporary table on target schema is hanging forever and we are out of ideas why.

Looking at wait_event as well as wait_event_type in pg_stat_activity
might give some more clues.

> When we try to create a (temporary) table with one statement, and that query takes forever (even after >24h the query
isstill there). In the Postgres Database we see three identical active PIDs for the same query (with same
starttimestamp).But we only sent the statement once to the DB. See screenshot below. 

I think this is just a parallelized query with two worker processes.

Were the query's source tables freshly created or freshly populated?
If so, you might need to issue ANALYZE commands on them before you
start the query itself, to make sure the query planner has valid
statistics to work with.  It's possible that the problem is just a
very dumb choice of query plan due to lack of stats.

Have you tried comparing EXPLAIN output for this query across the
scenarios where it works well and where it doesn't?  If it is a
bad-plan problem then there'd be obvious differences in the shape
of the plan.

            regards, tom lane



postgresql in docker to improve security

От
"Nguyen, Long (IM&T, St. Lucia)"
Дата:

Good day. This is a general db question.

 

I start exploring containerisation and start learning docker.  Would having postgresql in docker improve security in the sense that users could only access to the db through the port mapped to the environment outside of docker, and if they somehow are able to hack and access outside the db, the access is limited within the container not the OS that host the container.

 

Thanks.

Re: postgresql in docker to improve security

От
Kashif Zeeshan
Дата:
Hi

Yes docker container improves the security and following are the ways it does.
1. Isolation : When you run postgres in a container, you are isolating it from host os and other containers so it limits the attack surface.
2. Port mapping : By mapping only the necessary container port and allowing access only using that port limits the attack surface.
3. You can manage the access privileges of the users that run container
4. Docker containers use namespaces for process isolation and security.

Regards
Kashif Zeeshan
Bitnine Global

On Fri, May 3, 2024 at 3:44 AM Nguyen, Long (IM&T, St. Lucia) <Long.Nguyen@csiro.au> wrote:

Good day. This is a general db question.

 

I start exploring containerisation and start learning docker.  Would having postgresql in docker improve security in the sense that users could only access to the db through the port mapped to the environment outside of docker, and if they somehow are able to hack and access outside the db, the access is limited within the container not the OS that host the container.

 

Thanks.

Re: postgresql in docker to improve security

От
"gunnar.wagner"
Дата:
interesting points @Kashif.

On the other hand I often, that containers are by design ephemeral  and tend to crash. This would be a threat to data integrity (allegedly more then running in a VM i.e.).

Admittedly the environment I am working in is not very open to, nor experienced with container-technology in general, so these claims might be based on hearsay and those issues might not be actual problems any more these days.

Any thoughts on that?


Also I made a mark in my mind head thad podman, by design, was just a little more secure then Docker. I think it was due to the fact Containers can run without the need of requiring root privileges for the user running a podman container.




On 5/3/24 5:23 AM, Kashif Zeeshan wrote:
Hi

Yes docker container improves the security and following are the ways it does.
1. Isolation : When you run postgres in a container, you are isolating it from host os and other containers so it limits the attack surface.
2. Port mapping : By mapping only the necessary container port and allowing access only using that port limits the attack surface.
3. You can manage the access privileges of the users that run container
4. Docker containers use namespaces for process isolation and security.

Regards
Kashif Zeeshan
Bitnine Global

On Fri, May 3, 2024 at 3:44 AM Nguyen, Long (IM&T, St. Lucia) <Long.Nguyen@csiro.au> wrote:

Good day. This is a general db question.

 

I start exploring containerisation and start learning docker.  Would having postgresql in docker improve security in the sense that users could only access to the db through the port mapped to the environment outside of docker, and if they somehow are able to hack and access outside the db, the access is limited within the container not the OS that host the container.

 

Thanks.


--

Gunnar Wagner | Jahnstr. 5, 19386 Lübz | mob +49.176.7080.9090

Re: postgresql in docker to improve security

От
vrms
Дата:
interesting points @Kashif.

On the other hand I often, that containers are by design ephemeral  and tend to crash. This would be a threat to data integrity (allegedly more then running in a VM i.e.).

Admittedly the environment I am working in is not very open to, nor experienced with container-technology in general, so these claims might be based on hearsay and those issues might not be actual problems any more these days.

Any thoughts on that?


Also I made a mark in my mind head thad podman, by design, was just a little more secure then Docker. I think it was due to the fact Containers can run without the need of requiring root privileges for the user running a podman container.




On 5/3/24 5:23 AM, Kashif Zeeshan wrote:
Hi

Yes docker container improves the security and following are the ways it does.
1. Isolation : When you run postgres in a container, you are isolating it from host os and other containers so it limits the attack surface.
2. Port mapping : By mapping only the necessary container port and allowing access only using that port limits the attack surface.
3. You can manage the access privileges of the users that run container
4. Docker containers use namespaces for process isolation and security.

Regards
Kashif Zeeshan
Bitnine Global

On Fri, May 3, 2024 at 3:44 AM Nguyen, Long (IM&T, St. Lucia) <Long.Nguyen@csiro.au> wrote:

Good day. This is a general db question.

 

I start exploring containerisation and start learning docker.  Would having postgresql in docker improve security in the sense that users could only access to the db through the port mapped to the environment outside of docker, and if they somehow are able to hack and access outside the db, the access is limited within the container not the OS that host the container.

 

Thanks.




On 5/3/24 5:23 AM, Kashif Zeeshan wrote:
Hi

Yes docker container improves the security and following are the ways it does.
1. Isolation : When you run postgres in a container, you are isolating it from host os and other containers so it limits the attack surface.
2. Port mapping : By mapping only the necessary container port and allowing access only using that port limits the attack surface.
3. You can manage the access privileges of the users that run container
4. Docker containers use namespaces for process isolation and security.

Regards
Kashif Zeeshan
Bitnine Global

On Fri, May 3, 2024 at 3:44 AM Nguyen, Long (IM&T, St. Lucia) <Long.Nguyen@csiro.au> wrote:

Good day. This is a general db question.

 

I start exploring containerisation and start learning docker.  Would having postgresql in docker improve security in the sense that users could only access to the db through the port mapped to the environment outside of docker, and if they somehow are able to hack and access outside the db, the access is limited within the container not the OS that host the container.

 

Thanks.

Re: postgresql in docker to improve security

От
Kashif Zeeshan
Дата:


On Fri, May 3, 2024 at 10:14 AM vrms <vrms@netcologne.de> wrote:
interesting points @Kashif.

On the other hand I often, that containers are by design ephemeral  and tend to crash. This would be a threat to data integrity (allegedly more then running in a VM i.e.).
Yes that's true, but for that we have K8 and which can  automate the recovery process.

Admittedly the environment I am working in is not very open to, nor experienced with container-technology in general, so these claims might be based on hearsay and those issues might not be actual problems any more these days.
Yes I agree, the technology is changing rapidly but there are still loopholes and what we can do is to avoid as many risks as possible as nothing is 100% secure. 

Any thoughts on that?


Also I made a mark in my mind head thad podman, by design, was just a little more secure then Docker. I think it was due to the fact Containers can run without the need of requiring root privileges for the user running a podman container.




On 5/3/24 5:23 AM, Kashif Zeeshan wrote:
Hi

Yes docker container improves the security and following are the ways it does.
1. Isolation : When you run postgres in a container, you are isolating it from host os and other containers so it limits the attack surface.
2. Port mapping : By mapping only the necessary container port and allowing access only using that port limits the attack surface.
3. You can manage the access privileges of the users that run container
4. Docker containers use namespaces for process isolation and security.

Regards
Kashif Zeeshan
Bitnine Global

On Fri, May 3, 2024 at 3:44 AM Nguyen, Long (IM&T, St. Lucia) <Long.Nguyen@csiro.au> wrote:

Good day. This is a general db question.

 

I start exploring containerisation and start learning docker.  Would having postgresql in docker improve security in the sense that users could only access to the db through the port mapped to the environment outside of docker, and if they somehow are able to hack and access outside the db, the access is limited within the container not the OS that host the container.

 

Thanks.




On 5/3/24 5:23 AM, Kashif Zeeshan wrote:
Hi

Yes docker container improves the security and following are the ways it does.
1. Isolation : When you run postgres in a container, you are isolating it from host os and other containers so it limits the attack surface.
2. Port mapping : By mapping only the necessary container port and allowing access only using that port limits the attack surface.
3. You can manage the access privileges of the users that run container
4. Docker containers use namespaces for process isolation and security.

Regards
Kashif Zeeshan
Bitnine Global

On Fri, May 3, 2024 at 3:44 AM Nguyen, Long (IM&T, St. Lucia) <Long.Nguyen@csiro.au> wrote:

Good day. This is a general db question.

 

I start exploring containerisation and start learning docker.  Would having postgresql in docker improve security in the sense that users could only access to the db through the port mapped to the environment outside of docker, and if they somehow are able to hack and access outside the db, the access is limited within the container not the OS that host the container.

 

Thanks.

AW: postgresql in docker to improve security

От
"Dischner, Anton"
Дата:

Hi all,

 

i know of an installation where virtual Servers are running under VMWare.

On a VM there is docker running which does virtualisation again.

Then there is running a kubernetes cluster.

In my eyes this is complete insane.

Configuration complexity explodes.

Resources are wasted.

Kubernetes cluster are meant do run on dozens if not hundreds of real hardware. If one failes from hardware there should be no problem.

 

IMHO If you do a proper installation on the OS you will need no docker, kubernetes and so on.

 

Look for hardening you OS if you have increased security needs.

Take this as a beginning:

https://tuxcare.com/blog/linux-system-hardening-top-10-security-tips/

 

Please someone correct me if there are reasons against my opinion,

 

Best

 

Von: Kashif Zeeshan <kashi.zeeshan@gmail.com>
Gesendet: Freitag, 3. Mai 2024 07:18
An: vrms <vrms@netcologne.de>
Cc: pgsql-admin@lists.postgresql.org
Betreff: Re: postgresql in docker to improve security

 

 

 

On Fri, May 3, 2024 at 10:14 AM vrms <vrms@netcologne.de> wrote:

interesting points @Kashif.

On the other hand I often, that containers are by design ephemeral  and tend to crash. This would be a threat to data integrity (allegedly more then running in a VM i.e.).

Yes that's true, but for that we have K8 and which can  automate the recovery process.


Admittedly the environment I am working in is not very open to, nor experienced with container-technology in general, so these claims might be based on hearsay and those issues might not be actual problems any more these days.

Yes I agree, the technology is changing rapidly but there are still loopholes and what we can do is to avoid as many risks as possible as nothing is 100% secure. 


Any thoughts on that?


Also I made a mark in my mind head thad podman, by design, was just a little more secure then Docker. I think it was due to the fact Containers can run without the need of requiring root privileges for the user running a podman container.



On 5/3/24 5:23 AM, Kashif Zeeshan wrote:

Hi

 

Yes docker container improves the security and following are the ways it does.

1. Isolation : When you run postgres in a container, you are isolating it from host os and other containers so it limits the attack surface.

2. Port mapping : By mapping only the necessary container port and allowing access only using that port limits the attack surface.

3. You can manage the access privileges of the users that run container

4. Docker containers use namespaces for process isolation and security.

 

Regards

Kashif Zeeshan

Bitnine Global

 

On Fri, May 3, 2024 at 3:44 AM Nguyen, Long (IM&T, St. Lucia) <Long.Nguyen@csiro.au> wrote:

Good day. This is a general db question.

 

I start exploring containerisation and start learning docker.  Would having postgresql in docker improve security in the sense that users could only access to the db through the port mapped to the environment outside of docker, and if they somehow are able to hack and access outside the db, the access is limited within the container not the OS that host the container.

 

Thanks.

 

 

On 5/3/24 5:23 AM, Kashif Zeeshan wrote:

Hi

 

Yes docker container improves the security and following are the ways it does.

1. Isolation : When you run postgres in a container, you are isolating it from host os and other containers so it limits the attack surface.

2. Port mapping : By mapping only the necessary container port and allowing access only using that port limits the attack surface.

3. You can manage the access privileges of the users that run container

4. Docker containers use namespaces for process isolation and security.

 

Regards

Kashif Zeeshan

Bitnine Global

 

On Fri, May 3, 2024 at 3:44 AM Nguyen, Long (IM&T, St. Lucia) <Long.Nguyen@csiro.au> wrote:

Good day. This is a general db question.

 

I start exploring containerisation and start learning docker.  Would having postgresql in docker improve security in the sense that users could only access to the db through the port mapped to the environment outside of docker, and if they somehow are able to hack and access outside the db, the access is limited within the container not the OS that host the container.

 

Thanks.