Обсуждение: OpenMP in PostgreSQL-8.4.0
Hi,
I am trying to run postgresql functions with threads by using OpenMP. I tried to parallelize slot_deform_tuple function(src/backend/access/common/heaptuple.c) and added below lines to the code.
#pragma omp parallel
{
#pragma omp sections
{
#pragma omp section
values[attnum] = fetchatt(thisatt, tp + off);
#pragma omp section
off = att_addlength_pointer(off, thisatt->attlen, tp + off);
}
}
During ./configure I saw the information message for heaptuple.c as below:
"OpenMP defined section was parallelized."
Below is the configure that I have run:
./configure CC="/path/to/icc -openmp" CFLAGS="-O2" --prefix=/path/to/pgsql --bindir=/path/to/pgsql/bin --datadir=/path/to/pgsql/share --sysconfdir=/path/to/pgsql/etc --libdir=/path/to/pgsql/lib --includedir=/path/to/pgsql/include --mandir=/path/to/pgsql/man --with-pgport=65432 --with-readline --without-zlib
After configure I ran gmake and gmake install and I saw "PostgreSQL installation complete."
When I begin to configure for initdb and run below command:
/path/to/pgsql/bin/initdb -D /path/to/pgsql/data
I get following error:
The files belonging to this database system will be owned by user "reydan.cankur".
This user must also own the server process.
The database cluster will be initialized with locale en_US.UTF-8.
The default database encoding has accordingly been set to UTF8.
The default text search configuration will be set to "english".
fixing permissions on existing directory /path/to/pgsql/data ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 32MB
creating configuration files ... ok
creating template1 database in /path/to/pgsql/data/base/1 ... FATAL: could not create unique index "pg_type_typname_nsp_index"
DETAIL: Table contains duplicated values.
child process exited with exit code 1
initdb: removing contents of data directory "/path/to/pgsql/data"
I could not get the point between initdb process and the change that I have made.
I need your help on solution of this issue.
Thanks in advance,
Reydan
Reydan Cankur <reydan.cankur@gmail.com> writes: > I am trying to run postgresql functions with threads by using OpenMP. This is pretty much doomed to failure. It's *certainly* doomed to failure if you just hack up one area of the source code without dealing with the backend's general lack of support for threading. regards, tom lane
You mean that backend does not support threading and everything I try is useless Is there a way to overcome this issue? Is there anything I can adjust on backend to enable threading? Is there any documentation to advise? Best Regards, Reydan On Nov 28, 2009, at 6:42 PM, Tom Lane wrote: > Reydan Cankur <reydan.cankur@gmail.com> writes: >> I am trying to run postgresql functions with threads by using OpenMP. > > This is pretty much doomed to failure. It's *certainly* doomed to > failure if you just hack up one area of the source code without > dealing > with the backend's general lack of support for threading. > > regards, tom lane
Reydan Cankur wrote: > You mean that backend does not support threading and everything I try > is useless > Is there a way to overcome this issue? > Is there anything I can adjust on backend to enable threading? > Is there any documentation to advise? Uh, "no" to all those questions. We offer client-side threading, but not in the server. --------------------------------------------------------------------------- > > Best Regards, > Reydan > > > On Nov 28, 2009, at 6:42 PM, Tom Lane wrote: > > > Reydan Cankur <reydan.cankur@gmail.com> writes: > >> I am trying to run postgresql functions with threads by using OpenMP. > > > > This is pretty much doomed to failure. It's *certainly* doomed to > > failure if you just hack up one area of the source code without > > dealing > > with the backend's general lack of support for threading. > > > > regards, tom lane > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
So I am trying to understand that can anyone rewrite some functions in postgresql with OpenMP in order to increase performance. does this work? On Nov 29, 2009, at 3:05 PM, Bruce Momjian wrote: > Reydan Cankur wrote: >> You mean that backend does not support threading and everything I try >> is useless >> Is there a way to overcome this issue? >> Is there anything I can adjust on backend to enable threading? >> Is there any documentation to advise? > > Uh, "no" to all those questions. We offer client-side threading, but > not in the server. > > --------------------------------------------------------------------------- > > >> >> Best Regards, >> Reydan >> >> >> On Nov 28, 2009, at 6:42 PM, Tom Lane wrote: >> >>> Reydan Cankur <reydan.cankur@gmail.com> writes: >>>> I am trying to run postgresql functions with threads by using >>>> OpenMP. >>> >>> This is pretty much doomed to failure. It's *certainly* doomed to >>> failure if you just hack up one area of the source code without >>> dealing >>> with the backend's general lack of support for threading. >>> >>> regards, tom lane >> >> >> -- >> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org >> ) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-performance > > -- > Bruce Momjian <bruce@momjian.us> http://momjian.us > EnterpriseDB http://enterprisedb.com > > + If your life is a hard drive, Christ can be your backup. +
Reydan Cankur <reydan.cankur@gmail.com> writes: > So I am trying to understand that can anyone rewrite some functions in > postgresql with OpenMP in order to increase performance. > does this work? Not without doing a truly vast amount of infrastructure work first. Infrastructure work that, by and large, would add cycles and lose performance. So by the time you got to the point of being able to do micro-optimizations like parallelizing individual functions, you'd have dug a pretty large performance hole that you'd have to climb out of before showing any net benefit for all this work. If you search the PG archives for discussions of threading you should find lots and lots of prior material. regards, tom lane
On Sun, Nov 29, 2009 at 1:24 PM, Reydan Cankur <reydan.cankur@gmail.com> wrote: > So I am trying to understand that can anyone rewrite some functions in > postgresql with OpenMP in order to increase performance. > does this work? Well you have to check the code path you're parallelizing for any function calls which might manipulate any data structures and protect those data structures with locks. That will be a huge job and introduce extra overhead. If you try to find code which does nothing like that you'll be limited to a few low-level pieces of code because Postgres goes to great lengths to be generic and allow user-configurable code in lots of places. To give one example, the natural place to introduce parallelism would be in the sorting routines -- but the comparison routine is a data-type-specific function that users can specify at the SQL level and is allowed to do almost anything. Then you'll have to worry about things like signal handlers. Anything big enough to be worth parallelizing is going to have a CHECK_FOR_INTERRUPTS in it which you'll have to make sure gets received by and processed correctly, cancelling all threads and throwing an error properly. Come to think of it you'll have to handle PG_TRY() and PG_THROW() properly. That will mean if an error occurs in any thread you have to make sure that you kill all the threads that have been spawned in that PG_TRY block and throw the correct error up. Incidentally I doubt heap_deformtuple is suitable for parallelization. It loops over the tuple and the procesing for each field depends completely on the previous one. When you have that kind of chained dependency adding threads doesn't help. You need a loop somewhere where each iteration of the loop can be processed independently. You might find such loops in the executor for things like hash joins or nested loops. But they will definitely involve user-defined functions and even i/o for each iteration of the loop so you'll definitely have to take precautions against the usual multi-threading dangers. -- greg
Sounds more like a school project than a proper performance question. On 11/28/09, Reydan Cankur <reydan.cankur@gmail.com> wrote: > Hi, > > I am trying to run postgresql functions with threads by using OpenMP. > I tried to parallelize slot_deform_tuple function(src/backend/access/ > common/heaptuple.c) and added below lines to the code. > > #pragma omp parallel > { > #pragma omp sections > { > #pragma omp section > values[attnum] = fetchatt(thisatt, tp + off); > > #pragma omp section > off = att_addlength_pointer(off, thisatt->attlen, tp + off); > } > } > > During ./configure I saw the information message for heaptuple.c as > below: > "OpenMP defined section was parallelized." > > Below is the configure that I have run: > ./configure CC="/path/to/icc -openmp" CFLAGS="-O2" --prefix=/path/to/ > pgsql --bindir=/path/to/pgsql/bin --datadir=/path/to/pgsql/share -- > sysconfdir=/path/to/pgsql/etc --libdir=/path/to/pgsql/lib -- > includedir=/path/to/pgsql/include --mandir=/path/to/pgsql/man --with- > pgport=65432 --with-readline --without-zlib > > After configure I ran gmake and gmake install and I saw "PostgreSQL > installation complete." > > When I begin to configure for initdb and run below command: > /path/to/pgsql/bin/initdb -D /path/to/pgsql/data > > I get following error: > > The files belonging to this database system will be owned by user > "reydan.cankur". > This user must also own the server process. > > The database cluster will be initialized with locale en_US.UTF-8. > The default database encoding has accordingly been set to UTF8. > The default text search configuration will be set to "english". > > fixing permissions on existing directory /path/to/pgsql/data ... ok > creating subdirectories ... ok > selecting default max_connections ... 100 > selecting default shared_buffers ... 32MB > creating configuration files ... ok > creating template1 database in /path/to/pgsql/data/base/1 ... FATAL: > could not create unique index "pg_type_typname_nsp_index" > DETAIL: Table contains duplicated values. > child process exited with exit code 1 > initdb: removing contents of data directory "/path/to/pgsql/data" > > I could not get the point between initdb process and the change that I > have made. > I need your help on solution of this issue. > > Thanks in advance, > Reydan > > > >