Обсуждение: Proposed patch: remove hard-coded limit MAX_ALLOCATED_DESCS
Recently we had a gripe about how you can't read very many files concurrently with contrib/file_fdw: http://www.postgresql.org/message-id/OF419B5767.8A3C9ADB-ON85257B79.005491E9-85257B79.0054F6F1@isn.rtss.qc.ca The reason for this is that file_fdw goes through the COPY code, which uses AllocateFile(), which has a wired-in assumption that not very many files need to be open concurrently. I thought for a bit about trying to get COPY to use a "virtual" file descriptor such as is provided by the rest of fd.c, but that didn't look too easy, and in any case probably nobody would be excited about adding additional overhead to the COPY code path. What seems more practical is to relax the hard-coded limit on the number of files concurrently open through AllocateFile(). Now, we could certainly turn that array into some open-ended list structure, but that still wouldn't let us have an arbitrary number of files open, because at some point we'd run into platform-specific EMFILES or ENFILES limits on the number of open file descriptors. In practice we want to keep it under the max_safe_fds limit that fd.c goes to a lot of trouble to determine. So it seems like the most useful compromise is to keep the allocatedDescs[] array data structure as-is, but allow its size to depend on max_safe_fds. In the attached proposed patch I limit it to max_safe_fds / 2, so that there's still a reasonable number of FDs available for fd.c's main pool of virtual FDs. On typical modern platforms that should be at least a couple of hundred, thus making the effective limit about an order of magnitude more than it is currently (32). Barring objections or better ideas, I'd like to back-patch this as far as 9.1 where file_fdw was introduced. regards, tom lane
Вложения
Le samedi 8 juin 2013 19:06:58, Tom Lane a écrit : > Recently we had a gripe about how you can't read very many files > concurrently with contrib/file_fdw: > http://www.postgresql.org/message-id/OF419B5767.8A3C9ADB-ON85257B79.005491E > 9-85257B79.0054F6F1@isn.rtss.qc.ca > > The reason for this is that file_fdw goes through the COPY code, which > uses AllocateFile(), which has a wired-in assumption that not very many > files need to be open concurrently. I thought for a bit about trying to > get COPY to use a "virtual" file descriptor such as is provided by the > rest of fd.c, but that didn't look too easy, and in any case probably > nobody would be excited about adding additional overhead to the COPY > code path. What seems more practical is to relax the hard-coded limit > on the number of files concurrently open through AllocateFile(). Now, > we could certainly turn that array into some open-ended list structure, > but that still wouldn't let us have an arbitrary number of files open, > because at some point we'd run into platform-specific EMFILES or ENFILES > limits on the number of open file descriptors. In practice we want to > keep it under the max_safe_fds limit that fd.c goes to a lot of trouble > to determine. So it seems like the most useful compromise is to keep > the allocatedDescs[] array data structure as-is, but allow its size to > depend on max_safe_fds. In the attached proposed patch I limit it to > max_safe_fds / 2, so that there's still a reasonable number of FDs > available for fd.c's main pool of virtual FDs. On typical modern > platforms that should be at least a couple of hundred, thus making the > effective limit about an order of magnitude more than it is currently > (32). > > Barring objections or better ideas, I'd like to back-patch this as far > as 9.1 where file_fdw was introduced. I just wonder about this statement that you removed: * Since we don't want to encourage heavy use of those functions, *it seems OK to put a pretty small maximum limit on the number of * simultaneously allocated descs. Is it now encouraged to use those functions, or at least that it seems less 'scary' than in the past ? -- Cédric Villemain +33 (0)6 20 30 22 52 http://2ndQuadrant.fr/ PostgreSQL: Support 24x7 - Développement, Expertise et Formation
Cédric Villemain <cedric@2ndquadrant.com> writes: > I just wonder about this statement that you removed: > * Since we don't want to encourage heavy use of those functions, > * it seems OK to put a pretty small maximum limit on the number of > * simultaneously allocated descs. > Is it now encouraged to use those functions, or at least that it seems less > 'scary' than in the past ? Well, we could put back some weaker form of the statement, but I wasn't sure how to word it. regards, tom lane
<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">>> I just wonder about this statement that you removed: <p style=" margin-top:0px; margin-bottom:0px;margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">> > *Since we don't want to encourage heavy use of those functions,<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">> > * it seems OK to puta pretty small maximum limit on the number of<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px;-qt-block-indent:0; text-indent:0px; -qt-user-state:0;">> > * simultaneously allocated descs.<p style="margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0;text-indent:0px; -qt-user-state:0;">> > Is it now encouraged to use those functions, or at leastthat it seems less <p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0;text-indent:0px; -qt-user-state:0;">> > 'scary' than in the past ?<p style=" margin-top:0px; margin-bottom:0px;margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">> <p style="margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">>Well, we could put back some weaker form of the statement, but I wasn't<p style=" margin-top:0px;margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">>sure how to word it.<p style="-qt-paragraph-type:empty; margin-top:0px; margin-bottom:0px; margin-left:0px;margin-right:0px; -qt-block-indent:0; text-indent:0px; "> <p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">I'm not sure of the 'expected'problems...<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0;text-indent:0px; -qt-user-state:0;">The only thing I can think of at the moment is the time spent to close<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;-qt-user-state:0;">file descriptors on abort/commit.<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">I'm not sure of expected valueof "max_safe_fds". Your patch now initialize <p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px;-qt-block-indent:0; text-indent:0px; -qt-user-state:0;">with 5 slots instead of 10, if max_safe_fds is largemaybe it is better to <p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0;text-indent:0px; -qt-user-state:0;">double the size each time we need instead of jumping directly to thelargest <p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;-qt-user-state:0;">size ?<p style="-qt-paragraph-type:empty; margin-top:0px; margin-bottom:0px; margin-left:0px;margin-right:0px; -qt-block-indent:0; text-indent:0px; "> <p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">-- <p style=" margin-top:0px; margin-bottom:0px;margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">Cédric Villemain+33 (0)6 20 30 22 52<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0;text-indent:0px; -qt-user-state:0;">http://2ndQuadrant.fr/<p style=" margin-top:0px; margin-bottom:0px;margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">PostgreSQL:Support 24x7 - Développement, Expertise et Formation<p style="-qt-paragraph-type:empty; margin-top:0px;margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px; ">
Cédric Villemain <cedric@2ndquadrant.com> writes: > I'm not sure of expected value of "max_safe_fds". Your patch now initialize > with 5 slots instead of 10, if max_safe_fds is large maybe it is better to > double the size each time we need instead of jumping directly to the largest > size ? I don't see the point particularly. At the default value of max_files_per_process (1000), the patch can allocate at most 500 array elements, which on a 64-bit machine would probably be 16 bytes each or 8KB total. regards, tom lane
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > Recently we had a gripe about how you can't read very many files > concurrently with contrib/file_fdw: > http://www.postgresql.org/message-id/OF419B5767.8A3C9ADB-ON85257B79.005491E9-85257B79.0054F6F1@isn.rtss.qc.ca Ouch, that's pretty bad. > Barring objections or better ideas, I'd like to back-patch this as far > as 9.1 where file_fdw was introduced. +1. This would bump the limit to something like 300 instead of 10, right? That seems pretty reasonable. Thanks, Stephen
Le samedi 8 juin 2013 23:27:16, Tom Lane a écrit : > Cédric Villemain <cedric@2ndquadrant.com> writes: > > I'm not sure of expected value of "max_safe_fds". Your patch now > > initialize with 5 slots instead of 10, if max_safe_fds is large maybe it > > is better to double the size each time we need instead of jumping > > directly to the largest size ? > > I don't see the point particularly. At the default value of > max_files_per_process (1000), the patch can allocate at most 500 array > elements, which on a 64-bit machine would probably be 16 bytes each > or 8KB total. The point was only if the original comment was still relevant. It seems it is just not accurate anymore. With patch I can union 492 csv tables instead of 32 with beta1. -- Cédric Villemain +33 (0)6 20 30 22 52 http://2ndQuadrant.fr/ PostgreSQL: Support 24x7 - Développement, Expertise et Formation