Обсуждение: pg_execute_from_file, patch v10
Hi, The other infrastructure patch that has been mark ready for commit then commented further upon by Tom is $subject, even if the function provided as been renamed to pg_execute_sql_file(). Please find attached the newer version that fixes Tom concerns, removing the VARIADIC forms of the functions (those placeholders idea). The git tree already contains a fixed extension code, but the next patch for that one will have to wait some more (psql refactoring). Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Вложения
On Sun, Dec 12, 2010 at 06:08, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > The other infrastructure patch that has been mark ready for commit then > commented further upon by Tom is $subject, even if the function provided > as been renamed to pg_execute_sql_file(). > > Please find attached the newer version that fixes Tom concerns, removing > the VARIADIC forms of the functions (those placeholders idea). I think the version is almost OK, but I have a couple of comments: - Why do you need directory_fctx in genfile.h ? - It might be reasonable to have 3 and 1 arguments version of pg_read_file. i.e, (path, offset, size) and (path). Two argsversion (path, offset) doesn't seem to be so useful. In addition, CREATE EXTENSION will always call it with offset=0,no? - We don't need some of added #include "utils/array.h" anymore. -- Itagaki Takahiro
Itagaki Takahiro <itagaki.takahiro@gmail.com> writes: > I think the version is almost OK, but I have a couple of comments: > - Why do you need directory_fctx in genfile.h ? I then use it in extension.c, this way: typedef struct extension_fctx {directory_fctx dir;ExtensionList *installed; } extension_fctx; > - It might be reasonable to have 3 and 1 arguments version of pg_read_file. > i.e, (path, offset, size) and (path). Two args version (path, offset) > doesn't seem to be so useful. In addition, CREATE EXTENSION will always > call it with offset=0, no? Depending on the 'relocatable' property, we now do either of those calls: execute_sql_file(get_extension_absolute_path(control->script), pg_encoding_to_char(encoding)); read_text_file_with_endoding(filename, pg_encoding_to_char(encoding)); So we're using the internal forms only here, and we can propose whatever API we find best. Reading through the end of the file seems common enough, but I agree I would prefer reading the whole file here if I had to pick only one. > - We don't need some of added #include "utils/array.h" anymore. Ah yes, true. Do you want another patch version from me? -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Mon, Dec 13, 2010 at 9:36 AM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > Do you want another patch version from me? I'm looking at this patch and I'm confused. Why do we need this at all? pg_read_binary_file() seems like it might be useful to somebody, but I don't see what it has to do with extensions. And the rest of this doesn't appear to provide any new functionality. The extension mechanism hardly needs SQL-callable functions. As a side note, this comment almost makes sense, but not quite: + /* Abuse knowledge that we're bytea and text are both varlena */ ...but my real question is why any of this is necessary at all and what it has to do with extensions. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, Dec 14, 2010 at 10:53, Robert Haas <robertmhaas@gmail.com> wrote: > I'm looking at this patch and I'm confused. Why do we need this at > all? pg_read_binary_file() seems like it might be useful to somebody, > but I don't see what it has to do with extensions. And the rest of > this doesn't appear to provide any new functionality. The extension > mechanism hardly needs SQL-callable functions. Hmm, I've expected that the EXTENSION patch would use the SQL functions like as SPI_exec("SELECT pg_execute_sql(pg_read_file($1))", ...), but it actually uses internal functions and nested DirectFunctionCalls. So, the most important part of this patch is allowing to read any files in the server file system. The current pg_read_file() allows to read only files under $PGDATA and pg_log. However, the interface of current pg_read_file() is mis-designed to read files in multi-byte encoding because 1. The encoding must be same with the server encoding. 2. Users need to specifycorrect offset in the file not to split multi-byte characters. So, it'd be better to improve pg_read_file() aside from EXTENSION anyway. I think pg_read_whole_binary_file() is one of the solutions for the issue. -- Itagaki Takahiro
Itagaki Takahiro <itagaki.takahiro@gmail.com> writes: > On Tue, Dec 14, 2010 at 10:53, Robert Haas <robertmhaas@gmail.com> wrote: >> I'm looking at this patch and I'm confused. Why do we need this at >> all? pg_read_binary_file() seems like it might be useful to somebody, >> but I don't see what it has to do with extensions. And the rest of >> this doesn't appear to provide any new functionality. The extension >> mechanism hardly needs SQL-callable functions. > Hmm, I've expected that the EXTENSION patch would use the SQL functions > like as SPI_exec("SELECT pg_execute_sql(pg_read_file($1))", ...), but > it actually uses internal functions and nested DirectFunctionCalls. > So, the most important part of this patch is allowing to read any > files in the server file system. The current pg_read_file() allows > to read only files under $PGDATA and pg_log. Has anyone thought twice about the security implications of that? Not to mention that in most cases, the very last thing we want is to have to specify an exact full path? I think we'd be better off insisting that the extension files be under sharedir or some such place. In any case, I concur with what I gather Robert is thinking, which is that there is no good reason to be exposing any of this at the SQL level. regards, tom lane
On Mon, Dec 13, 2010 at 9:41 PM, Itagaki Takahiro <itagaki.takahiro@gmail.com> wrote: > Hmm, I've expected that the EXTENSION patch would use the SQL functions > like as SPI_exec("SELECT pg_execute_sql(pg_read_file($1))", ...), but > it actually uses internal functions and nested DirectFunctionCalls. > So, the most important part of this patch is allowing to read any > files in the server file system. The current pg_read_file() allows > to read only files under $PGDATA and pg_log. As Tom says, this is clearly not going to fly on security grounds. > However, the interface of current pg_read_file() is mis-designed > to read files in multi-byte encoding because > 1. The encoding must be same with the server encoding. > 2. Users need to specify correct offset in the file > not to split multi-byte characters. > So, it'd be better to improve pg_read_file() aside from EXTENSION anyway. > I think pg_read_whole_binary_file() is one of the solutions for the issue. I don't have any problem with a separate patch to try to improve some of these issues, but this is supposedly part of the extensions work, yet (1) most of what's here has little to do with extensions and (2) extensions don't need this stuff exposed at the SQL level anyway. I'm inclined to mark this patch as Returned with Feedback. The portions of this patch that are trying to fix multi-byte encoding issues can be submitted as a separate patch that just does that. Whatever material here is relevant to extensions can either be resubmitted with all of these other things taken out, or just get absorbed into the main extensions patch if (as I suspect) there isn't enough there to warrant a separate patch. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, Dec 14, 2010 at 12:02, Robert Haas <robertmhaas@gmail.com> wrote: > On Mon, Dec 13, 2010 at 9:41 PM, Itagaki Takahiro >> So, the most important part of this patch is allowing to read any >> files in the server file system. The current pg_read_file() allows >> to read only files under $PGDATA and pg_log. > > As Tom says, this is clearly not going to fly on security grounds. If it's a security hole, lo_import() should be also a hole because we can use lo_import() and SELECT * FROM pg_largeobject for the same purpose... > I don't have any problem with a separate patch to try to improve some > of these issues, but this is supposedly part of the extensions work, > yet (1) most of what's here has little to do with extensions and (2) > extensions don't need this stuff exposed at the SQL level anyway. I'm > inclined to mark this patch as Returned with Feedback. If so, I'm not sure why we need to split the EXTENSION patch into sub pieces. In my understanding, we did it because the sub pieces are also useful in standalone. The requirement for the pieces was changed and extended in discussions, but I hope the change will not be the reason to reject the patch. -- Itagaki Takahiro
Itagaki Takahiro <itagaki.takahiro@gmail.com> writes: > On Tue, Dec 14, 2010 at 12:02, Robert Haas <robertmhaas@gmail.com> wrote: >> As Tom says, this is clearly not going to fly on security grounds. > If it's a security hole, lo_import() should be also a hole > because we can use lo_import() and SELECT * FROM pg_largeobject > for the same purpose... lo_import is superuser-only. If we design this feature so that it will forever have to be superuser-only, to get a behavior that I think we don't even *want*, I believe we're making a serious error. regards, tom lane
On Mon, Dec 13, 2010 at 10:21 PM, Itagaki Takahiro <itagaki.takahiro@gmail.com> wrote: >> I don't have any problem with a separate patch to try to improve some >> of these issues, but this is supposedly part of the extensions work, >> yet (1) most of what's here has little to do with extensions and (2) >> extensions don't need this stuff exposed at the SQL level anyway. I'm >> inclined to mark this patch as Returned with Feedback. > > If so, I'm not sure why we need to split the EXTENSION patch into sub pieces. > In my understanding, we did it because the sub pieces are also useful in > standalone. The requirement for the pieces was changed and extended in > discussions, but I hope the change will not be the reason to reject the patch. Well, I think it is best when a patch has just one purpose. This seems to be sort of an odd hodge-podge of things. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, Dec 14, 2010 at 12:47, Tom Lane <tgl@sss.pgh.pa.us> wrote: > lo_import is superuser-only. If we design this feature so that it will > forever have to be superuser-only, to get a behavior that I think we > don't even *want*, I believe we're making a serious error. CREATE EXTENSION and pg_read_file() is also superuser-only, no? -- Itagaki Takahiro
Itagaki Takahiro <itagaki.takahiro@gmail.com> writes: > On Tue, Dec 14, 2010 at 12:47, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> lo_import is superuser-only. If we design this feature so that it will >> forever have to be superuser-only, to get a behavior that I think we >> don't even *want*, I believe we're making a serious error. > CREATE EXTENSION and pg_read_file() is also superuser-only, no? CREATE EXTENSION will be superuser to start with, no doubt, but I think we'll someday want to allow it to database owners, just as happened with CREATE LANGUAGE. Let's not build it on top of operations that inherently involve security problems, especially when there's no need to. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > Has anyone thought twice about the security implications of that? > Not to mention that in most cases, the very last thing we want is to > have to specify an exact full path? Well, the security is left same as before, superuser only. And Itagaki showed that superuser are allowed to read any file anywhere already, so we didn't change anything here. > I think we'd be better off insisting that the extension files be under > sharedir or some such place. That's the case, but the rework of genfile.c is more general than just support for extension, or I wouldn't have been asked for a separate patch, would I? > In any case, I concur with what I gather Robert is thinking, which is > that there is no good reason to be exposing any of this at the SQL level. That used to be done this way, you know, in versions between 0 and 6 of the patch. Starting at version 7, the underlyiong facilities have been splitted and exposed, because of the file encoding and server encoding issues reported by Itagaki. I propose that more than 2 of you guys get in agreement on what the good specs are and wake me up after that so that I spawn the right version of the patch, and if necessary, revise it. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Tom Lane <tgl@sss.pgh.pa.us> writes: > CREATE EXTENSION will be superuser to start with, no doubt, but I think > we'll someday want to allow it to database owners, just as happened with > CREATE LANGUAGE. Let's not build it on top of operations that > inherently involve security problems, especially when there's no need > to. That boils down to moving the superuser() test in the right functions, it's now in the innermost facility to read files. If you have something precise enough for me to work on it, please say, but I guess you'd spend less time making the copy/paste in the code rather than in the mail. That schedule optimisation is for you to make, though. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Robert Haas <robertmhaas@gmail.com> writes: > Well, I think it is best when a patch has just one purpose. This > seems to be sort of an odd hodge-podge of things. The purpose here is clean-up the existing pg_read_file() facility so that it's easy to build pg_execute_sql_file() on top of it. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Tue, Dec 14, 2010 at 18:01, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: >> In any case, I concur with what I gather Robert is thinking, which is >> that there is no good reason to be exposing any of this at the SQL level. > > That used to be done this way, you know, in versions between 0 and 6 of > the patch. Starting at version 7, the underlyiong facilities have been > splitted and exposed, because of the file encoding and server encoding > issues reported by Itagaki. I'm confused which part of the patch is the point of the discussion. 1. Relax pg_read_file() to be able to read any files.2. pg_read_binary_file() 3. pg_execute_sql_string/file() As I pointed out, 1 is reasonable as long as we restrict the usage only to superuser. If we think it is a security hole, there are the same issue in lo_import() and COPY FROM by superuser. 2 is a *fix* for the badly-designed pg_read_file() interface. It should have returned bytea rather than text. 3 could simplify later EXTENSION patches, but it might not be a large help because we can just use SPI_exec() instead of them if we write codes with C. I think the most useful parts of the patch is reading a whole file with encoding, i.e., 1 and 2. -- Itagaki Takahiro
On Tue, Dec 14, 2010 at 11:48 AM, Itagaki Takahiro <itagaki.takahiro@gmail.com> wrote: > I'm confused which part of the patch is the point of the discussion. > 1. Relax pg_read_file() to be able to read any files. > 2. pg_read_binary_file() > 3. pg_execute_sql_string/file() > > As I pointed out, 1 is reasonable as long as we restrict the usage > only to superuser. If we think it is a security hole, there are > the same issue in lo_import() and COPY FROM by superuser. > > 2 is a *fix* for the badly-designed pg_read_file() interface. > It should have returned bytea rather than text. > > 3 could simplify later EXTENSION patches, but it might not be > a large help because we can just use SPI_exec() instead of them > if we write codes with C. I think the most useful parts of the > patch is reading a whole file with encoding, i.e., 1 and 2. So there are really four changes in here, right? 1. Relax pg_read_file() to be able to read any files. 2. pg_read_binary_file() 3. pg_execute_sql_string()/file() 4. ability to read a file in a given encoding (rather than the client encoding) I think I agree that #1 doesn't open any security hole that doesn't exist already. We have no similar check for COPY, and both are superuser-only. I also see that this is useful for the extensions work, if that code wants to internally DirectFunctionCall to pg_read_file. I think #2 might be a nice thing to have, but I'm not sure what it has to do with extensions. I don't see why we need #3. I think #4 is useful. I am not clear whether it is needed for the extension stuff or not. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > So there are really four changes in here, right? > 1. Relax pg_read_file() to be able to read any files. > 2. pg_read_binary_file() > 3. pg_execute_sql_string()/file() > 4. ability to read a file in a given encoding (rather than the client encoding) > I think I agree that #1 doesn't open any security hole that doesn't > exist already. That function would never have been accepted into core at all without a locked-down range of how much of the filesystem it would let you get at. There is nothing whatsoever in the extensions proposal that justifies dropping that restriction. If you want to put it up as a separately proposed, separately justified patch, go ahead ... but I'll vote against it even then. (I will also point out that on SELinux-based systems, relaxing the restriction would be completely useless anyway.) > I think #2 might be a nice thing to have, but I'm not sure what it has > to do with extensions. Agreed. There might be some use for #4 in connection with extensions, but I don't see that #2 is related. BTW, it appears to me that pg_read_file expects server encoding not client encoding. Minor detail only, but let's be clear what it is we're talking about. regards, tom lane
On Tue, Dec 14, 2010 at 1:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> So there are really four changes in here, right? > >> 1. Relax pg_read_file() to be able to read any files. >> 2. pg_read_binary_file() >> 3. pg_execute_sql_string()/file() >> 4. ability to read a file in a given encoding (rather than the client encoding) > >> I think I agree that #1 doesn't open any security hole that doesn't >> exist already. > > That function would never have been accepted into core at all without a > locked-down range of how much of the filesystem it would let you get at. I have some angst about opening it up wide, but I'm also having a hard time seeing what problem it creates that you can't already create with COPY FROM or lo_import(). >> I think #2 might be a nice thing to have, but I'm not sure what it has >> to do with extensions. > > Agreed. There might be some use for #4 in connection with extensions, > but I don't see that #2 is related. > > BTW, it appears to me that pg_read_file expects server encoding not > client encoding. Minor detail only, but let's be clear what it is > we're talking about. OK. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Tom Lane <tgl@sss.pgh.pa.us> writes: > Robert Haas <robertmhaas@gmail.com> writes: >> So there are really four changes in here, right? > >> 1. Relax pg_read_file() to be able to read any files. >> 2. pg_read_binary_file() >> 3. pg_execute_sql_string()/file() >> 4. ability to read a file in a given encoding (rather than the client encoding) > >> I think I agree that #1 doesn't open any security hole that doesn't >> exist already. > > That function would never have been accepted into core at all without a > locked-down range of how much of the filesystem it would let you get at. Ok. Previously pg_read_file() only allows absolute file names that point into DataDir or into Log_directory. It used not to work in the first versions of the extension's patch, but with the current code, the check passes on a development install here: extension.c is only giving genfile.c absolute file names. Please note that debian will default to have DataDir in a different place than the sharepath: http://packages.debian.org/sid/amd64/postgresql-contrib-9.0/filelist PGDATA: /var/lib/postgresql/9.1/main sharepath: /usr/share/postgresql/9.1/contrib libdir: /usr/lib/postgresql/9.1/lib So I'm not sure how if it will play nice with such installs, or if there's already some genfile.c patching on debian. >> I think #2 might be a nice thing to have, but I'm not sure what it has >> to do with extensions. > > Agreed. There might be some use for #4 in connection with extensions, > but I don't see that #2 is related. Well, in fact, the extension's code is using either execute_sql_file() or read_text_file_with_endoding() then @extschema@ replacement then execute_sql_string(), all those functions called directly thanks to #include "utils/genfile.h". No DirectFunctionCall'ing, we can easily remove SQL callable forms. So what we need is 2, 3 and 4 (because 4 builds on 2). > BTW, it appears to me that pg_read_file expects server encoding not > client encoding. Minor detail only, but let's be clear what it is > we're talking about. Hence the refactoring in the patch. Ask Itagaki for details with funny environments using some file encoding that does not exists in the server yet ain't client_encoding and can't be. I didn't follow the use case in details, but he was happy with the current way of doing things and not with any previous one. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Wed, Dec 15, 2010 at 04:39, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > Well, in fact, the extension's code is using either execute_sql_file() > or read_text_file_with_endoding() then @extschema@ replacement then > execute_sql_string(), all those functions called directly thanks to > #include "utils/genfile.h". No DirectFunctionCall'ing, we can easily > remove SQL callable forms. > > So what we need is 2, 3 and 4 (because 4 builds on 2). No, 3 is not needed. You can use SPI_exec() directly instead of exporting execute_sql_string(). -- Itagaki Takahiro
On Wed, Dec 15, 2010 at 03:42, Robert Haas <robertmhaas@gmail.com> wrote: >>> I think #2 might be a nice thing to have, but I'm not sure what it has >>> to do with extensions. >> >> Agreed. There might be some use for #4 in connection with extensions, >> but I don't see that #2 is related. >> >> BTW, it appears to me that pg_read_file expects server encoding not >> client encoding. Minor detail only, but let's be clear what it is >> we're talking about. EXTENSION will use #2 with convert_from() for $4 like this: Datum sql = replace( convert_from(pg_read_binary_file($path), $encoding), '@extschema@', $schema);SPI_exec(TextDatumGetCString(sql)); I think it is a more flexible solution than adding 'encoding' parameter to pg_read_file(). -- Itagaki Takahiro
On Tue, Dec 14, 2010 at 9:25 PM, Itagaki Takahiro <itagaki.takahiro@gmail.com> wrote: > On Wed, Dec 15, 2010 at 03:42, Robert Haas <robertmhaas@gmail.com> wrote: >>>> I think #2 might be a nice thing to have, but I'm not sure what it has >>>> to do with extensions. >>> >>> Agreed. There might be some use for #4 in connection with extensions, >>> but I don't see that #2 is related. >>> >>> BTW, it appears to me that pg_read_file expects server encoding not >>> client encoding. Minor detail only, but let's be clear what it is >>> we're talking about. > > EXTENSION will use #2 with convert_from() for $4 like this: > > Datum sql = replace( > convert_from(pg_read_binary_file($path), $encoding), > '@extschema@', $schema); > SPI_exec(TextDatumGetCString(sql)); > > I think it is a more flexible solution than adding 'encoding' > parameter to pg_read_file(). It seems like pg_read_binary_file() is good to have regardless of whatever else we decide to do here. Should we pull that part out and commit it separately? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Dec 15, 2010 at 12:20, Robert Haas <robertmhaas@gmail.com> wrote: > It seems like pg_read_binary_file() is good to have regardless of > whatever else we decide to do here. Should we pull that part out and > commit it separately? OK, I'll do that, but I have some questions:#1 Should we add 'whole' versions of read functions in Dimitri's work?#2 Shouldwe allow additional directories? In the discussion, no restriction seems to be a bad idea. But EXTENSION requires to read PGSHARE or some system directories? #2 can be added separately from the first change, but I'd like to add #1 at the same time if required. Or, if we're planning not to use pg_read_file functions in the EXTENSION patch, we don't need #2 anyway. -- Itagaki Takahiro
On Tue, Dec 14, 2010 at 10:43 PM, Itagaki Takahiro <itagaki.takahiro@gmail.com> wrote: > On Wed, Dec 15, 2010 at 12:20, Robert Haas <robertmhaas@gmail.com> wrote: >> It seems like pg_read_binary_file() is good to have regardless of >> whatever else we decide to do here. Should we pull that part out and >> commit it separately? > > OK, I'll do that, but I have some questions: > #1 Should we add 'whole' versions of read functions in Dimitri's work? > #2 Should we allow additional directories? In the discussion, > no restriction seems to be a bad idea. But EXTENSION requires > to read PGSHARE or some system directories? > > #2 can be added separately from the first change, > but I'd like to add #1 at the same time if required. > > Or, if we're planning not to use pg_read_file functions in the > EXTENSION patch, we don't need #2 anyway. I think it's still unclear what we want to do about #2, so let's focus on the parts we are most certain about first. The whole-file versions seem like a good idea - my only hesitation is, I'm not sure why we didn't include that functionality originally. It seems obviously useful, so does that mean that it was omitted on purpose for some reason? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Dec 15, 2010 at 12:55, Robert Haas <robertmhaas@gmail.com> wrote: >>> It seems like pg_read_binary_file() is good to have regardless of >>> whatever else we decide to do here. Should we pull that part out and >>> commit it separately? >> > The whole-file versions seem like a good idea - my only hesitation is, > I'm not sure why we didn't include that functionality originally. It > seems obviously useful, so does that mean that it was omitted on > purpose for some reason? I applied the attached patch extracted from Dimitri's work. One difference is 'offset' argument is removed from 'whole' mode. So, we'll have (path, offset, length) and (path) versions. Checking with convert_and_check_filename is left as-is. -- Itagaki Takahiro
Вложения
Itagaki Takahiro <itagaki.takahiro@gmail.com> writes: > I applied the attached patch extracted from Dimitri's work. Thanks! I'm working on next extension's patch, merging now. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support