Re: Let file_fdw access COPY FROM PROGRAM

Поиск
Список
Период
Сортировка
От Corey Huinker
Тема Re: Let file_fdw access COPY FROM PROGRAM
Дата
Msg-id CADkLM=fw2AvNKxbMs3yNNC1CN8_5=2MOY3X_sjVjqRdWeXV6qA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Let file_fdw access COPY FROM PROGRAM  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
Ответы Re: Let file_fdw access COPY FROM PROGRAM  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
Список pgsql-hackers
On Tue, Sep 6, 2016 at 9:46 PM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
On 2016/09/07 3:12, Corey Huinker wrote:
> On Fri, Sep 2, 2016 at 5:07 AM, Amit Langote wrote:
>> I am not familiar with win32 stuff too, so I don't have much to say about
>> that.  Maybe someone else can chime in to help with that.
>
> The regressions basically *can't* test this because we'd need a shell
> command we know works on any architecture.

OK.

Well...maybe not, depending on what Craig and other can do to educate me about the TAP tests.
 
>
> Changing table-level options requires superuser privileges, for security
> reasons: only a superuser should be able to determine which file is read,
> or which program is run. In principle non-superusers could be allowed to
> change the other options, but that's not supported at present.

Hmm, just a little modification would make it better for me:

... for security reasons.  For example, only superuser should be able to
determine which file read or which program is run.

Although that could be just me.

In this case "determine" is unclear whether a non-superuser can set the program to be run, or is capable of knowing which program is set to be run by the fdw.

We may want some more opinions on what is the most clear.
 

>> @@ -632,11 +671,18 @@ fileBeginForeignScan(ForeignScanState *node, int
>> eflags)
>>       * Create CopyState from FDW options.  We always acquire all columns,
>> so
>>       * as to match the expected ScanTupleSlot signature.
>>       */
>> -    cstate = BeginCopyFrom(node->ss.ss_currentRelation,
>> -                           filename,
>> -                           false,
>> -                           NIL,
>> -                           options);
>> +    if(filename)
>> +        cstate = BeginCopyFrom(node->ss.ss_currentRelation,
>> +                               filename,
>> +                               false,
>> +                               NIL,
>> +                               options);
>> +    else
>> +        cstate = BeginCopyFrom(node->ss.ss_currentRelation,
>> +                               program,
>> +                               true,
>> +                               NIL,
>> +                               options)
>>
>> Like I mentioned above, if there was a is_program Boolean flag instead of
>> separate filename and program, this would be just:
>>
>> +    cstate = BeginCopyFrom(node->ss.ss_currentRelation,
>> +                           filename,
>> +                           is_program,
>> +                           NIL,
>> +                           options);
>>
>> That would get rid of if-else blocks here and a couple of other places.
>
> It would, pushing the complexity out to the user. Which could be fine, but
> if we do that then "filename" is a misnomer.

This is an internal state so I'm not sure how this would be pushing
complexity out to the user.  Am I perhaps misreading what you said?

Indeed, it is internal state. Maybe we rename the variable file_command or something.
 

What a user sees is that there are two separate foreign table options -
filename and program.  That we handle them internally using a string to
identify either file or program and a Boolean flag to show which of the
two is just an internal implementation detail.

COPY does it that way internally and I just saw that psql's \copy does it
the same way too.  In the latter's case, following is the options struct
(src/bin/psql/copy.c):

struct copy_options
{
    char       *before_tofrom;  /* COPY string before TO/FROM */
    char       *after_tofrom;   /* COPY string after TO/FROM filename */
    char       *file;           /* NULL = stdin/stdout */
    bool        program;        /* is 'file' a program to popen? */
    bool        psql_inout;     /* true = use psql stdin/stdout */
    bool        from;           /* true = FROM, false = TO */
};

But as you said above, this could be deferred to the committer.

Yeah, and that made for zero storage savings: a char pointer which is never assigned a string takes up as much space as a 4-byte-aligned boolean. And the result is that "file" really means program, which I found slightly awkward.


>> diff --git a/contrib/file_fdw/input/file_fdw.source
>> b/contrib/file_fdw/input/file_fdw.source
>> index 685561f..eae34a4 100644
>> --- a/contrib/file_fdw/input/file_fdw.source
>> +++ b/contrib/file_fdw/input/file_fdw.source
>>
>> You forgot to include expected output diffs.
>
> Having regression tests for this is extremely problematic, because the
> program invoked would need to be an invokable command on any architecture
> supported by postgres. I'm pretty sure no such command exists.

Craig and Michael elsewhere suggested something about adding TAP tests. If
that helps the situation, maybe you could.

Yeah, I need to educate myself about those.
 

I will mark the CF entry status to "Waiting on author" till you post a new
patch.


Thanks.
 


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Craig Ringer
Дата:
Сообщение: Re: Let file_fdw access COPY FROM PROGRAM
Следующее
От: Corey Huinker
Дата:
Сообщение: Re: Let file_fdw access COPY FROM PROGRAM