Re: making the backend's json parser work in frontend code
От | David Steele |
---|---|
Тема | Re: making the backend's json parser work in frontend code |
Дата | |
Msg-id | 3649c0bb-a4f2-e59b-3322-ea740a179b2f@pgmasters.net обсуждение исходный текст |
Ответ на | Re: making the backend's json parser work in frontend code (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: making the backend's json parser work in frontend code
|
Список | pgsql-hackers |
Hi Robert, On 1/17/20 2:33 PM, Robert Haas wrote: > On Fri, Jan 17, 2020 at 12:36 PM David Steele <david@pgmasters.net> wrote: > >> I used the callbacks because that's the first method I found but it >> seems like json_lex() might be easier to use in practice. > > Ugh, really? That doesn't seem like it would be nice at all. I guess it's a matter of how you want to structure the code. >> I think it's an issue that the entire string must be passed to the lexer >> at once. That will not be great for large manifests. However, I don't >> think it will be all that hard to implement an optional "want more" >> callback in the lexer so JSON data can be fed in from the file in chunks. > > I thought so initially, but now I'm not so sure. The thing is, you > actually need all the manifest data in memory at once anyway, or so I > think. You're essentially doing a "full join" between the contents of > the manifest and the contents of the file system, so you've got to > scan one (probably the filesystem) and then mark entries in the other > (probably the manifest) used as you go. Yeah, having a copy of the manifest in memory is the easiest way to do validation, but I think you'd want it in a structured format. We parse the file part of the manifest into a sorted struct array which we can then do binary searches on by filename. >> So, that just leaves ereport() as the largest remaining issue? I'll >> look at that today and Tuesday and see what I can work up. > > PFA my work on that topic. As compared with my previous patch series, > the previous 0001 is dropped and what are now 0001 and 0002 are the > same as patches from the previous series. 0003 and 0004 are aiming > toward getting rid of ereport() and, I believe, show a plausible > strategy for so doing. There are, possibly, things not to like here, > and it's certainly incomplete, but I think I kinda like this > direction. Comments appreciated. > > 0003 nukes lex_accept(), inlining the logic into callers. I found that > the refactoring I wanted to do in 0004 was pretty hard without this, > and it turns out to save code, so I think this is a good idea > independently of anything else. No arguments here. > 0004 adjusts many functions in jsonapi.c to return a new enumerated > type, JsonParseErrorType, instead of directly doing ereport(). It adds > a new function that takes this value and a lexing context and throws > an error. The JSON_ESCAPING_INVALID case is wrong and involves a gross > hack, but that's fixable with another field in the lexing context. > More work is needed to really bring this up to scratch, but the idea > is to make this code have a soft dependency on ereport() rather than a > hard one. My first reaction was that if we migrated ereport() first it would make this all so much easier. Now I'm no so sure. Having a general json parser in libcommon that is not tied into a specific error handling/logging system actually sounds like a really nice thing to have. If we do migrate ereport() the user would always have the choice to call throw_a_json_error() if they wanted to. There's also a bit of de-duplication of error messages, which is nice, especially in the case JSON_ESCAPING_INVALID. And I agree that this case can be fixed with another field in the lexer -- or at least so it seems to me. Though, throw_a_json_error() is not my favorite name. Perhaps json_ereport()? Regards, -- -David david@pgmasters.net
В списке pgsql-hackers по дате отправления: