Обсуждение: How to return a large String with C
Hi all, I want to write a function in C which retrieves a large string from a table, does some work on it and returns the result to the surrounding SELECT. (e.g. SELECT my_c_func(text);) So far I have been successfully doing calls to SPI, select the data from the table and return it. However, this works only with string not larger than page size of char[8192]. The strings I expect are much longer and this causes the backend to crash. Printing the string via elog shows the correct content, returning the string to the postmaster makes the backend crah. How do I do this correctly? The code snippet which works look like this: PG_FUNCTION_INFO_V1(my_c_func); Datum my_c_func (PG_FUNCTION_ARGS) (...) char buf[8192]; if (ret > 0 && SPI_tuptable != NULL) { TupleDesc tupdesc = SPI_tuptable->tupdesc; SPITupleTable *tuptable = SPI_tuptable; int i,j; for (i = 0; i < proc; i++) { HeapTuple tuple = tuptable->vals[i]; for (i = 1, buf[0] = 0; i <= tupdesc->natts; i++) { snprintf(buf + strlen (buf), sizeof(buf) - strlen(buf), " %s%s", SPI_getvalue(tuple, tupdesc, i), (i == tupdesc->natts) ? " " : " |"); appendStringInfo(&result_buf, SPI_getvalue(tuple, tupdesc, i)); } } } SPI_finish(); (... do some work here ...) elog(INFO, Content: %s <==###", result_buf.data); PG_RETURN_TEXT_P(GET_TEXT((char *)query_buf.data)); (...)
Stefan Niantschur <sniantschur@web.de> writes: > So far I have been successfully doing calls to SPI, select the data from the > table and return it. However, this works only with string not larger than > page size of char[8192]. > The strings I expect are much longer and this causes the backend to crash. Hardly surprising when you're printing the string into a fixed-size 8K buffer. The buffer overflow is smashing the stack, in particular the function's return address. regards, tom lane
Tom Lane wrote: > Stefan Niantschur <sniantschur@web.de> writes: >> So far I have been successfully doing calls to SPI, select the data from the >> table and return it. However, this works only with string not larger than >> page size of char[8192]. >> The strings I expect are much longer and this causes the backend to crash. > > Hardly surprising when you're printing the string into a fixed-size 8K buffer. > The buffer overflow is smashing the stack, in particular the function's > return address. He also uses the variable "i" in *both* parts of his nested loop. Stefan, you should probably pick up a C programming book before going too much further with this. Colin
Am Sun, 17 Feb 2008 09:17:08 -0500 schrieb Tom Lane <tgl@sss.pgh.pa.us>: > Stefan Niantschur <sniantschur@web.de> writes: > > So far I have been successfully doing calls to SPI, select the data > > from the table and return it. However, this works only with string > > not larger than page size of char[8192]. > > The strings I expect are much longer and this causes the backend to > > crash. > > Hardly surprising when you're printing the string into a fixed-size > 8K buffer. The buffer overflow is smashing the stack, in particular > the function's return address. > > regards, tom lane Yes, I know, but the backend does not allow for a bigger buffer. Trying to use a 80K (char[81920])buffer did not work and returns: INFO: string-size : 48015 INFO: +++++++++++++++++++++++++++ server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Succeeded. The surprising thing is that the data can be displayed using elog but not returend with a string pointer. Is there any good example which I could read? Best Regards, Stefan
Yes, I know, but the backend does not allow for a bigger buffer. Trying
to use a 80K (char[81920])buffer did not work and returns:
INFO: string-size : 48015
INFO: +++++++++++++++++++++++++++
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
The surprising thing is that the data can be displayed using elog but
not returend with a string pointer.
You cannot just return pointer to stack (local for function) buffer - it gets freed on return from the function. You must explicitly palloc required memory chunk.
Stefan Niantschur <sniantschur@web.de> writes: > Am Sun, 17 Feb 2008 09:17:08 -0500 > schrieb Tom Lane <tgl@sss.pgh.pa.us>: >> Hardly surprising when you're printing the string into a fixed-size >> 8K buffer. The buffer overflow is smashing the stack, in particular >> the function's return address. > Yes, I know, but the backend does not allow for a bigger buffer. Trying > to use a 80K (char[81920])buffer did not work and returns: So you've got some other bug in code you didn't show us. It's highly unlikely that you wouldn't be able to allocate an 80K buffer. (Whether that's big enough for your data even yet is a separate question.) What I was wondering was why you even bothered with the char[] buffer, when it looked like the actually useful return value was being accumulated in an expansible StringInfo buffer. regards, tom lane
Am Sun, 17 Feb 2008 14:28:18 -0500 schrieb Tom Lane <tgl@sss.pgh.pa.us>: > Stefan Niantschur <sniantschur@web.de> writes: > > Am Sun, 17 Feb 2008 09:17:08 -0500 > > schrieb Tom Lane <tgl@sss.pgh.pa.us>: > >> Hardly surprising when you're printing the string into a fixed-size > >> 8K buffer. The buffer overflow is smashing the stack, in particular > >> the function's return address. > > > Yes, I know, but the backend does not allow for a bigger buffer. > > Trying to use a 80K (char[81920])buffer did not work and returns: > > So you've got some other bug in code you didn't show us. It's highly > unlikely that you wouldn't be able to allocate an 80K buffer. > (Whether that's big enough for your data even yet is a separate > question.) > > What I was wondering was why you even bothered with the char[] buffer, > when it looked like the actually useful return value was being > accumulated in an expansible StringInfo buffer. > > regards, tom lane > Please find below the most recent code snippet. It is mainly based on examples from the pg documentation: -------8< ------- #define GET_TEXT(cstrp) DatumGetTextP(DirectFunctionCall1(textin, CStringGetDatum(cstrp))) (...) char *cres; int ret; (...) if ((ret = SPI_connect()) < 0) { elog(ERROR, "get_info: SPI_connect returned %d", ret); } proc = SPI_processed; initStringInfo(&result_buf); if (ret > 0 && SPI_tuptable != NULL) { TupleDesc tupdesc = SPI_tuptable->tupdesc; SPITupleTable *tuptable = SPI_tuptable; int i,k; for (i = 0; i < proc; i++) { HeapTuple tuple = tuptable->vals[i]; for (k = 1; k <= tupdesc->natts; k++) { cres = SPI_getvalue(tuple, tupdesc, k); appendStringInfo(&result_buf, SPI_getvalue(tuple, tupdesc, k)); elog(INFO, "info: %s", cres); } } } SPI_finish(); elog(INFO, "---"); PG_RETURN_TEXT_P(GET_TEXT(result_buf.data)); ------->8 ------- I still have the problem that I can use the C function via select from within psql if the result is not too long. If the result is a really long string, then I see this: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Succeeded. The problem is not the query via SPI (this works and I can see the result in elog). The issue I have is that I cannot see the string in psql when it is a long string. For short strings it does the trick. Even if I use a (char *) and return it this only works for short strings. As short strings are not the problem, what do I need to do to get a long string handed back to my select in psql? Best Regards, Stefan
Am Sun, 17 Feb 2008 14:28:18 -0500 schrieb Tom Lane <tgl@sss.pgh.pa.us>: > Stefan Niantschur <sniantschur@web.de> writes: > > Am Sun, 17 Feb 2008 09:17:08 -0500 > > schrieb Tom Lane <tgl@sss.pgh.pa.us>: > >> Hardly surprising when you're printing the string into a fixed-size > >> 8K buffer. The buffer overflow is smashing the stack, in particular > >> the function's return address. > > > Yes, I know, but the backend does not allow for a bigger buffer. > > Trying to use a 80K (char[81920])buffer did not work and returns: > > So you've got some other bug in code you didn't show us. It's highly > unlikely that you wouldn't be able to allocate an 80K buffer. > (Whether that's big enough for your data even yet is a separate > question.) > > What I was wondering was why you even bothered with the char[] buffer, > when it looked like the actually useful return value was being > accumulated in an expansible StringInfo buffer. > > regards, tom lane > Hi all, now after some days of intensive brainwork I could solve the problem with a slight change in the code. It turned out that palloc() did not reliably work for my purpose. So before calling SPI_finish() I am doing the following: text *out = (text *) SPI_palloc(strlen(cres) + VARHDRSZ); memcpy(VARDATA(out), cres, strlen(cres)); VARATT_SIZEP(out) = strlen(cres) + VARHDRSZ; SPI_finish(); which allocates the needed memory in the upper execution context. This allows for passing the really long string back to the select statement without crashing the database. If there are any better proceedings, please let me know. Best Regards, Stefan
Stefan Niantschur <sniantschur@web.de> writes: > Please find below the most recent code snippet. It is mainly based on > examples from the pg documentation: > -------8< ------- > #define GET_TEXT(cstrp) DatumGetTextP(DirectFunctionCall1(textin, > CStringGetDatum(cstrp))) > (...) > char *cres; > int ret; > (...) > if ((ret = SPI_connect()) < 0) > { > elog(ERROR, "get_info: SPI_connect returned %d", ret); > } > proc = SPI_processed; > initStringInfo(&result_buf); > if (ret > 0 && SPI_tuptable != NULL) > { > TupleDesc tupdesc = SPI_tuptable->tupdesc; > SPITupleTable *tuptable = SPI_tuptable; > int i,k; > for (i = 0; i < proc; i++) > { > HeapTuple tuple = tuptable->vals[i]; > for (k = 1; k <= tupdesc->natts; k++) > { > cres = SPI_getvalue(tuple, tupdesc, k); > appendStringInfo(&result_buf, SPI_getvalue(tuple, tupdesc, k)); > elog(INFO, "info: %s", cres); > } > } > } > SPI_finish(); > elog(INFO, "---"); > PG_RETURN_TEXT_P(GET_TEXT(result_buf.data)); > ------->8 ------- > I still have the problem that I can use the C function via > select from within psql if the result is not too long. I don't think length has anything to do with it; rather the problem is that you're trying to return data that's already been pfree'd when you did SPI_finish(). Probably the simplest fix for this case is to put the initStringInfo call before SPI_connect, so that the result_buf.data buffer exists in the outer function context and not in the temporary SPI context. BTW, it is strongly advisable to do development/testing of C code in a backend that's been built with --enable-cassert. Had you done so, this mistake would have been much more obvious. regards, tom lane
Am Mon, 18 Feb 2008 14:15:14 -0500 schrieb Tom Lane <tgl@sss.pgh.pa.us>: > Stefan Niantschur <sniantschur@web.de> writes: > > Please find below the most recent code snippet. It is mainly based > > on examples from the pg documentation: > > -------8< ------- > > #define GET_TEXT(cstrp) DatumGetTextP(DirectFunctionCall1(textin, > > CStringGetDatum(cstrp))) > > (...) > > char *cres; > > int ret; > > (...) > > if ((ret = SPI_connect()) < 0) > > { > > elog(ERROR, "get_info: SPI_connect returned %d", ret); > > } > > > proc = SPI_processed; > > > initStringInfo(&result_buf); > > if (ret > 0 && SPI_tuptable != NULL) > > { > > TupleDesc tupdesc = SPI_tuptable->tupdesc; > > SPITupleTable *tuptable = SPI_tuptable; > > int i,k; > > for (i = 0; i < proc; i++) > > { > > HeapTuple tuple = tuptable->vals[i]; > > for (k = 1; k <= tupdesc->natts; k++) > > { > > cres = SPI_getvalue(tuple, tupdesc, k); > > appendStringInfo(&result_buf, SPI_getvalue(tuple, tupdesc, k)); > > elog(INFO, "info: %s", cres); > > } > > } > > } > > SPI_finish(); > > elog(INFO, "---"); > > PG_RETURN_TEXT_P(GET_TEXT(result_buf.data)); > > ------->8 ------- > > > I still have the problem that I can use the C function via > > select from within psql if the result is not too long. > > I don't think length has anything to do with it; rather the problem is > that you're trying to return data that's already been pfree'd when you > did SPI_finish(). > > Probably the simplest fix for this case is to put the initStringInfo > call before SPI_connect, so that the result_buf.data buffer exists > in the outer function context and not in the temporary SPI context. > > BTW, it is strongly advisable to do development/testing of C code > in a backend that's been built with --enable-cassert. Had you done > so, this mistake would have been much more obvious. > > regards, tom lane That was also one of my attempts. I even changed my code from calling initStringInfo to initialise the buffer with StringInfo result_buf = makeStringInfo(); just at the beginning fo my function before anything happened. Regarding the documentation this should palloc the buffer. However, reproducably I ended up in crashes when the string length exceeded a certain size. So SPI_palloc() helped me to solve the problem. Regards, Stefan