Re: Atomic access to large arrays
От | Oleg Bartunov |
---|---|
Тема | Re: Atomic access to large arrays |
Дата | |
Msg-id | Pine.LNX.4.64.0907220910220.8065@sn.sai.msu.ru обсуждение исходный текст |
Ответ на | Atomic access to large arrays ("Victor de Buen (Bayes)" <vdebuen@bayesinf.com>) |
Список | pgsql-performance |
Victor, Just wondering why do you use array ? Oleg On Wed, 22 Jul 2009, Victor de Buen (Bayes) wrote: > Hi > > I'm storing historical meteorological gridded data from GFS ( > http://www.nco.ncep.noaa.gov/pmb/products/gfs/) into an array field in a > table like this: > > CREATE TABLE grid_f_data_i2 ( > //Specifies the variable and other features of data > id_inventory integer REFERENCES grid_d_inventory(id_inventory), > //A new grid is available each 3 hours since 5 years ago > dh_date timestamp, > //Values are scaled to be stored as signed integers of 2 bytes > vl_grid smallint[361][720], > CONSTRAINT meteo_f_gfs_tmp PRIMARY KEY > (co_inventory, dh_date) > ); > > Dimensions of each value of field vl_grid are (lat:361 x lon:720 = 259920 > cells} for a grid of 0.5 degrees (about each 55 Km) around the world. So, > vl_grid[y][x] stores the value at dh_date of a meteorological variable > specified by id_inventory in the geodesic point > > latitude = -90 + y*0.5 > longitude = x*0.5 > > The reverse formula for the closest point in the grid of an arbitary > geodesic point will be > > y = Round((latitude+90) * 2 > x = Round(longitude*2) > > Field vl_grid is stored in the TOAST table and has a good compression level. > PostgreSql is the only one database that is able to store this huge amount > of data in only 34 GB of disk. It's really great system. Queries returning > big rectangular areas are very fast, but the target of almost all queries is > to get historical series for a geodesic point > > SELECT dh_date, vl_grid[123][152] > FROM grid_f_data_i2 > WHERE id_inventory = 6 > ORDER BY dh_date > > In this case, atomic access to just a cell of each one of a only few > thousands of rows becomes too slow. > > Please, could somebody answer some of these questions? > > - It's posible to tune some TOAST parameters to get faster atomic access > to large arrays? > > > - Using "EXTERNAL" strategy for storing TOAST-able columns could solve > the problem? > > > - Atomic access will be faster if table stores vectors for data in the > same parallel instead of matrices of global data? > CREATE TABLE grid_f_data_i2 ( > //Specifies the variable and other features of data > id_inventory integer REFERENCES grid_d_inventory(id_inventory), > //A new grid is available each 3 hours since 5 years ago > dh_date timestamp, > // nu_parallel = y = Round((latitude+90) * 2 > smallint nu_parallel, > //Values are scaled to be stored as signed integers of 2 bytes > vl_parallel smallint[], > CONSTRAINT meteo_f_gfs_tmp PRIMARY KEY > (co_inventory, nu_parallel, dh_date) > ); > > - There is another faster solution? > > Thanks in advance and best regards > > Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
В списке pgsql-performance по дате отправления: