Обсуждение: column limit

Поиск
Список
Период
Сортировка

column limit

От
"Isaac Ben"
Дата:
Hi,
I'm trying to create a table with 20,000 columns of type int2, but I
keep getting the error message that the limit is 1600.  According to
this message http://archives.postgresql.org/pgsql-admin/2001-01/msg00199.php
it can be increased, but only up to about 6400.  Can anyone tell me
how to get 20,000 columns?

Thanks,
IB

Re: column limit

От
Ron Johnson
Дата:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 01/25/07 09:34, Isaac Ben wrote:
> Hi,
> I'm trying to create a table with 20,000 columns of type int2, but I
> keep getting the error message that the limit is 1600.  According to
> this message
> http://archives.postgresql.org/pgsql-admin/2001-01/msg00199.php
> it can be increased, but only up to about 6400.  Can anyone tell me
> how to get 20,000 columns?

Why the heck do you need 20 *thousand* columns?

Assuming, though, that you know what you're doing, and the design
isn't horribly botched, then arrays might be what you want.

Normalizing the table might be better, and vertically partitioning
it would be a big performance win if you don't need all 20
*thousand* columns at the same time.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFFuNE0S9HxQb37XmcRAoxxAKCLrX0WaekNH8N8ghAzMkhgMmZ43ACg7F0K
0pqcprs/suZ/1xmK73PAdOE=
=WLdL
-----END PGP SIGNATURE-----

Re: column limit

От
"Isaac Ben"
Дата:
Hi,
Sorry, I forgot to post back to the list instead of just replying
individual responders.

The data is gene expression data with 20,000 dimensions. Part of the
project I'm working on is to discover what dimensions are truly
independent.   But to start with I need to have
all of the data available in a master table to do analysis on.  After
the analysis I hope to derive subsets of much lower dimensionality.

IB

Isaac Ben Jeppsen

On 1/25/07, David Brain <dbrain@bandwidth.com> wrote:
> Hi,
>
> Seeing as how no one has asked this question yet - I have to ask, why do
> you need 20,000 columns? I'm sure I'm not the only one who is curious.
>
> It's hard to think of a situation where this couldn't be solved by using
> a schema with parent/child tables to have 20000 rows per record rather
> than 20000 columns.
>
> David.
>
> Isaac Ben wrote:
> > Hi,
> > I'm trying to create a table with 20,000 columns of type int2, but I
> > keep getting the error message that the limit is 1600.  According to
> > this message
> > http://archives.postgresql.org/pgsql-admin/2001-01/msg00199.php
> > it can be increased, but only up to about 6400.  Can anyone tell me
> > how to get 20,000 columns?
> >
> > Thanks,
> > IB
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 1: if posting/reading through Usenet, please send an appropriate
> >       subscribe-nomail command to majordomo@postgresql.org so that your
> >       message can get through to the mailing list cleanly
>
>
> --
> David Brain - bandwidth.com
> dbrain@bandwidth.com
> 919.297.1078
>

Re: column limit

От
Bruno Wolff III
Дата:
On Thu, Jan 25, 2007 at 08:34:08 -0700,
  Isaac Ben <ib.zero@gmail.com> wrote:
> Hi,
> I'm trying to create a table with 20,000 columns of type int2, but I
> keep getting the error message that the limit is 1600.  According to
> this message http://archives.postgresql.org/pgsql-admin/2001-01/msg00199.php
> it can be increased, but only up to about 6400.  Can anyone tell me
> how to get 20,000 columns?

Can you explain what you are really trying to do? It is unlikely that using
20000 columns is the best way to solve your problem. If we know what you are
really trying to do we may be able to make some other suggestions.
One thing you might start looking at is using an array or arrays.

Re: column limit

От
Martijn van Oosterhout
Дата:
On Thu, Jan 25, 2007 at 10:47:50AM -0700, Isaac Ben wrote:
> The data is gene expression data with 20,000 dimensions. Part of the
> project I'm working on is to discover what dimensions are truly
> independent.   But to start with I need to have
> all of the data available in a master table to do analysis on.  After
> the analysis I hope to derive subsets of much lower dimensionality.

Even if you managed to hack the server enough to make that work (which
is debatable) performance is going to suck. The system is simply not
setup for that kind of thing. Use either arrays, or normalize the data
into a seperate table.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Вложения

Re: column limit

От
Bruno Wolff III
Дата:
On Thu, Jan 25, 2007 at 10:47:50 -0700,
  Isaac Ben <ib.zero@gmail.com> wrote:
>
> The data is gene expression data with 20,000 dimensions. Part of the
> project I'm working on is to discover what dimensions are truly
> independent.   But to start with I need to have
> all of the data available in a master table to do analysis on.  After
> the analysis I hope to derive subsets of much lower dimensionality.

Are you actually planning to do the analysis in Postgres? This doesn't seem
like a real good fit for that kind of task. (Though I haven't played with
the R stuff, and that might be good for doing that kind of analysis.)

If you do put this in postgres, it seems the two most natural things are
to use arrays to store the dimension values or to have table with a key
of the gene and the dimension and have another column with the value of
that dimension for that gene.

Re: column limit

От
"Isaac Ben"
Дата:
On 1/26/07, Bruno Wolff III <bruno@wolff.to> wrote:
> On Thu, Jan 25, 2007 at 10:47:50 -0700,
>   Isaac Ben <ib.zero@gmail.com> wrote:
> >
> > The data is gene expression data with 20,000 dimensions. Part of the
> > project I'm working on is to discover what dimensions are truly
> > independent.   But to start with I need to have
> > all of the data available in a master table to do analysis on.  After
> > the analysis I hope to derive subsets of much lower dimensionality.
>
> Are you actually planning to do the analysis in Postgres? This doesn't seem
> like a real good fit for that kind of task. (Though I haven't played with
> the R stuff, and that might be good for doing that kind of analysis.)

I plan on accessing the data with postgres via python and R. The main
reason for putting the data in postgres is that postgres handles large
data sets well and it will allow me to pull subsets easily if slowly.

>
> If you do put this in postgres, it seems the two most natural things are
> to use arrays to store the dimension values or to have table with a key
> of the gene and the dimension and have another column with the value of
> that dimension for that gene.

Yeah, I received a tip from someone regarding the use of arrays, and I
think that I will be using that.  Thanks for the tips.

IB

Re: column limit

От
Ron Johnson
Дата:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 01/26/07 13:37, Isaac Ben wrote:
> On 1/26/07, Bruno Wolff III <bruno@wolff.to> wrote:
>> On Thu, Jan 25, 2007 at 10:47:50 -0700, Isaac Ben
>> <ib.zero@gmail.com> wrote:
[snip]
>
> I plan on accessing the data with postgres via python and R. The
> main reason for putting the data in postgres is that postgres
> handles large data sets well and it will allow me to pull subsets
> easily if slowly.

I wonder if sed/grep/awk (or, just perl) could rapidly do your row
and column pre-filtering?

>> If you do put this in postgres, it seems the two most natural
>> things are to use arrays to store the dimension values or to
>> have table with a key of the gene and the dimension and have
>> another column with the value of that dimension for that gene.
>
> Yeah, I received a tip from someone regarding the use of arrays,
> and I think that I will be using that.  Thanks for the tips.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFFulz7S9HxQb37XmcRArBQAKCZc1Eusg/HtsdMKs8A6z8MTT6FgACg1GuU
yOjqrCxi8CIPX3rCjrDcX6U=
=0xY2
-----END PGP SIGNATURE-----