Обсуждение: Table design for basic user management

Поиск
Список
Период
Сортировка

Table design for basic user management

От
Thomas Løcke
Дата:
Hey all,

I'm in the process of building a web-application, where some basic user management is necessary. This is not the first time I've done this, but those I've made earlier have all been for internal use (LAN access only, very limited amount of users). This is going to be a public application, so naturally I'm a bit more concerned about performance, security and scalability. The application will open up with ~10.000 users, each doing 1-2 logins per day (0800-1700 weekdays only).

A login basically consists of a few simple steps:

*A username/password lookup.
*Decide users permissions based on her 'group' settings.
*Log stuff like last login and visits.

For some parts of the application, further information is necessary. This will be fetched from one or more 'user_data' tables only when necessary.

I haven't decided on the hash algorithm for passwords yet, which explains the rather large varchar column for the 'user' table.

Outside of the initial 10.000 users, we expect a limited amount of new users registering, probably no more than 5 per weekday.

I've more or less always used SQLite for our internal web applications, because it is simple, easy and low-maintenance. But for this application, I'd rather go with something more "solid", and because we've been running a bunch of internal Wikis on PostgreSQL without a hitch for years, I decided to stay with that.

I've come up with an initial design for this database:  http://pastebin.com/f5255453e

Are there any glaring problems with this design?

As far as I can see, PostgreSQL handles indexes automatically on primary and unique columns. Is this assumption correct?

Also, will I gain anything from using functions to read/write data, or is it mostly a matter of convenience and simplified SQL?

The application will be done in PHP using PDO_PGSQL. We're running PostgreSQL 8.38, and plan on sticking with that for a while yet.

As you might have guessed from this post, and from my design, I'm no database expert. The task of solving this has fallen on me, not because I'm good at it, but because I'm the best of the worst here at our offices. Any and all advice is very much appreciated. Currently I'm glued to the PostgreSQL manual, trying to suck up as much knowledge as possible. But there's a lot of words in there, and sometimes a few pointers from experienced users is just what the doctor ordered.  :o)

Regards,
Thomas

Re: Table design for basic user management

От
Jasen Betts
Дата:
On 2009-11-06, Thomas Løcke <thomas.granvej6@gmail.com> wrote:

> --0015174c3520dbeb790477aff96b
> Content-Type: text/plain; charset=ISO-8859-1
>
> Hey all,
>
> I'm in the process of building a web-application, where some basic user
> management is necessary. This is not the first time I've done this, but
> those I've made earlier have all been for internal use (LAN access only,
> very limited amount of users). This is going to be a public application, so
> naturally I'm a bit more concerned about performance, security and
> scalability. The application will open up with ~10.000 users, each doing 1-2
> logins per day (0800-1700 weekdays only).
>
> A login basically consists of a few simple steps:
>
> *A username/password lookup.
> *Decide users permissions based on her 'group' settings.
> *Log stuff like last login and visits.

20000/11h  about 1 every 2 seconds

should not be a problem.

> For some parts of the application, further information is necessary. This
> will be fetched from one or more 'user_data' tables only when necessary.
>
> I haven't decided on the hash algorithm for passwords yet, which explains
> the rather large varchar column for the 'user' table.

use text instead of varchar unless you need to enforce a size limit
on the field. postgresql has a md5 hashing function built in. which is
good unless you need something more secure.

> Outside of the initial 10.000 users, we expect a limited amount of new users
> registering, probably no more than 5 per weekday.

> I've come up with an initial design for this database:
> http://pastebin.com/f5255453e

lines 4-19 do the same as:

CREATE TABLE "user" (
    id SERIAL PRIMARY KEY
    username character varying(50) NOT NULL UNIQUE,
    password character varying(256) NOT NULL
    );

I would be inclined to use text instead of varchar, and possibly call
the last column password_hash


I would default lastvisit to now() also,
that makes updating slightly simpler.

 UPDATE log SET lastvisit=DEFAULT;

timestamp with timezone is better suited to real world times unless
your server is running localtime=UTC (OTOH you say noone will be using
your server during the DST changes so it may not matter) there is not
performance penalty, timestamp with timezone is internally stored as
UTC time.

I'd also consider merging this data with the user table.

CREATE TABLE log (
    userid integer PRIMARY KEY REFERENCES REFERENCES user(id) ON DELETE CASCADE,
    registered timestamp  DEFAULT now() NOT NULL,
    lastvisit  timestamp  NOT NULL,
    visits integer DEFAULT 0 NOT NULL
);

then you can log-in with a single query,

update user set lastvisit=default where username='fred' and
password=md5('salt'||'password') returning id;

most of the traffc will be on the index to the member table so you
could also merge the userdata table too with taking a large performance hit.

> As far as I can see, PostgreSQL handles indexes automatically on primary and
> unique columns. Is this assumption correct?

yes.

> Also, will I gain anything from using functions to read/write data, or is it
> mostly a matter of convenience and simplified SQL?

not performance.

> The application will be done in PHP using PDO_PGSQL.

I cant's comment my PHP is worse than your PG DDL. when I do I just
use vendor specific postgres functions:
(http://www.php.net/manual/en/book.pgsql.php), but if you're familiar
with PDO i see no reason why not.


> We're running PostgreSQL 8.38, and plan on sticking with that for a
> while yet.

no such version. ITYM 8.3.8

> As you might have guessed from this post, and from my design, I'm no
> database expert. The task of solving this has fallen on me, not because I'm
> good at it, but because I'm the best of the worst here at our offices. Any
> and all advice is very much appreciated. Currently I'm glued to the
> PostgreSQL manual, trying to suck up as much knowledge as possible. But
> there's a lot of words in there, and sometimes a few pointers from
> experienced users is just what the doctor ordered.  :o)

what you have looks basically good. I think text would be more apropriate
everywhere you are using varchar, also timestamp with timezone instead of
without especially if it's available between 2 and 3 am on sundays.


Re: Table design for basic user management

От
Michael Wood
Дата:
2009/11/7 Jasen Betts <jasen@xnet.co.nz>:
> On 2009-11-06, Thomas Løcke <thomas.granvej6@gmail.com> wrote:
[...]
>> I've come up with an initial design for this database:
>> http://pastebin.com/f5255453e
[...]
> CREATE TABLE log (
>    userid integer PRIMARY KEY REFERENCES REFERENCES user(id) ON DELETE CASCADE,
>    registered timestamp  DEFAULT now() NOT NULL,
>    lastvisit  timestamp  NOT NULL,
>    visits integer DEFAULT 0 NOT NULL
> );
>
> then you can log-in with a single query,
>
> update user set lastvisit=default where username='fred' and
> password=md5('salt'||'password') returning id;
[...]

I agree with using a salt, but you appear to be advocating a fixed
salt for everyone?  Normally the salt is stored along with the
password hash, so you'd need one query to retrieve the salt and
another query to calculate the hash and compare it to the stored hash.
 Something like this:

select salt from "user" where username='fred';
update "user" set lastvisit=default where username='fred' and
password=md5(salt||'password');

>> As far as I can see, PostgreSQL handles indexes automatically on primary and
>> unique columns. Is this assumption correct?
[...]

Yes, but foreign keys are not automatically indexed.  So, e.g. your
log table has a userid column that references "user"(id).  You would
have to create the index on log.userid yourself if that's what you
wanted.

--
Michael Wood <esiotrot@gmail.com>

Re: Table design for basic user management

От
Jasen Betts
Дата:
On 2009-11-08, Michael Wood <esiotrot@gmail.com> wrote:
> 2009/11/7 Jasen Betts <jasen@xnet.co.nz>:
>> On 2009-11-06, Thomas Løcke <thomas.granvej6@gmail.com> wrote:
> [...]
>>> I've come up with an initial design for this database:
>>> http://pastebin.com/f5255453e
> [...]
>> CREATE TABLE log (
>>    userid integer PRIMARY KEY REFERENCES REFERENCES user(id) ON DELETE CASCADE,
>>    registered timestamp  DEFAULT now() NOT NULL,
>>    lastvisit  timestamp  NOT NULL,
>>    visits integer DEFAULT 0 NOT NULL
>> );
>>
>> then you can log-in with a single query,
>>
>> update user set lastvisit=default where username='fred' and
>> password=md5('salt'||'password') returning id;
> [...]
>
> I agree with using a salt, but you appear to be advocating a fixed
> salt for everyone?

it's better than nothing, but you're right a variable salt is even
better.

with no salt at all you can sometimes google the MD5 to "decrypt" it!

> Normally the salt is stored along with the
> password hash, so you'd need one query to retrieve the salt and
> another query to calculate the hash and compare it to the stored hash.