Обсуждение: Table design for basic user management
Hey all,
I'm in the process of building a web-application, where some basic user management is necessary. This is not the first time I've done this, but those I've made earlier have all been for internal use (LAN access only, very limited amount of users). This is going to be a public application, so naturally I'm a bit more concerned about performance, security and scalability. The application will open up with ~10.000 users, each doing 1-2 logins per day (0800-1700 weekdays only).
A login basically consists of a few simple steps:
*A username/password lookup.
*Decide users permissions based on her 'group' settings.
*Log stuff like last login and visits.
For some parts of the application, further information is necessary. This will be fetched from one or more 'user_data' tables only when necessary.
I haven't decided on the hash algorithm for passwords yet, which explains the rather large varchar column for the 'user' table.
Outside of the initial 10.000 users, we expect a limited amount of new users registering, probably no more than 5 per weekday.
I've more or less always used SQLite for our internal web applications, because it is simple, easy and low-maintenance. But for this application, I'd rather go with something more "solid", and because we've been running a bunch of internal Wikis on PostgreSQL without a hitch for years, I decided to stay with that.
I've come up with an initial design for this database: http://pastebin.com/f5255453e
Are there any glaring problems with this design?
As far as I can see, PostgreSQL handles indexes automatically on primary and unique columns. Is this assumption correct?
Also, will I gain anything from using functions to read/write data, or is it mostly a matter of convenience and simplified SQL?
The application will be done in PHP using PDO_PGSQL. We're running PostgreSQL 8.38, and plan on sticking with that for a while yet.
As you might have guessed from this post, and from my design, I'm no database expert. The task of solving this has fallen on me, not because I'm good at it, but because I'm the best of the worst here at our offices. Any and all advice is very much appreciated. Currently I'm glued to the PostgreSQL manual, trying to suck up as much knowledge as possible. But there's a lot of words in there, and sometimes a few pointers from experienced users is just what the doctor ordered. :o)
Regards,
Thomas
I'm in the process of building a web-application, where some basic user management is necessary. This is not the first time I've done this, but those I've made earlier have all been for internal use (LAN access only, very limited amount of users). This is going to be a public application, so naturally I'm a bit more concerned about performance, security and scalability. The application will open up with ~10.000 users, each doing 1-2 logins per day (0800-1700 weekdays only).
A login basically consists of a few simple steps:
*A username/password lookup.
*Decide users permissions based on her 'group' settings.
*Log stuff like last login and visits.
For some parts of the application, further information is necessary. This will be fetched from one or more 'user_data' tables only when necessary.
I haven't decided on the hash algorithm for passwords yet, which explains the rather large varchar column for the 'user' table.
Outside of the initial 10.000 users, we expect a limited amount of new users registering, probably no more than 5 per weekday.
I've more or less always used SQLite for our internal web applications, because it is simple, easy and low-maintenance. But for this application, I'd rather go with something more "solid", and because we've been running a bunch of internal Wikis on PostgreSQL without a hitch for years, I decided to stay with that.
I've come up with an initial design for this database: http://pastebin.com/f5255453e
Are there any glaring problems with this design?
As far as I can see, PostgreSQL handles indexes automatically on primary and unique columns. Is this assumption correct?
Also, will I gain anything from using functions to read/write data, or is it mostly a matter of convenience and simplified SQL?
The application will be done in PHP using PDO_PGSQL. We're running PostgreSQL 8.38, and plan on sticking with that for a while yet.
As you might have guessed from this post, and from my design, I'm no database expert. The task of solving this has fallen on me, not because I'm good at it, but because I'm the best of the worst here at our offices. Any and all advice is very much appreciated. Currently I'm glued to the PostgreSQL manual, trying to suck up as much knowledge as possible. But there's a lot of words in there, and sometimes a few pointers from experienced users is just what the doctor ordered. :o)
Regards,
Thomas
On 2009-11-06, Thomas Løcke <thomas.granvej6@gmail.com> wrote: > --0015174c3520dbeb790477aff96b > Content-Type: text/plain; charset=ISO-8859-1 > > Hey all, > > I'm in the process of building a web-application, where some basic user > management is necessary. This is not the first time I've done this, but > those I've made earlier have all been for internal use (LAN access only, > very limited amount of users). This is going to be a public application, so > naturally I'm a bit more concerned about performance, security and > scalability. The application will open up with ~10.000 users, each doing 1-2 > logins per day (0800-1700 weekdays only). > > A login basically consists of a few simple steps: > > *A username/password lookup. > *Decide users permissions based on her 'group' settings. > *Log stuff like last login and visits. 20000/11h about 1 every 2 seconds should not be a problem. > For some parts of the application, further information is necessary. This > will be fetched from one or more 'user_data' tables only when necessary. > > I haven't decided on the hash algorithm for passwords yet, which explains > the rather large varchar column for the 'user' table. use text instead of varchar unless you need to enforce a size limit on the field. postgresql has a md5 hashing function built in. which is good unless you need something more secure. > Outside of the initial 10.000 users, we expect a limited amount of new users > registering, probably no more than 5 per weekday. > I've come up with an initial design for this database: > http://pastebin.com/f5255453e lines 4-19 do the same as: CREATE TABLE "user" ( id SERIAL PRIMARY KEY username character varying(50) NOT NULL UNIQUE, password character varying(256) NOT NULL ); I would be inclined to use text instead of varchar, and possibly call the last column password_hash I would default lastvisit to now() also, that makes updating slightly simpler. UPDATE log SET lastvisit=DEFAULT; timestamp with timezone is better suited to real world times unless your server is running localtime=UTC (OTOH you say noone will be using your server during the DST changes so it may not matter) there is not performance penalty, timestamp with timezone is internally stored as UTC time. I'd also consider merging this data with the user table. CREATE TABLE log ( userid integer PRIMARY KEY REFERENCES REFERENCES user(id) ON DELETE CASCADE, registered timestamp DEFAULT now() NOT NULL, lastvisit timestamp NOT NULL, visits integer DEFAULT 0 NOT NULL ); then you can log-in with a single query, update user set lastvisit=default where username='fred' and password=md5('salt'||'password') returning id; most of the traffc will be on the index to the member table so you could also merge the userdata table too with taking a large performance hit. > As far as I can see, PostgreSQL handles indexes automatically on primary and > unique columns. Is this assumption correct? yes. > Also, will I gain anything from using functions to read/write data, or is it > mostly a matter of convenience and simplified SQL? not performance. > The application will be done in PHP using PDO_PGSQL. I cant's comment my PHP is worse than your PG DDL. when I do I just use vendor specific postgres functions: (http://www.php.net/manual/en/book.pgsql.php), but if you're familiar with PDO i see no reason why not. > We're running PostgreSQL 8.38, and plan on sticking with that for a > while yet. no such version. ITYM 8.3.8 > As you might have guessed from this post, and from my design, I'm no > database expert. The task of solving this has fallen on me, not because I'm > good at it, but because I'm the best of the worst here at our offices. Any > and all advice is very much appreciated. Currently I'm glued to the > PostgreSQL manual, trying to suck up as much knowledge as possible. But > there's a lot of words in there, and sometimes a few pointers from > experienced users is just what the doctor ordered. :o) what you have looks basically good. I think text would be more apropriate everywhere you are using varchar, also timestamp with timezone instead of without especially if it's available between 2 and 3 am on sundays.
2009/11/7 Jasen Betts <jasen@xnet.co.nz>: > On 2009-11-06, Thomas Løcke <thomas.granvej6@gmail.com> wrote: [...] >> I've come up with an initial design for this database: >> http://pastebin.com/f5255453e [...] > CREATE TABLE log ( > userid integer PRIMARY KEY REFERENCES REFERENCES user(id) ON DELETE CASCADE, > registered timestamp DEFAULT now() NOT NULL, > lastvisit timestamp NOT NULL, > visits integer DEFAULT 0 NOT NULL > ); > > then you can log-in with a single query, > > update user set lastvisit=default where username='fred' and > password=md5('salt'||'password') returning id; [...] I agree with using a salt, but you appear to be advocating a fixed salt for everyone? Normally the salt is stored along with the password hash, so you'd need one query to retrieve the salt and another query to calculate the hash and compare it to the stored hash. Something like this: select salt from "user" where username='fred'; update "user" set lastvisit=default where username='fred' and password=md5(salt||'password'); >> As far as I can see, PostgreSQL handles indexes automatically on primary and >> unique columns. Is this assumption correct? [...] Yes, but foreign keys are not automatically indexed. So, e.g. your log table has a userid column that references "user"(id). You would have to create the index on log.userid yourself if that's what you wanted. -- Michael Wood <esiotrot@gmail.com>
On 2009-11-08, Michael Wood <esiotrot@gmail.com> wrote: > 2009/11/7 Jasen Betts <jasen@xnet.co.nz>: >> On 2009-11-06, Thomas Løcke <thomas.granvej6@gmail.com> wrote: > [...] >>> I've come up with an initial design for this database: >>> http://pastebin.com/f5255453e > [...] >> CREATE TABLE log ( >> userid integer PRIMARY KEY REFERENCES REFERENCES user(id) ON DELETE CASCADE, >> registered timestamp DEFAULT now() NOT NULL, >> lastvisit timestamp NOT NULL, >> visits integer DEFAULT 0 NOT NULL >> ); >> >> then you can log-in with a single query, >> >> update user set lastvisit=default where username='fred' and >> password=md5('salt'||'password') returning id; > [...] > > I agree with using a salt, but you appear to be advocating a fixed > salt for everyone? it's better than nothing, but you're right a variable salt is even better. with no salt at all you can sometimes google the MD5 to "decrypt" it! > Normally the salt is stored along with the > password hash, so you'd need one query to retrieve the salt and > another query to calculate the hash and compare it to the stored hash.