Re: Implementing "thick"/"fat" databases

Поиск

Список

Период

Сортировка

От	Sim Zacks
Тема	Re: Implementing "thick"/"fat" databases
Дата	26 июля 2011 г. 05:05:17
Msg-id	4E2E74F2.7010404@compulab.co.il обсуждение исходный текст
Ответ на	Re: Implementing "thick"/"fat" databases (Chris Travers <chris.travers@gmail.com>)
Ответы	Re: Implementing "thick"/"fat" databases
Список	pgsql-general

Дерево обсуждения

On 07/25/2011 06:24 PM, Chris Travers wrote:

On Sun, Jul 24, 2011 at 11:53 PM, Sim Zacks <sim@compulab.co.il> wrote:

The goal is to make our system client agnostic, Most of our GUI is written
in wxpython, we also have some web functions and even a barcode terminal
function, written in C#. We would like to use an application server, so that
all the code is run on the server but we don't want to be web-based. I don't
want to have the same business logic written into various clients as that
causes maintenance headaches.

Ok.  So we are talking about a multi-application database, and you are
trying to provide some degree of consistency in business logic across
the software applications (and hence uses of your data).  So far, so
good.  Agreed to put anything that might need to be in common.

The way that postgresql works is that each session is its own process. That
means that from the server perspective (at least in most ways), my plpython
function is not connected to any other call on the database. My goal is to
allow any client to request functionality and have it execute the same way
every time.

So.... I assume that means using Pl/Mono to make your barcode stuff
work in the db too?

No need for PL/Mono or any other client specific language. The GUI should be dumb, so all I really need to program design is the interface and input output methods. When you push a button, it should call the appropriate function. The functions on the barcode terminal (which is a Windows Mobile platform) can also be run on the client application written in wxpython. Keeping as much as possible off of the client allows me to share the work and be sure that both clients do the exact same thing when the user pushes the button.

There is always functionality that needs to be on the client, however in my experience with multi-client applications, if it is business logic you will end up duplicating it and then needing to maintain 2 copies of the same thing.

Data logic is built in by use of constraints and triggers and some
functions, business logic is built only into functions.

I guess I am using "data logic" more broadly than you.    Any select,
update, or insert statement against a relation (except, of course,
simple function calls) is part of that data logic, and there are
supporting dependencies too, such as security and permissions.

IOW, I think a lot of 'business logic' is data logic.  I separate this
with what you actually use the data for.

IOW, Generate a report with the following format is data logic (even
perhaps, and email it to the head of product ordering), but use this
report to decide what inventory to order is application logic (you
probably want people making these decisions).

Similarly "Generate me a list of invoices to be paid in the following
format": is data logic, but selecting those invoices to be paid,
entering the data, determining what bank account you want to pay them
from etc. is application logic.  Recording the payments is again data
logic.  Application logic might be better thought of as workflow
logic.

So much of human interaction is definition of terms. Now I understand what you mean by data logic.

To me data logic is strictly whether the data will be considered corrupt if the database allows an action to occur. For example, not enforcing relationships. If you erase the parent the child has no meaning and therefore you have data in the system which is suspect.
As my db is designed with the business rules in mind, some of the business logic is included in the data logic, for example, 3 tables must be updated at the same time in order for the transaction to be considered valid. This is taken care of either through a function or triggers.

An example of Business Logic is when my stock is updated to 0, I want the record deleted.
This is done through a rule that on update that changes the value to 0 delete instead.

In other words, our postgresql server is a hybrid database/application
server.

Right.  One thing we are seeing in LedgerSMB as we go through this is
that the database does indeed become an application server.  The areas
which are taking the new approach are mostly SQL and HTML templates
(in TemplateToolkit), while the Perl code is shrinking pretty fast.

I guess what I am wondering is if you are using it as an application
server, is there a benefit to trying to put everything in the
database?  Or just those parts which necessarily follow from the
database independent of workflow, etc or are likely to be common
across diverse applications hitting your server?

If so, what is it?

The benefits of putting everything into the database is having one platform to maintain. Also, anything that can talk to the database can call the functions, so we have a defined interface for all our functions. If I wanted to use JBOSS, for example, how would my wxpython or my c# app talk to it?
IOW, I did not find a middleware that was more appropriate for a multi-client/multi-platform environment then the database server.
Another big advantage of using the database for business logic, is that the plpython functions can be called from other functions. For example, in the python prompt function I posted earlier, I can have an plpgsql function say
if python_prompt('Are you sure?"') then
process
end if

It can also say
insert into table X with a trigger on the table that calls a send email function.

I have a database function that builds an HTML page and FTPs it to our website. This gives me a number of benefits: a) I don't have to worry about the ftp client version on the client or connection issues, firewalls, etc. b) it can be done with mutliple clients without changing the code, c) I don't have to worry about path issues.

All I need is to know that my server can do it and that my clients can connect to my server.

As I mentioned, I am still playing around and testing this, and have
The concept is that each client has a SocketServer thread that listens on a
specific port (8080 in this case).

Ok, so just pointing out here that if you go this route you are
starting to break the client-agnosticism requirement, unless I
misunderstand what you are trying to do :-).

Any client that has the ability to listen to a port will be able to use this functionality.
In C# for example, I would use System.Net.Sockets and the TcpListener class
In Python I would use the socketserver module

My database function is called python_prompt pasted below.
It has a timeout of 2 minutes after which the function errors out with a
timeout. The user will get a message that the function did not complete. The
important thing is that it has a timeout. If the timeout is 20 minutes, that
is also ok. Depending on the query that calls this function, there do not
have to be any locks being used and as this is its own process and it is
basically sleeping while waiting for the answer, it doesn't use resources
(aside from a connection and the work mem in the calling query, of course).

Now, if for sake of argument this happens after an insert or update,
that means things are locked for that time, right?  This would be more
likely to cause deadlocks and if the person steps up entering the
data, there would be a rollback, right?

Yes. This is the type of functionality that must be used smartly.

Using this, a plpgsql query can prompt the user, "You are about to use more
stock then you have, are you sure you want to do this?"

CREATE OR REPLACE FUNCTION python_prompt(v_question text)
 RETURNS boolean AS
$BODY$
import time
import socket
ipaddr=plpy.execute("select inet_client_addr()")[0]["inet_client_addr"]
HOST, PORT = str(ipaddr), 8080
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.settimeout(120)
sock.connect((HOST, PORT))
sock.send(v_question + "\n")
ans=sock.recv(1024)
sock.close()
if ans=="yes":
   return true
else:
   return false
$BODY$
 LANGUAGE plpythonu VOLATILE;

I guess I am wondering what you get by requiring that a client listens
on port 8080 AND knows how to communicate with the server on it that
you don't get from assuming that the client can manage the workflow
(breaking the routines into two separate ones and prompting for input
between).  I also think there is a better way to solve this problem
(see below).

Breaking it into 2 functions is not always practical because you lose your context (all variables etc..). It is not all done in a single transaction which means when you start the second function you would then have to verify that the work that is there is actually what you put there and nobody modified it.

What we do today (the reason I am playing with this interactivity) is have the database return an error code, which the client understands to be a question. It then prompts the user and the query is sent again with the answer as a parameter. It works but it is ugly (IMO).

I haven't fully decided to use this yet, but one of my goals is to make my
GUI as dumb as possible.

Probably a personal preference but I don't like to try to push
something as far as possible :-)

Anyway, here's an alternative to your interactive query approach, the
one we use for LedgerSMB user creation for 1.3.  it';s the "if you
don't tell me to do it anyway, I am raising an exception" approach.

Basically the problem is that db users are used to enforce application
permissions, and db users are cluster-wide while application users are
specific to a database.  So what happens when you are running two
companies (which have to have complete data separation and different
permissions) but a single user account is needed to access both of
them?

What we decided to do is have a stored procedure which creates the
user, sets the password (valid for 1 day by default, until changed by
the user, at which point it is valid for an admin-specified time).

Now, one parameter to the function is whether this is an import of an
existing db role or not (i.e. not == creation of new user/role).  In
the event you specify that this is an import, if you also specify a
password, you get an error unconditionally.  Don't want to change
password on import.  However, if you don't specify that this is an
import and the role exists you get a different exception.  Exception
strings in this case are short but descriptive ('No password allowed,'
and 'Duplicate user'), and these are handed back up to the application
to process.  The application then can flag that this role exists, and
the user of the software can resubmit as an import.  Now there's no
reason the application if it wasn't a web-based app couldn't just pop
up a confirmation box saying "This role exists.  Do you want to
import?  Yes/no" and handle it there.

Similarly if you add a do_anyway arg to your invoice item query, you
can raise an exception "Negative Inventory" and the application can
handle that, flag the user, and let the user decide whether to do it
anyway.

It seems to me that as you have timeouts, the application has to be
prepared to retry functions anyway.

Also one huge cost to putting too much interactive logic in the
database in some environments.  I haven't yet found a decent way of
localizing strings in database queries.  That means that if you want
to support multiple languages, it's a LOT easier to deal with these
strings if they are in the client codebase than if they are in the
server codebase and interpolated in the client (gettext throws errors
with variable interpolation if you identify the string on the server
and mix localization of client-side things with server-side strings).
I suppose you can patch gettext to ignore these errors.......

Best Wishes,
Chris Travers

В списке pgsql-general по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Implementing "thick"/"fat" databases