Обсуждение: Question about where to deploy the business logics for data processing

Поиск

Список

Период

Сортировка

Question about where to deploy the business logics for data processing

От

Nim Li

Дата:

09 июня 2023 г., 03:21:28

Hello.

We have a PostgreSQL database with many tables, as well as foreign table, dblink, triggers, functions, indexes, etc, for managing the business logics of the data within the database. We also have a custom table for the purpose of tracking the slowly changing dimensions (type 2).

Currently we are looking into using TypeORM (from Nest JS framework) to connect to the database for creating a BE that provides web service. Some reasons of using TypeORM are that it can update the database schema without any SQL codes, works very well with Git, etc. And from what I am reading, Git seems to work better with TypeORM, rather than handling individual batch files with SQL codes (I still need to find out more about this) Yet I do not think the ORM concept deals with database specify functions, such as dblink and/or trigger-function, etc, which handles the business logics or any ETL automation within the database itself (I should read more about this as well.)

Anyway, in our team discussion, I was told that in modern programming concept, the world is moving away from deploying programming logics within the database (eg, by using PL/SQL). Instead, the proper way should be to deploy all the programming logics to the framework which is used to connect to the database, such as NestJS in our case. So, all we need in a database should be only the schema (managed by ORM), and we should move all the existing business logics (currently managed by things like the database triggers, functions, dblink, etc.) to the Typescript codes within the NestJS framework.

I wonder if anyone in the community has gone through changes like this? I mean ... moving the business logics from PL/SQL within the database to the codes in NestJS framework, and reply on only the TypeORM to manage the update of the database without any SQL codes? Any thoughts about such a change?

Thank you!!

Re: Question about where to deploy the business logics for data processing

От

Rob Sargent

Дата:

09 июня 2023 г., 15:06:41


> On Jun 8, 2023, at 8:21 PM, Nim Li <mr.nim.li@gmail.com> wrote:
>
> Hello.
>
> We have a PostgreSQL database with many tables, as well as foreign table, dblink, triggers, functions, indexes, etc,
formanaging the business logics of the data within the database.  We also have a custom table for the purpose of
trackingthe slowly changing dimensions (type 2).  
>
> Currently we are looking into using TypeORM (from Nest JS framework) to connect to the database for creating a BE
thatprovides web service.  Some reasons of using TypeORM are that it can update the database schema without any SQL
codes,works very well with Git, etc.  And from what I am reading, Git seems to work better with TypeORM, rather than
handlingindividual batch files with SQL codes (I still need to find out more about this)  Yet I do not think the ORM
conceptdeals with database specify functions, such as dblink and/or trigger-function, etc, which handles the business
logicsor any ETL automation within the database itself (I should read more about this as well.) 
>
> Anyway, in our team discussion, I was told that in modern programming concept, the world is moving away from
deployingprogramming logics within the database (eg, by using PL/SQL).  Instead, the proper way should be to deploy all
theprogramming logics to the framework which is used to connect to the database, such as NestJS in our case.  So, all
weneed in a database should be only the schema (managed by ORM), and we should move all the existing business logics
(currentlymanaged by things like the database triggers, functions, dblink, etc.) to the Typescript codes within the
NestJSframework. 
>
> I wonder if anyone in the community has gone through changes like this?  I mean ... moving the business logics from
PL/SQLwithin the database to the codes in NestJS framework, and reply on only the TypeORM to manage the update of the
databasewithout any SQL codes?  Any thoughts about such a change?   
>
> Thank you!!
>

You’re riding a pendulum which has swung too far.
In any organization of even minimal complexity, the physical data model and the deployable business model are never
wellaligned: the usages of the data are in a different dimension than the storage and maintenance  of the data.  I’ve
notheard of TypeORM but on this list ORMs are notorious for generating poorly performing queries.  The notion that
applicationprogramming will replace database triggers is ludicrous.

Re: Question about where to deploy the business logics for data processing

От

Michael Nolan

Дата:

09 июня 2023 г., 15:46:02

Clearly I'm a 73 year old dinosaur, because I believe in having the
business logic in the database wherever possible.  But the development
projects I've been around lately aren't using triggers at all.  (And
it should not surprise anyone, certainly not me, that consistency of
data enforcement is an ongoing issue in these projects.)

Mike Nolan

On Fri, Jun 9, 2023 at 10:06 AM Rob Sargent <robjsargent@gmail.com> wrote:
>
>
>
> > On Jun 8, 2023, at 8:21 PM, Nim Li <mr.nim.li@gmail.com> wrote:
> >
> > Hello.
> >
> > We have a PostgreSQL database with many tables, as well as foreign table, dblink, triggers, functions, indexes,
etc,for managing the business logics of the data within the database.  We also have a custom table for the purpose of
trackingthe slowly changing dimensions (type 2). 
> >
> > Currently we are looking into using TypeORM (from Nest JS framework) to connect to the database for creating a BE
thatprovides web service.  Some reasons of using TypeORM are that it can update the database schema without any SQL
codes,works very well with Git, etc.  And from what I am reading, Git seems to work better with TypeORM, rather than
handlingindividual batch files with SQL codes (I still need to find out more about this)  Yet I do not think the ORM
conceptdeals with database specify functions, such as dblink and/or trigger-function, etc, which handles the business
logicsor any ETL automation within the database itself (I should read more about this as well.) 
> >
> > Anyway, in our team discussion, I was told that in modern programming concept, the world is moving away from
deployingprogramming logics within the database (eg, by using PL/SQL).  Instead, the proper way should be to deploy all
theprogramming logics to the framework which is used to connect to the database, such as NestJS in our case.  So, all
weneed in a database should be only the schema (managed by ORM), and we should move all the existing business logics
(currentlymanaged by things like the database triggers, functions, dblink, etc.) to the Typescript codes within the
NestJSframework. 
> >
> > I wonder if anyone in the community has gone through changes like this?  I mean ... moving the business logics from
PL/SQLwithin the database to the codes in NestJS framework, and reply on only the TypeORM to manage the update of the
databasewithout any SQL codes?  Any thoughts about such a change? 
> >
> > Thank you!!
> >
>
> You’re riding a pendulum which has swung too far.
> In any organization of even minimal complexity, the physical data model and the deployable business model are never
wellaligned: the usages of the data are in a different dimension than the storage and maintenance  of the data.  I’ve
notheard of TypeORM but on this list ORMs are notorious for generating poorly performing queries.  The notion that
applicationprogramming will replace database triggers is ludicrous. 
>
>
>

Re: Question about where to deploy the business logics for data processing

От

Lorusso Domenico

Дата:

09 июня 2023 г., 16:34:47

Uhm me need to start form 2 concepts:

competence
Network lag

Competence: usually programmers aren't skilled enough about the architectures and the actual needs of each layer.

This is a problem, because often programmers try to do something with what he already know (e.g. perform join in Java....).

A correct design requires to identify at least the data logic, the process logic, the business logic and the presentation logic.

One of the most important goals of Data logic is to ensure the correctness of data from many point of view (all is impossible).

That involve:

audit information
bitemporal management
strictly definition and verification of data (foreign key, checks, management of compatibility)
replicate consistently data for different usage
isolate access for actual needs
design

So an application that requires changing the data model does not seem to be well designed...

Network lag

The first problem is latency, I must minimize the passage of data over the network.

This means, for example, creating a service that allows the caller to choose only the information it needs.

But it also means, to get all the information needed in a single call, design asynchronous service, use cache data physically near to the frontend or the middle layer.

Based on these 2 concepts I suggest:

develop the Data logic near or inside the database;
design powerful and addictive api;
don't allow model change by the business logic
organize/copy data in jsonb with a powerful json schema to provide coherence through every layer
ensure a system to grant ACID features to your process.

Il giorno ven 9 giu 2023 alle ore 05:22 Nim Li <mr.nim.li@gmail.com> ha scritto:

Hello.

We have a PostgreSQL database with many tables, as well as foreign table, dblink, triggers, functions, indexes, etc, for managing the business logics of the data within the database. We also have a custom table for the purpose of tracking the slowly changing dimensions (type 2).

Currently we are looking into using TypeORM (from Nest JS framework) to connect to the database for creating a BE that provides web service. Some reasons of using TypeORM are that it can update the database schema without any SQL codes, works very well with Git, etc. And from what I am reading, Git seems to work better with TypeORM, rather than handling individual batch files with SQL codes (I still need to find out more about this) Yet I do not think the ORM concept deals with database specify functions, such as dblink and/or trigger-function, etc, which handles the business logics or any ETL automation within the database itself (I should read more about this as well.)

Anyway, in our team discussion, I was told that in modern programming concept, the world is moving away from deploying programming logics within the database (eg, by using PL/SQL). Instead, the proper way should be to deploy all the programming logics to the framework which is used to connect to the database, such as NestJS in our case. So, all we need in a database should be only the schema (managed by ORM), and we should move all the existing business logics (currently managed by things like the database triggers, functions, dblink, etc.) to the Typescript codes within the NestJS framework.

I wonder if anyone in the community has gone through changes like this? I mean ... moving the business logics from PL/SQL within the database to the codes in NestJS framework, and reply on only the TypeORM to manage the update of the database without any SQL codes? Any thoughts about such a change?

Thank you!!

Domenico L.

per stupire mezz'ora basta un libro di storia,
io cercai di imparare la Treccani a memoria... [F.d.A.]

Re: Question about where to deploy the business logics for data processing

От

Nim Li

Дата:

09 июня 2023 г., 18:36:33

Hello,

Thank you so so much for all the feedback so far. :D

About this comment:

> "... an application that requires changing the data model does not seem to be well designed...don't allow model change by the business logic..."

I work in a science research faculity. When researchers start a project, they don't necessary get the full picture of what they are hoping to achive (yet they may get some ideas about the starting point that allow them to move forward) By the time they see 40% percent of what they have done, they may start to have a different thought and move towards a different direction, or in some cases, they may spin it off to something different after a certain period of time Coming with my Agile Development mindset in the research area, it is common for me to see users changing their requirement and expectation, with the same buckets for the data. Yes, there is quite a lot of work to keep the researchers happy. ;-)

I suppose when there is a specific end-goal to achive for a project, a more specific design can be more feasible based on the goal. But when the end-goal is not necessary clear, and/or change-able, I am not exactly clear how we may draw a black-and-white line to determine a design is good or not (.. and for how long...)

I imagine one option may be to put less logics and restrictions on the data side, which allows the researchers to have more flexible on their end. But this may not be always feasbile due to the specific protocol of the study. Perhaps there may be some other approaches and/or principles to deal with situation like mine?

My major focus is still on getting more opinions about where to implement the business logics for data processing ... if you have any thoughts about the design, I would love to hear your thoughts as well.

Thank you so so much for sharing!

On Fri, Jun 9, 2023 at 12:35 PM Lorusso Domenico <domenico.l76@gmail.com> wrote:

Uhm me need to start form 2 concepts:
competence
Network lag
Competence: usually programmers aren't skilled enough about the architectures and the actual needs of each layer.
This is a problem, because often programmers try to do something with what he already know (e.g. perform join in Java....).

A correct design requires to identify at least the data logic, the process logic, the business logic and the presentation logic.

One of the most important goals of Data logic is to ensure the correctness of data from many point of view (all is impossible).

That involve:
audit information
bitemporal management
strictly definition and verification of data (foreign key, checks, management of compatibility)
replicate consistently data for different usage
isolate access for actual needs
design
So an application that requires changing the data model does not seem to be well designed...

Network lag
The first problem is latency, I must minimize the passage of data over the network.
This means, for example, creating a service that allows the caller to choose only the information it needs.
But it also means, to get all the information needed in a single call, design asynchronous service, use cache data physically near to the frontend or the middle layer.

Based on these 2 concepts I suggest:
develop the Data logic near or inside the database;
design powerful and addictive api;
don't allow model change by the business logic
organize/copy data in jsonb with a powerful json schema to provide coherence through every layer
ensure a system to grant ACID features to your process.

Il giorno ven 9 giu 2023 alle ore 05:22 Nim Li <mr.nim.li@gmail.com> ha scritto:
Hello.

We have a PostgreSQL database with many tables, as well as foreign table, dblink, triggers, functions, indexes, etc, for managing the business logics of the data within the database. We also have a custom table for the purpose of tracking the slowly changing dimensions (type 2).

Currently we are looking into using TypeORM (from Nest JS framework) to connect to the database for creating a BE that provides web service. Some reasons of using TypeORM are that it can update the database schema without any SQL codes, works very well with Git, etc. And from what I am reading, Git seems to work better with TypeORM, rather than handling individual batch files with SQL codes (I still need to find out more about this) Yet I do not think the ORM concept deals with database specify functions, such as dblink and/or trigger-function, etc, which handles the business logics or any ETL automation within the database itself (I should read more about this as well.)

Anyway, in our team discussion, I was told that in modern programming concept, the world is moving away from deploying programming logics within the database (eg, by using PL/SQL). Instead, the proper way should be to deploy all the programming logics to the framework which is used to connect to the database, such as NestJS in our case. So, all we need in a database should be only the schema (managed by ORM), and we should move all the existing business logics (currently managed by things like the database triggers, functions, dblink, etc.) to the Typescript codes within the NestJS framework.

I wonder if anyone in the community has gone through changes like this? I mean ... moving the business logics from PL/SQL within the database to the codes in NestJS framework, and reply on only the TypeORM to manage the update of the database without any SQL codes? Any thoughts about such a change?

Thank you!!

--
Domenico L.

per stupire mezz'ora basta un libro di storia,
io cercai di imparare la Treccani a memoria... [F.d.A.]

Re: Question about where to deploy the business logics for data processing

От

Ron

Дата:

09 июня 2023 г., 20:38:45

You can be sure that banks and academic research projects have different needs. Heck, your University's class scheduling software has different needs from the research problems that you support.

The bottom line is that putting all of the "business" logic in TypeORM locks you into using an ORM, while putting as much "business" logic in database as stored procedures, triggers, foreign keys, etc... doesn't. Parts of the application can be in Java, some in JS, C, C++, Rust, Perl, even COBOL.

On the other hand, putting so much logic into the database essentially locks you into that RDBMS.

On 6/9/23 13:36, Nim Li wrote:

Hello,

Thank you so so much for all the feedback so far. :D

About this comment:

> "... an application that requires changing the data model does not seem to be well designed...don't allow model change by the business logic..."

I work in a science research faculity. When researchers start a project, they don't necessary get the full picture of what they are hoping to achive (yet they may get some ideas about the starting point that allow them to move forward) By the time they see 40% percent of what they have done, they may start to have a different thought and move towards a different direction, or in some cases, they may spin it off to something different after a certain period of time Coming with my Agile Development mindset in the research area, it is common for me to see users changing their requirement and expectation, with the same buckets for the data. Yes, there is quite a lot of work to keep the researchers happy. ;-)

I suppose when there is a specific end-goal to achive for a project, a more specific design can be more feasible based on the goal. But when the end-goal is not necessary clear, and/or change-able, I am not exactly clear how we may draw a black-and-white line to determine a design is good or not (.. and for how long...)

I imagine one option may be to put less logics and restrictions on the data side, which allows the researchers to have more flexible on their end. But this may not be always feasbile due to the specific protocol of the study. Perhaps there may be some other approaches and/or principles to deal with situation like mine?

My major focus is still on getting more opinions about where to implement the business logics for data processing ... if you have any thoughts about the design, I would love to hear your thoughts as well.

Thank you so so much for sharing!

On Fri, Jun 9, 2023 at 12:35 PM Lorusso Domenico <domenico.l76@gmail.com> wrote:
Uhm me need to start form 2 concepts:
competence
Network lag
Competence: usually programmers aren't skilled enough about the architectures and the actual needs of each layer.
This is a problem, because often programmers try to do something with what he already know (e.g. perform join in Java....).

A correct design requires to identify at least the data logic, the process logic, the business logic and the presentation logic.

One of the most important goals of Data logic is to ensure the correctness of data from many point of view (all is impossible).

That involve:
audit information
bitemporal management
strictly definition and verification of data (foreign key, checks, management of compatibility)
replicate consistently data for different usage
isolate access for actual needs
design
So an application that requires changing the data model does not seem to be well designed...

Network lag
The first problem is latency, I must minimize the passage of data over the network.
This means, for example, creating a service that allows the caller to choose only the information it needs.
But it also means, to get all the information needed in a single call, design asynchronous service, use cache data physically near to the frontend or the middle layer.

Based on these 2 concepts I suggest:
develop the Data logic near or inside the database;
design powerful and addictive api;
don't allow model change by the business logic
organize/copy data in jsonb with a powerful json schema to provide coherence through every layer
ensure a system to grant ACID features to your process.

Il giorno ven 9 giu 2023 alle ore 05:22 Nim Li <mr.nim.li@gmail.com> ha scritto:
Hello.

We have a PostgreSQL database with many tables, as well as foreign table, dblink, triggers, functions, indexes, etc, for managing the business logics of the data within the database. We also have a custom table for the purpose of tracking the slowly changing dimensions (type 2).

Currently we are looking into using TypeORM (from Nest JS framework) to connect to the database for creating a BE that provides web service. Some reasons of using TypeORM are that it can update the database schema without any SQL codes, works very well with Git, etc. And from what I am reading, Git seems to work better with TypeORM, rather than handling individual batch files with SQL codes (I still need to find out more about this) Yet I do not think the ORM concept deals with database specify functions, such as dblink and/or trigger-function, etc, which handles the business logics or any ETL automation within the database itself (I should read more about this as well.)

Anyway, in our team discussion, I was told that in modern programming concept, the world is moving away from deploying programming logics within the database (eg, by using PL/SQL). Instead, the proper way should be to deploy all the programming logics to the framework which is used to connect to the database, such as NestJS in our case. So, all we need in a database should be only the schema (managed by ORM), and we should move all the existing business logics (currently managed by things like the database triggers, functions, dblink, etc.) to the Typescript codes within the NestJS framework.

I wonder if anyone in the community has gone through changes like this? I mean ... moving the business logics from PL/SQL within the database to the codes in NestJS framework, and reply on only the TypeORM to manage the update of the database without any SQL codes? Any thoughts about such a change?

Thank you!!

--
Domenico L.

per stupire mezz'ora basta un libro di storia,
io cercai di imparare la Treccani a memoria... [F.d.A.]

--
Born in Arizona, moved to Babylonia.

Re: Question about where to deploy the business logics for data processing

От

Guyren Howe

Дата:

09 июня 2023 г., 23:07:10

People change applications and programming languages all the time.

But change the database? Particularly away from Postgres, which is for nearly any purpose clearly the best SQL database available?

You have to pick one. Heck, write your triggers and stored procedures in Python and you can change to SQL Server, or in Java and you have the option of Oracle.

There is never a good reason to use MySQL. :-)

Guyren G Howe

On Jun 9, 2023 at 13:39 -0700, Ron <ronljohnsonjr@gmail.com>, wrote:

You can be sure that banks and academic research projects have different needs. Heck, your University's class scheduling software has different needs from the research problems that you support.

The bottom line is that putting all of the "business" logic in TypeORM locks you into using an ORM, while putting as much "business" logic in database as stored procedures, triggers, foreign keys, etc... doesn't. Parts of the application can be in Java, some in JS, C, C++, Rust, Perl, even COBOL.

On the other hand, putting so much logic into the database essentially locks you into that RDBMS.

On 6/9/23 13:36, Nim Li wrote:
Hello,

Thank you so so much for all the feedback so far. :D

About this comment:

> "... an application that requires changing the data model does not seem to be well designed...don't allow model change by the business logic..."

I work in a science research faculity. When researchers start a project, they don't necessary get the full picture of what they are hoping to achive (yet they may get some ideas about the starting point that allow them to move forward) By the time they see 40% percent of what they have done, they may start to have a different thought and move towards a different direction, or in some cases, they may spin it off to something different after a certain period of time Coming with my Agile Development mindset in the research area, it is common for me to see users changing their requirement and expectation, with the same buckets for the data. Yes, there is quite a lot of work to keep the researchers happy. ;-)

I suppose when there is a specific end-goal to achive for a project, a more specific design can be more feasible based on the goal. But when the end-goal is not necessary clear, and/or change-able, I am not exactly clear how we may draw a black-and-white line to determine a design is good or not (.. and for how long...)

I imagine one option may be to put less logics and restrictions on the data side, which allows the researchers to have more flexible on their end. But this may not be always feasbile due to the specific protocol of the study. Perhaps there may be some other approaches and/or principles to deal with situation like mine?

My major focus is still on getting more opinions about where to implement the business logics for data processing ... if you have any thoughts about the design, I would love to hear your thoughts as well.

Thank you so so much for sharing!

On Fri, Jun 9, 2023 at 12:35 PM Lorusso Domenico <domenico.l76@gmail.com> wrote:
Uhm me need to start form 2 concepts:
competence
Network lag
Competence: usually programmers aren't skilled enough about the architectures and the actual needs of each layer.
This is a problem, because often programmers try to do something with what he already know (e.g. perform join in Java....).

A correct design requires to identify at least the data logic, the process logic, the business logic and the presentation logic.

One of the most important goals of Data logic is to ensure the correctness of data from many point of view (all is impossible).

That involve:
audit information
bitemporal management
strictly definition and verification of data (foreign key, checks, management of compatibility)
replicate consistently data for different usage
isolate access for actual needs
design
So an application that requires changing the data model does not seem to be well designed...

Network lag
The first problem is latency, I must minimize the passage of data over the network.
This means, for example, creating a service that allows the caller to choose only the information it needs.
But it also means, to get all the information needed in a single call, design asynchronous service, use cache data physically near to the frontend or the middle layer.

Based on these 2 concepts I suggest:
develop the Data logic near or inside the database;
design powerful and addictive api;
don't allow model change by the business logic
organize/copy data in jsonb with a powerful json schema to provide coherence through every layer
ensure a system to grant ACID features to your process.

Il giorno ven 9 giu 2023 alle ore 05:22 Nim Li <mr.nim.li@gmail.com> ha scritto:
Hello.

We have a PostgreSQL database with many tables, as well as foreign table, dblink, triggers, functions, indexes, etc, for managing the business logics of the data within the database. We also have a custom table for the purpose of tracking the slowly changing dimensions (type 2).

Currently we are looking into using TypeORM (from Nest JS framework) to connect to the database for creating a BE that provides web service. Some reasons of using TypeORM are that it can update the database schema without any SQL codes, works very well with Git, etc. And from what I am reading, Git seems to work better with TypeORM, rather than handling individual batch files with SQL codes (I still need to find out more about this) Yet I do not think the ORM concept deals with database specify functions, such as dblink and/or trigger-function, etc, which handles the business logics or any ETL automation within the database itself (I should read more about this as well.)

Anyway, in our team discussion, I was told that in modern programming concept, the world is moving away from deploying programming logics within the database (eg, by using PL/SQL). Instead, the proper way should be to deploy all the programming logics to the framework which is used to connect to the database, such as NestJS in our case. So, all we need in a database should be only the schema (managed by ORM), and we should move all the existing business logics (currently managed by things like the database triggers, functions, dblink, etc.) to the Typescript codes within the NestJS framework.

I wonder if anyone in the community has gone through changes like this? I mean ... moving the business logics from PL/SQL within the database to the codes in NestJS framework, and reply on only the TypeORM to manage the update of the database without any SQL codes? Any thoughts about such a change?

Thank you!!

--
Domenico L.

per stupire mezz'ora basta un libro di storia,
io cercai di imparare la Treccani a memoria... [F.d.A.]

--
Born in Arizona, moved to Babylonia.

Re: Question about where to deploy the business logics for data processing

От

Adrian Klaver

Дата:

09 июня 2023 г., 23:23:21

On 6/9/23 11:36, Nim Li wrote:
> Hello,
> 
> Thank you so so much for all the feedback so far.  :D
> 
> About this comment:
> 
>  > "... an application that requires changing the data model does not 
> seem to be well designed...don't allow model change by the business 
> logic..."
> 
> I work in a science research faculity.  When researchers start a 
> project, they don't necessary get the full picture of what they are 
> hoping to achive (yet they may get some ideas about the starting point 
> that allow them to move forward)  By the time they see 40% percent of 
> what they have done, they may start to have a different thought and move 
> towards a different direction, or in some cases, they may spin it off to 
> something different after a certain period of time  Coming with my Agile 
> Development mindset in the research area, it is common for me to see 
> users changing their requirement and expectation, with the same buckets 
> for the data.  Yes, there is quite a lot of work to keep the researchers 
> happy.  ;-)
> 
> I suppose when there is a specific end-goal to achive for a project, a 
> more specific design can be more feasible based on the goal.  But when 
> the end-goal is not necessary clear, and/or change-able, I am not 
> exactly clear how we may draw a black-and-white line to determine a 
> design is good or not (.. and for how long...)
> 

Seems to me you are looking for a two part set up:

1) A experiment play ground where ideas and processes can be tested out 
in a more free form manner. Some example software I have used or 
experimented with that can fill that role:

Pandas
https://pandas.pydata.org/

Duckdb
https://duckdb.org/

Polars
https://pola-rs.github.io/polars-book/

2) Once something that resembles a solid plan has been developed then 
move to Postgres or not.


-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: Question about where to deploy the business logics for data processing

От

Michael Nolan

Дата:

10 июня 2023 г., 00:13:28

You're gonna lock yourself into SOMETHING, that's why there are still
thousands of COBOL programs still being maintained.

Mike Nolan

On Fri, Jun 9, 2023 at 3:39 PM Ron <ronljohnsonjr@gmail.com> wrote:
>
> You can be sure that banks and academic research projects have different needs.  Heck, your University's class
schedulingsoftware has different needs from the research problems that you support. 
>
> The bottom line is that putting all of the "business" logic in TypeORM locks you into using an ORM, while putting as
much"business" logic in database as stored procedures, triggers, foreign keys, etc... doesn't.  Parts of the
applicationcan be in Java, some in JS, C, C++, Rust, Perl, even COBOL. 
>
> On the other hand, putting so much logic into the database essentially locks you into that RDBMS.
>
>
> On 6/9/23 13:36, Nim Li wrote:
>
> Hello,
>
> Thank you so so much for all the feedback so far.  :D
>
> About this comment:
>
> > "... an application that requires changing the data model does not seem to be well designed...don't allow model
changeby the business logic..." 
>
> I work in a science research faculity.  When researchers start a project, they don't necessary get the full picture
ofwhat they are hoping to achive (yet they may get some ideas about the starting point that allow them to move forward)
By the time they see 40% percent of what they have done, they may start to have a different thought and move towards a
differentdirection, or in some cases, they may spin it off to something different after a certain period of time
Comingwith my Agile Development mindset in the research area, it is common for me to see users changing their
requirementand expectation, with the same buckets for the data.  Yes, there is quite a lot of work to keep the
researchershappy.  ;-) 
>
> I suppose when there is a specific end-goal to achive for a project, a more specific design can be more feasible
basedon the goal.  But when the end-goal is not necessary clear, and/or change-able, I am not exactly clear how we may
drawa black-and-white line to determine a design is good or not (.. and for how long...) 
>
> I imagine one option may be to put less logics and restrictions on the data side, which allows the researchers to
havemore flexible on their end.  But this may not be always feasbile due to the specific protocol of the study.
Perhapsthere may be some other approaches and/or principles to deal with situation like mine? 
>
> My major focus is still on getting more opinions about where to implement the business logics for data processing ...
if you have any thoughts about the design, I would love to hear your thoughts as well. 
>
> Thank you so so much for sharing!
>
> On Fri, Jun 9, 2023 at 12:35 PM Lorusso Domenico <domenico.l76@gmail.com> wrote:
>>
>> Uhm me need to start form 2 concepts:
>>
>> competence
>> Network lag
>>
>> Competence: usually programmers aren't skilled enough about the architectures and the actual needs of each layer.
>> This is a problem, because often programmers try to do something with what he already know (e.g. perform join in
Java....).
>>
>> A correct design requires to identify at least the data logic, the process logic, the business logic and the
presentationlogic. 
>>
>> One of the most important goals of Data logic is to ensure the correctness of data from many point of view (all is
impossible).
>>
>> That involve:
>>
>> audit information
>> bitemporal management
>> strictly definition and verification of data (foreign key, checks, management of compatibility)
>> replicate consistently data for different usage
>> isolate access for actual needs
>> design
>>
>> So an application that requires changing the data model does not seem to be well designed...
>>
>> Network lag
>> The first problem is latency, I must minimize the passage of data over the network.
>> This means, for example, creating a service that allows the caller to choose only the information it needs.
>> But it also means, to get all the information needed in a single call, design asynchronous service, use cache data
physicallynear to the frontend or the middle layer. 
>>
>> Based on these 2 concepts I suggest:
>>
>> develop the Data logic near or inside the database;
>> design powerful and addictive api;
>> don't allow model change by the business logic
>> organize/copy data in jsonb with a powerful json schema to provide coherence through every layer
>> ensure a system to grant ACID features to your process.
>>
>>
>>
>> Il giorno ven 9 giu 2023 alle ore 05:22 Nim Li <mr.nim.li@gmail.com> ha scritto:
>>>
>>> Hello.
>>>
>>> We have a PostgreSQL database with many tables, as well as foreign table, dblink, triggers, functions, indexes,
etc,for managing the business logics of the data within the database.  We also have a custom table for the purpose of
trackingthe slowly changing dimensions (type 2). 
>>>
>>> Currently we are looking into using TypeORM (from Nest JS framework) to connect to the database for creating a BE
thatprovides web service.  Some reasons of using TypeORM are that it can update the database schema without any SQL
codes,works very well with Git, etc.  And from what I am reading, Git seems to work better with TypeORM, rather than
handlingindividual batch files with SQL codes (I still need to find out more about this)  Yet I do not think the ORM
conceptdeals with database specify functions, such as dblink and/or trigger-function, etc, which handles the business
logicsor any ETL automation within the database itself (I should read more about this as well.) 
>>>
>>> Anyway, in our team discussion, I was told that in modern programming concept, the world is moving away from
deployingprogramming logics within the database (eg, by using PL/SQL).  Instead, the proper way should be to deploy all
theprogramming logics to the framework which is used to connect to the database, such as NestJS in our case.  So, all
weneed in a database should be only the schema (managed by ORM), and we should move all the existing business logics
(currentlymanaged by things like the database triggers, functions, dblink, etc.) to the Typescript codes within the
NestJSframework. 
>>>
>>> I wonder if anyone in the community has gone through changes like this?  I mean ... moving the business logics from
PL/SQLwithin the database to the codes in NestJS framework, and reply on only the TypeORM to manage the update of the
databasewithout any SQL codes?  Any thoughts about such a change? 
>>>
>>> Thank you!!
>>>
>>
>>
>> --
>> Domenico L.
>>
>> per stupire mezz'ora basta un libro di storia,
>> io cercai di imparare la Treccani a memoria... [F.d.A.]
>
>
> --
> Born in Arizona, moved to Babylonia.

Re: Question about where to deploy the business logics for data processing

От

Lorusso Domenico

Дата:

12 июня 2023 г., 15:15:08

Hi Nim,

well this is a very particular scenario.

In a few words, these projects will never go live for production purposes, but just to verify some hypotheses.

In this case, could be acceptable to generate schema on the fly, but isn't easy to automatize each aspect related to optimization (partitioning, index and so on).

Coming to your last question, where set the logic of data manipulation, again, in this case, minimize the lan traffic could be your main goal, this means logic inside the DB.

Il giorno ven 9 giu 2023 alle ore 18:34 Lorusso Domenico <domenico.l76@gmail.com> ha scritto:

Uhm me need to start form 2 concepts:
competence
Network lag
Competence: usually programmers aren't skilled enough about the architectures and the actual needs of each layer.
This is a problem, because often programmers try to do something with what he already know (e.g. perform join in Java....).

A correct design requires to identify at least the data logic, the process logic, the business logic and the presentation logic.

One of the most important goals of Data logic is to ensure the correctness of data from many point of view (all is impossible).

That involve:
audit information
bitemporal management
strictly definition and verification of data (foreign key, checks, management of compatibility)
replicate consistently data for different usage
isolate access for actual needs
design
So an application that requires changing the data model does not seem to be well designed...

Network lag
The first problem is latency, I must minimize the passage of data over the network.
This means, for example, creating a service that allows the caller to choose only the information it needs.
But it also means, to get all the information needed in a single call, design asynchronous service, use cache data physically near to the frontend or the middle layer.

Based on these 2 concepts I suggest:
develop the Data logic near or inside the database;
design powerful and addictive api;
don't allow model change by the business logic
organize/copy data in jsonb with a powerful json schema to provide coherence through every layer
ensure a system to grant ACID features to your process.

Il giorno ven 9 giu 2023 alle ore 05:22 Nim Li <mr.nim.li@gmail.com> ha scritto:
Hello.

We have a PostgreSQL database with many tables, as well as foreign table, dblink, triggers, functions, indexes, etc, for managing the business logics of the data within the database. We also have a custom table for the purpose of tracking the slowly changing dimensions (type 2).

Currently we are looking into using TypeORM (from Nest JS framework) to connect to the database for creating a BE that provides web service. Some reasons of using TypeORM are that it can update the database schema without any SQL codes, works very well with Git, etc. And from what I am reading, Git seems to work better with TypeORM, rather than handling individual batch files with SQL codes (I still need to find out more about this) Yet I do not think the ORM concept deals with database specify functions, such as dblink and/or trigger-function, etc, which handles the business logics or any ETL automation within the database itself (I should read more about this as well.)

Anyway, in our team discussion, I was told that in modern programming concept, the world is moving away from deploying programming logics within the database (eg, by using PL/SQL). Instead, the proper way should be to deploy all the programming logics to the framework which is used to connect to the database, such as NestJS in our case. So, all we need in a database should be only the schema (managed by ORM), and we should move all the existing business logics (currently managed by things like the database triggers, functions, dblink, etc.) to the Typescript codes within the NestJS framework.

I wonder if anyone in the community has gone through changes like this? I mean ... moving the business logics from PL/SQL within the database to the codes in NestJS framework, and reply on only the TypeORM to manage the update of the database without any SQL codes? Any thoughts about such a change?

Thank you!!

--
Domenico L.

per stupire mezz'ora basta un libro di storia,
io cercai di imparare la Treccani a memoria... [F.d.A.]

Domenico L.

per stupire mezz'ora basta un libro di storia,
io cercai di imparare la Treccani a memoria... [F.d.A.]

Re: Question about where to deploy the business logics for data processing

От

Merlin Moncure

Дата:

13 июня 2023 г., 02:49:14

On Thu, Jun 8, 2023 at 10:22 PM Nim Li <mr.nim.li@gmail.com> wrote:

I wonder if anyone in the community has gone through changes like this? I mean ... moving the business logics from PL/SQL within the database to the codes in NestJS framework, and reply on only the TypeORM to manage the update of the database without any SQL codes? Any thoughts about such a change?

Heads up, this is something of a religious database debate in the industry, and you are asking a bunch of database guys what they think about this, and their biases will show in their answers.

Having said that, your developers are utterly, completely, wrong. This is classic, "my technology good, your technology bad", and most of the reasons given to migrate the stack boil down to "I don't know SQL any will do absolutely anything to avoid learning it", to the point of rewriting the entire freaking project into (wait for it) javascript, which might very be the worst possible language for data management.

The arguments supplied are tautological: "SQL is bad because you have to write SQL, which is bad", except for the laughably incorrect "sql can't be checked into git". Guess what, it can (try git -a my_func.sql), and there are many techniques to deal with this.

Now, database deployments are a very complex topic which don't go away when using an ORM. in fact, they often get worse. There are tools which can generate change scripts from database A to A', are there tools to do that for NestJS object models? Is there automatic dependency tracking for them? Next thing you know, they will moving all your primary keys to guids ("scaling problem, solved!") and whining about database performance when you actually get some users.

WHY is writing SQL so bad? Is it slower? faster? Better supported? plpgsql is very highly supported and draws from a higher talent pool than "NestJS". Suppose you want to mix in some python, enterprise java, to your application stack. What then?

ORMs are famously brittle and will often break if any data interaction to the database does not itself go through the ORM, meaning you will be writing and deploying programs to do simple tasks. They are slow, discourage strong data modelling, interact with the database inefficiently, and do not manage concurrent access to data well.

merlin

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Question about where to deploy the business logics for data processing