Re: The tragedy of SQL

Поиск
Список
Период
Сортировка
От Arnold Morein
Тема Re: The tragedy of SQL
Дата
Msg-id 93BF97E0-CB26-44D4-9B98-83BAD7076AF4@mac.com
обсуждение исходный текст
Ответ на Re: The tragedy of SQL  (Ron <ronljohnsonjr@gmail.com>)
Список pgsql-general
My $5.00 on this discussion (quite fascinating and nostalgic actually). This is a little long granted, but it was fun.

As someone who cut their IT in general and programmer teeth in particular on the HP 3000 platform, in the time (and a land) BEFORE SQL, where we had only a single database engine to choose from (IMAGE, a network database), where you accessed data with its hashed key via a strict read/insert/lock before (update/delete) API or the KSAM file engine for non-database work that didn’t have high transaction volume (again, also only via a strict API), SQL at the time was nothing short of a miraculous boon!

None of the above were Relational. Nor was there anything like SQL in the business market. At the time, it was mostly found in academia and some fringe/bleeding edge companies.

Though I understand the concept of the request that started the thread, what irks me is when people make statements like the “tragedy of SQL” or anything else that equates SQL with the concept of a RDBMS. The two are not the same thing.

Let us not forget that SQL is an acronym for Structured Query Language and it is indeed just that: a language and a tool to be used in retrieving and manipulating data organized in the form of a tabular data stream; how it is managed by the underlying engine is irrelevant. By the time I was able to study it in college (after several years of only being to access data via a program I had to write utilizing engine APIs) I found it a great tool for a programmer and yes, even the odd accountant that had some technical aptitude. At the time, a basic SQL interpreter exceeded some of the best "report writer” tools of its day and unlike those tools, you could use SQL to actually change data in the underlying data store. (It’s Dark Majik, Moriarty!) In fact in many ways, it wasn’t marketed as a API abstraction layer (as it has turned into of late) but as a separate add-on tool for a given flavor of file/database engine.

Back on the HP 3000 (a 16-bit machine at that point), the only languages for use (that I had access to) were COBOL, Fortran, and (new at the time) Pascal. Having to write a quick and dirty application just to list data on the screen was required and overly tedious. When SQL became available as a layer between the user and the data store, it actually reduced the programmer workload as simple inquiries could be done by technical if not programming staff. (Also, when migrating from one propriety data source to another, SQL could be exported from one system and executed in another for import.)

On the other hand, from a programmer’s stand point back then, the introduction of the RELATIONAL database, and the cool stuff the engine would do for me with a one time declaration and no coding was amazing and again, a time saver. That there was a SQL interpreter option for the engine was icing on the cake!

HP would later release a SQL interpreter for its then TurboIMAGE and later ALLBASE products. The current CONNX product (now owned by SoftwareAG) provides SQL based access to Adabas data. IBM provides a product to provide access to VSAM data via SQL. Again: an add-on module to a product that wasn’t originally built to a be a RDBMS. 

I think the IT industry did a bit of a disservice to itself, SQL, and the RDBMS concept by coupling them so tightly. (Or maybe they just grew together organically?) The concept of a QL, whether verbose or terse, that allows a human with only a little training to write statements that gives them access to the data they need was a great idea. It still is. But, the plethora of other languages (for data access or other arcane activities) that have appeared in the field over the last 10-15 years is well, boggling; and personally at times seem like a rework only because the author didn’t like the language’s syntax or simply wasn’t a touch typist. (Man am I glad I took typing in high school!)

As stated, SQL is a language/tool that is simply one way to interact with a file/database engine. Though these days, I suspect many vendors prefer to use SQL as the only language they offer because SQL has such a broad use and most people have had to learn it to some degree or other in their collegiate and professional careers. What amazes me is that so many companies and products have gone to such lengths to allow the “simple” language of SQL to become the de facto language to interact with their file/database engines; sometimes not even publishing a non-SQL API at all. (All you get is a C/JDBC driver. Enjoy!)

So back to the thread, given the extensibility of PostgreSQL, I think the concept of a different QL to access the data stored there (or anywhere else for that matter) may be of great use to a specific audience, depending on the environment (read tool) where it is used.  (The biology reference was brilliant: one of my COBOL teachers digressed once that one of the features of the language at its outset was its ability to manipulate multidimensional arrays of a certain depth in order to facilitate work in biology dealing with the tree of life.)

But just where SQL may not be the best way to access a given dataset a certain way, neither is a RDBMS necessarily the best place to store the data of a certain type or structure. Just as Postgre has helped expand the range of data types that can be stored within it, the advent of the NoSQL databases has shown that not every size fits all. (Yet again.)

On the other topic, as a programmer in the business field, I must say that the ORM has saved me years worth of hours of work, especially when coupled with an object-oriented language. Being able to create a hierarchy of extensible and thus reusable classes that allow me to abstract and template my data model has allowed me to get the lower-level tedium of application development done and out of the way much more rapidly than in the past. 

Perfect example: I once had to create an ESB process that synchronized data between an Adabas file on a mainframe, and nearly 30 separate instances of the same FoxPro (FP) for Windows database (no, not Visual FoxPro, go even further back). The only way I was able to do it was a product and its JDBC driver that turned basic SQL into Adabas API calls and the ODBC driver that spoke SQL to FP. The key was a yet another product that ran on Windows 32-bit and would work with the 16-bit FoxPro for Windows driver, turning the Windows workstation into a FP database server. Using the Hibernate ORM and the OO Java language, I was able to model both databases once, connect to all of the instances simultaneously, and even override the generated SQL which was too robust for the FP interpreter with “native” FPSQL and got the job done. How’s that for convoluted?! NONE of that would have been so easy (if not impossible) without the standard that is SQL.

Today, my cocktail of choice is Java with the Hibernate ORM and given that Hibernate supports SQL, HQL (Hibernate Query Language) and the EJB-QL, I think its safe to say that there are a lot of options out there to be had. In fact, the Hibernate folks have extended the original ORM with modules that allow programmers to use the same or similar Hibernate API against many SQL and NoSQL data stores with several more supported by the developer community; and they are open to more. This shows that the concept of the ORM (which started as a bridge between the object oriented and relational worlds) can be utilized in an agnostic way for accessing data from different underlying data stores, using a common/similar API paradigm. They even extended it to work with the Apache Lucene full text index engine (which is neither relational nor what I would call a database, in the traditional use of the word)!

Though I understand the tendency of people from certain disciplines to want to extend their “native language” into the computer realm so that they can describe or access their data using their native nomenclature, I think its safe to say that such a practice can have long term drawbacks. Such obscure constructs have a tendency to wither on the vine as the user audience is narrow with little chance (or reason) for adoption outside of the discipline. Then you end up with code that is so arcane no one can understand let alone maintain it.

Lastly, though the verbosity of languages like COBOL over C cannot be denied, there is something to say about a language whose programs can be read in a cursory manner and still allow the reader to understand the logic, even if only at a high level. (Dare I utter the phrase, “self-documenting”? An intended “feature” of the COBOL language.) I spent 7 years working exclusively with COBOL ’85 (BTW, its definition was last revised in 2014 - that’s 55 years of evolution since its debut in 1959!) and though I liked its “document” like format (was a full-time word processor in an earlier phase of my adult life so I could touch type), I preferred Pascal (less verbose and cleaner (at least until Java came along)).

When I look at a language’s syntax, not only its readability but the ability to easily comprehend what a statement means or what it will do is a critical factor. Not only for its use in a class room but in the real world. If I have to compile a statement in my head, read it symbol by symbol (assembling a stack in my puny brain as I go), having to worry about the presence or absence of a single character that could cancel or otherwise mutate the statement … BOOM! Stack Fault! (I’m looking at you C, or worse: Apple’s Objective-C (puke, cough, gag).) The concept of KISS should be an axiom but I think KISF (Keep It Straight Forward) is just as important. Because no matter how cool a terse line of code may look to you or a co-worker, the poor bastard that comes along later on who has to analyze what you did and document what you didn’t will be cursing your name and progeny.

A language like SQL is unique among its cousins like COBOL and C. It is English-like (which C ain’t), relatively easy to understand, the end result of a where clause is (mostly) easy to comprehend before it is executed, and (despite the annoying dialect differences) easy to understand and write.

I think a community research project to define a new query language and its syntax would be a fun exercise, even if its result is simply an interpreter that calls an API to talk SQL (or Adabas, iODBC, DB2, IMS, Rocket D3, etc.) at an engine. The result might conclude that the concept of the relational model doesn’t fit the underlying needs of the data, but something else does; or something new needs to be invented to store and retrieve the data. (Don’t forget, there’s this thing called “embedded SQL” which is a great idea and helps reduce work on the developer [but an ORM is a better solution].) 

Oh, wait: wasn’t that the impetus for companies that now rule the data world? :)

So, in closing, let’s not beat on SQL too much, ok? Yes, it has its limitations, but I think it is more than incredibly powerful in most of its capabilities. It was designed to fit a need of the time based on logic that was again based on data constructs of the time. And from an interpreter no less, not a compiled language! And though relational databases may no longer be the current “fad” in certain industries or communities, I think the fact that they are so wide spread and continuing to evolve based on our needs should be celebrated, not derided as old or decrepit technology. Especially engines like PostgreSQL (unlike certain others) which is so extensible, I think its coders deserve an award. (Oh wait, they have already received some!)

Peace and Long Life, ya’ll.

P.S.: Yes, the ALTER statement would get you an F in class and possibly fired from a job back in the day. Talk about spaghetti code! It still amazes me that someone allowed that into the language, let alone figured out how to make it work.

P.P.S: Wanna talk about longevity? Check out the wiki page on the HP 3000. HP couldn’t kill it! The user base wouldn’t let go. Now the hardware has been virtualized by at least 2 companies, so the software investment lives on on commodity (read cheap) HW. Can’t say that about “traditional” mainframes! Long Live MPE!

В списке pgsql-general по дате отправления:

Предыдущее
От: Yi Sun
Дата:
Сообщение: pg_upgrade problem as locale difference in data centers
Следующее
От: Stacey Haysler
Дата:
Сообщение: Proposed French Translation of Code of Conduct Policy