Обсуждение: DB benchmark and pg config file help
Hello List, Not sure to which list I should post (gray lines, and all that), so point me in the right direction if'n it's a problem. I am in the process of learning some of the art/science of benchmarking. Given novnov's recent post about the comparison of MS SQL vs PostgresQL, I felt it time to do a benchmark comparison of sorts for myself . . . more for me and the benchmark learning process than the DB's, but I'm interested in DB's in general, so it's a good fit. (If I find anything interesting/new, I will of course share the results.) Given that, I don't know what I'm doing. :| It seems initially that to do it properly, I have to pick some sort of focus. In other words, shall I benchmark from a standpoint of ACID compliance? Shall I benchmark with functionality in mind? Ease of use/setup? Speed? The latter seems to be done most widely/often, so I suspect it's the easiest standpoint from which to work. Thus, for my initial foray into benchmarking, I'll probably start there. (Unless of course, in any of your wisdom, you can point me in a better direction.) From my less-than-one-month-of-Postgres-list-lurking, I think I need to be aware of at /least/ these items for my benchmarks (in no particular order): * overall speed (obvious) * mitigating factors - DB fits entirely in memory or not (page faults) - DB size - DB versions * DB non-SELECT performance. A common point I see in comparisons of MySQL and PostgresQL is that MySQL is much faster. However, I rarely see anything other than comparison of SELECT. * Query complexity (e.g. criteria, {,inner,outer}-joins) ex. SELECT * FROM aTable; vs SELECT FUNC( var ), ... FROM tables WHERE x IN (<list>) OR y BETWEEN a AND b ... * Queries against tables/columns of varying data types. (BOOLEAN, SMALLINT, TEXT, VARCHAR, etc.) * Queries against tables with/out constraints * Queries against tables with/out triggers {post,pre}-{non,}SELECT * Transactions * Individual and common functions (common use, not necessarily common name, e.g. SUBSTRING/SUBSTR, MAX, COUNT, ORDER BY w/{,o} LIMIT). * Performance under load (e.g. 1, 10, 100 concurrent users), - need to delineate how DB's handle concurrent queries against the same tuples AND against different tuples/tables. * Access method (e.g. Thru C libs, via PHP/Postgres libs, apache/web, command line and stdin scripts) # I don't currently have access to a RAID setup, so this will all have to be on single hard drive for now. Perhaps later I can procure more hardware/situations with which to test. Clearly, this is only a small portion of what I should be aware when I'm benchmarking different DB's in terms of speed/performance, and already it's feeling daunting. Feel free to add any/all items about which I'm not thinking. The other thing: as I'm still a bit of a noob, all my use of the Postgres DB has been -- for the most part -- with the stock configuration. Since I'm planning to run these tests on the same hardware, I can pseudo-rule out hardware-related differences in the results. However, I'm hoping that I can give my stats/assumptions to the list and someone would give me a configuration file that would /most likely/ be best? I can search the documentation/archives, but I'm hoping to get head start and tweak from there. Any and all advice would be /much/ appreciated! Kevin
On 1/17/07, Kevin Hunter <hunteke@earlham.edu> wrote: > Hello List, > > Not sure to which list I should post (gray lines, and all that), so > point me in the right direction if'n it's a problem. > > I am in the process of learning some of the art/science of benchmarking. > Given novnov's recent post about the comparison of MS SQL vs > PostgresQL, I felt it time to do a benchmark comparison of sorts for > myself . . . more for me and the benchmark learning process than the > DB's, but I'm interested in DB's in general, so it's a good fit. (If I > find anything interesting/new, I will of course share the results.) Just remember that all the major commercial databases have anti-benchmark clauses in their license agreements. So, if you decide to publish your results (especially in a formal benchmark), you can't mention the big boys by name. [yes this is cowardice] merlin
On 19 Jan 2007 at 8:45a -0500, Merlin Moncure wrote: > On 1/17/07, Kevin Hunter [hunteke∈earlham.edu] wrote: >> I am in the process of learning some of the art/science of benchmarking. >> Given novnov's recent post about the comparison of MS SQL vs >> PostgresQL, I felt it time to do a benchmark comparison of sorts for >> myself . . . more for me and the benchmark learning process than the >> DB's, but I'm interested in DB's in general, so it's a good fit. (If I >> find anything interesting/new, I will of course share the results.) > > Just remember that all the major commercial databases have > anti-benchmark clauses in their license agreements. So, if you decide > to publish your results (especially in a formal benchmark), you can't > mention the big boys by name. [yes this is cowardice] "Anti-benchmark clauses in the license agreements"?!? Cowardice indeed! <wry_look>So, by implication, I should do my benchmarking with "borrowed" copies, right? No sale, no agreement . . . </wry_look> Seriously though, that would have bitten me. Thank you, I did not know that. Does that mean that I can't publish the results outside of my work/research/personal unit at all? Or do I just need to obscure about which DB I'm talking? (Like Vendor {1,2,3,...} Product). Appreciatively, Kevin
On Fri, Jan 19, 2007 at 09:05:35 -0500, Kevin Hunter <hunteke@earlham.edu> wrote: > > Seriously though, that would have bitten me. Thank you, I did not know > that. Does that mean that I can't publish the results outside of my > work/research/personal unit at all? Or do I just need to obscure about > which DB I'm talking? (Like Vendor {1,2,3,...} Product). Check with your lawyer. Depending on where you are, those clauses may not even be valid.
On 19 Jan 2007 at 10:56a -0600, Bruno Wolff III wrote: > On Fri, Jan 19, 2007 at 09:05:35 -0500, > Kevin Hunter <hunteke@earlham.edu> wrote: >> Seriously though, that would have bitten me. Thank you, I did not know >> that. Does that mean that I can't publish the results outside of my >> work/research/personal unit at all? Or do I just need to obscure about >> which DB I'm talking? (Like Vendor {1,2,3,...} Product). > > Check with your lawyer. Depending on where you are, those clauses may not even > be valid. <grins /> /me = student => no money . . . lawyer? You /are/ my lawyers. ;) Well, sounds like America's legal system/red tape will at least slow my efforts against the non-open source DBs, until I get a chance to find out for sure. I really do appreciate the warnings/heads ups. Kevin BTW: I'm currently located in Richmond, IN, USA. A pin for someone's map. :)