Обсуждение: Proposal for Updating CRC32C with AVX-512 Algorithm.

Поиск
Список
Период
Сортировка

Proposal for Updating CRC32C with AVX-512 Algorithm.

От
"Amonson, Paul D"
Дата:
Hi,

Comparing the current SSE4.2 implementation of the CRC32C algorithm in Postgres, to an optimized AVX-512 algorithm [0]
weobserved significant gains. The result was a ~6.6X average multiplier of increased performance measured on 3
differentIntel products. Details below. The AVX-512 algorithm in C is a port of the ISA-L library [1] assembler code. 

Workload call size distribution details (write heavy):
   * Average was approximately around 1,010 bytes per call
   * ~80% of the calls were under 256 bytes
   * ~20% of the calls were greater than or equal to 256 bytes up to the max buffer size of 8192

The 256 bytes is important because if the buffer is smaller, it makes sense fallback to the existing implementation.
Thisis because the AVX-512 algorithm needs a minimum of 256 bytes to operate. 

Using the above workload data distribution,
at 0%    calls < 256 bytes, a 841% improvement on average for crc32c functionality was observed.
at 50%   calls < 256 bytes, a 758% improvement on average for crc32c functionality was observed.
at 90%   calls < 256 bytes, a 44% improvement on average for crc32c functionality was observed.
at 97.6% calls < 256 bytes, the workload's crc32c performance breaks-even.
at 100%  calls < 256 bytes, a 14% regression is seen when using AVX-512 implementation.

The results above are averages over 3 machines, and were measured on: Intel Saphire Rapids bare metal, and using EC2 on
AWScloud: Intel Saphire Rapids (m7i.2xlarge) and Intel Ice Lake (m6i.2xlarge). 

Summary Data (Saphire Rapids bare metal, AWS m7i-2xl, and AWS m6i-2xl):
+---------------------+-------------------+-------------------+-------------------+--------------------+
| Rates in Bytes/us   |     Bare Metal    |    AWS m6i-2xl    |   AWS m7i-2xl     |                    |
| (Larger is Better)  +---------+---------+---------+---------+---------+---------+ Overall Multiplier |
|                     | SSE 4.2 | AVX-512 | SSE 4.2 | AVX-512 | SSE 4.2 | AVX-512 |                    |
+---------------------+---------+---------+---------+---------+---------+---------+--------------------+
| Numbers 256-8192    |  12,046 |  83,196 |   7,471 |  39,965 |  11,867 |  84,589 |        6.62        |
+---------------------+---------+---------+---------+---------+---------+---------+--------------------+
| Numbers 64 - 255    |  16,865 |  15,909 |   9,209 |   7,363 |  12,496 |  10,046 |        0.86        |
+---------------------+---------+---------+---------+---------+---------+---------+--------------------+
                                                    |  Weighted Multiplier [*]    |        1.44        |
                                                    +-----------------------------+--------------------+
There was no evidence of AVX-512 frequency throttling from perf data, which stayed steady during the test.

Feedback on this proposed improvement is appreciated. Some questions:
1) This AVX-512 ISA-L derived code uses BSD-3 license [2]. Is this compatible with the PostgreSQL License [3]? They
bothappear to be very permissive licenses, but I am not an expert on licenses.  
2) Is there a preferred benchmark I should run to test this change?

If licensing is a non-issue, I can post the initial patch along with my Postgres benchmark function patch for further
review.

Thanks,
Paul

[0] https://www.researchgate.net/publication/263424619_Fast_CRC_computation#full-text
[1] https://github.com/intel/isa-l
[2] https://opensource.org/license/bsd-3-clause
[3] https://opensource.org/license/postgresql

[*] Weights used were 90% of requests less than 256 bytes, 10% greater than or equal to 256 bytes.



RE: Proposal for Updating CRC32C with AVX-512 Algorithm.

От
"Amonson, Paul D"
Дата:
Hi, forgive the top-post but I have not seen any response to this post?

Thanks,
Paul

> -----Original Message-----
> From: Amonson, Paul D
> Sent: Wednesday, May 1, 2024 8:56 AM
> To: pgsql-hackers@lists.postgresql.org
> Cc: Nathan Bossart <nathandbossart@gmail.com>; Shankaran, Akash
> <akash.shankaran@intel.com>
> Subject: Proposal for Updating CRC32C with AVX-512 Algorithm.
>
> Hi,
>
> Comparing the current SSE4.2 implementation of the CRC32C algorithm in
> Postgres, to an optimized AVX-512 algorithm [0] we observed significant
> gains. The result was a ~6.6X average multiplier of increased performance
> measured on 3 different Intel products. Details below. The AVX-512 algorithm
> in C is a port of the ISA-L library [1] assembler code.
>
> Workload call size distribution details (write heavy):
>    * Average was approximately around 1,010 bytes per call
>    * ~80% of the calls were under 256 bytes
>    * ~20% of the calls were greater than or equal to 256 bytes up to the max
> buffer size of 8192
>
> The 256 bytes is important because if the buffer is smaller, it makes sense
> fallback to the existing implementation. This is because the AVX-512 algorithm
> needs a minimum of 256 bytes to operate.
>
> Using the above workload data distribution,
> at 0%    calls < 256 bytes, a 841% improvement on average for crc32c
> functionality was observed.
> at 50%   calls < 256 bytes, a 758% improvement on average for crc32c
> functionality was observed.
> at 90%   calls < 256 bytes, a 44% improvement on average for crc32c
> functionality was observed.
> at 97.6% calls < 256 bytes, the workload's crc32c performance breaks-even.
> at 100%  calls < 256 bytes, a 14% regression is seen when using AVX-512
> implementation.
>
> The results above are averages over 3 machines, and were measured on: Intel
> Saphire Rapids bare metal, and using EC2 on AWS cloud: Intel Saphire Rapids
> (m7i.2xlarge) and Intel Ice Lake (m6i.2xlarge).
>
> Summary Data (Saphire Rapids bare metal, AWS m7i-2xl, and AWS m6i-2xl):
> +---------------------+-------------------+-------------------+-------------------+---------
> -----------+
> | Rates in Bytes/us   |     Bare Metal    |    AWS m6i-2xl    |   AWS m7i-2xl     |
> |
> | (Larger is Better)  +---------+---------+---------+---------+---------+---------+
> Overall Multiplier |
> |                     | SSE 4.2 | AVX-512 | SSE 4.2 | AVX-512 | SSE 4.2 | AVX-512 |
> |
> +---------------------+---------+---------+---------+---------+---------+---------+-------
> -------------+
> | Numbers 256-8192    |  12,046 |  83,196 |   7,471 |  39,965 |  11,867 |
> 84,589 |        6.62        |
> +---------------------+---------+---------+---------+---------+---------+---------+-------
> -------------+
> | Numbers 64 - 255    |  16,865 |  15,909 |   9,209 |   7,363 |  12,496 |
> 10,046 |        0.86        |
> +---------------------+---------+---------+---------+---------+---------+---------+-------
> -------------+
>                                                     |  Weighted Multiplier [*]    |        1.44        |
>                                                     +-----------------------------+--------------------+
> There was no evidence of AVX-512 frequency throttling from perf data, which
> stayed steady during the test.
>
> Feedback on this proposed improvement is appreciated. Some questions:
> 1) This AVX-512 ISA-L derived code uses BSD-3 license [2]. Is this compatible
> with the PostgreSQL License [3]? They both appear to be very permissive
> licenses, but I am not an expert on licenses.
> 2) Is there a preferred benchmark I should run to test this change?
>
> If licensing is a non-issue, I can post the initial patch along with my Postgres
> benchmark function patch for further review.
>
> Thanks,
> Paul
>
> [0]
> https://www.researchgate.net/publication/263424619_Fast_CRC_computati
> on#full-text
> [1] https://github.com/intel/isa-l
> [2] https://opensource.org/license/bsd-3-clause
> [3] https://opensource.org/license/postgresql
>
> [*] Weights used were 90% of requests less than 256 bytes, 10% greater than
> or equal to 256 bytes.



Re: Proposal for Updating CRC32C with AVX-512 Algorithm.

От
Daniel Gustafsson
Дата:
> On 17 May 2024, at 18:21, Amonson, Paul D <paul.d.amonson@intel.com> wrote:

> Hi, forgive the top-post but I have not seen any response to this post?

The project is currently in feature-freeze in preparation for the next major
release so new development and ideas are not the top priority right now.
Additionally there is a large developer meeting shortly which many are busy
preparing for.  Excercise some patience, and I'm sure there will be follow-ups
to this once development of postgres v18 picks up.

--
Daniel Gustafsson




RE: Proposal for Updating CRC32C with AVX-512 Algorithm.

От
"Amonson, Paul D"
Дата:
> The project is currently in feature-freeze in preparation for the next major
> release so new development and ideas are not the top priority right now.
> Additionally there is a large developer meeting shortly which many are busy
> preparing for.  Excercise some patience, and I'm sure there will be follow-ups
> to this once development of postgres v18 picks up.

Thanks, understood.

I had our OSS internal team, who are experts in OSS licensing, review possible conflicts between the PostgreSQL license
andthe BSD-Clause 3-like license for the CRC32C AVX-512 code, and they found no issues. Therefore, including the new
licenseinto the PostgreSQL codebase should be acceptable. 

I am attaching the first official patches. The second patch is a simple test function in PostgreSQL SQL, which I used
fortesting and benchmarking. It will not be merged. 

Code Structure Question: While working on this code, I noticed overlaps with runtime CPU checks done in the previous
POPCNTmerged code. I was considering that these checks should perhaps be formalized and consolidated into a single
source/headerfile pair. If this is desirable, where should I place these files? Should it be in "src/port" where they
areused, or in "src/common" where they are available to all (not just the "src/port" tree)? 

Thanks,
Paul


Вложения

Re: Proposal for Updating CRC32C with AVX-512 Algorithm.

От
Tom Lane
Дата:
"Amonson, Paul D" <paul.d.amonson@intel.com> writes:
> I had our OSS internal team, who are experts in OSS licensing, review possible conflicts between the PostgreSQL
licenseand the BSD-Clause 3-like license for the CRC32C AVX-512 code, and they found no issues. Therefore, including
thenew license into the PostgreSQL codebase should be acceptable. 

Maybe you should get some actual lawyers to answer this type of
question.  The Chromium license this code cites is 3-clause-BSD
style, which is NOT compatible: the "advertising" clause is
significant.

In any case, writing copyright notices that are pointers to
external web pages is not how it's done around here.  We generally
operate on the assumption that the Postgres source code will
outlive any specific web site.  Dead links to incidental material
might be okay, but legally relevant stuff not so much.

            regards, tom lane



Re: Proposal for Updating CRC32C with AVX-512 Algorithm.

От
Bruce Momjian
Дата:
On Wed, Jun 12, 2024 at 02:08:02PM -0400, Tom Lane wrote:
> "Amonson, Paul D" <paul.d.amonson@intel.com> writes:
> > I had our OSS internal team, who are experts in OSS licensing, review possible conflicts between the PostgreSQL
licenseand the BSD-Clause 3-like license for the CRC32C AVX-512 code, and they found no issues. Therefore, including
thenew license into the PostgreSQL codebase should be acceptable.
 
> 
> Maybe you should get some actual lawyers to answer this type of
> question.  The Chromium license this code cites is 3-clause-BSD
> style, which is NOT compatible: the "advertising" clause is
> significant.
> 
> In any case, writing copyright notices that are pointers to
> external web pages is not how it's done around here.  We generally
> operate on the assumption that the Postgres source code will
> outlive any specific web site.  Dead links to incidental material
> might be okay, but legally relevant stuff not so much.

Agreed.  The licenses are compatible in the sense that they can be
combined to create a unified work, but they cannot be combined without
modifying the license of the combined work.  You would need to combine
the Postgres and Chrome license for this, and I highly doubt we are
going to be modifying the Postgres for this.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.



Re: Proposal for Updating CRC32C with AVX-512 Algorithm.

От
Andres Freund
Дата:
Hi,

I'm wonder if this isn't going in the wrong direction. We're using CRCs for
something they're not well suited for in my understanding - and are paying a
reasonably high price for it, given that even hardware accelerated CRCs aren't
blazingly fast.

CRCs are used for things like ethernet, iSCSI because they are good at
detecting the kinds of errors encountered, namely short bursts of
bitflips. And the covered data is limited to a fairly small limit.

Which imo makes CRCs a bad choice for WAL. For one, we don't actually expect a
short burst of bitflips, the most likely case is all bits after some point
changing (because only one part of the record made it to disk). For another,
WAL records are *not* limited to a small size, and if anything error detection
becomes more important with longer records (they're likely to be span more
pages / segments).


It's hard to understand, but a nonetheless helpful page is
https://users.ece.cmu.edu/~koopman/crc/crc32.html which lists properties for
crc32c:
https://users.ece.cmu.edu/~koopman/crc/c32/0x8f6e37a0_len.txt
which lists
(0x8f6e37a0; 0x11edc6f41) <=> (0x82f63b78; 0x105ec76f1)
{2147483615,2147483615,5243,5243,177,177,47,47,20,20,8,8,6,6,1,1}| gold | (*op) iSCSI; CRC-32C; CRC-32/4
 

This cryptic notion AFAIU indicates that for our polynomial we can detect 2bit
errors up to a length of 2147483615 bytes, 3 bit errors up to 2147483615, 3
and 4 bit errors up to 5243, 5 and 6 bit errors up to 177, 7/8 bit errors up
to 47.

IMO for our purposes just about all errors are going to be at least at sector
boundaries, i.e. 512 bytes and thus are at least 8 bit large. At that point we
are only guaranteed to find a single-byte error (it'll be common to have
much more) up to a lenght of 47bits. Which isn't a useful guarantee.


With that I perhaps have established that CRC guarantees aren't useful for us.
But not yet why we should use something else: Given that we already aren't
relying on hard guarantees, we could instead just use a fast hash like xxh3.
https://github.com/Cyan4973/xxHash which is fast both for large and small
amounts of data.


Greetings,

Andres Freund



Re: Proposal for Updating CRC32C with AVX-512 Algorithm.

От
Andres Freund
Дата:
Hi,

On 2024-05-01 15:56:08 +0000, Amonson, Paul D wrote:
> Comparing the current SSE4.2 implementation of the CRC32C algorithm in
> Postgres, to an optimized AVX-512 algorithm [0] we observed significant
> gains. The result was a ~6.6X average multiplier of increased performance
> measured on 3 different Intel products. Details below. The AVX-512 algorithm
> in C is a port of the ISA-L library [1] assembler code.
>
> Workload call size distribution details (write heavy):
>    * Average was approximately around 1,010 bytes per call
>    * ~80% of the calls were under 256 bytes
>    * ~20% of the calls were greater than or equal to 256 bytes up to the max buffer size of 8192

This is extremely workload dependent, it's not hard to find workloads with
lots of very small record and very few big ones...  What you observed might
have "just" been the warmup behaviour where more full page writes have to be
written.

There a very frequent call computing COMP_CRC32C over just 20 bytes, while
holding a crucial lock.  If we were to do introduce something like this
AVX-512 algorithm, it'd probably be worth to dispatch differently in case of
compile-time known small lengths.


How does the latency of the AVX-512 algorithm compare to just using the CRC32C
instruction?


FWIW, I tried the v2 patch on my Xeon Gold 5215 workstation, and dies early on
with SIGILL:

Program terminated with signal SIGILL, Illegal instruction.
#0  0x0000000000d5946c in _mm512_clmulepi64_epi128 (__A=..., __B=..., __C=0)
    at /home/andres/build/gcc/master/install/lib/gcc/x86_64-pc-linux-gnu/15/include/vpclmulqdqintrin.h:42
42      return (__m512i) __builtin_ia32_vpclmulqdq_v8di ((__v8di)__A,
(gdb) bt
#0  0x0000000000d5946c in _mm512_clmulepi64_epi128 (__A=..., __B=..., __C=0)
    at /home/andres/build/gcc/master/install/lib/gcc/x86_64-pc-linux-gnu/15/include/vpclmulqdqintrin.h:42
#1  pg_comp_crc32c_avx512 (crc=<optimized out>, data=<optimized out>, length=<optimized out>)
    at ../../../../../home/andres/src/postgresql/src/port/pg_crc32c_avx512.c:163
#2  0x0000000000819343 in ReadControlFile () at
../../../../../home/andres/src/postgresql/src/backend/access/transam/xlog.c:4375
#3  0x000000000081c4ac in LocalProcessControlFile (reset=<optimized out>) at
../../../../../home/andres/src/postgresql/src/backend/access/transam/xlog.c:4817
#4  0x0000000000a8131d in PostmasterMain (argc=argc@entry=85, argv=argv@entry=0x341b08f0)
    at ../../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:902
#5  0x00000000009b53fe in main (argc=85, argv=0x341b08f0) at
../../../../../home/andres/src/postgresql/src/backend/main/main.c:197


Cascade lake doesn't have vpclmulqdq, so we shouldn't be getting here...

This is on an optimied build with meson, with -march=native included in
c_flags.

Relevant configure output:

Checking if "XSAVE intrinsics without -mxsave" : links: NO (cached)
Checking if "XSAVE intrinsics with -mxsave" : links: YES (cached)
Checking if "AVX-512 popcount without -mavx512vpopcntdq -mavx512bw" : links: NO (cached)
Checking if "AVX-512 popcount with -mavx512vpopcntdq -mavx512bw" : links: YES (cached)
Checking if "_mm512_clmulepi64_epi128 ... with -msse4.2 -mavx512vl -mvpclmulqdq" : links: YES
Checking if "x86_64: popcntq instruction" compiles: YES (cached)


Greetings,

Andres Freund



RE: Proposal for Updating CRC32C with AVX-512 Algorithm.

От
"Amonson, Paul D"
Дата:
> -----Original Message-----
> From: Andres Freund <andres@anarazel.de>
> Sent: Wednesday, June 12, 2024 1:12 PM
> To: Amonson, Paul D <paul.d.amonson@intel.com>

> FWIW, I tried the v2 patch on my Xeon Gold 5215 workstation, and dies early
> on with SIGILL:

Nice catch!!!  I was testing the bit for the vpclmulqdq in EBX instead of the correct ECX register. New Patch attached.
Iadded defines to make that easier to see those types of bugs rather than a simple index number. I double checked the
othersas well. 

Paul


Вложения

RE: Proposal for Updating CRC32C with AVX-512 Algorithm.

От
"Amonson, Paul D"
Дата:
> This is extremely workload dependent, it's not hard to find workloads with
> lots of very small record and very few big ones...  What you observed might
> have "just" been the warmup behaviour where more full page writes have to
> be written.

Can you tell me how to avoid capturing this "warm-up" so that the numbers are more accurate?

> There a very frequent call computing COMP_CRC32C over just 20 bytes, while
> holding a crucial lock.  If we were to do introduce something like this
> AVX-512 algorithm, it'd probably be worth to dispatch differently in case of
> compile-time known small lengths.

So are you suggesting that we be able to directly call into the 64/32 bit based algorithm directly from these known
smallbyte cases in the code? I think that we can do that with a separate API being exposed. 

> How does the latency of the AVX-512 algorithm compare to just using the
> CRC32C instruction?

I think I need more information on this one as I am not sure I understand the use case? The same function pointer
indirectmethods are used with or without the AVX-512 algorithm? 

Paul




Re: Proposal for Updating CRC32C with AVX-512 Algorithm.

От
Alvaro Herrera
Дата:
On 2024-Jun-12, Amonson, Paul D wrote:

> +/*-------------------------------------------------------------------------
> + *
> + * pg_crc32c_avx512.c
> + *      Compute CRC-32C checksum using Intel AVX-512 instructions.
> + *
> + * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
> + * Portions Copyright (c) 1994, Regents of the University of California
> + * Portions Copyright (c) 2024, Intel(r) Corporation
> + *
> + * IDENTIFICATION
> + *      src/port/pg_crc32c_avx512.c
> + *
> + *-------------------------------------------------------------------------
> + */

Hmm, I wonder if the "(c) 2024 Intel" line is going to bring us trouble.
(I bet it's not really necessary anyway.)

> +/*******************************************************************
> + * pg_crc32c_avx512(): compute the crc32c of the buffer, where the
> + * buffer length must be at least 256, and a multiple of 64. Based
> + * on:
> + *
> + * "Fast CRC Computation for Generic Polynomials Using PCLMULQDQ
> + * Instruction"
> + *  V. Gopal, E. Ozturk, et al., 2009,
> + *  https://www.researchgate.net/publication/263424619_Fast_CRC_computation#full-text
> + *
> + * This Function:
> + * Copyright 2017 The Chromium Authors
> + * Copyright (c) 2024, Intel(r) Corporation
> + *
> + * Use of this source code is governed by a BSD-style license that can be
> + * found in the Chromium source repository LICENSE file.
> + * https://chromium.googlesource.com/chromium/src/+/refs/heads/main/LICENSE
> + */

And this bit doesn't look good.  The LICENSE file says:

> // Redistribution and use in source and binary forms, with or without
> // modification, are permitted provided that the following conditions are
> // met:
> //
> //    * Redistributions of source code must retain the above copyright
> // notice, this list of conditions and the following disclaimer.
> //    * Redistributions in binary form must reproduce the above
> // copyright notice, this list of conditions and the following disclaimer
> // in the documentation and/or other materials provided with the
> // distribution.
> //    * Neither the name of Google LLC nor the names of its
> // contributors may be used to endorse or promote products derived from
> // this software without specific prior written permission.

The second clause essentially says we would have to add a page to our
"documentation and/or other materials" with the contents of the license
file.

There's good reasons for UCB to have stopped using the old BSD license,
but apparently Google (or more precisely the Chromium authors) didn't
get the memo.


Our fork distributors spent a lot of time scouring out source cleaning
up copyrights, a decade ago or two.  I bet they won't be happy to see
this sort of thing crop up now.

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
"No nos atrevemos a muchas cosas porque son difíciles,
pero son difíciles porque no nos atrevemos a hacerlas" (Séneca)



RE: Proposal for Updating CRC32C with AVX-512 Algorithm.

От
"Amonson, Paul D"
Дата:
> Hmm, I wonder if the "(c) 2024 Intel" line is going to bring us trouble.
> (I bet it's not really necessary anyway.)

Our lawyer agrees, copyright is covered by the "PostgreSQL Global Development Group" copyright line as a contributor.

> And this bit doesn't look good.  The LICENSE file says:
...
> > //    * Redistributions in binary form must reproduce the above
> > // copyright notice, this list of conditions and the following
> > disclaimer // in the documentation and/or other materials provided
> > with the // distribution.
...
> The second clause essentially says we would have to add a page to our
> "documentation and/or other materials" with the contents of the license file.

According to one of Intel’s lawyers, 55 instances of this clause was found when they searched in the PostgreSQL
repository.Therefore, I assume that this obligation has either been satisfied or determined not to apply, given that
thesecond BSD clause already appears in the PostgreSQL source tree. I might have misunderstood the concern, but the
lawyerbelieves this is a non-issue. Could you please provide more clarifying details about the concern?
 

Thanks,
Paul


Вложения

Re: Proposal for Updating CRC32C with AVX-512 Algorithm.

От
Bruce Momjian
Дата:
On Tue, Jun 18, 2024 at 05:14:08PM +0000, Amonson, Paul D wrote:
> > And this bit doesn't look good.  The LICENSE file says:
> ...
> > > //    * Redistributions in binary form must reproduce the above
> > > // copyright notice, this list of conditions and the following
> > > disclaimer // in the documentation and/or other materials provided
> > > with the // distribution.
> ...
> > The second clause essentially says we would have to add a page to our
> > "documentation and/or other materials" with the contents of the license file.
> 
> According to one of Intel’s lawyers, 55 instances of this clause was found when they searched in the PostgreSQL
repository.Therefore, I assume that this obligation has either been satisfied or determined not to apply, given that
thesecond BSD clause already appears in the PostgreSQL source tree. I might have misunderstood the concern, but the
lawyerbelieves this is a non-issue. Could you please provide more clarifying details about the concern?
 

Yes, I can confirm that:

    grep -Rl 'Redistributions in binary form must reproduce' . | wc -l

reports 54;  file list attached.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.

Вложения

Re: Proposal for Updating CRC32C with AVX-512 Algorithm.

От
Bruce Momjian
Дата:
On Tue, Jun 18, 2024 at 01:20:50PM -0400, Bruce Momjian wrote:
> On Tue, Jun 18, 2024 at 05:14:08PM +0000, Amonson, Paul D wrote:
> > > And this bit doesn't look good.  The LICENSE file says:
> > ...
> > > > //    * Redistributions in binary form must reproduce the above
> > > > // copyright notice, this list of conditions and the following
> > > > disclaimer // in the documentation and/or other materials provided
> > > > with the // distribution.
> > ...
> > > The second clause essentially says we would have to add a page to our
> > > "documentation and/or other materials" with the contents of the license file.
> > 
> > According to one of Intel’s lawyers, 55 instances of this clause was found when they searched in the PostgreSQL
repository.Therefore, I assume that this obligation has either been satisfied or determined not to apply, given that
thesecond BSD clause already appears in the PostgreSQL source tree. I might have misunderstood the concern, but the
lawyerbelieves this is a non-issue. Could you please provide more clarifying details about the concern?
 
> 
> Yes, I can confirm that:
> 
>     grep -Rl 'Redistributions in binary form must reproduce' . | wc -l
> 
> reports 54;  file list attached.

I am somewhat embarrassed by this since we made the Intel lawyers find
something that was in our own source code.

First, the "advertizing clause" in the 4-clause license:

     3. All advertising materials mentioning features or use of this
    software must display the following acknowledgement: This product
    includes software developed by the University of California,
    Berkeley and its contributors.

and was disavowed by Berkeley on July 22nd, 1999:

    https://elrc-share.eu/static/metashare/licences/BSD-3-Clause.pdf

While the license we are concerned about does not have this clause, it
does have:

     2. Redistributions in binary form must reproduce the above
    copyright notice, this list of conditions and the following
    disclaimer in the documentation and/or other materials provided
    with the distribution.

I assume that must also include the name of the copyright holder.

I think that means we need to mention The Regents of the University of
California in our copyright notice, which we do.  However several
non-Regents of the University of California copyright holder licenses
exist in our source tree, and accepting this AVX-512 patch would add
another one.  Specifically, I see existing entries for:

    Aaron D. Gifford
    Board of Trustees of the University of Illinois
    David Burren
    Eric P. Allman
    Jens Schweikhardt
    Marko Kreen
    Sun Microsystems, Inc.
    WIDE Project
    
Now, some of these are these names plus Berkeley, and some are just the
names above.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.



Re: Proposal for Updating CRC32C with AVX-512 Algorithm.

От
Bruce Momjian
Дата:
On Tue, Jun 18, 2024 at 02:00:34PM -0400, Bruce Momjian wrote:
> While the license we are concerned about does not have this clause, it
> does have:
> 
>      2. Redistributions in binary form must reproduce the above
>     copyright notice, this list of conditions and the following
>     disclaimer in the documentation and/or other materials provided
>     with the distribution.
> 
> I assume that must also include the name of the copyright holder.
> 
> I think that means we need to mention The Regents of the University of
> California in our copyright notice, which we do.  However several
> non-Regents of the University of California copyright holder licenses
> exist in our source tree, and accepting this AVX-512 patch would add
> another one.  Specifically, I see existing entries for:
> 
>     Aaron D. Gifford
>     Board of Trustees of the University of Illinois
>     David Burren
>     Eric P. Allman
>     Jens Schweikhardt
>     Marko Kreen
>     Sun Microsystems, Inc.
>     WIDE Project
>     
> Now, some of these are these names plus Berkeley, and some are just the
> names above.

In summary, either we are doing something wrong in how we list
copyrights in our documentation, or we don't need to make any changes for
this Intel patch.

Our license is at:

    https://www.postgresql.org/about/licence/

The Intel copyright in the source code is:

     * Copyright 2017 The Chromium Authors
     * Copyright (c) 2024, Intel(r) Corporation
     *
     * Use of this source code is governed by a BSD-style license that can be
     * found in the Chromium source repository LICENSE file.
     * https://chromium.googlesource.com/chromium/src/+/refs/heads/main/LICENSE

and the URL contents are:

    // Copyright 2015 The Chromium Authors
    //
    // Redistribution and use in source and binary forms, with or without
    // modification, are permitted provided that the following conditions are
    // met:
    //
    //    * Redistributions of source code must retain the above copyright
    // notice, this list of conditions and the following disclaimer.
    //    * Redistributions in binary form must reproduce the above
    // copyright notice, this list of conditions and the following disclaimer
    // in the documentation and/or other materials provided with the
    // distribution.
    //    * Neither the name of Google LLC nor the names of its
    // contributors may be used to endorse or promote products derived from
    // this software without specific prior written permission.
    //
    // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
    // "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
    // LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
    // A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
    // OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
    // SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
    // LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
    // DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
    // THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
    // (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
    // OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Google LLC is added to clause three, and I assume Intel is also covered
by this because it is considered "the names of its contributors", maybe?

It would be good to know exactly what, if any, changes the Intel lawyers
want us to make to our license if we accept this patch.

There are also different versions of clause three in our source tree.
The Postgres license only lists the University of California in our
equivalent of clause three, meaning that there are three-clause BSD
licenses in our source tree that reference entities that we don't
reference in the Postgres license.  Oddly, the Postgres license doesn't
even disclaim warranties for the PostgreSQL Global Development Group,
only for Berkeley.

An even bigger issue is that we are distributing 3-clause BSD licensed
software under the Postgres license, which is not the 3-clause BSD
license.  I think we were functioning under the assuption that the
licenses are compatibile, so can be combined, which is true, but I don't
think we can assume the individual licenses can be covered by our one
license, can we?

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.



RE: Proposal for Updating CRC32C with AVX-512 Algorithm.

От
"Amonson, Paul D"
Дата:
> It would be good to know exactly what, if any, changes the Intel lawyers want
> us to make to our license if we accept this patch.

I asked about this and there is nothing Intel requires here license wise. They believe that there is nothing wrong with
includingClause-3 BSD like licenses under the PostgreSQL license. They only specified that for the source file, the
applyinglicense need to be present either as a link (which was previously discouraged in this thread) or the full text.
Pleasenote that I checked and for this specific Chromium license there is not SPDX codename so the entire text is
required.

Thanks,
Paul




Re: Proposal for Updating CRC32C with AVX-512 Algorithm.

От
Bruce Momjian
Дата:
On Tue, Jun 25, 2024 at 05:41:12PM +0000, Amonson, Paul D wrote:
> > It would be good to know exactly what, if any, changes the Intel
> > lawyers want us to make to our license if we accept this patch.
>
> I asked about this and there is nothing Intel requires here license
> wise. They believe that there is nothing wrong with including Clause-3
> BSD like licenses under the PostgreSQL license. They only specified
> that for the source file, the applying license need to be present
> either as a link (which was previously discouraged in this thread)
> or the full text. Please note that I checked and for this specific
> Chromium license there is not SPDX codename so the entire text is
> required.

Okay, that is very interesting.  Yes, we will have no problem
reproducing the exact license text in the source code.  I think we can
remove the license issue as a blocker for this patch.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.