On Wed, Jul 29, 2020 at 5:29 PM Martijn van Oosterhout
<kleptog@gmail.com> wrote:
>
>
> On Wed, 29 Jul 2020 at 15:40, Julien Rouhaud <rjuju123@gmail.com> wrote:
>>
>> On Wed, Jul 29, 2020 at 2:42 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>> >
>> >
>> > Or, we should extend the existing query normalization to handle also DDL?
>>
>> +1, introducing DDL normalization seems like a better way to go in the
>> long run. Defining what should and shouldn't be normalized can be
>> tricky though.
>
>
> In principle, the only thing that really needs to be normalised is SAVEPOINT/CURSOR names which are essentially
randomstrings which have no effect on the result. Most other stuff is material to the query.
>
> That said, I think "aggregate by tag" has value all by itself. Being able to collapse all CREATE TABLES into a single
linecan be useful in some situations.
There's at least PREPARE TRANSACTION / COMMIT PREPARED / ROLLBACK
PREPARED that should be normalized too. I also don't think that we
really want to have different entries for begin / Begin / BEGIN /
bEgin and similar for many other commands, as the hash is computed
based on the query text.