Обсуждение: [GENERAL] UTF-8 on Postgres wire protocol

Поиск
Список
Период
Сортировка

[GENERAL] UTF-8 on Postgres wire protocol

От
Rui Pacheco
Дата:
I’m toying around with the wire protocol and came across something I don’t understand.

I created a table with two columns, one called “id” and one called “señor”. When I select from that table I get the
listof columns and while its fairly easy to identify the column with the name “id”, I’m not sure how to identify the
othercolumn: 

So this would be the ID column:

  […]
  [7] = 0x69
  [8] = 0x64
  [9] = 0x00
  [10] = 0x00
  [11] = 0x00
  [12] = 0x4f
  [13] = 0x08
  [14] = 0x00
  [15] = 0x01
  [16] = 0x00
  [17] = 0x00
  [18] = 0x00
  [19] = 0x17
  [20] = 0x00
  [21] = 0x04
  [22] = 0xff
  [23] = 0xff
  [24] = 0xff
  [25] = 0xff
  [26] = 0x00
  [27] = 0x00
  […]

  And this señor:
  [47] = 0x01
  [48] = 0x03
  [49] = 0x00
  [50] = 0x00
  [51] = 0x73
  [52] = 0x65
  [53] = 0xc3
  [54] = 0xb1
  [55] = 0x6f
  [56] = 0x72
  [57] = 0x00
  [58] = 0x00
  [59] = 0x00
  [60] = 0x4f
  [61] = 0x08
  [62] = 0x00
  [63] = 0x08
  [64] = 0x00
  [65] = 0x00
  [66] = 0x04
  [67] = 0x13
  [68] = 0xff
  [69] = 0xff
  [70] = 0x00
  [71] = 0x00
  […]

What are the 4 bytes that precede the word señor? In other words, if I were to parse this, how would I know where the
columnname begins and ends? 

Re: [GENERAL] UTF-8 on Postgres wire protocol

От
Michael Paquier
Дата:
On Thu, Dec 22, 2016 at 8:25 AM, Rui Pacheco <rui.pacheco@gmail.com> wrote:
> I’m toying around with the wire protocol and came across something I don’t understand.
>
> I created a table with two columns, one called “id” and one called “señor”. When I select from that table I get the
listof columns and while its fairly easy to identify the column with the name “id”, I’m not sure how to identify the
othercolumn: 
>
> So this would be the ID column:
>
>   […]
>   [7] = 0x69
>   [8] = 0x64

Yes this one maps to "id".

>   And this señor:
>   [47] = 0x01
>   [48] = 0x03
>   [49] = 0x00
>   [50] = 0x00

The string is from here...

>   [51] = 0x73
>   [52] = 0x65
>   [53] = 0xc3
>   [54] = 0xb1
>   [55] = 0x6f
>   [56] = 0x72

To here. And then señor ends.

> What are the 4 bytes that precede the word señor? In other words, if I were to parse this, how would I know where the
columnname begins and ends? 

I am not sure what message you used to query them, but the answer you
are looking for is much likely here:
https://www.postgresql.org/docs/9.6/static/protocol-message-formats.html
https://www.postgresql.org/docs/9.6/static/protocol-message-types.html
If you are looking at a reliable way to re-implement the frontend-side
protocol parsing the information according to those docs is the way to
go.
--
Michael