Обсуждение: iterating over DictRow
Dear all, I cannot currently wrap my head around why I am seeing this: 2020-09-23 23:30:23 gmConnectionPool.py::<module>() #87): psycopg2 module version: 2.8.5 (dt dec pq3 ext) 2020-09-23 23:30:23 gmConnectionPool.py::<module>() #88): PostgreSQL via DB-API module "<module 'psycopg2' from '/usr/lib/python3/dist-packages/psycopg2/__init__.py'>":API level 2.0, thread safety 2, parameter style "pyformat" 2020-09-23 23:30:23 gmConnectionPool.py::<module>() #89): libpq version (compiled in): 120002 2020-09-23 23:30:23 gmConnectionPool.py::<module>() #90): libpq version (loaded now) : 120004 ... 2020-09-23 23:30:28 gmConnectionPool.py::__log_on_first_contact() #445): heed Prime Directive 2020-09-23 23:30:28 gmConnectionPool.py::__log_on_first_contact() #457): PostgreSQL version (numeric): 11.7 ... 2020-09-23 23:31:02 gmMacro.py::_get_variant_diagnoses() #2442): [[260, 'L: Bewegungsapparat', False, 'B', 'clin.health_issue'],[260, 'K: aHT/HI', False, 'B', 'clin.health_issue'], [260, 'D: Verdauung', False, None, 'clin.health_issue']] 2020-09-23 23:31:02 gmMacro.py::_get_variant_diagnoses() #2443): <class 'psycopg2.extras.DictRow'> 2020-09-23 23:31:02 gmMacro.py::_escape_dict() #2851): 260 2020-09-23 23:31:02 gmMacro.py::__getitem__() #880): placeholder handling error: diagnoses:: \item %(diagnosis)s:: Traceback (most recent call last): File "/home/ncq/Projekte/gm/git/gnumed/gnumed/Gnumed/wxpython/gmMacro.py", line 869, in __getitem__ val = handler(data = options) File "/home/ncq/Projekte/gm/git/gnumed/gnumed/Gnumed/wxpython/gmMacro.py", line 2445, in _get_variant_diagnoses return '\n'.join(template % self._escape_dict(dx, none_string = '?', bool_strings = [_('yes'), _('no')]) for dx inselected) File "/home/ncq/Projekte/gm/git/gnumed/gnumed/Gnumed/wxpython/gmMacro.py", line 2445, in <genexpr> return '\n'.join(template % self._escape_dict(dx, none_string = '?', bool_strings = [_('yes'), _('no')]) for dx inselected) File "/home/ncq/Projekte/gm/git/gnumed/gnumed/Gnumed/wxpython/gmMacro.py", line 2852, in _escape_dict val = the_dict[field] File "/usr/lib/python3/dist-packages/psycopg2/extras.py", line 169, in __getitem__ return super(DictRow, self).__getitem__(x) IndexError: list index out of range This line logs a list of DictRow's: 2020-09-23 23:31:02 gmMacro.py::_get_variant_diagnoses() #2442): [[260, 'L: Bewegungsapparat', False, 'B', 'clin.health_issue'],[260, 'K: aHT/HI', False, 'B', 'clin.health_issue'], [260, 'D: Verdauung', False, None, 'clin.health_issue']] as evidenced here: 2020-09-23 23:31:02 gmMacro.py::_get_variant_diagnoses() #2443): <class 'psycopg2.extras.DictRow'> The logging code: _log.debug('%s', selected) _log.debug('%s', type(selected[0])) and then iterates over the list of DictRow's as per list comprehension like so: return '\n'.join(template % self._escape_dict(dx, none_string = '?', bool_strings = [_('yes'), _('no')]) for dx in selected) where _escape_dict() does this: def _escape_dict(self, the_dict=None, date_format='%Y %b %d %H:%M', none_string='', bool_strings=None): data = {} for field in the_dict: _log.debug('%s', field) val = the_dict[field] if val is None: ... if isinstance(val, bool): ... if isinstance(val, datetime.datetime): ... if self.__esc_style in ['latex', 'tex']: ... elif self.__esc_style in ['xetex', 'xelatex']: ... return data Iterating over the_dict should work for lists (produces values) OR dicts (produces keys, under py3). It seems as if the line for field in the_dict: # I hoped to iterate over the keys = column names somehow "decides": "currently we are treating the_dict as a list (after all, it is a DictRow, which _can_ be treated as a list) which then "turns" the dict-style access in val = the_dict[field] into a list access. Somehow I have a feeling that due to the type of "field" -- it being integer (namely, a primary key) -- forces the_dict[field] to attempt a list-style index based access (which fails, due to there not being 260 columns in the SQL query result :-) Solutions that come to mind: _Must_ I use RealDictCursor() to safely avoid this trap ? (or else dict()ify DictRow's as needed) Any insights to be had ? Thanks, Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
On 9/23/20 2:54 PM, Karsten Hilbert wrote: > Dear all, > > I cannot currently wrap my head around why I am seeing this: > > 2020-09-23 23:30:23 gmConnectionPool.py::<module>() #87): psycopg2 module version: 2.8.5 (dt dec pq3 ext) > 2020-09-23 23:30:23 gmConnectionPool.py::<module>() #88): PostgreSQL via DB-API module "<module 'psycopg2' from '/usr/lib/python3/dist-packages/psycopg2/__init__.py'>":API level 2.0, thread safety 2, parameter style "pyformat" > 2020-09-23 23:30:23 gmConnectionPool.py::<module>() #89): libpq version (compiled in): 120002 > 2020-09-23 23:30:23 gmConnectionPool.py::<module>() #90): libpq version (loaded now) : 120004 > ... > 2020-09-23 23:30:28 gmConnectionPool.py::__log_on_first_contact() #445): heed Prime Directive > 2020-09-23 23:30:28 gmConnectionPool.py::__log_on_first_contact() #457): PostgreSQL version (numeric): 11.7 > ... > 2020-09-23 23:31:02 gmMacro.py::_get_variant_diagnoses() #2442): [[260, 'L: Bewegungsapparat', False, 'B', 'clin.health_issue'],[260, 'K: aHT/HI', False, 'B', 'clin.health_issue'], [260, 'D: Verdauung', False, None, 'clin.health_issue']] > 2020-09-23 23:31:02 gmMacro.py::_get_variant_diagnoses() #2443): <class 'psycopg2.extras.DictRow'> > 2020-09-23 23:31:02 gmMacro.py::_escape_dict() #2851): 260 > 2020-09-23 23:31:02 gmMacro.py::__getitem__() #880): placeholder handling error: diagnoses:: \item %(diagnosis)s:: > Traceback (most recent call last): > File "/home/ncq/Projekte/gm/git/gnumed/gnumed/Gnumed/wxpython/gmMacro.py", line 869, in __getitem__ > val = handler(data = options) > File "/home/ncq/Projekte/gm/git/gnumed/gnumed/Gnumed/wxpython/gmMacro.py", line 2445, in _get_variant_diagnoses > return '\n'.join(template % self._escape_dict(dx, none_string = '?', bool_strings = [_('yes'), _('no')]) for dxin selected) > File "/home/ncq/Projekte/gm/git/gnumed/gnumed/Gnumed/wxpython/gmMacro.py", line 2445, in <genexpr> > return '\n'.join(template % self._escape_dict(dx, none_string = '?', bool_strings = [_('yes'), _('no')]) for dxin selected) > File "/home/ncq/Projekte/gm/git/gnumed/gnumed/Gnumed/wxpython/gmMacro.py", line 2852, in _escape_dict > val = the_dict[field] > File "/usr/lib/python3/dist-packages/psycopg2/extras.py", line 169, in __getitem__ > return super(DictRow, self).__getitem__(x) > IndexError: list index out of range > > This line logs a list of DictRow's: > > 2020-09-23 23:31:02 gmMacro.py::_get_variant_diagnoses() #2442): [[260, 'L: Bewegungsapparat', False, 'B', 'clin.health_issue'],[260, 'K: aHT/HI', False, 'B', 'clin.health_issue'], [260, 'D: Verdauung', False, None, 'clin.health_issue']] > > as evidenced here: > > 2020-09-23 23:31:02 gmMacro.py::_get_variant_diagnoses() #2443): <class 'psycopg2.extras.DictRow'> > > The logging code: > > _log.debug('%s', selected) > _log.debug('%s', type(selected[0])) > > and then iterates over the list of DictRow's as per list > comprehension like so: > > return '\n'.join(template % self._escape_dict(dx, none_string = '?', bool_strings = [_('yes'), _('no')]) for dx inselected) > > where _escape_dict() does this: > > def _escape_dict(self, the_dict=None, date_format='%Y %b %d %H:%M', none_string='', bool_strings=None): > data = {} > for field in the_dict: > _log.debug('%s', field) > val = the_dict[field] > if val is None: > ... > if isinstance(val, bool): > ... > if isinstance(val, datetime.datetime): > ... > if self.__esc_style in ['latex', 'tex']: > ... > elif self.__esc_style in ['xetex', 'xelatex']: > ... > return data > > Iterating over the_dict should work for lists (produces > values) OR dicts (produces keys, under py3). It seems as if > the line > > for field in the_dict: # I hoped to iterate over the keys = column names > > somehow "decides": "currently we are treating the_dict as a > list (after all, it is a DictRow, which _can_ be treated as a > list) which then "turns" the dict-style access in > > val = the_dict[field] > > into a list access. > > Somehow I have a feeling that due to the type of "field" > -- it being integer (namely, a primary key) -- forces > > the_dict[field] > > to attempt a list-style index based access (which fails, due > to there not being 260 columns in the SQL query result :-) > > Solutions that come to mind: > > _Must_ I use RealDictCursor() to safely avoid this trap ? > > (or else dict()ify DictRow's as needed) > > Any insights to be had ? Maybe?: con = psycopg2.connect("dbname=production host=localhost user=postgres", cursor_factory=DictCursor) cur = con.cursor() cur.execute("select * from cell_per") the_dict = cur.fetchall() # Example 1 for row in the_dict: for fld in row: print(fld) # Return just field values ... 4 HERB 3.5 19 None 2020-09-22 17:34:31 None postgres herb none HR3 #Example 2 for row in the_dict: for fld in dict(row): print(fld, row[fld]) Showing the field name and field value. ... line_id 4 category HERB 3.5 cell_per 19 ts_insert None ts_update 2020-09-22 17:34:31 user_insert None user_update postgres plant_type herb season none short_category HR3 > > Thanks, > Karsten > -- > GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B > > -- Adrian Klaver adrian.klaver@aklaver.com
Since DictRow's can be indexed by either the field name or the field index (either record["ID"] or record[0]) then I thinkwhat it "should" be is a little ambiguous. But you can always either cast it to a dict, or iterate over .keys() if youwant the field names. for field in dict(the_dict): ... or for field in the_dict.keys(): ...
On 9/24/20 7:53 AM, David Raymond wrote: > Since DictRow's can be indexed by either the field name or the field index (either record["ID"] or record[0]) then I thinkwhat it "should" be is a little ambiguous. But you can always either cast it to a dict, or iterate over .keys() if youwant the field names. > > for field in dict(the_dict): > ... > or > for field in the_dict.keys(): > ... > Except I think the_dict is actually the_list. I was just about to post a reply to my post saying that my use of the_dict was incorrect as I was actually iterating over a list. -- Adrian Klaver adrian.klaver@aklaver.com
On Thu, Sep 24, 2020 at 02:53:40PM +0000, David Raymond wrote: > Since DictRow's can be indexed by either the field name or > the field index (either record["ID"] or record[0]) then I > think what it "should" be is a little ambiguous. I eventually thought so. I was, however, wondering whether I should have _expected_ iteration over a DictRow to be a list iteration. Either of those > for field in dict(the_dict): > for field in the_dict.keys(): work, however (or RealDictCursor, for that matter). Thanks all, Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
On 9/24/20 2:11 PM, Karsten Hilbert wrote: > On Thu, Sep 24, 2020 at 02:53:40PM +0000, David Raymond wrote: > >> Since DictRow's can be indexed by either the field name or >> the field index (either record["ID"] or record[0]) then I >> think what it "should" be is a little ambiguous. > > I eventually thought so. I was, however, wondering whether I > should have _expected_ iteration over a DictRow to be a list > iteration. Maybe. The issue is as below. Thanks to Maurice Meyer's answer to this SO question: https://stackoverflow.com/questions/63200437/psycopg2-extras-dictrow-behaves-differently-using-for-vs-next/63200557#63200557 for pointing me in the right direction. https://github.com/psycopg/psycopg2/blob/fbba461052ae6ebc43167ab69ad91cadb7914c83/lib/extras.py class DictRow(list): ... def __getitem__(self, x): if not isinstance(x, (int, slice)): x = self._index[x] return super(DictRow, self).__getitem__(x) So if the value passed to __getitem__() is a integer or slice it does a list index. So: con = psycopg2.connect("dbname=production host=localhost user=postgres", cursor_factory=DictCursor) cur = con.cursor() cur.execute("select * from cell_per") rs = cur.fetchall() type(rs) list r0 = rs[0] type(r0) psycopg2.extras.DictRow for fld in r0.__iter__(): print(fld) 5 H PREM 3.5 18 None 2004-06-02 15:11:26 None postgres herb none HP3 What you where trying: for fld in r0: print(r0[fld]) KeyError: 'H PREM 3.5' So the 5 works because there are at least 6 items in the record. The 'H PREM 3.5' fails because it is a value not a key. cur.execute("select cell_per from cell_per") rs = cur.fetchall() r0 = rs[0] r0 [18] for fld in r0: print(r0[fld]) IndexError: list index out of range Fails as there are not 19 items in the DictRow. > > Either of those > >> for field in dict(the_dict): >> for field in the_dict.keys(): > > work, however (or RealDictCursor, for that matter). > > Thanks all, > Karsten > -- > GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B > > -- Adrian Klaver adrian.klaver@aklaver.com
Adrian, thanks for tracing my misunderstanding. I should have confirmed my suspicion here: > https://github.com/psycopg/psycopg2/blob/fbba461052ae6ebc43167ab69ad91cadb7914c83/lib/extras.py > class DictRow(list): > def __getitem__(self, x): > if not isinstance(x, (int, slice)): > x = self._index[x] > return super(DictRow, self).__getitem__(x) > > So if the value passed to __getitem__() is a integer or slice it does a list > index. Indeed. I wonder whether that should be mentioned in the psycopg2 docs somewhere as it might be considered to violate the Principle Of Least Astonishment. Or rather, this issues seems unfortunate fallout from python3: In py2 one *had* to do DictRow.keys() to iterate over the keys. In py3 for key in DictRow: is the suggested idiom for that which, however, iterates over DictRow as a list (as it always did). DictRow.keys() still exists on dicts in py3 (and is not deprec(i?)ated to my knowledge) but now returns a memoryview (dict_keys, that is) rather than a list, which brings with it its own set of issues (dict and keys "list" are not independant objects anymore). So, neither using py2's for key in DictRow.keys(): under py3 nor changing to py3's for key in DictRow: # beep: variable wrongly named leads to fully equivalent code. So this is a py2/py3 Gotcha in psycopg2. Not that I complain, but worth a mention in the DictRow docs somewhere ? Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
On 9/25/20 12:48 AM, Karsten Hilbert wrote: > Adrian, > > thanks for tracing my misunderstanding. I should have > confirmed my suspicion here: > >> https://github.com/psycopg/psycopg2/blob/fbba461052ae6ebc43167ab69ad91cadb7914c83/lib/extras.py > >> class DictRow(list): > >> def __getitem__(self, x): >> if not isinstance(x, (int, slice)): >> x = self._index[x] >> return super(DictRow, self).__getitem__(x) >> >> So if the value passed to __getitem__() is a integer or slice it does a list >> index. > > Indeed. I wonder whether that should be mentioned in the > psycopg2 docs somewhere as it might be considered to violate > the Principle Of Least Astonishment. > > Or rather, this issues seems unfortunate fallout from python3: > > In py2 one *had* to do DictRow.keys() to iterate over the > keys. In py3 > > for key in DictRow: > > is the suggested idiom for that which, however, iterates over > DictRow as a list (as it always did). > > DictRow.keys() still exists on dicts in py3 (and is not > deprec(i?)ated to my knowledge) but now returns a memoryview > (dict_keys, that is) rather than a list, which brings with it > its own set of issues (dict and keys "list" are not > independant objects anymore). > > So, neither using py2's > > for key in DictRow.keys(): > > under py3 nor changing to py3's > > for key in DictRow: # beep: variable wrongly named > > leads to fully equivalent code. So this is a py2/py3 Gotcha > in psycopg2. Well you can do, borrowing from previous example: for ky in r0._index: print(ky) line_id category cell_per ts_insert ts_update user_insert user_update plant_type season short_category for ky in r0._index: print(r0[ky]) 5 H PREM 3.5 18 None 2004-06-02 15:11:26 None postgres herb none HP3 Where _index is a substitute for *.keys(). > > Not that I complain, but worth a mention in the DictRow docs > somewhere ? > > Karsten > -- > GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B > > -- Adrian Klaver adrian.klaver@aklaver.com
On Fri, Sep 25, 2020 at 09:06:43AM -0700, Adrian Klaver wrote: > > In py2 one *had* to do DictRow.keys() to iterate over the > > keys. In py3 > > > > for key in DictRow: > > > > is the suggested idiom for that which, however, iterates over > > DictRow as a list (as it always did). > > > > DictRow.keys() still exists on dicts in py3 (and is not > > deprec(i?)ated to my knowledge) but now returns a memoryview > > (dict_keys, that is) rather than a list, which brings with it > > its own set of issues (dict and keys "list" are not > > independant objects anymore). > > > > So, neither using py2's > > > > for key in DictRow.keys(): > > > > under py3 nor changing to py3's > > > > for key in DictRow: # beep: variable wrongly named > > > > leads to fully equivalent code. So this is a py2/py3 Gotcha > > in psycopg2. > > Well you can do, borrowing from previous example: > > for ky in r0._index: > print(ky) > > for ky in r0._index: > print(r0[ky]) > > Where _index is a substitute for *.keys(). Sure, there's a number of solutions to my immediate problem, the fitting of which is for key in dict(DictRow): That's the best fit because my def _escape_dict(the_dict, ...): was inaptly named. It should have been (and now is) def _escape_dict_like(dict_like, ...): within which dict(dict_like) is quite the thing to do despite having to make something a duck which already nearly quacks like one is somehwat unfortunate. Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
On 9/25/20 2:16 PM, Karsten Hilbert wrote: > On Fri, Sep 25, 2020 at 09:06:43AM -0700, Adrian Klaver wrote: > >>> In py2 one *had* to do DictRow.keys() to iterate over the >>> keys. In py3 >>> >>> for key in DictRow: >>> >>> is the suggested idiom for that which, however, iterates over >>> DictRow as a list (as it always did). >>> >>> DictRow.keys() still exists on dicts in py3 (and is not >>> deprec(i?)ated to my knowledge) but now returns a memoryview >>> (dict_keys, that is) rather than a list, which brings with it >>> its own set of issues (dict and keys "list" are not >>> independant objects anymore). >>> >>> So, neither using py2's >>> >>> for key in DictRow.keys(): >>> >>> under py3 nor changing to py3's >>> >>> for key in DictRow: # beep: variable wrongly named >>> >>> leads to fully equivalent code. So this is a py2/py3 Gotcha >>> in psycopg2. >> >> Well you can do, borrowing from previous example: >> >> for ky in r0._index: >> print(ky) >> >> for ky in r0._index: >> print(r0[ky]) >> >> Where _index is a substitute for *.keys(). > > Sure, there's a number of solutions to my immediate problem, > the fitting of which is > > for key in dict(DictRow): > > That's the best fit because my > > def _escape_dict(the_dict, ...): > > was inaptly named. It should have been (and now is) > > def _escape_dict_like(dict_like, ...): > > within which > > dict(dict_like) > > is quite the thing to do despite having to make something a > duck which already nearly quacks like one is somehwat > unfortunate. I'm pretty sure DictRow has had the same behavior for some time so: Are you migrating from Python 2? Or what changed that made this show up? > > Karsten > -- > GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B > > -- Adrian Klaver adrian.klaver@aklaver.com
On Fri, Sep 25, 2020 at 03:19:12PM -0700, Adrian Klaver wrote: > > Sure, there's a number of solutions to my immediate problem, > > the fitting of which is > > > > for key in dict(DictRow): > > > > That's the best fit because my > > > > def _escape_dict(the_dict, ...): > > > > was inaptly named. It should have been (and now is) > > > > def _escape_dict_like(dict_like, ...): > > > > within which > > > > dict(dict_like) > > > > is quite the thing to do despite having to make something a > > duck which already nearly quacks like one is somehwat > > unfortunate. > > I'm pretty sure DictRow has had the same behavior for some time so: > > Are you migrating from Python 2? I did not too long ago. > Or what changed that made this show up? At the very core ? My grandpa was admitted to a hospital :-) Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
On 9/25/20 3:25 PM, Karsten Hilbert wrote: > On Fri, Sep 25, 2020 at 03:19:12PM -0700, Adrian Klaver wrote: > >>> Sure, there's a number of solutions to my immediate problem, >>> the fitting of which is >>> >>> for key in dict(DictRow): >>> >>> That's the best fit because my >>> >>> def _escape_dict(the_dict, ...): >>> >>> was inaptly named. It should have been (and now is) >>> >>> def _escape_dict_like(dict_like, ...): >>> >>> within which >>> >>> dict(dict_like) >>> >>> is quite the thing to do despite having to make something a >>> duck which already nearly quacks like one is somehwat >>> unfortunate. >> >> I'm pretty sure DictRow has had the same behavior for some time so: >> >> Are you migrating from Python 2? > > I did not too long ago. So that explains the error that popped up. > >> Or what changed that made this show up? > > At the very core ? My grandpa was admitted to a hospital :-) Sorry to hear that, it's a trial. > > Karsten > -- > GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B > > -- Adrian Klaver adrian.klaver@aklaver.com