Обсуждение: Contributing with code

Поиск
Список
Период
Сортировка

Contributing with code

От
Antonio Belloni
Дата:
Hi,

This is my first post on the list. My name is Antonio. I am a CS grad student and my field of study is about databases and information retrieval. To get some practical knowledge, I've been studying Postgresql codebase for a while.

Now I would like to contribute with some code and I've chosen the following topic of the TODO list : 

Allow reporting of which objects are in which tablespaces

This item is difficult because a tablespace can contain objects from multiple databases. There is a server-side function that returns the databases which use a specific tablespace, so this requires a tool that will call that function and connect to each database to find the objects in each database for that tablespace.

The topic suggests to use the pg_tablespace_databases to discover which database is using a specific tablespace and then connect to each database and find the objects in the tablespaces.

I checked the code of pg_tablespace_databases, defined in src/backend/utils/adt/misc.c, and see that it uses a much simpler approach : It just reads the tablespaces directories and return the name of the directories that represents databases OIDs. 

Although the function works as expected, I  can see some issues not addressed in the code :

- It does not check for permissions. Any user can execute it;
- It does not check if the platform supports symlinks, which can cause an error because the function is trying to follow the links defined in base/pg_tblspc.

I could use the same approach and write a function that goes down one more level in the directory structure and find the objects' OIDs inside each database directory, but I don't know if this is the better way to do that.

Please, could someone give me feedback and help me with this topic ?

Regards,
Antonio Belloni

Re: Contributing with code

От
Peter Eisentraut
Дата:
On 12/27/17 15:18, Antonio Belloni wrote:
> I checked the code of pg_tablespace_databases, defined in
> src/backend/utils/adt/misc.c, and see that it uses a much simpler
> approach : It just reads the tablespaces directories and return the name
> of the directories that represents databases OIDs. 
> 
> Although the function works as expected, I  can see some issues not
> addressed in the code :
> 
> - It does not check for permissions. Any user can execute it;
> - It does not check if the platform supports symlinks, which can cause
> an error because the function is trying to follow the links defined in
> base/pg_tblspc.
> 
> I could use the same approach and write a function that goes down one
> more level in the directory structure and find the objects' OIDs inside
> each database directory, but I don't know if this is the better way to
> do that.

The information of what object is in what tablespace already exists in
the system catalogs, so I don't think we need new ways to discover that
information.  What the todo item referred to, I think, was having a way
to discover that across database boundaries.  But I think that problem
is unsolvable by design, because you can't look into other databases.
Looking into the file system to discover the OIDs of objects might work,
but then you can't do anything with those OIDs without having access to
the respective database to resolve them.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Contributing with code

От
Craig Ringer
Дата:
On 28 December 2017 at 23:05, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
On 12/27/17 15:18, Antonio Belloni wrote:
> I checked the code of pg_tablespace_databases, defined in
> src/backend/utils/adt/misc.c, and see that it uses a much simpler
> approach : It just reads the tablespaces directories and return the name
> of the directories that represents databases OIDs. 
>
> Although the function works as expected, I  can see some issues not
> addressed in the code :
>
> - It does not check for permissions. Any user can execute it;
> - It does not check if the platform supports symlinks, which can cause
> an error because the function is trying to follow the links defined in
> base/pg_tblspc.
>
> I could use the same approach and write a function that goes down one
> more level in the directory structure and find the objects' OIDs inside
> each database directory, but I don't know if this is the better way to
> do that.

The information of what object is in what tablespace already exists in
the system catalogs, so I don't think we need new ways to discover that
information.  What the todo item referred to, I think, was having a way
to discover that across database boundaries.  But I think that problem
is unsolvable by design, because you can't look into other databases.
Looking into the file system to discover the OIDs of objects might work,
but then you can't do anything with those OIDs without having access to
the respective database to resolve them.

Well, it's arguably solveable, if the solution is to allow at least limited cross-database access to catalog relations.

I don't see any compelling reason we can't do it for catalogs, because:

- Catalog oids are compiled in and the same across all DBs;

- Catalog structure is also compiled in and the same across all DBs, using Form_ structs and constants usable with fastgetattr

- We already have two relmapper instances (shared and per-db), so we should be able to instantiate one for the other DB's per-db relmapper and read its relation mapping to get oid-to-relfilenode mappings for nailed relations.

- Non-shared non-nailed relfilenodes could be discovered by reading the other DB's pg_class. You couldn't use relcache lookups, but that shouldn't matter.

The same is true for catalog indexes.
 
We can safely assume that all type oids, etc, that appear in catalog relations are the same in our connected DB as the other DB, so it's safe to use the syscache when reading "foreign" catalog relations.

It's nowhere near as practical to read *user* tables from another DB, since we'd need a way to switch the active syscache and relcache, among numerous other things. But all the restrictions we place on catalogs mean I think cross-db catalog reads should actually be practical, at least as far as reading the other db's pg_class and relmapper goes. Most of the juicy bits are hardcoded.

AFAICS it'd be necessary to extend the relmapper interface to let you instantiate new maps by dboid and pass them explicitly to the various lookup functions.  Relmapper invalidations would be a problem, I don't see how it'd be practical to handle them, you'd have to be able to register as interested in another DB's invals somehow. And you'd have no locking. So you could probably only use it for things like this - mapping relfilenodes to oids/relnames for other DBs.

Worth the effort? Probably not. But potentially fun.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Contributing with code

От
Robert Haas
Дата:
On Thu, Dec 28, 2017 at 5:42 PM, Craig Ringer <craig@2ndquadrant.com> wrote:
> Well, it's arguably solveable, if the solution is to allow at least limited
> cross-database access to catalog relations.

I think this is a bad idea.  It's bound to add complexity and
fragility to the system and I don't think it's worth it.

Also, let's delete the TODO list.  People keep using it as a source of
project ideas, and that's bad.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


TODO list (was Re: Contributing with code)

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> Also, let's delete the TODO list.  People keep using it as a source of
> project ideas, and that's bad.

If we're not going to maintain/curate it properly, I agree it's not
worth keeping it around.  But I'd rather see somebody put some effort
into it ...

            regards, tom lane


Re: TODO list (was Re: Contributing with code)

От
Stephen Frost
Дата:
Tom, Robert, all,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > Also, let's delete the TODO list.  People keep using it as a source of
> > project ideas, and that's bad.
>
> If we're not going to maintain/curate it properly, I agree it's not
> worth keeping it around.  But I'd rather see somebody put some effort
> into it ...

I don't see anything particularly wrong with the specific item chosen
off of the todo list- it'd be nice to be able to tell what objects exist
in a tablespace even if they're in other databases.

The todo entry even talks about why it's difficult to do and what the
expected way to go about doing it is (that is, connect to each database
that has objects in the tablespace and query it to find out what's in
the tablespace).  Craig's suggestion is an interesting alternative way
though and I'm not sure that it'd be all that bad, but it would be
limited to catalog tables.

If we'd extend the system to allow transparent usage of postgres_fdw to
access other databases which are part of the same cluster, then this
could end up being much simpler (eg: select * from
otherdatabase.pg_catalog.pg_class ...).

Thanks!

Stephen

Вложения

Re: TODO list (was Re: Contributing with code)

От
"David G. Johnston"
Дата:
On Sun, Dec 31, 2017 at 11:42 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
> Also, let's delete the TODO list.  People keep using it as a source of
> project ideas, and that's bad.

If we're not going to maintain/curate it properly, I agree it's not
worth keeping it around.  But I'd rather see somebody put some effort
into it ...

​It probably needs three sub-sections.  Fist the raw ideas put forth by people not capable of implementation but needing capabilities; these get moved to one of two sections: ideas that have gotten some attention by core that have merit but don't hav​e development interest presently; and one like this that have gotten the some attention and that core doesn't feel would be worth maintaining even if someone was willing to develop it.  We already have this in practice but maybe a bit more formality would help.

I'm not seeing that having it, even if incorrect, does harm.  Those who would develop from it should be using it as a conversation starter and even if it is discovered that the feature is no longer desirable or applicable that conversation will have been worthwhile to that person and probably many others browsing these mailing lists.  I do think that all of the commit-fest managers (and release note writers/reviewers) for the release year could be asked/reminded to at least skim over the ToDo list at the end of their session and see whether anything was addressed by one of the commits that went in during their tenure and remove (or update) them.

A thread started by "hey, lets talk about this ToDo list item" is one that reinforces the value of having such a list - and at worse should result in that particular entry being updated in some fashion.

David J.

Re: Contributing with code

От
Peter Geoghegan
Дата:
On Sun, Dec 31, 2017 at 6:35 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> Also, let's delete the TODO list.  People keep using it as a source of
> project ideas, and that's bad.

It would be great if the TODO list was deleted IMV.

Choosing the right project is one of the hardest and most important
parts of successfully contributing to PostgreSQL. It's usually
essential to understand in detail why the thing that you're thinking
of working on doesn't already exist. The TODO list seems to suggest
almost the opposite, and as such is a trap for inexperienced hackers.

-- 
Peter Geoghegan


Re: TODO list (was Re: Contributing with code)

От
Peter Geoghegan
Дата:
On Sun, Dec 31, 2017 at 10:42 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> If we're not going to maintain/curate it properly, I agree it's not
> worth keeping it around.  But I'd rather see somebody put some effort
> into it ...

If somebody was going to resolve to put some effort into maintaining
it to a high standard then it probably would have happened already.
The fact that it hasn't happened tells us plenty.

-- 
Peter Geoghegan


Re: Contributing with code

От
Craig Ringer
Дата:
On 1 January 2018 at 03:27, Peter Geoghegan <pg@bowt.ie> wrote:
On Sun, Dec 31, 2017 at 6:35 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> Also, let's delete the TODO list.  People keep using it as a source of
> project ideas, and that's bad.

It would be great if the TODO list was deleted IMV.

Choosing the right project is one of the hardest and most important
parts of successfully contributing to PostgreSQL. It's usually
essential to understand in detail why the thing that you're thinking
of working on doesn't already exist. The TODO list seems to suggest
almost the opposite, and as such is a trap for inexperienced hackers.

I don't entirely agree. It's a useful place to look for "are there other things related to the thing I'm contemplating doing" and "has anyone tried this? where did they get stuck?"

I'd rather rename it the "stuck, hard and abandoned projects list" ;) 

For example I try to keep track of protocol-related stuff there, so that if/when we do a real protocol revision we don't fail to consider things that have come up and since been forgotten.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: TODO list (was Re: Contributing with code)

От
Robert Haas
Дата:
On Sun, Dec 31, 2017 at 2:31 PM, Peter Geoghegan <pg@bowt.ie> wrote:
> On Sun, Dec 31, 2017 at 10:42 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> If we're not going to maintain/curate it properly, I agree it's not
>> worth keeping it around.  But I'd rather see somebody put some effort
>> into it ...
>
> If somebody was going to resolve to put some effort into maintaining
> it to a high standard then it probably would have happened already.
> The fact that it hasn't happened tells us plenty.

+1, and well said.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: TODO list (was Re: Contributing with code)

От
Robert Haas
Дата:
On Sun, Dec 31, 2017 at 1:51 PM, Stephen Frost <sfrost@snowman.net> wrote:
> The todo entry even talks about why it's difficult to do and what the
> expected way to go about doing it is (that is, connect to each database
> that has objects in the tablespace and query it to find out what's in
> the tablespace).  Craig's suggestion is an interesting alternative way
> though and I'm not sure that it'd be all that bad, but it would be
> limited to catalog tables.

I think it'd be pretty bad.  There's nothing in the system that
actually guarantees that the system catalog structure matches across
every database.  Of course, if you change structural properties, then
the system will probably crash, but attstorage and attacl values could
be different, as could relfilenode, relpages, reltuples,
relallvisible, relfroxenxid, relminmxid, and relacl.  I don't think
it's wise to let this work on the theory that none of that stuff
matters.  Even if that's true, or can be made true with a crowbar,
it's a fragile assumption that might turn false in the future.

> If we'd extend the system to allow transparent usage of postgres_fdw to
> access other databases which are part of the same cluster, then this
> could end up being much simpler (eg: select * from
> otherdatabase.pg_catalog.pg_class ...).

It would probably be better to use background workers for this than
postgres_fdw to avoid for example making sure you can authenticate,
but even then this is a pretty significant body of work for what I'd
consider a fairly marginal benefit.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: TODO list (was Re: Contributing with code)

От
Robert Haas
Дата:
On Sun, Dec 31, 2017 at 2:02 PM, David G. Johnston
<david.g.johnston@gmail.com> wrote:
> It probably needs three sub-sections.  Fist the raw ideas put forth by
> people not capable of implementation but needing capabilities; these get
> moved to one of two sections: ideas that have gotten some attention by core
> that have merit but don't have development interest presently; and one like
> this that have gotten the some attention and that core doesn't feel would be
> worth maintaining even if someone was willing to develop it.  We already
> have this in practice but maybe a bit more formality would help.
>
> I'm not seeing that having it, even if incorrect, does harm.

It causes people to waste time developing features we don't want.

It also has a note at the top saying we think it's complete, but we
don't think that, or I don't think it anyway.

It's basically disinformation.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: TODO list (was Re: Contributing with code)

От
Stephen Frost
Дата:
Robert,

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Sun, Dec 31, 2017 at 1:51 PM, Stephen Frost <sfrost@snowman.net> wrote:
> > The todo entry even talks about why it's difficult to do and what the
> > expected way to go about doing it is (that is, connect to each database
> > that has objects in the tablespace and query it to find out what's in
> > the tablespace).  Craig's suggestion is an interesting alternative way
> > though and I'm not sure that it'd be all that bad, but it would be
> > limited to catalog tables.
>
> I think it'd be pretty bad.  There's nothing in the system that
> actually guarantees that the system catalog structure matches across
> every database.  Of course, if you change structural properties, then
> the system will probably crash, but attstorage and attacl values could
> be different, as could relfilenode, relpages, reltuples,
> relallvisible, relfroxenxid, relminmxid, and relacl.  I don't think
> it's wise to let this work on the theory that none of that stuff
> matters.  Even if that's true, or can be made true with a crowbar,
> it's a fragile assumption that might turn false in the future.

I'm surprised to hear it described as fragile when I would certainly
expect a huge amount of push-back from the developer community if
someone suggested making template1 have a different catalog structure
from the other databases for some reason.  This seems more like an
unwritten rule to me than a 'it happens to work that way' kind of thing.

> > If we'd extend the system to allow transparent usage of postgres_fdw to
> > access other databases which are part of the same cluster, then this
> > could end up being much simpler (eg: select * from
> > otherdatabase.pg_catalog.pg_class ...).
>
> It would probably be better to use background workers for this than
> postgres_fdw to avoid for example making sure you can authenticate,
> but even then this is a pretty significant body of work for what I'd
> consider a fairly marginal benefit.

We could address the authentication issue, I believe, internally such
that users wouldn't have to actually worry about it (perhaps a special
entry in pg_hba.conf and something done through shared memory), which
would definitely be part of the point- what I was trying to get at above
is that we could possibly solve this in a much more general way by
supporting, more-or-less transparently, cross-database queries, which
would be a terribly useful feature, imv.

I agree that it'd be a significant body of work, but we would gain a
great deal more than just the ability to query the catalog of other
databases and that seems quite worthwhile to me.

Thanks!

Stephen

Вложения

Re: Contributing with code

От
Peter Eisentraut
Дата:
On 12/31/17 22:43, Craig Ringer wrote:
> I'd rather rename it the "stuck, hard and abandoned projects list" ;) 

That might actually be useful.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: TODO list (was Re: Contributing with code)

От
"Joshua D. Drake"
Дата:
On 01/02/2018 11:17 AM, Robert Haas wrote:
> On Sun, Dec 31, 2017 at 2:31 PM, Peter Geoghegan <pg@bowt.ie> wrote:
>> On Sun, Dec 31, 2017 at 10:42 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> If we're not going to maintain/curate it properly, I agree it's not
>>> worth keeping it around.  But I'd rather see somebody put some effort
>>> into it ...
>> If somebody was going to resolve to put some effort into maintaining
>> it to a high standard then it probably would have happened already.
>> The fact that it hasn't happened tells us plenty.
> +1, and well said.

O.k. what does it tell us though? Is it a resource issue? Is it a 
barrier of entry issue? What does deleting it solve? What problems (and 
there is a very large obvious one) are caused by deleting it?

Right now, the TODO list is the "only" portal to "potential" things we 
"might" want. If we delete it we are just creating yet another barrier 
of entry to potential contribution. I think we need to consider an 
alternative solution because of that.

Thanks,

JD


-- 
Command Prompt, Inc. || http://the.postgres.company/ || @cmdpromptinc

PostgreSQL centered full stack support, consulting and development.
Advocate: @amplifypostgres || Learn: https://postgresconf.org
*****     Unless otherwise stated, opinions are my own.   *****



Re: Contributing with code

От
Christopher Browne
Дата:
On 2 January 2018 at 17:52, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> On 12/31/17 22:43, Craig Ringer wrote:
>> I'd rather rename it the "stuck, hard and abandoned projects list" ;)
>
> That might actually be useful.

Yep, agreed.  Though it might do better to describe it in *slightly*
more positive terms, and/or to explain the matter in a bit more
detail...

"TODO notes...   This page has collected together a variety of
proposed projects to enhance PostgreSQL.  Newcomers should beware; the
fact that these proposals have not been implemented should be taken to
indicate that they represent approaches that are difficult, have
gotten stuck, or even that have been abandoned due to having drawbacks
exceeding the imagined advantages.  Be sure to discuss proposals on
the pgsql-hackers list before getting too far into implementation,
otherwise there is considerable risk of wasting effort on a patch that
is likely to be rejected."

-- 
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"


Re: TODO list (was Re: Contributing with code)

От
Patrick Krecker
Дата:
On Tue, Jan 2, 2018 at 3:42 PM, Joshua D. Drake <jd@commandprompt.com> wrote:
> On 01/02/2018 11:17 AM, Robert Haas wrote:
>>
>> On Sun, Dec 31, 2017 at 2:31 PM, Peter Geoghegan <pg@bowt.ie> wrote:
>>>
>>> On Sun, Dec 31, 2017 at 10:42 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>>
>>>> If we're not going to maintain/curate it properly, I agree it's not
>>>> worth keeping it around.  But I'd rather see somebody put some effort
>>>> into it ...
>>>
>>> If somebody was going to resolve to put some effort into maintaining
>>> it to a high standard then it probably would have happened already.
>>> The fact that it hasn't happened tells us plenty.
>>
>> +1, and well said.
>
>
> O.k. what does it tell us though? Is it a resource issue? Is it a barrier of
> entry issue? What does deleting it solve? What problems (and there is a very
> large obvious one) are caused by deleting it?
>
> Right now, the TODO list is the "only" portal to "potential" things we
> "might" want. If we delete it we are just creating yet another barrier of
> entry to potential contribution. I think we need to consider an alternative
> solution because of that.
>
> Thanks,
>
> JD
>
>
> --
> Command Prompt, Inc. || http://the.postgres.company/ || @cmdpromptinc
>
> PostgreSQL centered full stack support, consulting and development.
> Advocate: @amplifypostgres || Learn: https://postgresconf.org
> *****     Unless otherwise stated, opinions are my own.   *****
>
>

As a person looking to become a postgres contributor, perhaps I can
offer some perspective on this. I think there is value in providing
*some* starting point for new contributors in the form of concrete
problems to solve. The value I hope to extract from the time spent on
my first feature comes mostly from the learning experience and not
from the acceptance of the feature itself. I would not be upset if my
work was never accepted as long as I understand why. I expect most
people picking features at random from a TODO list would have a
similar outlook on their first contribution.


Re: Contributing with code

От
Noah Misch
Дата:
On Tue, Jan 02, 2018 at 05:52:37PM -0500, Peter Eisentraut wrote:
> On 12/31/17 22:43, Craig Ringer wrote:
> > I'd rather rename it the "stuck, hard and abandoned projects list" ;) 
> 
> That might actually be useful.

+1.  When I do refer to a TODO entry, it's usually because the entry bears a
list of threads illustrating the difficulty of a problem.  I like the project
having such lists, but "TODO" is a bad heading for them.


Re: TODO list (was Re: Contributing with code)

От
Stephen Frost
Дата:
Greetings,

* Patrick Krecker (pkrecker@gmail.com) wrote:
> As a person looking to become a postgres contributor, perhaps I can
> offer some perspective on this. I think there is value in providing
> *some* starting point for new contributors in the form of concrete
> problems to solve. The value I hope to extract from the time spent on
> my first feature comes mostly from the learning experience and not
> from the acceptance of the feature itself. I would not be upset if my
> work was never accepted as long as I understand why. I expect most
> people picking features at random from a TODO list would have a
> similar outlook on their first contribution.

I really appreciate this view-point on it and think that it's definitely
a good one to have, though I don't expect everyone to share it.

While I tend to agree that the TODO list is valuable, what I don't think
it has that is really key is any notion around 'level of difficulty' or
'who you can talk to about trying to address this item'.

We had a rather successful Google Summer of Code in 2017 and I don't
think that any of the projects accepted were ever on the TODO list.
In running GSoC last year and reading through all of the documentation
provided by Google about how to run GSoC successfully and what they're
looking for in terms of a project list made me realize that the areas I
mention above are really key to success.

The projects proposed through GSoC are, necessairly, constrained to what
can be done in a single summer (or, at least, serious progress being
made), but that's actually a good thing because it means that the
projects are sized well- large enough to be interesting and serious
features while small enough for someone relatively new to the community.
The "easy" things tend to be done by regular hackers, the really hard
stuff is too big for a beginner to start with.

To that end, let me just say that I strongly encourage anyone who is
interested in hacking on PG to please feel free to review what's on the
GSoC 2018 page and feel free to reach out to me if you're interested in
working on it, even if you aren't planning to participate in GSoC 2018.
We can come up with other things for GSoC students to work on, I
believe, even if half of the items currently listed there get picked up
on by non-GSoC individuals.

For the regular hackers, when you see something that's really hard on
the TODO list, please mark it as such instead of just deciding that the
list is useless.  If we could quantify the level of effort required,
that'd be even better.  Ultimately, I think it'd be fantastic to have a
wiki page for each item on the TODO list that describes the project
itself, what's been discussed (with links to those discussions) and what
areas could use more exploration.

The way forward, at least from my perspective, is to try and improve the
list, not to throw it away.  I'd love to, one day, be able to go through
the TODO list and find the links to GSoC-sized projects to build the
yearly GSoC page with instead of having to make it an independent
effort.

Lastly, as a reminder, we will be submitting for GSoC 2018 soon, so
please put your GSoC-sized ideas on the GSoC 2018 page:

https://wiki.postgresql.org/wiki/GSoC_2018

and feel free to put them on the TODO list too...

Thanks!

Stephen

Вложения

Re: Contributing with code

От
Stephen Frost
Дата:
Noah, all,

* Noah Misch (noah@leadboat.com) wrote:
> On Tue, Jan 02, 2018 at 05:52:37PM -0500, Peter Eisentraut wrote:
> > On 12/31/17 22:43, Craig Ringer wrote:
> > > I'd rather rename it the "stuck, hard and abandoned projects list" ;) 
> >
> > That might actually be useful.
>
> +1.  When I do refer to a TODO entry, it's usually because the entry bears a
> list of threads illustrating the difficulty of a problem.  I like the project
> having such lists, but "TODO" is a bad heading for them.

Renaming the list is certainly an idea that I could get behind, though I
agree with Chris that we could keep it a bit more positive.  I also
agree that the TODO list tends towards projects that are stuck and hard,
which is why I actually think it wouldn't be that hard to go through and
mark the really hard things as really hard or even create an independent
page for them as I suggested elsewhere on this thread, because (at least
from my perception of it- which could be wrong) the overall list
doesn't actually change that much (see above wrt "stuck, hard and
abandoned" comment).  If we could see our way forward to really making
it clear that these things are stuck, hard or abandoned then maybe we
can make room for new projects to go on the list that are of reasonable
size for newcomers to the project.

Thanks!

Stephen

Вложения

Re: TODO list (was Re: Contributing with code)

От
David Rowley
Дата:
On 3 January 2018 at 13:12, Patrick Krecker <pkrecker@gmail.com> wrote:
> As a person looking to become a postgres contributor, perhaps I can
> offer some perspective on this. I think there is value in providing
> *some* starting point for new contributors in the form of concrete
> problems to solve. The value I hope to extract from the time spent on
> my first feature comes mostly from the learning experience and not
> from the acceptance of the feature itself. I would not be upset if my
> work was never accepted as long as I understand why. I expect most
> people picking features at random from a TODO list would have a
> similar outlook on their first contribution.

I agree with this. It was about 10 years ago when I first looked at
the TODO list and thought "I could do that!", so I did. I got lucky
and it was accepted, but without that list, I'd probably never have
written the patch and probably not be here today.

I'd say, anyone who looks at the TODO list, picks something and
implements it is doing it for a personal challenge and maybe a foot in
a door.  We should not make doing that any harder for people.

For all the others who are implementing things because they need
PostgreSQL to that, then they've not used the TODO list for that so is
not a concern of this thread.

I think the warning that's on the TODO list [1] is a great idea.
Perhaps it should be bigger or more verbose.

[1] https://wiki.postgresql.org/wiki/Todo

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: TODO list (was Re: Contributing with code)

От
Vik Fearing
Дата:
On 01/03/2018 03:50 AM, David Rowley wrote:
> On 3 January 2018 at 13:12, Patrick Krecker <pkrecker@gmail.com> wrote:
>> As a person looking to become a postgres contributor, perhaps I can
>> offer some perspective on this. I think there is value in providing
>> *some* starting point for new contributors in the form of concrete
>> problems to solve. The value I hope to extract from the time spent on
>> my first feature comes mostly from the learning experience and not
>> from the acceptance of the feature itself. I would not be upset if my
>> work was never accepted as long as I understand why. I expect most
>> people picking features at random from a TODO list would have a
>> similar outlook on their first contribution.
>
> I agree with this. It was about 10 years ago when I first looked at
> the TODO list and thought "I could do that!", so I did. I got lucky
> and it was accepted, but without that list, I'd probably never have
> written the patch and probably not be here today.

I, too, came in through the TODO list.
-- 
Vik Fearing                                          +33 6 46 75 15 36
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: TODO list (was Re: Contributing with code)

От
Jeff Janes
Дата:
On Tue, Jan 2, 2018 at 2:48 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Sun, Dec 31, 2017 at 2:02 PM, David G. Johnston
<david.g.johnston@gmail.com> wrote:
> It probably needs three sub-sections.  Fist the raw ideas put forth by
> people not capable of implementation but needing capabilities; these get
> moved to one of two sections: ideas that have gotten some attention by core
> that have merit but don't have development interest presently; and one like
> this that have gotten the some attention and that core doesn't feel would be
> worth maintaining even if someone was willing to develop it.  We already
> have this in practice but maybe a bit more formality would help.
>
> I'm not seeing that having it, even if incorrect, does harm.

It causes people to waste time developing features we don't want.

We don't want them at all, or we just don't want a naive implementation of them?

If we don't want them at all, then surely we should remove those items.  Or move them to the bottom of the page, where there is a section just for such things.  That way people can at least see that it has been considered and rejected.  And for things that we do want, it is nice to have links to the emails of where it was discussed/attempted before.  This is especially useful because searching the archives is very inefficient due to nearly every word you want to search on being a stop word or a very common word with many meanings.  It is much easier to find it on the todo list if it is there than to search the archives.
 

It also has a note at the top saying we think it's complete, but we
don't think that, or I don't think it anyway.

Yeah, I don't care for that, either.

Cheers,

Jeff

Re: TODO list (was Re: Contributing with code)

От
Jeff Janes
Дата:
On Tue, Jan 2, 2018 at 6:42 PM, Joshua D. Drake <jd@commandprompt.com> wrote:
On 01/02/2018 11:17 AM, Robert Haas wrote:
On Sun, Dec 31, 2017 at 2:31 PM, Peter Geoghegan <pg@bowt.ie> wrote:
On Sun, Dec 31, 2017 at 10:42 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
If we're not going to maintain/curate it properly, I agree it's not
worth keeping it around.  But I'd rather see somebody put some effort
into it ...
If somebody was going to resolve to put some effort into maintaining
it to a high standard then it probably would have happened already.
The fact that it hasn't happened tells us plenty.
+1, and well said.

O.k. what does it tell us though? Is it a resource issue? Is it a barrier of entry issue?

Lack of ownership/ruthlessness.  While I can edit it to remove items that don't seem desirable (or comprehensible, or whatever) I'm not likely to do so, unless I'm the one who added it in the first place. Maybe it made more sense or was more important to someone else, like the person who added it.  At one time many of the items didn't have links to the relevant email discussions (or more detailed wiki pages of their own), so those would have been good targets for purging but I think Bruce hunted down and added links for most of them.

Another problem is that wikimedia doesn't have a "git blame" like feature.  I've been frustrated before trying to figure out who added an item and when, so I could research it a bit more. 
 
What does deleting it solve? What problems (and there is a very large obvious one) are caused by deleting it?

Right now, the TODO list is the "only" portal to "potential" things we "might" want. If we delete it we are just creating yet another barrier of entry to potential contribution. I think we need to consider an alternative solution because of that.

There are various "roadmaps" floating around (wiki and elsewhere), but they aren't very prominent or easy to find.  They seem to mostly be minutes from meetings, but you wouldn't know to look for them if you weren't at the meeting.

Cheers,

Jeff

Re: TODO list (was Re: Contributing with code)

От
Alvaro Herrera
Дата:
I think deleting the TODO list is a bad idea -- it contains very useful
pointers to previous discussion on hard topics.  

Jeff Janes wrote:
> On Tue, Jan 2, 2018 at 2:48 PM, Robert Haas <robertmhaas@gmail.com> wrote:

> > It also has a note at the top saying we think it's complete, but we
> > don't think that, or I don't think it anyway.
> 
> Yeah, I don't care for that, either.

This text was added by [1] as saying:
    > This list contains '''some known PostgreSQL bugs and feature
    > requests''' and we hope it contains all such
(before this, it said "all known Pg bugs" which seemed too optimistic,
so the correction was in a good direction) and later amended by [2] to
the current wording ("This list contains '''known PostgreSQL bugs and
feature requests''' and we hope it is complete").  I think [1]'s wording
was okay, but the other one not so much.

I rewrote the text.  I thought about presenting my draft here first, but
hey, it's a wiki.

[1] https://wiki.postgresql.org/index.php?title=Todo&diff=17840&oldid=17669
[2] https://wiki.postgresql.org/index.php?title=Todo&diff=17961&oldid=17877

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: TODO list (was Re: Contributing with code)

От
"Joshua D. Drake"
Дата:
On 01/03/2018 07:49 AM, Jeff Janes wrote:
>
> O.k. what does it tell us though? Is it a resource issue? Is it a 
> barrier of entry issue?
>
> Lack of ownership/ruthlessness.  While I can edit it to remove items 
> that don't seem desirable (or comprehensible, or whatever) I'm not 
> likely to do so, unless I'm the one who added it in the first place. 
> Maybe it made more sense or was more important to someone else, like 
> the person who added it.  At one time many of the items didn't have 
> links to the relevant email discussions (or more detailed wiki pages 
> of their own), so those would have been good targets for purging but I 
> think Bruce hunted down and added links for most of them.
>
> Another problem is that wikimedia doesn't have a "git blame" like 
> feature.  I've been frustrated before trying to figure out who added 
> an item and when, so I could research it a bit more.

It seems to me that this is rather easily solved with an issue tracker 
or enhancement to the commitfest app. With an issue tracker we just set 
a status of "wishlist" and then set up a simple public report that 
allows people to see wishlist items. Plus we get the benefit of an issue 
tracker.

The commitfest app could work in a similar fashion. The commitfest app 
could be replaced by an issue tracker to, but we as a community tend to 
suffer from NIH, so it might be less of a battle to just adjust the 
commitfest app.

Either one of these options allows to consolidation of tools as well as 
information portals both of which would help the community.

Heck we could go a step further and actually allow (authenticated) 
voting on various features. This would provide the community the ability 
to more easily interact with -hackers on various features that would be 
desirable. (I am not suggesting that the voting dictate our direction 
but to allow feedback on new and interesting ideas that show community 
support)

Of course that takes resources. There have been discussions of getting 
an issue tracker in the past but the priority has appeared to drop to 
the wayside.

Thanks,

JD

-- 
Command Prompt, Inc. || http://the.postgres.company/ || @cmdpromptinc

PostgreSQL centered full stack support, consulting and development.
Advocate: @amplifypostgres || Learn: https://postgresconf.org
*****     Unless otherwise stated, opinions are my own.   *****



Re: TODO list (was Re: Contributing with code)

От
Peter Eisentraut
Дата:
On 1/3/18 11:10, Alvaro Herrera wrote:
> This text was added by [1] as saying:
>     > This list contains '''some known PostgreSQL bugs and feature
>     > requests''' and we hope it contains all such
> (before this, it said "all known Pg bugs" which seemed too optimistic,
> so the correction was in a good direction) and later amended by [2] to
> the current wording ("This list contains '''known PostgreSQL bugs and
> feature requests''' and we hope it is complete").  I think [1]'s wording
> was okay, but the other one not so much.
> 
> I rewrote the text.  I thought about presenting my draft here first, but
> hey, it's a wiki.

I went through it just now as well.  I have cleaned up the formatting a
bit and removed some entries that were done.

It's not bad really, as long as the intro text is more appropriate.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: TODO list (was Re: Contributing with code)

От
Alvaro Herrera
Дата:
Joshua D. Drake wrote:

> Heck we could go a step further and actually allow (authenticated) voting on
> various features. This would provide the community the ability to more
> easily interact with -hackers on various features that would be desirable.

There's already https://postgresql.uservoice.com/forums/21853-general
which seems to work pretty well.  Here's the list of completed items:
https://postgresql.uservoice.com/forums/21853-general?status_id=124172

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: TODO list (was Re: Contributing with code)

От
"Joshua D. Drake"
Дата:
On 01/03/2018 09:00 AM, Alvaro Herrera wrote:
> Joshua D. Drake wrote:
>
> There's already https://postgresql.uservoice.com/forums/21853-general
> which seems to work pretty well.  Here's the list of completed items:
> https://postgresql.uservoice.com/forums/21853-general?status_id=124172

Well that's interesting, I have never heard of that. Is that something 
we want to promote more?

JD

>

-- 
Command Prompt, Inc. || http://the.postgres.company/ || @cmdpromptinc

PostgreSQL centered full stack support, consulting and development.
Advocate: @amplifypostgres || Learn: https://postgresconf.org
*****     Unless otherwise stated, opinions are my own.   *****



Re: Contributing with code

От
Antonio Belloni
Дата:
Sorry, but I did not want to start a flaming war against the TODO list with my first message. In all the other open source projects I have contributed code, the TODO list is always a start point to newcomers. There's no explicit message in the Postgresql TODO list saying that the projects there are hard, stuck or undesirable. So it's very confusing to a newbie who just want to help and try to learn something in the process.

And now I don't know If I should continue to work on the issue on my first message and post my ideas on the list, or if I should find other ways to contribute, for example, fixing bugs from the bug list.

Regards,
Antonio Belloni


Atenciosamente,
Antonio Belloni
abelloni@rioservice.com
+55 21 3083-1939
+55 21 99327-0200
 
RIO SERVICE | Tecnologia em Movimento
Av. Pastor Martin Luther King Jr. 126 - Grupo 465
Centro Empresarial Shopping Nova América
http://www.rioservice.com


2018-01-03 0:47 GMT-02:00 Stephen Frost <sfrost@snowman.net>:
Noah, all,

* Noah Misch (noah@leadboat.com) wrote:
> On Tue, Jan 02, 2018 at 05:52:37PM -0500, Peter Eisentraut wrote:
> > On 12/31/17 22:43, Craig Ringer wrote:
> > > I'd rather rename it the "stuck, hard and abandoned projects list" ;) 
> >
> > That might actually be useful.
>
> +1.  When I do refer to a TODO entry, it's usually because the entry bears a
> list of threads illustrating the difficulty of a problem.  I like the project
> having such lists, but "TODO" is a bad heading for them.

Renaming the list is certainly an idea that I could get behind, though I
agree with Chris that we could keep it a bit more positive.  I also
agree that the TODO list tends towards projects that are stuck and hard,
which is why I actually think it wouldn't be that hard to go through and
mark the really hard things as really hard or even create an independent
page for them as I suggested elsewhere on this thread, because (at least
from my perception of it- which could be wrong) the overall list
doesn't actually change that much (see above wrt "stuck, hard and
abandoned" comment).  If we could see our way forward to really making
it clear that these things are stuck, hard or abandoned then maybe we
can make room for new projects to go on the list that are of reasonable
size for newcomers to the project.

Thanks!

Stephen

Re: Contributing with code

От
Alvaro Herrera
Дата:
Antonio Belloni wrote:
> Sorry, but I did not want to start a flaming war against the TODO list with
> my first message. In all the other open source projects I have contributed
> code, the TODO list is always a start point to newcomers. There's no
> explicit message in the Postgresql TODO list saying that the projects there
> are hard, stuck or undesirable. So it's very confusing to a newbie who just
> want to help and try to learn something in the process.

Yeah.  In this project, the tendency is that when things are
straightforward, they get implemented rather than documented.  So there
are no tasks that are known and also easy, because they are taken as
soon as somebody thinks of them.  They only get documented once someone
tries to implement one and realizes it was not as easy as it seemed.

> And now I don't know If I should continue to work on the issue on my first
> message and post my ideas on the list, or if I should find other ways to
> contribute, for example, fixing bugs from the bug list.

Fixing bugs is a very good way to learn the system.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services