Обсуждение: Mentors needed urgently for SoC & PostgreSQL Student Internships

Поиск
Список
Период
Сортировка

Mentors needed urgently for SoC & PostgreSQL Student Internships

От
Josh Berkus
Дата:
All,

Due to budget constraints, Google needed to cut 50 projects from the 
Summer of Code this year.  We were one of the projects cut (although we 
can re-apply next year).

However, that doesn't mean we won't be working with students this year.

1) Portland State University has generously offered to host 1 or 2 
PostgreSQL-based projects in their GSoC compliment.

2) We can fund two student internships our of PostgreSQL money.  This 
would also allow us to fund some projects which don't fit within the 
parameters of GSoC (wrong schedule, high school student, docs or 
infrastructure rather than code, etc.).

What this all hinges on is getting some really solid mentors who have 
projects they'd like students to work on, and can commit unconditionally 
to having 5 hours a week or more, over a 3-month period, to work with 
the student.

If we don't get at least 4 solid mentor-volunteers by next week, I'll 
drop the whole idea and we'll stop doing internships or SoC entirely.

Note that since the Internships are not required to be project code, we 
can also take student projects to contribute to our WWW infrastructure 
and other areas the project needs some work.

--Josh Berkus  for the Core Team


Re: Mentors needed urgently for SoC & PostgreSQL Student Internships

От
"David E. Wheeler"
Дата:
On Mar 25, 2009, at 2:39 PM, Josh Berkus wrote:

> Note that since the Internships are not required to be project code,  
> we can also take student projects to contribute to our WWW  
> infrastructure and other areas the project needs some work.

God, could someone do the module thing? :->

I'd be happy to mentor in whatever way my limited knowledge would  
help. I wouldn't be of much help to someone working on backend code,  
however.

Best,

David


Re: [HACKERS] Mentors needed urgently for SoC & PostgreSQL Student Internships

От
Gabriele Bartolini
Дата:
Ciao Josh,

Josh Berkus ha scritto:
> What this all hinges on is getting some really solid mentors who have 
> projects they'd like students to work on, and can commit 
> unconditionally to having 5 hours a week or more, over a 3-month 
> period, to work with the student.
Thanks for letting us know. However for this year we (as 2ndQuadrant) 
have just planned to collaborate with some Italian Universities, 
starting from the University of Pisa (I spoke to their IT students last 
Monday). I don't think we can dedicate more time to mentoring in the 
short period (that's a pity, I know). :(

However, thanks again for keeping us informed.

Ciao,
Gabriele

-- Gabriele Bartolini - 2ndQuadrant ItaliaPostgreSQL Training, Services and Supportgabriele.bartolini@2ndQuadrant.it |
www.2ndQuadrant.it



Re: [HACKERS] Mentors needed urgently for SoC & PostgreSQL Student Internships

От
Josh Berkus
Дата:
> Due to budget constraints, Google needed to cut 50 projects from the
> Summer of Code this year. We were one of the projects cut (although we
> can re-apply next year).

Leslie at Google has asked me to clarify this.  We *also* made a mistake 
on our application which disqualified us.

--Josh


Re: [HACKERS] Mentors needed urgently for SoC & PostgreSQL Student Internships

От
"David E. Wheeler"
Дата:
On Mar 29, 2009, at 10:08 PM, Josh Berkus wrote:

>
>> Due to budget constraints, Google needed to cut 50 projects from the
>> Summer of Code this year. We were one of the projects cut (although  
>> we
>> can re-apply next year).
>
> Leslie at Google has asked me to clarify this.  We *also* made a  
> mistake on our application which disqualified us.

What mistake was that??

D



Re: [HACKERS] Mentors needed urgently for SoC & PostgreSQL Student Internships

От
Steven Lembark
Дата:
>> Note that since the Internships are not required to be project code,
>> we can also take student projects to contribute to our WWW
>> infrastructure and other areas the project needs some work.

Would introducing a Duration (i.e., time-series
a'la Date, et al)) data type be considered useful?

I'd be happy to mentor someone doing it instead of
having to write the entire thing myself.

--
Steven Lembark                                            85-09 90th St.
Workhorse Computing                                 Woodhaven, NY, 11421
lembark@wrkhors.com                                      +1 888 359 3508


Re: [HACKERS] Mentors needed urgently for SoC & PostgreSQL Student Internships

От
"David E. Wheeler"
Дата:
On Apr 2, 2009, at 7:20 AM, Steven Lembark wrote:

>>> Note that since the Internships are not required to be project code,
>>> we can also take student projects to contribute to our WWW
>>> infrastructure and other areas the project needs some work.
>
> Would introducing a Duration (i.e., time-series
> a'la Date, et al)) data type be considered useful?
>
> I'd be happy to mentor someone doing it instead of
> having to write the entire thing myself.

+1

David


Re: [HACKERS] Mentors needed urgently for SoC & PostgreSQL Student Internships

От
Josh Berkus
Дата:
On 4/2/09 8:48 AM, David E. Wheeler wrote:
> On Apr 2, 2009, at 7:20 AM, Steven Lembark wrote:
>
>>>> Note that since the Internships are not required to be project code,
>>>> we can also take student projects to contribute to our WWW
>>>> infrastructure and other areas the project needs some work.
>>
>> Would introducing a Duration (i.e., time-series
>> a'la Date, et al)) data type be considered useful?

Jeff Davis has already done a lot of this work; it's on pgFOundry somewhere.

--Josh


Re: [HACKERS] Mentors needed urgently for SoC & PostgreSQL Student Internships

От
Heikki Linnakangas
Дата:
Josh Berkus wrote:
> On 4/2/09 8:48 AM, David E. Wheeler wrote:
>> On Apr 2, 2009, at 7:20 AM, Steven Lembark wrote:
>>
>>>>> Note that since the Internships are not required to be project code,
>>>>> we can also take student projects to contribute to our WWW
>>>>> infrastructure and other areas the project needs some work.
>>>
>>> Would introducing a Duration (i.e., time-series
>>> a'la Date, et al)) data type be considered useful?
> 
> Jeff Davis has already done a lot of this work; it's on pgFOundry 
> somewhere.

The data type itself is quite trivial. It's all the operators that are 
more difficult to implement, and also immensely useful. That part is 
still incomplete. I'd recommend a book called Temporal Data and the 
Relational Model by C.J. Date, Hugh Darwen and Nikos Lorentzos for 
anyone interested in this topic. That book gives a guideline on how the 
data type and operators should behave.

I'd love to see that implemented. I volunteer to mentor if someone wants 
to tackle it.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: [HACKERS] Mentors needed urgently for SoC & PostgreSQL Student Internships

От
Jeff Davis
Дата:
On Thu, 2009-04-02 at 21:19 +0300, Heikki Linnakangas wrote:
> The data type itself is quite trivial. It's all the operators that are 
> more difficult to implement, and also immensely useful. That part is 
> still incomplete.

Can you please let me know what you find lacking (note: the SVN repo is
the most current one)?

I've implemented a pretty standard set of operators, and a GiST opclass
to make things like overlaps, etc., indexable.

I have not yet implemented temporal join.

> I'd recommend a book called Temporal Data and the 
> Relational Model by C.J. Date, Hugh Darwen and Nikos Lorentzos for 
> anyone interested in this topic. That book gives a guideline on how the 
> data type and operators should behave.

Agreed! That is a _very_ good book, and it's what I based my PERIOD type
on (I used to call it t_interval because I agree with Date that's a
better word -- but the conflict with SQL was too great so I changed it).

> I'd love to see that implemented. I volunteer to mentor if someone wants 
> to tackle it.

A big open question is whether we do new syntax, and if so, what. A lot
of the literature for temporal types out there (from people basing their
suggestions on SQL, like Snodgrass, et al., not C.J. Date) suggests
syntax extensions which seem pretty specialized and unnecessary to me,
but perhaps convenient.

The only thing I really think needs better syntax is a constructor that
can easily represent [ ), [ ], ( ), ( ] -- i.e. inclusive/exclusive.
Right now I have 4 functions to do that, but it's awkward and overly
verbose.

In a related topic, an index that can implement a non-overlapping
constraint is important to temporal databases. I have done some
implementation work on this already, based on my proposal here:

http://archives.postgresql.org//pgsql-hackers/2008-06/msg00404.php

and I have adjusted my design to address some of the concerns Tom brings
up here:

http://archives.postgresql.org//pgsql-hackers/2008-06/msg00427.php

I already have some code written, so if anyone else is thinking of
working on this please contact me first. I will post my progress in the
next couple weeks.

Regards,Jeff Davis








Re: [HACKERS] Mentors needed urgently for SoC & PostgreSQL Student Internships

От
Heikki Linnakangas
Дата:
Jeff Davis wrote:
> On Thu, 2009-04-02 at 21:19 +0300, Heikki Linnakangas wrote:
>> The data type itself is quite trivial. It's all the operators that are 
>> more difficult to implement, and also immensely useful. That part is 
>> still incomplete.
> 
> Can you please let me know what you find lacking (note: the SVN repo is
> the most current one)?
> 
> I've implemented a pretty standard set of operators, and a GiST opclass
> to make things like overlaps, etc., indexable.
> 
> I have not yet implemented temporal join.

That, and temporal union and difference. You have a union operator, but 
that's not enough for a temporal union, as in:

SELECT 'foo', (10, 20) as when
UNION temporal on when -- imaginary syntax..
SELECT 'foo', (15, 30) as when

->

'foo', (10, 30)


Also, it would be nice to generalize the thing so that it works not only 
with intervals of time, but also floats, integers, numerics etc. The 
concept of an interval is not really tied to timestamps, even though 
that's probably the most common use case in the business world.

>> I'd love to see that implemented. I volunteer to mentor if someone wants 
>> to tackle it.
> 
> A big open question is whether we do new syntax, and if so, what. A lot
> of the literature for temporal types out there (from people basing their
> suggestions on SQL, like Snodgrass, et al., not C.J. Date) suggests
> syntax extensions which seem pretty specialized and unnecessary to me,
> but perhaps convenient.

I can't imagine how you would implement temporal joins and unions 
without syntax extensions. If there is a way, that would be great, 
because that might allow us to implement them without backend changes.

> The only thing I really think needs better syntax is a constructor that
> can easily represent [ ), [ ], ( ), ( ] -- i.e. inclusive/exclusive.
> Right now I have 4 functions to do that, but it's awkward and overly
> verbose.

Can't the input function handle those? Or you could have just one 
constructor with an extra argument indicating whether each end of the 
range is exclusive or inclusive.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: [HACKERS] Mentors needed urgently for SoC & PostgreSQL Student Internships

От
Jeff Davis
Дата:
On Thu, 2009-04-02 at 21:58 +0300, Heikki Linnakangas wrote:
> > I have not yet implemented temporal join.
> 
> That, and temporal union and difference. You have a union operator, but 
> that's not enough for a temporal union, as in:

Ok, so you were talking about the relational operators, not interval
predicates or interval operators. I agree that the relational operators
are non-trivial.

> Also, it would be nice to generalize the thing so that it works not only 
> with intervals of time, but also floats, integers, numerics etc. The 
> concept of an interval is not really tied to timestamps, even though 
> that's probably the most common use case in the business world.

Yeah. I thought about how to do that with typmod, but it doesn't allow
storing an entire OID for the constituent types. It may be possible to
work around that.

> > A big open question is whether we do new syntax, and if so, what. A lot
> > of the literature for temporal types out there (from people basing their
> > suggestions on SQL, like Snodgrass, et al., not C.J. Date) suggests
> > syntax extensions which seem pretty specialized and unnecessary to me,
> > but perhaps convenient.
> 
> I can't imagine how you would implement temporal joins and unions 
> without syntax extensions. If there is a way, that would be great, 
> because that might allow us to implement them without backend changes.

I still didn't know you were talking about relational operators at that
point. Temporal join, union, difference, and also probably table logs
all require syntax (not "require" maybe, but it would help a lot).

The unnecessary syntax I was referring to is the SQL-ish syntax
suggested by Snodgrass, et al, which involves words for things like
"overlaps", which we really don't need.

> > The only thing I really think needs better syntax is a constructor that
> > can easily represent [ ), [ ], ( ), ( ] -- i.e. inclusive/exclusive.
> > Right now I have 4 functions to do that, but it's awkward and overly
> > verbose.
> 
> Can't the input function handle those? Or you could have just one 
> constructor with an extra argument indicating whether each end of the 
> range is exclusive or inclusive.

Constructing from a single string is easy. What happens when you want to
say ( 2009-01-01, now() ], or pass a timestamptz from a table?  Ideas
welcome.

Regards,Jeff Davis



Re: [HACKERS] Mentors needed urgently for SoC & PostgreSQL Student Internships

От
Robert Haas
Дата:
On Thu, Apr 2, 2009 at 2:58 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> Also, it would be nice to generalize the thing so that it works not only
> with intervals of time, but also floats, integers, numerics etc. The concept
> of an interval is not really tied to timestamps, even though that's probably
> the most common use case in the business world.

Suddenly this thread has my undivided attention.

A does-not-overlap operator would be awesome.   A does-not-overlap
index on a column whose value is a range would be awesome beyond
words.

As a simple example, consider an application whose job is to allocate
subnets out of some larger IP block.  Today, I typically handle cases
of this type by defining triggers that generate the ends of the range
and all the intermediate values and insert them into a side table with
a unique index.  It's really the pits, and unworkable for large
ranges.

...Robert