Re: PostgreSQL Developer meeting minutes up

Поиск
Список
Период
Сортировка
От Markus Wanner
Тема Re: PostgreSQL Developer meeting minutes up
Дата
Msg-id 4A241340.9080903@bluegap.ch
обсуждение исходный текст
Ответ на Re: PostgreSQL Developer meeting minutes up  (Aidan Van Dyk <aidan@highrise.ca>)
Ответы Re: PostgreSQL Developer meeting minutes up  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: PostgreSQL Developer meeting minutes up  (Marko Kreen <markokr@gmail.com>)
Список pgsql-hackers
Hi,

a newish conversion with cvs2git is available to check here:

  git://www.bluegap.ch/

(it's not incremental and will only stay for a few days)


For everybody interested, please check the committer names and emails.
I'm missing the names and email addresses for these committers:

    'barry' : ('barry??', ''),
    'dennis' : ('Dennis??', ''),
    'inoue' : ('inoue??', ''),
    'jurka' : ('jurka??', ''),
    'pjw' : ('pjw??', ''),

And I'm guessing that 'peter' is the same as 'petere':

    'peter' : ('Peter Eisentraut (?)', 'peter_e@gmx.net'),


I've compared all branch heads and all tags with a cvs checkout. The
only differences are keyword expansion errors. Most commonly the RCS
version "1.1" is used in the resulting git repository, instead of
version "1.1.1.1". This also leads to getting dates wrong ($Date keyword).

I'm unsure on how to test Tom's requirement that every commit and its
log message is included in the resulting git repository. Feel free to
clone and inspect the mentioned git repository and propose improvements
on the cvs2git options used.

Aidan Van Dyk wrote:
> Yes, but the point is you want an exact replica of CVS right?  You're
> git repo should have $PostgreSQL$ and the cvs export/checkout (you do
> use -kk right) should also have $PostgreSQL$.

No, I'm testing against cvs checkout, as that's what everybody is used to.

> But it's important, because on *some* files you *do* want expanded
> "keywords" (like the $OpenBSD ... Exp $.  One of the reasons pg CVS went
> to the $PostgreSQL$ keyword (I'm guessing) was so they could explictly
> de-couple them from other keywords that they didn't want munging on.

I don't care half as much about the keyword expansion stuff - that's
doomed to disappear anyway.

What I'm much more interested in is correctness WRT historic contents,
i.e. that git log, git blame, etc.. deliver correct results. That's
certainly harder to check.

In my experience, cvs2svn (or cvs2git) does a pretty decent job at that,
even in case of some corruptions. Plus it offers lots of options to fine
tune the conversion, see the attached configuration I've used.

> So, I wouldn't consider any conversion good unless it had all these:
>
> As well as stuff like:
>     parsecvs-master:src/backend/access/index/genam.c: *       $PostgreSQL$

I disagree here and find it more convenient for the git repository to
keep the "old" RCS versions - as in the source tarballs that got (and
still get) shipped. Just before switching over to git one can (and
should, IMO) remove these tags to avoid confusion.

Regards

Markus Wanner
# (Be in -*- mode: python; coding: utf-8 -*- mode.)

import re

from cvs2svn_lib import config
from cvs2svn_lib import changeset_database
from cvs2svn_lib.common import CVSTextDecoder
from cvs2svn_lib.log import Log
from cvs2svn_lib.project import Project
from cvs2svn_lib.git_revision_recorder import GitRevisionRecorder
from cvs2svn_lib.git_output_option import GitRevisionMarkWriter
from cvs2svn_lib.git_output_option import GitOutputOption
from cvs2svn_lib.revision_manager import NullRevisionRecorder
from cvs2svn_lib.revision_manager import NullRevisionExcluder
from cvs2svn_lib.fulltext_revision_recorder \
     import SimpleFulltextRevisionRecorderAdapter
from cvs2svn_lib.rcs_revision_manager import RCSRevisionReader
from cvs2svn_lib.cvs_revision_manager import CVSRevisionReader
from cvs2svn_lib.checkout_internal import InternalRevisionRecorder
from cvs2svn_lib.checkout_internal import InternalRevisionExcluder
from cvs2svn_lib.checkout_internal import InternalRevisionReader
from cvs2svn_lib.symbol_strategy import AllBranchRule
from cvs2svn_lib.symbol_strategy import AllTagRule
from cvs2svn_lib.symbol_strategy import BranchIfCommitsRule
from cvs2svn_lib.symbol_strategy import ExcludeRegexpStrategyRule
from cvs2svn_lib.symbol_strategy import ForceBranchRegexpStrategyRule
from cvs2svn_lib.symbol_strategy import ForceTagRegexpStrategyRule
from cvs2svn_lib.symbol_strategy import ExcludeTrivialImportBranchRule
from cvs2svn_lib.symbol_strategy import ExcludeVendorBranchRule
from cvs2svn_lib.symbol_strategy import HeuristicStrategyRule
from cvs2svn_lib.symbol_strategy import UnambiguousUsageRule
from cvs2svn_lib.symbol_strategy import HeuristicPreferredParentRule
from cvs2svn_lib.symbol_strategy import SymbolHintsFileRule
from cvs2svn_lib.symbol_transform import ReplaceSubstringsSymbolTransform
from cvs2svn_lib.symbol_transform import RegexpSymbolTransform
from cvs2svn_lib.symbol_transform import IgnoreSymbolTransform
from cvs2svn_lib.symbol_transform import NormalizePathsSymbolTransform
from cvs2svn_lib.property_setters import AutoPropsPropertySetter
from cvs2svn_lib.property_setters import CVSBinaryFileDefaultMimeTypeSetter
from cvs2svn_lib.property_setters import CVSBinaryFileEOLStyleSetter
from cvs2svn_lib.property_setters import CVSRevisionNumberSetter
from cvs2svn_lib.property_setters import DefaultEOLStyleSetter
from cvs2svn_lib.property_setters import EOLStyleFromMimeTypeSetter
from cvs2svn_lib.property_setters import ExecutablePropertySetter
from cvs2svn_lib.property_setters import KeywordsPropertySetter
from cvs2svn_lib.property_setters import MimeMapper
from cvs2svn_lib.property_setters import SVNBinaryFileKeywordsPropertySetter

Log().log_level = Log.NORMAL
ctx.revision_recorder = SimpleFulltextRevisionRecorderAdapter(
    CVSRevisionReader(cvs_executable=r'cvs'),
    GitRevisionRecorder('cvs2git-tmp/git-blob.dat'),
    )

ctx.revision_excluder = NullRevisionExcluder()

ctx.revision_reader = None

ctx.sort_executable = r'sort'

ctx.trunk_only = False

ctx.cvs_author_decoder = CVSTextDecoder(
    ['ascii', 'latin1'],
    )
ctx.cvs_log_decoder = CVSTextDecoder(
    ['ascii', 'latin1'],
    )
ctx.cvs_filename_decoder = CVSTextDecoder(
    ['ascii', 'latin1'],
    )

ctx.initial_project_commit_message = (
    'Standard project directories initialized by cvs2git.'
    )

ctx.post_commit_message = (
    'This commit was generated by cvs2git to track changes on a CVS '
    'vendor branch.'
    )

ctx.symbol_commit_message = (
    "This commit was manufactured by cvs2git to create %(symbol_type)s "
    "'%(symbol_name)s'."
    )

ctx.decode_apple_single = False

ctx.symbol_info_filename = None

global_symbol_strategy_rules = [
    ExcludeTrivialImportBranchRule(),
    UnambiguousUsageRule(),
    BranchIfCommitsRule(),
    HeuristicStrategyRule(),

    # Convert all ambiguous symbols as branches:
    AllBranchRule(),
    # Convert all ambiguous symbols as tags:
    AllTagRule(),

    # The last rule is here to choose the preferred parent of branches
    # and tags, that is, the line of development from which the symbol
    # sprouts.
    HeuristicPreferredParentRule(),
    ]

ctx.username = 'cvs2git'

ctx.svn_property_setters.extend([
    CVSBinaryFileEOLStyleSetter(),
    CVSBinaryFileDefaultMimeTypeSetter(),
    DefaultEOLStyleSetter(None),
    SVNBinaryFileKeywordsPropertySetter(),
    KeywordsPropertySetter(config.SVN_KEYWORDS_VALUE),
    ExecutablePropertySetter(),
    ])

ctx.tmpdir = r'cvs2git-tmp'

ctx.cross_project_commits = False
ctx.cross_branch_commits = False
ctx.keep_cvsignore = True

ctx.retain_conflicting_attic_files = True

author_transforms={

    'adunstan' : ('Androw Dunstan', 'andrew@dunslane.net'),
    'alvherre' : ('Alvaro Herrera', 'alvherre@commandprompt.com'),
    'barry' : ('barry??', ''),
    'bryanh' : ('Bryan Henderson', 'bryanh@giraffe.netgate.net'),
    'darcy' : ('D\'Arcy J.M. Cain', 'darcy@druid.net'),
    'dennis' : ('Dennis??', ''),
    'heikki' : ('Heikki Linnakangas', 'heikki.linnakangas@enterprisedb.com'),
    'inoue' : ('inoue??', ''),
    'ishii' : ('Tatsuo Ishii', 'ishii@sraoss.co.jp'),
    'joe' : ('Joe Conway', 'mail@joeconway.com'),
    'jurka' : ('jurka??', ''),
    'meskes' : ('Michael Meskes', 'meskes@postgresql.org'),
    'mha': ('Magnus Hagander', 'magnus@hagander.net'),
    'momjian' : ('Bruce Momjian', 'bruce@momjian.us'),
    'neilc' : ('Neil Conway', 'neil.conway@gmail.com'),
    'petere' : ('Peter Eisentraut', 'peter_e@gmx.net'),
    'peter' : ('Peter Eisentraut (?)', 'peter_e@gmx.net'),
    'pjw' : ('pjw??', ''),
    'scrappy' : ('Marc G. Fournier', 'scrappy@postgresql.org'),
    'teodor' : ('Teodor Sigaev', 'teodor@sigaev.ru'),
    'tgl' : ('Tom Lane', 'tgl@sss.pgh.pa.us'),
    'vadim' : ('Vadim B. Mikheev', 'vadim4o@yahoo.com'),
    'wieck' : ('Jan Wieck', 'JanWieck@yahoo.com'),

    'cvs2git' : ('cvs2git', 'admin@postgresql.org'),
    }

# This is the main option that causes cvs2svn to output to git rather
# than Subversion:
ctx.output_option = GitOutputOption(
    'cvs2git-tmp/git-dump.dat',
    GitRevisionMarkWriter(),
    max_merges=None,
    author_transforms=author_transforms,
    )

run_options.profiling = False


changeset_database.use_mmap_for_cvs_item_to_changeset_table = True

run_options.set_project(
    r'../postgresql.org/pgsql',

    symbol_transforms=[
        ReplaceSubstringsSymbolTransform('\\','/'),
        NormalizePathsSymbolTransform(),
        ],

    symbol_strategy_rules=global_symbol_strategy_rules,
    )


В списке pgsql-hackers по дате отправления:

Предыдущее
От: fche@redhat.com (Frank Ch. Eigler)
Дата:
Сообщение: Re: Dtrace probes documentation
Следующее
От: "Kevin Grittner"
Дата:
Сообщение: Re: User-facing aspects of serializable transactions