summaryrefslogtreecommitdiff
path: root/README.dsc-import
diff options
context:
space:
mode:
Diffstat (limited to 'README.dsc-import')
-rw-r--r--README.dsc-import292
1 files changed, 292 insertions, 0 deletions
diff --git a/README.dsc-import b/README.dsc-import
new file mode 100644
index 0000000..f5bb0bd
--- /dev/null
+++ b/README.dsc-import
@@ -0,0 +1,292 @@
+From ijackson Mon Sep 26 15:37:19 +0100 2016
+X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil]
+ [nil "Monday" "26" "September" "2016" "15:37:19" "+0100" "Ian Jackson" "ijackson@chiark.greenend.org.uk" nil nil "Intent to commit craziness - source package unpacking" "^From:" nil nil "9" nil nil nil nil nil nil nil nil nil nil]
+ nil)
+X-Mozilla-Status: 0001
+X-Mozilla-Status2: 00000000
+MIME-Version: 1.0
+Content-Type: text/plain; charset=us-ascii
+Content-Transfer-Encoding: 7bit
+Message-ID: <22505.12959.668142.478444@chiark.greenend.org.uk>
+X-Mailer: VM 8.2.0b under 24.4.1 (i586-pc-linux-gnu)
+From: Ian Jackson <ijackson@chiark.greenend.org.uk>
+To: debian-dpkg@lists.debian.org,
+ Guido Guenther <agx@debian.org>,
+ Bernhard R. Link <brlink@debian.org>,
+ vcs-pkg-discuss@lists.alioth.debian.org
+Subject: Intent to commit craziness - source package unpacking
+Date: Mon, 26 Sep 2016 15:37:19 +0100
+
+tl;dr:
+
+ * dpkg developers, please tell me whether I am making assumptions
+ that are likely to become false. Particularly, on the behaviour of
+ successive runs of dpkg-source --before-build with successively
+ longer series files.
+
+ * git-buildpackage and git-dpm developers, please point me to
+ information about what metadata to put into the commit message for
+ a git commit which represents a dpkg-source quilt patch. I would
+ like these commits to be as convenient for gbp and git-dpm users as
+ possible.
+
+
+Hi.
+
+Currently when dgit needs to import a .dsc into git, it just uses
+dpkg-source -x, and git-add. The result is a single commit where the
+package springs into existence fully formed. This is not as good as
+it could be. I would like to represent (in the git pseudohistory) the
+way that the resulting tree is constructed from the input objects.
+
+In particular, I would like to: represent the input tarballs as a
+commit each (which all get merged together as if by git merge -s
+subtree), and for quilt packages, each patch as a commit. But I want
+to avoid (as much as possible) reimplementing the package extraction
+algorithm in dpkg-source.
+
+dpkg-source does not currently provide interfaces that look like they
+are intended for what I want to do. And dgit wants to work with old
+versions of dpkg, so I don't want to block on getting such interfaces
+added (even supposing that a sane interface could be designed, which
+is doubtful).
+
+So I intend to do as follows. (Please hold your nose.)
+
+* dgit will untar each input tarball (other than the Debian tarball).
+
+ This will be done by scanning the .dsc for things whose names look
+ like (compressed) tarballs, and using the interfaces provided by
+ Dpkg::Compression to get at the tarball.
+
+ Each input tarball unpack will be done separately, and will be
+ followed by git-add and git-write tree, to obtain a git tree object
+ corresponding to the tarball contents.
+
+ That tree object will be made into a commit object with no parents.
+ (The package changelog will be searched for the earliest version
+ with the right upstream version component, and the information found
+ there used for the commit object's metadata.)
+
+* dgit will then run dpkg-source -x --skip-patches.
+
+ Again, git plumbing will be used to make this into a tree and a
+ commit. The commit will have as parents all the tarballs previous
+ mentioned. The metadata will come from the .dsc and/or the
+ final changelog entry.
+
+* dgit will look to see if the package is `3.0 (quilt)' and if so
+ whether it has a series file. (dgit already rejects packages with
+ distro-specific series files, so we need worry only about a single
+ debian/patches/series file.)
+
+ If there is a series file, dgit will read it into memory. It will
+ then iterate over the series file, and each time:
+ - write into its playground a series file containing one
+ more non-comment non-empty line to previously
+ - run dpkg-source --before-build (which will apply that
+ additional patch)
+ - make git tree and commit objects, using the metadata from
+ the relevant patch file to make the commit (if available)
+ - each commit object has as a parent the previous commit
+ (either the previous commit, or the commit resulting from
+ dpkg-source -x)
+
+ After this the series file has been completely rewritten.
+
+* dgit will then run one final invocation of dpkg-source
+ --before-build. This ought not to produce any changes, but if
+ it does, they will be represented as another commit.
+
+* As currently, there will be a final no-change-to-the-tree
+ pseudomerge commit which stitches the package into the relevant dgit
+ suite branch; ie something that looks as if it was made with git
+ merge -s ours.
+
+* As currently, dgit will take steps so that none of the git trees
+ discussed above contain a .pc directory.
+
+
+This has the following properties:
+
+* Each input tarball is represented by a different commit; in usual
+ cases these commits will be the same for every upload of the same
+ upstream version.
+
+* For `3.0 (quilt)' each patch's changes to the upstream files appears
+ as a single git commit (as is the effect of the debian tarball).
+ For `1.0' non-native, the effect of the diff is represented as a
+ commit. So eg `git blame' will show synthetic commits corresponding
+ to the correct parts of the input source package.
+
+* It is possible to `git-cherry-pick' etc. commits representing `3.0
+ (quilt)' patches. It is even possible fish out the patch stack as
+ git branch and rebase it elsewhere etc., since the patch stack is
+ represented as a contiguous series of commits which make only the
+ relevant upstream changes.
+
+* Every orig tarball in the source package is decompressed twice, but
+ disk space for only one extra copy of its unpacked contents is
+ needed. (The converse would be possible in principle but would be
+ very hard to arrange with the current interfaces provided by the
+ various tools.)
+
+* No back doors into the innards of dpkg-source (nor changes to
+ dpkg-dev) are required.
+
+* dgit does grow a dependency on Dpkg::Compression.
+
+* Knowledge of the source format embedded in dgit is is restricted to
+ iterating over tarballs and manipulating debian/patches/series,
+ which dgit already does.
+
+* dgit now depends on dpkg-source --before-build idempotently applying
+ patches as they successively appear on debian/patches/series.
+
+* Perhaps the git commits generated by dgit to represent patches can
+ be made to round-trip nicely into tools like git-dpm and
+ git-buildpackage.
+
+ I have found the information about tags in gbp-dch(1), but that
+ doesn't seem like it's applicable.
+
+ I have also found the information about tags in gbp-pq(1). From
+ that it looks like I ought to generate "Gbp-Pq: Name" and "Gbp-Pq:
+ Topic".
+
+* The scheme I describe avoids introducing a dependency from dgit to
+ git-buildpackage. I might be able to replace the
+ successive-patch-application part with an appropriate invocation of
+ gbp-pq. Would that be better ?
+
+ Bear in mind that because the output of gbp-pq import doesn't
+ contain debian/patches, I would need to rewrite its output (perhaps
+ with git-filter-branch).
+
+
+Comments welcome. Please be quick - this is very close to the top of
+my dgit todo list.
+
+
+Thanks,
+Ian.
+
+
+--
+Ian Jackson <ijackson@chiark.greenend.org.uk> These opinions are my own.
+
+If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
+a private address which bypasses my fierce spamfilter.
+
+From ijackson Wed Sep 28 10:50:49 +0100 2016
+X-VM-v5-Data: ([nil nil nil nil t nil nil nil nil nil nil nil nil nil nil nil]
+ [nil "Wednesday" "28" "September" "2016" "10:50:49" "+0100" "Ian Jackson" "ijackson@chiark.greenend.org.uk" "<22507.37497.633622.843659@chiark.greenend.org.uk>" nil "Re: Intent to commit craziness - source package unpacking" "^From:" nil nil "9" nil nil nil nil nil nil nil nil nil nil]
+ nil)
+X-Mozilla-Status: 0003
+X-Mozilla-Status2: 00000000
+MIME-Version: 1.0
+Content-Type: text/plain; charset=iso-8859-1
+Content-Transfer-Encoding: quoted-printable
+Message-ID: <22507.37497.633622.843659@chiark.greenend.org.uk>
+In-Reply-To: <20160928010117.nqe2prbsbaqkbjza@gaara.hadrons.org>
+References: <22505.12959.668142.478444@chiark.greenend.org.uk>
+ <20160928010117.nqe2prbsbaqkbjza@gaara.hadrons.org>
+X-Mailer: VM 8.2.0b under 24.4.1 (i586-pc-linux-gnu)
+From: Ian Jackson <ijackson@chiark.greenend.org.uk>
+To: Guillem Jover <guillem@debian.org>
+Cc: debian-dpkg@lists.debian.org,
+ Guido Guenther <agx@debian.org>,
+ "Bernhard R. Link" <brlink@debian.org>,
+ vcs-pkg-discuss@lists.alioth.debian.org
+Subject: Re: Intent to commit craziness - source package unpacking
+Date: Wed, 28 Sep 2016 10:50:49 +0100
+
+Guillem Jover writes ("Re: Intent to commit craziness - source package =
+unpacking"):
+> On Mon, 2016-09-26 at 15:37:19 +0100, Ian Jackson wrote:
+> > tl;dr:
+> >=20
+> > * dpkg developers, please tell me whether I am making assumptions
+> > that are likely to become false. Particularly, on the behaviour=
+ of
+> > successive runs of dpkg-source --before-build with successively
+> > longer series files.
+>=20
+> For format =AB3.0 (quilt)=BB, that seems fine, to the point I'm fine =
+even
+> documenting this, which I can probably do for 1.18.11.
+
+Great.
+
+> For other formats, such as =AB2.0=BB, I don't think that's true, but =
+I
+> assume you don't care about that one anyway. But just mentioning
+> because this behavior is probably format-specific. For =AB2.0=BB I
+> think it could be fixed, and should not be too hard (not sure if it's=
+
+> worth it though).
+
+I think the right approach is perhaps to use --skip-patches and
+--before-build only with 3.0 (quilt). The that would leave 2.0 (or
+other strange or future formats) producing a correct (although
+possibly sub-optimal) import.
+
+> > dpkg-source does not currently provide interfaces that look like th=
+ey
+> > are intended for what I want to do. And dgit wants to work with ol=
+d
+> > versions of dpkg, so I don't want to block on getting such interfac=
+es
+> > added (even supposing that a sane interface could be designed, whic=
+h
+> > is doubtful).
+>=20
+> Even then I'm still interested in a decription of what you'd need
+> ideally, to take into account when having a pass at cleaning up that
+> part of the interface. I think you could be interested in a cleaner
+> Dpkg::Source::* hierarchy, for the mid/long-term?
+
+For `3.0 (quilt)' explicit interfaces for applying and unapplying
+individual patches would help. But really IMO such an interface ought
+to be exposed on the command line rather than (or as well as) via a
+Perl module.
+
+Beyond that I find it hard to see what could make dgit's life easier.
+Since dgit wants to construct a commit graph representing the source
+package's innards, unless dpkg-source explicitly provides an interface
+along those lines ("please output a graph of unpacked source tree
+states and corresponding commit messages") dgit is still going to have
+to know specially about most of the source package formats.
+
+> > * dgit will untar each input tarball (other than the Debian tarball=
+).
+> >=20
+> > This will be done by scanning the .dsc for things whose names loo=
+k
+> > like (compressed) tarballs, and using the interfaces provided by
+> > Dpkg::Compression to get at the tarball.
+>=20
+> Hmm, Dpkg::Source::Archive is currently private, but I might have a
+> look at making it public if that would be helpful here.
+
+I think the amount of logic I would have to replicate is minimal.
+
+> > * As currently, dgit will take steps so that none of the git trees
+> > discussed above contain a .pc directory.
+>=20
+> As long as the directory does not disappear from the working tree,
+> that should work.
+
+Right, indeed it won't.
+
+Thanks for your comments. I feel unblocked :-).
+
+Ian.
+
+--=20
+Ian Jackson <ijackson@chiark.greenend.org.uk> These opinions are my o=
+wn.
+
+If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
+a private address which bypasses my fierce spamfilter.
+