summaryrefslogtreecommitdiff
path: root/README.dsc-import
blob: 1ec53b037aab4804b94f6767a6fddb7f07cc3777 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
We would like to: represent the input tarballs as a commit each (which
all get merged together as if by git merge -s subtree), and for quilt
packages, each patch as a commit.  But w want to avoid (as much as
possible) reimplementing the package extraction algorithm in
dpkg-source.

dpkg-source does not currently provide interfaces that look like they
are intended for what dgit wants to do.  And dgit wants to work with
old versions of dpkg, so I have implemented the following algorithm
rather than wait for such interfaces added (even supposing that a sane
interface could be designed, which is doubtful):

* dgit will untar each input tarball.

  This will be done by scanning the .dsc for things whose names look
  like (compressed) tarballs, and using the interfaces provided by
  Dpkg::Compression to get at the tarball.

  Each input tarball unpack will be done separately, and will be
  followed by git add and git write-tree, to obtain a git tree object
  corresponding to the tarball contents.

  That tree object will be made into a commit object with no parents.
  (The package changelog will be searched for the earliest version
  with the right upstream version component, and the information found
  there used for the commit object's metadata.)

* For `3.0 (quilt), dgit will run
    dpkg-source -x --skip-patches

  git plumbing will be used to make the result into a tree and a
  commit.  The commit will have as parents all the tarballs previously
  mentioned.  The main orig tarball will be the leftmost parent and
  the debian tarball the rightmost parent.  The metadata will come
  from the .dsc and/or the final changelog entry.

  dgit will then dpkg-source --before-build and record the resulting
  tree, too.

  Then, dgit will switch back to the patches-unapplied version and use
  `gbp pq import' (in the private working area) to turn the
  patches-unapplied tree into a patches-applied one.

  Finally dgit will check that the gbp pq generated patches-applied
  version has the same git tree object as the one generated by
  dpkg-source --before-build.

* For source formats other than `3.0 (quilt)', dgit will do simply
    dpkg-source -x.

  Again, it will make that into a tree and a commit.

* For source formats with only single file entry in the .dsc, the
  (one) tarball is not imported separately (since its tree object
  would be the same as the extracted object), and the commit of the
  dpkg-source -x output has no parents.

* As currently, there will be a final no-change-to-the-tree
  pseudomerge commit which stitches the package into the relevant dgit
  suite branch.  (By `pseudomerge' we mean something that looks as if
  it was made with git merge -s ours.)

* As currently, dgit will take steps so that none of the git trees
  discussed above contain a .pc directory.


This has the following properties:

* Each input tarball is represented by a different commit; in usual
  cases these commits will be the same for every upload of the same
  upstream version.

* For `3.0 (quilt)' each patch's changes to the upstream files appears
  as a single git commit (as is the effect of the debian tarball);
  also, there is a commit object whose tree is just the debian/
  directory, which might well be the same as certain debian-only git
  workflow trees.

* For `1.0' non-native, the effect of the diff is represented as a
  commit.  So eg `git blame' will show synthetic commits corresponding
  to the correct parts of the input source package.

* It is possible to `git cherry-pick' etc. commits representing `3.0
  (quilt)' patches.  It is even possible fish out the patch stack as
  git branch and rebase it elsewhere etc., since the patch stack is
  represented as a contiguous series of commits which make only the
  relevant upstream changes.

* Every orig tarball in the source package is decompressed twice, but
  disk space for only one extra copy of its unpacked contents is
  needed.  (The converse would be possible in principle but would be
  very hard to arrange with the current interfaces provided by the
  various tools.)

* No back doors into the innards of dpkg-source (nor changes to
  dpkg-dev) are required.

* dgit does grow a dependency on git-buildpackage.

* Knowledge of the source format embedded in dgit is is restricted to
  some relatively straightforward processing of filenames found in
  .dsc files.

* dgit now depends on dpkg-source -x --skip-patches followed by
  dpkg-source --before-build being the same as dpkg-source -x
  (for `3.0 (quilt)').