diff options
author | Ian Jackson <ijackson@chiark.greenend.org.uk> | 2024-05-21 18:57:10 +0100 |
---|---|---|
committer | Ian Jackson <ijackson@chiark.greenend.org.uk> | 2024-05-21 18:57:10 +0100 |
commit | 3704eb1397d27dbd25f19d4c7345ba6e2edf5aa1 (patch) | |
tree | 5f8322b4789518cfcbb0d2ee9b9873b8ccaa6731 | |
parent | 54292ce163ee23e35c941ad49e5bd28ae74cf711 (diff) |
Copy tag2upload DESGIN.txt from wip branch
This is tag2upload/DESIGN.txt from wip.tag2upl-draft
commitid 2c10eb1f1b3d11e6c428d15ea2fd30aca7a3c90d.
It's been reviewed there.
Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
-rw-r--r-- | TAG2UPLOAD-DESIGN.txt | 273 |
1 files changed, 273 insertions, 0 deletions
diff --git a/TAG2UPLOAD-DESIGN.txt b/TAG2UPLOAD-DESIGN.txt new file mode 100644 index 0000000..6dbfaec --- /dev/null +++ b/TAG2UPLOAD-DESIGN.txt @@ -0,0 +1,273 @@ +TAG-TO-UPLOAD - DEBIAN - SERVICE DESIGN / DEPLOYMENT PLAN +========================================================= + +Overall structure and dataflow +------------------------------ + + * Uploader (DD or DM) makes signed git tag (containing metadata + forming instructions to tag2upload service) + + * Uploader pushes said tag to salsa. [1] + + * salsa sends webhook to tag2upload service. + + * tag2upload service + : provides an HTTPS service accessible to salsa + : fishes url and tag name out of webhook json + : checks to see if the tag is at all relevant + : retrieves tag data (git shallow clone) + ! verifies signature on the tag + ! parses the tag metadata + ! checks that salsa repo url is basically sane + ! checks to see if signed by DD, or DM for appropriate package + - obtains relevant git history + - obtains, if applicable, orig tarball from archive + - makes source package + # signs source package and "canonical view" git tag + - pushes history and both tags to dgit-repos git server + - uploads source package to archive + ! reports activities by email + : shows status of package building to enquirers via www + + * archive publishes package as normal + +[1] In principle other git servers would be possible but it would have +to be restricted to ones where we can either avoid, or stop, them +being used as a channel for a DoS attack against the tag2upload +service. + +Privsep +------- + +The tag2upload service will have to have a signing key that can upload +source packages to the archive. + +We do not want that signing key to be abused. In particular, even +though it will be in a hardware token we want to avoid giving +unrestricted access to use that key, to code which itself has a large +attack surface. In particular, source package construction is very +complex. + +So there will be a privilege separation arrangement, as described +above. Different tasks run in a different security context: + + : runs on the Manager, which is web-accessible and + not trusted very much + + ! is fully trusted and has access to the signing key + + - runs in the discardable VM or container, controlled by `!' + + # is achieved by the `dgit rpush' protocol, where the trusted + (invoking, signing) part offers a restricted signing oracle to + the less-trusted (building) part. + + The signing oracle will check that the files to be signed are + roughly in the right form and that they name the right source + package. It will construct the "canonical view" git tag itself + from metadata provided by the building part. + + The signing oracle has the information from the now-verified git + tag (since it operating in the context of a particular request) + and will only sign for the same source package and version. + +Service architecture +-------------------- + +I propose the following architecture for the tag2upload service. + +There are three systems involved: + +I. Manager (`:`) + +Hardly trusted. + + * Database (sqlite) containing queue, and historical data. + + * Conventional webserver offering TLS and using Let's Encrypt. + + * Manager daemon. + +Manager daemon has the following tasks: + + * Web-service-style "application server" written in some scripting + language listens on a local TCP port, handles HTTP connections + proxied by the webserver. + + * Receives webbook requests. + Checks that the calling IP address is salsa. + Parses the JSON. Checks tag name to see if it seems of interest. + If so, fetches the actual tag data (git shallow clone) + and sees if it looks plausible, and if so, stores it in the db. + If an Oracle client is waiting, feeds it the tag and url. + + * Server for very simple protocol, used by Oracle to obtain work to do. + Accessed via ssh with restricted key (`ssh ... nc`). + + * Manager daemon web service also offers basic query API + and web pages showing recent activity, for human tracking. + (To all comers.) + +II. Oracle (`!`) + +Trusted to use the signing key. (Key itself is in a hardware token.) +Not exposed to source package contents. Not exposed to the web. +Not exposed via the git protocol, not even as a client. + + * Uses ssh to connect to manager's simple Oracle protocol port. + Manager sends Oracle the signed tag, and repository URL. + + * Sends an email saying what it is about to process. + (We do this in the Oracle so that less-trusted components + don't get to hide their misbheaviours by not sending reports.) + + * Checks that the tag is signed by someone in the keyring + (and that it uses a good enough hash function). + (Oracle has a copy of the keyrings and dm allow list.) + + * Parses the tag to find the metadata including + source package name, target suite, and version. + Checks that the signer is authorised for this package. + + * Checks that the source repository URL is basically sane. + (But does not access it - the Builder does that, below.) + + * Arranges that the Builder is reset (see below). + + * ssh's to the Builder to have the builder fetch the git data. + + * Runs dgit rpush, specifying the package, version and + target suite on the command line. Target host is the Builder. + (We use the existing dgit rpush signing oracle protocol.) + + * Sends an email saying what it did. + + * Reports the outcome success/failure and a summary line + to the Manager via the still-open manager protocol connection. + +III. Builder (`-`) + +Does the actual source package conversion. +Largely trusts the Oracle. +Trusted as to source package contents, but not otherwise. + +Oracle can reset this. So it is a VM or a chroot. +We propose to use the same schroot configuration as for a buildd, +subject to consultation with DSA as to the best approach. + + * On instructions from the Oracle (via incoming ssh): + + - Fetches the git objects for the maintainer's tag from Salsa. + - Fetches the git objects for the existing canonical view + from the dgit-repos git server. + - Fetches necessary origs from the archive. + - Converts the git history to the canonical form (treesame to + the source package) by adding necessary synthetic commits. + - Builds the source package + - Uses the rpush protocol to obtain signed git tag + (on the canonical git form) + and signed .dsc and .changes. + - Pushes the git objects to the dgit-repos server. + - Uploads the .dsc and .changes to the archive. + + * Packet filter limiting outgoing connections to salsa, + dgit-repos, and the Debian archive, + Incoming connections come only from the Oracle. + +Reproducibility, metadata and auditing +-------------------------------------- + +The trusted part of the tag2upload service will keep some logs, +particularly of each tag it is told about and what the disposition of +that was, and when it was retried. + +Also, it will send the following information to a public mailing list: + - The tag object data for any tag it decides to process, + before it passes it to the VM. + - A report (more or less, a shell transcript) + of each processing attempt + - The list will also be the public email address of the + tag2upload robot's signing key + +The generated .dscs will contain additional fields + + Git-Tag-Tagger: Firstname Surname <email@address> + + "tagger" line from the git tag converted to deb822 format + + Git-Tag-Info: tag=<tagobjid> fp=<fingerprint> + + <tagobjid> is the git object ID of the tag object + (if someone wants to obtain referenced git objects, + they can be found on the dgit-repos git server) + + <fingerprint> is the "fingerprint_in_hex" from the VALIDSIG line + in the gpgv output. + +This additional metadata is needed to be able to tell by looking at +the .dsc who the original uploader was (which might be different to +the maintainer, in the sponsorship case). (Programs which use the +uploader signature identity will send mails to the mailing list +mentioned above, until they have been updated. This is not desirable +but not a blocker for deployment.) + +The generated .changes will contain copies of the two .dsc fields +above. + +The upload will contain a .source_buildinfo. This will list the +versions of the software running in the Builder, which is primarily what +controls the generated .dsc. + +The versions of dgit-infrastructure and git running in the trusted +part are also relevant because the trusted part assembles outgoing +tagger lines etc. and interprets the incoming git tag; however, in our +deployment we intend to maintain them in sync, and anyway our ad-hoc +reproduction tooling will not be able to arrange for them to be +different. So the outside-VM version information will not be +included. + +Eventually there could be a mode for sbuild (related to +binary build reproduction), or a suitable script, which can verify a +reproduction attempt. For now the src:dgit test suite will check that +the upload is reproducible if run again in the same environment. + +Emails +------ + +Emails are sent to: + + 1. The username associated with the signing key + 2. The tagger (email address from the git tag object) + 3. A public mailing list selected (or created) for the purpose + +1 and 2 will often be the same. +This provides feedback to the person making the signature. +The person preparing (rather than, maybe, sponsoring) the upload +(Changed-By in .changes) will be notified by the archive software. + +The email report will contain at least: + + * The target distro, package, suite and version + * The URL from which the git objectx were downloadeed + * Whether the operation succeeded, and error messages if it didn't. + +Email is sent by the Oracle feeding a file to +`ssh smarthost sendmail -t` not by implementing SMTP, +to reduce the attack surface. + +DoS +--- + +This service is not very resistant to DoS attacks. In particular, +sending it bad URLs might stall it (since it has to retry failing +URLs). + +So we (i) do not expose it to anyone but salsa and (ii) limit it to +trying to fetch salsa urls. + +Making very many tags on salsa would stress this tag2upload service a +bit but not fatally, and it would be a DoS against salsa too. + +After signature verification, we are much more vulnerable to DoS. An +approved signer can get the service to do a lot of work. That is the +purpose of the service, indeed. |