[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[atomic-devel] 3 kinds of "docker build"



In a Dockerfile, one can do pretty much anything.  This makes it very
easy to get started and use.  However, I see 3 quite distinct things:

1) Assembly of self-built artifacts: In this case, you have an external
build system that generates artifacts (RPMs/debs/whatever).  This is
what the Fedora base image does; we're just taking existing RPMs and
aggregating.  A crucial aspect of this model is that the external build
system is taking care of things like tracking the source code, holding
logs of builds, etc.

This is where the distribution side is today.

2) Aggregation + Caching: This is where you're inheriting from a base
(probably from 1), then doing an e.g. "gem/pip/cargo install" in your
Dockerfile, and if e.g. rubygems.org goes down, you can still do
deployments.  You're relying on the upstream rubygems.org (or whatever)
to hold the source code.  

3) Compilation: This is where you're actually *building* - you're
running gcc or whatever compiler inside the container, operating on
source code.  This case further subdivides into:
3a) Compilation-to-docker: Your binaries live in the generated Docker
image
3b) Compliation-via-docker: You're using Docker instead of mock/pbuilder
to generate other artifacts, like RPMs/debs.   See
https://github.com/alanfranz/docker-rpm-builder

And finally in all of these, there's commonly a "dockerization" phase
where we include a custom init script, tweak the daemon's logging
defaults or to run in foreground mode, etc.

But I think ideally the "assembly" is a cleanly separate phase from the
"dockerization".  In practice this is normally how people write their
Dockerfiles, but it'd be good to make it rigorous.  If it was, then it'd
be a lot easier to move towards a potential "dynamic linking" model for
images in the future (yes, this is a whole can of worms).

What I'm actually arguing here is that we want dedicated tools for 1)
and 3).  In particular, it seems to me in order to be able to do
efficient rebuilds, a build system really needs to know in a static
fashion what Docker images need to be rebuilt.  Yes, you can do this
after the fact by running them and extracting rpm -qa, but that's ugly,
and more importantly you can only get the dependencies *after* building.

Finally, if we had a dedicated tool for 1), we could make it a lot
easier to take the same template, and say "I want to assemble this
Docker nginx image from a CentOS base, or a Fedora rawhide base".

It seems a number of people out there are templating Dockerfiles for
similar reasons, and I think that's probably the route we should also
pursue.

Opinions?  Anyone aware of tools that exist in this area?


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]