[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]
Docker builds vs package builds
- From: Colin Walters <walters verbum org>
- To: atomic-devel projectatomic io
- Subject: Docker builds vs package builds
- Date: Fri, 23 May 2014 12:33:38 +0000
I posted something related to this on atomic@, but it's more of a
development thing, so moving here.
After playing with Dockerfiles for some of my code, one thing that's
becoming very clear to me is that using Dockerfiles certainly make it
easy to get started, but the simplicity of the model/format seriously
hampers efficient RPM (or many other buildsystems) integration, and has
some subtle traps.
For example, a simple thing you see in a lot of Dockerfiles is:
RUN yum -y update
Except...this gets cached by Docker which is **not** what you want in
general. This is https://github.com/dotcloud/docker/issues/1996
I ended up doing the recommended workaround in that bug of adding a
comment, so it's
RUN yum -y update #nocache20140523.0
And whenever I want to avoid the cache I change the comment.
The right thing depends on circumstance, but it would be lot more
efficient to do a check of the repository timestamps, and reuse the
cached layer if they haven't changed. A further optimization would be
to reuse the cached layer if none of the packages have changed (this
matters a lot for Fedora, where the repo changes a lot but not always
for packages you actually use).
Doing this sort of intelligence requires nontrivial code; it's not
clear to me whether it should live inside or outside of the container.
It's hard for it to live inside the container as the container itself
(AFAIK) can't drive the Docker layer caching. We'd have to do
something like include within each layer cache metadata (such as the
repository timestamps), and then enhance yum to leave the system
untouched if the timestamps haven't changed.
This gets really quite ugly as we'd need to do pervasive O_NOATIME
inside the container to avoid any changes from simply running code,
clean up every temporary file, etc.
Then the other model is to try to entirely drive the process from
outside of Docker. But if you do that, it's again hard to reuse the
Docker image cache; you'd have to keep cached state on the host system.
Is anyone aware of any advanced work on package (or other advanced
buildsystem/deployment) integration with Dockerfiles? This topic could
probably use a wiki page or something, it seems like a fairly open area.
[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]