Hi, I'd like to propose a fairly fundamental rework of Atomic Host. TL;DR:
- Move towards "system containers" (or layered packages) for flannel/etcd
- Move towards containers (system, or Docker) for kubernetes-master
- Move towards layered packages for kubernetes-node and storage (ceph/gluster)
In progress PR:
https://github.com/CentOS/sig-atomic-buildscripts/pull/144
There are advantages to this and disadvantages; I think we'll have some
short term transition pain, but past that short term the advantages
outweigh the disadvantages a lot.
== Advantage: Version flexibility ==
etcd for should really have its own identity in a clustered environment, and
not necessarily roll fowards/backwards with the underlying host version. I've
had users report things like hitting an e.g. Kubernetes or Docker issue and rolling back their
host, which rolled back etcd as well, but the store isn't forwards-compatible,
which then breaks. There's also a transition to etcd2 coming, which again
one should really manage distinct from host upgrades.
Another major example is that while we chose to include Kubernetes in
the host, it's a fast moving project, and many people want to use a newer
version, or a different distribution like OpenShift. The version flexibility
also applies to other components like Ceph/Gluster and flannel.
== Advantage: Size and fit to purpose ==
We included things like the Ceph and GlusterFS drivers in the base
host, but that was before we had layered packages, and there's
also continuing progress on containerized drivers. If one is using
an existing IaaS environment like OpenStack or AWS, many users
want to reuse Cinder/AWS, rather than maintaining their own storage.
Similarly, while flannel is a good general purpose tool, there are
lots of alternatives, and some users already have existing SDN solutions.
== Disadvantage: More assembly required ==
This is a superficial disadvantage I think - in practice, since we didn't
pick a single official installation/upgrade system (like OpenShift has
openshift-ansible), if you want to run a Kubernetes cluster, you need
to do a lot of assembly anyways. Adding a bit more to that I suspect
isn't going to be too bad for most users.
Down the line I'd like to revisit the installation/upgrade story - there's
work happening upstream in
https://github.com/kubernetes/contrib/tree/master/ansible
and I think there's also interest and some work in
having parts of openshift-ansible be available for baseline Kubernetes
and accessible on Galaxy etc.
== Disadvantage: Dependency on new tooling ==
Both `rpm-ostree pkg-add` and `atomic install --system` are pretty new.
They both could use better documentation, more real world testing, and
in particular haven't gained management tool awareness yet (for example, they need
better Ansible support).
== Summary ==
If people agree, I'd like to merge the PR pretty soon and do a new CentOS AH Alpha,
and we can collaborate on updating docs/tools around this. For Fedora...it's
probably simplest to leave 24 alone and just do 25 for now.
What I'd like to focus on is having AH be more of a good "building block"
rather than positioning it as a complete solution. We ensure that the base
Docker/kernel/SELinux/systemd block works together, system management tools
work, and look at working more in the upstream Kubernetes (and OpenShift)
communities, particularly around Ansible.