I have installed Ubuntu 22.04 on a union filesystem on my NAS and boot it on diskless clients using NFS. The lowest layer in the filesystem contains a minimal, bootable version of Ubuntu. I can boot that layer on its own and update it using apt
.
I would like the upper layer in the union filesystem to contain additional packages not included in the lower layer. By booting the union filesystem consisting of lower and upper layers with the upper layer read-write, I can install those additional packages using apt
.
My problems begin when I have to update packages on the lower filesystem. I use apt
for this and it seems to work.
Having updated the lower filesystem, I return to the union filesystem and try to update the upper filesystem. However, apt
appears to want to update the packages from the lower filesystem again too, in addition to the ones in the upper filesystem. This would result in duplication on the upper and lower filesystems, which I don't want.
I have tried cleaning out /var/lib/apt
and /var/cache/apt
before doing apt update
on the upper filesystem, but this doesn't solve the problem. This is partly because the packages on the upper filesystem are referencing versions on the lower filesystem that have been updated, and therefore appear not to exist anymore. So apt
kindly tries to re-install them, but on the upper filesystem.
I thought about just doing an apt upgrade
on the upper filesystem and then going through the duplicated packages and forcibly purging them somehow, but I haven't found a strategy yet.
I suspect that I am way outside the scope of what apt
was designed for. Are there any apt
gurus out there who can provide me with some guidance? The hacker in me is pondering attacking /var/lib/dpkg/*
with a Perl script...
Additional Information
The directory /var/lib/dpkg/info/
occurs in both the lower and the upper layers. It appears that the version in the each layer only contains information for the packages installed in the same layer. Since the information for each for each package is held in separate files specific to that package, merging the two directories in the union filesystem results in a single view of all installed packages, which is what I had hoped for.
However, /var/lib/dpkg/status
is a single text file that contains information about all packages, and the only visible merged version is from the upper layer. Any later changes to the lower layer are not copied up and thus are not visible. This causes problems in the upper layer which contains references to stuff that has been updated and has effectively disappeared.
Do I need to rebuild /var/lib/dpkg/status
?
A Step Back, but also More Detail
[This is provided in anticipation of my anticipated attempt at answering my question myself.]
I'm going to try to answer this myself. Please feel free to comment and point out any errors or false assumptions that I make.
To recap: I am trying to create and maintain Ubuntu images that can be booted via NFS on (many different) diskless devices. The Ubuntu images are stored on and exported from a central NAS as union filesystems. My NAS only supports aufs
, so that is what I use here.
Each image is constructed with aufs
as follows on the NAS:
mount -t aufs o=br:<instance-data>=rw \
:<application>=ro \
:<local-Ubuntu-config>=ro \
:<Canonical-Ubuntu-image>=ro
none
<merged-image> # e.g. /mnt/xxx
Each aufs
image is exported by NFS (yes, that is possible).
There is one of these images for each diskless client. Each image has 4 layers, defined as branches (o=br:...
). Since ogres have layers, I call these ogres.
Each layer has access to the layers below it, so it can use any packages and configuration installed there. Each layer also has its own additional packages and configuration.
All ogres share the same <Canonical-Ubuntu-image>
and the same <local-Ubuntu-config>
. These two layers could be collapsed into a single layer, but I want to keep a clear distinction between Canonical's stuff and my local configuration.
There are a small number of <application>
s (e.g. network management, network monitoring, smart home aka OpenHAB, development, ...). They all share the same and
`, as mentioned above, but have their own application layer.
Some ogres are duplicated (e.g. network management, OpenHAB) so they share the same application but need individual configuration for their own instance.
Excuse Me, but Why?
At the moment I have about ten Raspberry Pi's which perform various functions in my network, including making my house work. I spend all my time just making sure that each Pi is updated and working correctly. I am trying to reduce the required effort by sharing as much as possible between the Pi's.