Docker and IPv6 with dynamic prefix

Getting IPv6 on your private connection should be quite easy by today. Getting services you might have hosted before via dynDNS towards IPv6 however is a bit more work. Mainly due to the fascinating concept of dynamic prefixes.

Most private dial up connections with dual-stack use dynamic prefixes. This gives you three options:

  • Change provider or ask yours nicely for a fixed prefix (nice, but not possible for everyone)
  • Use NAT and IPv6 ULAs with tools like these: https://github.com/robbertkl/docker-ipv6nat (working, but pretty much the old IPv4 world)
  • What we want to do: or improvise, adapt, overcome.

The article is about how to work with a dynamic prefix and not use something like DHCPv6 and basically results in a few problems with each prefix change triggered via SLAAC.

  • You need IPv6 dynDNS to update your AAAA records
  • Docker will need updates for the network configurations
  • Neighbor solicitation will need to be updated
  • ip6tables becomes more complicated

So what is the solution

BEWARE: This is not about configuring your router. I assume that it is not blocking IPv6 traffic IN- or OUTBOUND.

First you need to get your docker IPv6 setup working with the current prefix unchanged. I used the official Docker documentation here https://docs.docker.com/v17.09/engine/userguide/networking/default_network/ipv6/#how-ipv6-works-on-docker

This is very straightforward. But some comments to this:

  • Docker itself advises against using the default network and advises to setup and use your own network. This is what I did, however I would advise to also setup the default network properly, as containers started without any network will come up in that
  • I got a /56 prefix from my provider, so best for me was a /80 subnet for docker. This would allow each docker container to code their MAC into the IPv6
  • If you are not sure, how to cut your networks, use a IPv6 calculator from the internet, it helps to visualize. I also decided to number my networks manually to make them more readable for me (hence using the MAC only for random containers)
  • While setting this up, make sure your ip6tables is in ACCEPT mode on your host machine, especially in the FORWARD chain, as this is the relevant chain for DOCKER. The additional docker chains you might know from IPv4 are not existent in IPv6

I will use the following format to reference the IPv6 (letters the prefix, numbers my subnet numbering): aaaa:bbbb:cccc:dddd:1::/80 Yes, my network is just 1, so containers then get ::1::1, ::1::2 etc. Keep it simple here.

Next, add the ndppd daemon to automate the neighbor solicitation on the host machine. It is available as a package in Debian. You will need this, as otherwise more scripting is needed. I used the following config (/etc/ndppd.conf):

# route-ttl  (NEW)
# This tells 'ndppd' how often to reload the route file /proc/net/ipv6_route.
# Default value is '30000' (30 seconds).

route-ttl 30000

# proxy
# This sets up a listener, that will listen for any Neighbor Solicitation
# messages, and respond to them according to a set of rules (see below).
#  is required. You may have several 'proxy' sections.

proxy eth0 {

   # router <yes|no|true|false>
   # This option turns on or off the router flag for Neighbor Advertisement
   # messages. Default value is 'true'.

   router no

   # timeout
   # Controls how long to wait for a Neighbor Advertisment message before
   # invalidating the entry, in milliseconds. Default value is '500'.

   timeout 500

   # ttl
   # Controls how long a valid or invalid entry remains in the cache, in
   # milliseconds. Default value is '30000' (30 seconds).

   ttl 30000

   # rule [/]
   # This is a rule that the target address is to match against. If no netmask
   # is provided, /128 is assumed. You may have several rule sections, and the
   # addresses may or may not overlap.

   rule aaaa:bbbb:cccc:dddd:1::/80 {
      # Only one of 'static', 'auto' and 'interface' may be specified. Please
      # read 'ndppd.conf' manpage for details about the methods below.

      # 'auto' should work in most cases.

      # static (NEW)
      # 'ndppd' will immediately answer any Neighbor Solicitation Messages
      # (if they match the IP rule).

      # iface
      # 'ndppd' will forward the Neighbor Solicitation Message through the
      # specified interface - and only respond if a matching Neighbor
      # Advertisement Message is received.

      # auto (NEW)
      # Same as above, but instead of manually specifying the outgoing
      # interface, 'ndppd' will check for a matching route in /proc/net/ipv6_route.

      auto

      # Note that before version 0.2.2 of 'ndppd', if you didn't choose a
      # method, it defaulted to 'static'. For compatibility reasons we choose
      # to keep this behavior - for now (it may be removed in a future version).
   }
}

eth0 represents the host interface in which requests will show up. Hence this is the interface towards your router. If your docker container has the global routeable IPv6 aaaa:bbbb:cccc:dddd:1::1 and a request from the outside is delivered to your router (identified by prefix), your router will ask every connected node who has this IPv6. However the container is not connected to the router but to the docker host. Hence the docker host needs to answer to the router and then forward it to the docker container. For this the NDP proxying is needed to listen on eth0 on the host machine.

rule is the docker network you want to activate in your docker host, so it can answer to the router. You can have multiple rule sections for multiple docker networks. With the ndppd deamon you can skip the manual adding as described in the documentation.

Info: If you use ip -6 neigh to chech if it was registered, know that the daemon will only add it when traffic is routed. Hence trigger it with something like a ping. At this point you should be able to have your docker containers on IPv6 and ping6 the internet.

Next, to prepare for prefix changes, we need a script that updates several things. Sadly I could not find a hook that would allow to trigger right when the prefix changes. the dhcpclient apparently has a hook, but as we are not using DHCP… Hence the script is triggered by cron every 5 minutes and checks for a prefix change. Below find an example, however you will need to adapt it to your needs, especially networks after the prefix, the docker compose and container information and dyndns setup.

update-prefix.sh

#!/bin/bash
# Update script to adapt docker networking to changed IPv6 prefix

export LC_ALL=C

DYN_USER="<your dyndns user>"
DYN_PASS="<your dyndns pass>"

# Grep current configured prefix from docker settings
PREFIX_OLD=$(grep -o -P '(?<=fixed-cidr-v6": ").*(?=:0::)' /etc/docker/daemon.json)

# Get latest prefix from ip, latest = highest ttl, hence on top
PREFIX_NEW=$(ip -6 addr show eth0 | grep inet6 | grep -v 'inet6 f[de]' | awk '{print $2}' | cut -f 1-4 -d : | head -n 1)

# If prefix changed, update docker settings and restart docker service
if [ $PREFIX_OLD != $PREFIX_NEW ]; then
    echo "Prefix needs update"

    echo "   Stopping Docker container..."
    /usr/local/bin/docker-compose -f docker-compose.yml down
    /usr/bin/docker network rm docker-network

    echo "   Changing prefix from ${PREFIX_OLD} to ${PREFIX_NEW}..."    
    sed -i 's'"@$PREFIX_OLD"'@'"$PREFIX_NEW"'@g' /etc/docker/daemon.json
    sed -i 's'"@$PREFIX_OLD"'@'"$PREFIX_NEW"'@g' /etc/ndppd.conf

    echo "   Restarting Docker..."
    /etc/init.d/docker restart
    /usr/bin/docker network create \
        --driver=bridge \
        --subnet=192.168.1.0/24 \
        --gateway=192.168.1.1 \
        --ipv6 \
        --subnet="${PREFIX_NEW}:1::/80" \
        docker-network
    /usr/local/bin/docker-compose -f docker-compose.yml up -d

    echo "   Restarting ndppd..."
    /etc/init.d/ndppd restart

    echo "   Updating DNS..."
    IP_NEW=$(/usr/bin/docker inspect -f '{{range .NetworkSettings.Networks}}{{.GlobalIPv6Address}}{{end}}' nginx)
    curl -4 -s -u $DYN_USER:$DYN_PASS "<dyndns update url>"

DYN_USER=""
DYN_PASS=""

fi

exit 0

The script basically does the following:

  • extracts the prefix stored in the daemon.conf of docker for the default network and the newest prefix of eth0, here between fixed-cidr-v6": and :0:: in the config file. I used 0 for my default docker network (hence aaaa:bbbb:cccc:dddd:0::/80) and 1 for my custom network.
  • compares both and only acts if there is a change
  • updates the new prefix in /etc/docker/daemon.conf and /etc/ndppd.conf
  • restarts docker and ndppd for changes to take effect
  • extracts new IPv6 from container named nginx and updates dyndns with this IPv6

Lastly adapt ip6tables to changing IPv6 prefix is not easy, as it seems, that the function is undocumented.

This is only needed, if you actually DROP all forwarding rules in your ip6tables. Hence would close off all docker containers from the internet, other than exceptions. I would recommend this, as in IPv4 Docker the standard configuration would only open exposed ports to the internet. With IPv6 you have to take care of the filtering yourself. And remember -with this config- all containers have internet routable IPv6s.

To expose a Docker container (e.g. here aaaa:bbbb:cccc:dddd:1::1) with port 80 to the internet, you normally would add following exception to your host forwarding.

ip6tables -I FORWARD -p tcp -m tcp -d aaaa:bbbb:cccc:dddd:1::1 --dport 80 -j ACCEPT

However as the prefix aaaa:bbbb:cccc:dddd could change you need something different:

ip6tables -I FORWARD -p tcp -m tcp -d ::1:0:0:1/::ffff:ffff:ffff:ffff --dport 80 -j ACCEPT

As you can see, the prefix is gone and you are able to target a specific container. Now you only need to allow inter-container traffic, this should be handled by the following.

ip6tables -I FORWARD -s ::1:0:0:1/::ffff:ffff:ffff:ffff -j ACCEPT

You could add these rules in the end of the script. I fixed the container IPv6 ends in the docker-compose, hence I can leave them static in the firewall.

Addendum

I would recommend to test each of this part by part, e.g. make sure, the dyndns works before putting the script together, etc. It makes debugging and changes way easier. Thanks to input of following sources:

Full article view to comment or share...

Protect docker from internet while allowing LAN with iptables

So some of you might have noticed, that docker is setting up its own rules in iptables. However these rules reside mostly in the FORWARD chains and not in the INPUT. Most users, that setup their small server or laptop use INPUT rules to secure it. This is allright, as normally a server and a laptop are standalone and no routers. However with installing docker it becomes a router for the containers, hence docker uses the FORWARD chains. Knowing iptables means, that traffic gets sorted after PREROUTING into INPUT or FORWARD. Hence the INPUT rules are not affecting the FORWARD chain.

Docker manages the FORWARD chain (and even the nat table) itself to control access to the containers. Basically it setups NAT and the FORWARDING to the containers as soon as you use the --port command or docker-compose port: setting.

In this docker does not differ between traffic from the internet or a local LAN. Everything that arrives at the FORWARD chain is used.

ISSUE: In the scenario of of having a laptop or small home server, which is usually connected via local LAN to the internet, this could lead to opening services that should only run in LAN, to the internet. Only an additional firewall outside could prevent this.

So what is the solution

To get back to a state where we can control opening services to the internet, you need four iptables rules. We will use the DOCKER-USER chain, which should be empty and is made by docker for user written rules within the docker setup. Therefore they will not be touched by docker.

Before starting you should make a list of your local networks, including the docker network (even the default network).

First stop docker from communicating.

iptables -A DOCKER-USER -j DROP

This is adding a rule, that is just stopping any docker communication. After this all containers will be offline for the internet or locally.

Next allow local networks to communicate.

iptables -I DOCKER-USER -s <docker-network> -j RETURN
iptables -I DOCKER-USER -s <LAN> -j RETURN

The first rule allows the docker containers to communicate with each other, hence you need to add your docker network her or have multiple rules if you have multiple networks (e.g., 172.18.0.0/24). The second rule allows the LAN to communicate again, hence it should be something along 192.168.1.0/24. After this your docker services will be available from the local network, however not from the internet (IP spoofing would work, but you can only handle that with different interfaces). If you do not want to publish services to the internet, you can stop here.

Finally allow outside communication to reach containers

iptables -I DOCKER-USER -p tcp --dport <target-port> -j RETURN

Put here the target port of the service on the docker container to allow traffic. IMPORTANT: This happens after nat, so if you have docker do some portmapping, e.g., 8080 -> 80 you need to put the container port 80 here. With this you can have your docker services accessible on the LAN and decide which to publish to the internet.

Notes: We use -I to add these rules before the DROP rule that was added first. RETURN is used as the chain DOCKER-USER is in front of all other docker rules in the filter table. Hence we need to return to those.

Full article view to comment or share...

Dovecot learn ham/spam with rspamd via inet protocol for docker

I started rebuilding my mail server following Thomas Leisters Howto. However I decided to dockerize the whole setup. With that I needed to get rid of any socket communication and move to tcp based communication between different docker containers.

This was surprisingly easy, as most components already communicate via tcp. However the learn spam and ham mechanism still uses a socket. So here are some details for my setup:

  • I used a user defined network via docker compose to connect the different containers. By that I have full control over the containers IPs
  • Each process is running in one container, so I have unbound, redis, rspamd, dovecot, postfix
  • Host system is a debian stretch
  • Docker containers are based on Debian:stable-slim

EDIT 14.04.20: I switched my setup to Debian based Docker. Especially the postfix container needs to be NOT Alpine right now. The resolver implementation in musl-libc cripples the DNSSEC calls of postfix, making outgoing DANE unusable.

So what is the solution

BEWARE: I am basing my guide on Thomas config linked above.

First you need to change a few details in the ham/spam piping. Within the dovecot.conf down at the plugin settings you need to set the sieve_pipe_bin_dir option to the location, where the pipe scripts (following steps) will be stored. Beware to set the path as it will be in your docker image. My setting: sieve_pipe_bin_dir = /usr/local/sbin

Next adapt the sieve scripts. These scripts trigger the learning as you can see in dovecot.conf. Ham on copying out of SPAM folder, Spam on copying into SPAM folder. Do not forget to call sievec after placing them in the sieve folder.

learn-spam.sieve

require ["vnd.dovecot.pipe", "copy", "imapsieve"];
pipe :copy "rspamd-pipe-spam";

learn-ham.sieve

require ["vnd.dovecot.pipe", "copy", "imapsieve"];
pipe :copy "rspamd-pipe-ham";

Now adapt the pipe scripts itself. These scripts will actually connect to rspamd to deliver the mail for learning. During docker image creation you will need to copy the rspamd-pipe-spam and rspam-pipe-ham scripts into the sieve_pipe_bin_dir location (first step ) and make them executable. The script is connecting via the container name rspamd if you have a different one, you need to change or use the IP.

rspamd-pipe-spam

#!/bin/bash
cat $1 | /usr/bin/curl -s --data-binary @- http://rspamd:11334/learnspam
exit 0

rspamd-pipe-ham

#!/bin/bash
cat $1 | /usr/bin/curl -s --data-binary @- http://rspamd:11334/learnham
exit 0

To allow this scripts to call rspamd you need to allow the IP of dovecot for the worker controller.

worker-controller.inc

bind_socket = "rspamd container>:11334";
password = "<your pwd as described in the guide>";
secure_ip = "<dovecot container ip>";

This should enable ham/spam learning via sieve within a docker setup.

####Addendum To train existing mails, e.g. from an old server, you need to execute the following commands in the dovecot docker. Please make sure you adapt paths, if you changed them. Learn HAM: find /var/vmail/mailboxes/*/*/mail/cur -type f -exec /usr/local/sbin/rspamd-pipe-ham {} \; Learn SPAM: find /var/vmail/mailboxes/*/*/mail/Spam/cur -type f -exec /usr/local/sbin/rspamd-pipe-spam {} \;

Full article view to comment or share...

Regular update of Docker images/containers

After converting my servers into docker setups, I was in need to update the images/containers regularly for security reasons. Baffled I found, that there is no standard update method to make sure that everything is up-to-date. The ephemeral setup allows you to throw away your containers and images and recreate them with the latest version. As easy as this sounds, you figure that there are some loopholes in the setup.

First we need to understand, there are 3 types of images that we need to keep up-to-date

  • Images from the docker hub, that just get pulled and are used as they are with some configs
  • Images from the docker hub, that get pulled and then are only used as base for own dockerfiles
  • The images created out of own dockerfiles

Ridiculously all three of them need to be updated to make sure everything is up-to-date (most important, a new local build won’t get the latest base image update) and additionally we have to care for cleanup.

I found some solutions in the net to automatically update docker, the so far best version by binfalse.de. But this leaves out my own dockerfiles with a build and some minor steps, pruning, etc. So I am only using the dupdate script out of the Handy Docker Tools to incorporate in a little script.

So what is the solution

WARNING: This just updates images. If your setup needs additional update steps, you need to plan these in. Otherwise you risk breaking your setup.

Multiple steps are needed to completely update. First, I use /usr/local/sbin/dupdate -v to update all docker images coming from a hub, covering the ones I use directly and as base for builds.

This will give an error for the images you created out of your own dockerfiles, but update all pulled ones from docker hub. Second, I update the images of my own dockerfiles by rebuilding all via /usr/local/bin/docker-compose -f docker-compose.yml build --no-cache. If you use docker without docker-compose, you just have to do something similar for each dockerfile.

This will use the newly pulled base images in their build, hence create the latest version for your dockerfile. IMPORTANT: The --no-cache is needed to force the update of self build Dockerfiles. Docker determines an update on your own builds only by the commands in the Dockerfile (hence if they changed). It CANNOT see a version change of a package installed with e.g. apt-get install. But exactly that version change you want, so you have to force a rebuild.

Now the images are all updated and we only need to restart the containers.

Addition for cleaning up

However you end up with a lot of images tagged or named . These are your old images, which are now cluttering the hard drive. The ones only tagged with are the ones you updated from docker hub, the ones with name and tag are the ones you build. You will need to run ```/usr/bin/docker image prune -a --force``` to get rid of them and free up space. Warning: This will erase all older images. If you need them as a safety precaution, skip this step.

Full article view to comment or share...

Reinstall GRUB after BIOS update on LUKS encrypted system

Due to reasons unknown I upgraded my Lenovos BIOS via a USB Stick. :-) Everything went well, however after reboot, all my boot options for Linux Mint were gone. Turns out, somehow my boot setup was erased as well. Using UEFI without CSM and without secure boot with LUKS encrypted Linux Mint, this was already an issue when first installing. Getting everything right seems to be more of good luck.

This answer ist mainly for straight forward installations of Ubuntu/Mint with LVM on LUKS and unencrypted boot. If you have a different setup, make sure to adapt the different mounts.

Advice on using boot-repair utility: Don’t. The tool messes with a lot of configs unnecessarily. If you are not absolutely sure, what it does, don’t use it. As an example for myself: When using it to restore Grub, it edited my fstab and uncommented my root mapper and set it to noauto. Result: You end up after Grub in an initramfs prompt, as the volume has not been unlocked. Possibly wasting time by checking on why LVM times out, cryptsetup not working, etc.

So what is the solution

To get back your boot menu, I tried several things, e.g. boot-repair. However since my system is LUKS encrypted, I guess the tools all had some problems. To get back my system, I accessed my old system via chroot from a Linux Live CD. In this case Linux Mint Live CD.

First boot from Linux Live CD, get keyboard locale and network set up. Unlock your LUKS device. Make sure, that the name in the end (sda3_crypt) is as specified in your original /etc/crypttab (yes, if you do not know, you need to open the crypt device somewhere, take a look, close and reopen it). Otherwise you might get a warning later on.

If you have a different setup, make sure to take the correct device.

cryptsetup luksOpen /dev/sda3 sda3_crypt

For overall LUKS informations and commands, take a look here: https://wiki.ubuntuusers.de/LUKS/

Next mount all necessary partitions out of the old system To find the LUKS drives, use sudo lvscan

For me my /root turns out to be in /dev/mint-vg/root

If you have separated partitions, e.g. for home, or other devices, make sure to adapt parts below for mounting.

sudo mount /dev/mint-vg/root /mnt
sudo mount /dev/sda2 /mnt/boot
for i in /dev /dev/pts /proc /sys /run ; do sudo mount -B $i /mnt$i ; done
sudo mount -o bind /etc/resolv.conf /mnt/etc/resolv.conf
sudo mount /dev/sda1 /mnt/boot/efi/

Make sure you add the /mnt/boot/efi, otherwise grub will complain grub-install: "cannot find EFI directory". It is not included in your boot partition, but a separate partition.

After that, enter the chroot environment with sudo chroot /mnt /bin/bash

To install GRUB run sudo grub-install /dev/sda

Now just a reboot is needed and the system should work as before.

Addendum

If you are here because something on boot is not working and you messed things up even further as I did, additionally update-initramfs -c -k all could help.

Full article view to comment or share...