SaltStack, Ansible, Puppet, … how can we share data between servers?

WARNING: This is still a draft somewhat..

Lately I've been depressed by the lack of automation in my life or really in the projects I take part in.

Most of my stuff is not hosted on AWS, Azure, Digitalocean or other big "cloud" players. That is due to multiple reasons:

  • We may have hardware, rack, power and network capacity available which is basically for free.
  • We might need physical connectivity to the machine - e.g. connecting to a local WiFi mesh network, a beamer, some audio equipment and so on…
  • We want to remain in control of our data. That means we are not going to trust some random internet business.
  • It is cheaper, even if you've to pay for the servers that are running 24/7. This might not be true for a website that sells boat trips on the Atlantic Ocean and gets high traffic spikes every 2nd Sunday from April to December but for our scenarios it works quite well.

At the office I've been using puppet more then six years and I'm quite happy with it. In the local Freifunk-Community (https://darmstadt.freifunk.net) we are using SaltStack. For my personal infrastructure (E-Mail, Webhosting, random docker containers, pastebin, dn42, …) I'm using Ansible (https://ansible.com).

Neither of those deployments are very trivial, they are probably just the average project for either of those tools.

A few weeks back I and a few others started writing SaltStack states for an IRC-Network that we are running. If you think about automating the deployment of IRCd's it is rather easier. Create a VM, install your favorite flavor of linux, install build dependencies for the IRCd of your choice and generate the configuration. That's basically it.

We've been using TLS in our network for about 10 years now. (Both user and server2server communication). Since it wasn't and still is not desirable to buy wildcard certificates or share certificates between servers we are running our own CA (including OCSP service etc.). With the switch to automated server configuration we also want to deploy LetsEncrypt - actually LetsEncrypt encouraged us to start using some kind of automation since humans aren't reliable and having to jupe a server every 3 months because the admin still didn't configure a LetsEncrypt cronjob is not what we are after.

So I started out writing some states and a Vagrantfile (https://gist.github.com/andir/2188da38904557a58cd7df29fd277275) for easier testing of the whole stack. After tackling a few issues with Vagrant and different boxes (as vagrant calls VM images) everything looked fine.

We basically have a zoo of VMs on our laptops wich simulate the production network. At least so we thought.

For the links between the IRCDs we are using TLS. So far we did exchange fingerprints, manually add them to the ircd.conf, /quote rehash and off you go.

When you automate the deployment you also automate the configuration of links between your servers. While you can configure send_password and receive_password since what I do know when that isn't what we want to rely on when sending chat messages around the globe. So we still need to exchange certificate fingerprints. One might say that TLSA records are made for that but I wonder why I can't use the already verified connection between my Salt Master and the Salt Minion. If you query DNS for TLSA records it requires your DNS to be up, DNSSEC to work, your local resolver to have a copy of the root dns zone or an upstream resolver that (hopefully) verifies DNSSEC - unlike OpenDNS.

Since we are also running our own DNS Servers adding them as a dependency isn't really a great idea. We might be recovering from the death of some parts of the infrastructure. The only reasonable approach to me is to use the authenticated relationship from minion and master to exchange public keys, fingerprints etc.

While I'm currently focusing on exchanging certificate fingerprints and/or public keys this applies to a bunch of scenarios you might come acress when you decide to drop all shared secrets and generate them on your servers. A few of those could be:

  • Generating and exchanging TSIG keys for nsupdate between your DNS Servers and minions.
  • Communicating a shared secret for database access, borg backups, …
  • Distributing SSHFPs between your servers (local verification, publishing them to DNS as SSHFP records,…)
  • Trust bootstrapping (after minion key verification) for any kind of internal secret database

With SaltStack you've a few options to transfer data from a minion to the master / other minions:

  • use the Mine to export data
  • use grains
  • use pillars (with pre-generated certificates, keys, …) -> not the desired solution
  • trigger events with custom context
  • use a vault

In puppet you could use exported ressources, in Ansible you could write to some kind of custom api without much overhead or query/configure the other side on-demand since a single run isn't restricted to a single server.

Puppet has solved the issue for many years - people tend to not like puppet for various reasons. Maybe because it feels like programming?

Using the Salt Mine

The Salt Mine is a handy construct when it comes to exporting data from a node as long as that data only exists as some kind of list or with a very low cardinality. Also it only makes sense if the data is mostly static and doesn't change. Changes to the exported data (e.g. another information, removal of information, …) require changes to the minion configuration. This isn't really practical if you are trying to write modular states where not everything is hard-wired with another state.

For reference this is how you would export information about a certificate:

mine_functions:
  my_awesome_certificate:
    - mine_function: tls.cert_info
    - /etc/letsencrypt/live/.../cert.pem

This has to be deployed on each minion and you probably also have to restart the minion when changeing the file.

More information on the salt mine can be found at https://docs.saltstack.com/en/latest/topics/mine/.

Trying to export data via Grains

I'm not going to get into the details or provide a example configuration here. I've written a bunch of grains and they just work. The limitation is that you can't really attribute them. So you can't tell your custom grains what to export without changing the code or introducing custom configuration files, adding stuff to the minion config etc.. While this approach would automatically propagate new certificate fingerprints, public keys, … you still have to use the Salt Mine to actually export them. Which isn't that bad.

The lack of configuration options for custom grains kills this approach for me.

Pillars

LOLNOPE. Using pillars would require manual collection of fingerprints or (even worse) central management of all the certificates.

This simply doesn't work for me.

Using a custom event

This is what seems to be a promising but ephemeral approach to the issue.

The basic idea is:

  • Create the public-/private key pair on the minion.

  • On changes to that fail (creation, key rollover, new certificate, …) execute a script with the filename as argument

  • The (bash) script then extracts the information we want from the file. (using openssl or other command line tools) and publishes those information using something like

    `salt-call event.send my/custom/event/certificate-changed '{"certfp": "ABCDEF01234567890…"}'`
    

In order to use the "published" fingerprint other minions must be up, running and listening to those events. Otherwise the information is lost and nobody cares.

This approach only works if we figure out how to store the fingerprints - after receiving an event.

It basically sucks as bad as the others but we might be able to configure the links after randomly restarting salt-minions and running salt '*' state.apply

Using a vault

At the time of this writing this seems to be a valid option. I've not tried it yet. There will be an update on this soon (tm).

Reading of the vault seems to be rather easy. You can also write to it but only using infromation that is availbe during state-rendering. So I'm not sure what the benefit is. One could probably combine this with the event approach to store the public keys of the minions in the vault by listening to key events on the master.

Conclusion

This world sucks. We need better tools :/

Juniper MCD decided to coredump on commit and rollback

On a Juniper EX3300 a colleague of mine entered an invalid statement:

interface-range some_ports {
    member ge0/0/2;
    member-range ge-0/0/0 to ge-0/0/1;
}

The member ge0/0/2 is missing a - between ge and 0/0/2. Juniper (for whatever reason) accepted the input but mcd decided to segfault when asked to delete or rollback the configuration.

pid 75982 (mgd), uid 0: exited on signal 11 (core dumped)

Recovering from that case is actually not that hard. You just know the right command(s) ;-)

You can load an older configuration via the load override <config file> command. I did also try the load replace <config file> command but that also segfaulted..

andi@foo# load override ?
Possible completions:
  <filename>           Filename (URL, local, remote, or floppy)
  db/                  Last changed: Oct 27 10:44:35
  juniper.conf+.gz     Size: 6000, Last changed: Nov 08 15:12:55
  juniper.conf.1.gz    Size: 5913, Last changed: Jun 30 13:36:52
  juniper.conf.2.gz    Size: 5881, Last changed: Jun 30 12:59:25
  juniper.conf.3.gz    Size: 5280, Last changed: Jun 30 12:54:02
  [...]
{master:0}[edit]
andi@foo# load override juniper.conf+.gz
load complete

In my case the juniper.conf+.gz was the desired config file. I recommend inspecting those files before loading them (they are stored in /config/).

Using VRFs with linux and systemd-networkd

While working on a systemd-networkd patch to implement (at least basic) VRF interfaces I did write my other post. This post should give you a brief example on how you can create a VRF with systemd-networkd.

At this point it really only created the interfaces and enslaves potential customer interfaces to a given VRF.

You still have to implement all the ip rule-stuff yourself. For example a systemd.unit file might be the right approach which is executed/started after the network is "up".

First you've to create the systemd.netdev vrf-customer1.netdev file:

After restarting systemd-networkd with systemctl restart systemd-networkd you should see the corresponding interface:

$ ip -d link show dev vrf-customer1
9: vrf-customer1: <NOARP,MASTER> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 02:74:c7:e1:de:64 brd ff:ff:ff:ff:ff:ff promiscuity 0
    vrf table 42 addrgenmode eui64 numtxqueues 1 numrxqueues 1

Note the last line which states vrf table 42.

To add an interface to the VRF you'll have to modify/create the corresponding .network file. This is how the file /etc/systemd/network/enp0s31f6.network would look on my notebook:

Restarting systemd-networkd again and checking the status using ip -d link gives us:

$ip -d link show  dev enp0s31f6
3: enp0s31f6: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel master vrf-customer1 state DOWN mode DEFAULT group default qlen 1000
 link/ether 50:7b:9d:cf:34:dc brd ff:ff:ff:ff:ff:ff promiscuity 0
 vrf_slave table 42 addrgenmode eui64 numtxqueues 1 numrxqueues 1

Again note the last line which states vrf_slave table 42. Also in the first line you can see that it belongs to the VRF vrf-customer.

And that is all for now. You still have to add the ip rule commands in some way or another (there is no support in systemd-networkd yet and I did not have a good idea without inventing ip rule management in systemd).

Using VRFs with linux

Ever since I've heard about VRF support (or VRF-lite like it is called in Documentation/network/vrf.txt) I wanted to start tinkering with it. Since the topic is currently only covered in the previously mentioned linux kernel documentation I thought it would be a good idea to post some notes.

It basically boils down to adding an VRF interface and creating two ip rule-entries.

I'm using a local VM with ArchLinux since the VRF feature seems to require a rather recent kernel. My experience with kernels below version 4.6 weren't that great.

$ ip -br link # this is where we are start off
lo               UNKNOWN        00:00:00:00:00:00 <LOOPBACK,UP,LOWER_UP>
ens3             DOWN           52:54:00:12:34:56 <BROADCAST,MULTICAST>

Now are adding a new interface named vrf-customer1 with the table customer1 assigned to it. The table parameter is used to place routes from your devices within the VRF into the right routing table

$ ip link add vrf-customer1 type vrf table customer1

$ ip -d link show vrf-customer1 # verify that the interface indeed exists and has the correct table assigned to it
4: vrf-customer1: <NOARP,MASTER> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether ca:22:59:ba:05:da brd ff:ff:ff:ff:ff:ff promiscuity 0
    vrf table 100 addrgenmode eui64

Next: redirect the traffic from and to the vrf to the customer1 table and verify the rules are indeed as expected:

$ ip -4 rule add oif vrf-customer1 lookup customer1
$ ip -4 rule add iif vrf-customer1 lookup customer1
$ ip -6 rule add oif vrf-customer1 lookup customer1
$ ip -6 rule add iif vrf-customer1 lookup customer1

$ ip -4 rule
0:  from all lookup local
32764:      from all iif vrf-customer1 lookup customer1
32765:      from all oif vrf-customer1 lookup customer1
32766:      from all lookup main
32767:      from all lookup default

$ ip -6 rule
0:  from all lookup local
32764:      from all oif vrf-customer1 lookup customer1
32765:      from all iif vrf-customer1 lookup customer1
32766:      from all lookup main

To make any use of our VRF we will have to add an device to it. In my case I'll add the only available "physical" device ens3.

$ ip link set ens3 master vrf-customer1
$ # verify the interface is indeed a member of the VRF
$ ip -br link show master vrf-customer1
ens3             DOWN           52:54:00:12:34:56 <BROADCAST,MULTICAST>

Now that we've an interface to receive send send packets with it we should consider adding an IP-Address to it. Since IPv6 is enabled per default we don't need to configure a LL-Address for that protocol.

$ # add an IP to the interface
$ ip addr add 10.0.0.1/24 dev ens3
$ ip route show table customer1
local 10.0.0.1 dev ens3  proto kernel  scope host  src 10.0.0.1

Seeing a route like that might confuse the average linux user. Those routes usually exist within the local table which you can check via ip route show table local

The route to the /24 we've added is still missing from the interface. Why is that? You'll have to change the state of the interface to "UP":

$ ip link set ens3 up
$ ip route show table customer1
broadcast 10.0.0.0 dev ens3  proto kernel  scope link  src 10.0.0.1
10.0.0.0/24 dev ens3  proto kernel  scope link  src 10.0.0.1
local 10.0.0.1 dev ens3  proto kernel  scope host  src 10.0.0.1
broadcast 10.0.0.255 dev ens3  proto kernel  scope link  src 10.0.0.1


$ ip -6 route show table customer1
local fe80::5054:ff:fe12:3456 dev lo  proto none  metric 0  pref medium
fe80::/64 dev ens3  proto kernel  metric 256  pref medium
ff00::/8 dev vrf-customer1  metric 256  pref medium
ff00::/8 dev ens3  metric 256  pref medium

suddenly routes \o/

Using multiple client classes with ISC DHCPd

Since the internet is lacking examples of how to use multiple classes with a single pool here is one:

class "mac-filtered-clients"
{
    match binary-to-ascii (16, 8, ":", substring(hardware, 1, 6));
}

subclass "mac-filtered-clients" "50:7b:00:00:00:00"; # some cool host!

class "J-client" {
    spawn with option agent.circuit-id;
    match if (substring(option agent.circuit-id, 30, 9) = "foo-bar");
    lease limit 1;
}

subnet 192.168.0.0 255.255.0.0 {
     pool {
          range 192.168.0.10 192.168.0.150;
          allow members of "J-client";
          allow members of "mac-filtered-clients";
     }
}

This isn't very special compared to a setup with just a single class but it can be confusing since debugging classes is a pita.. One pitfall I did run into was using the byte representation of the mac-addresses (without the quotes) and using match hardware;. The example above works for me (tm).

Postgresql-tmpfs with systemd.socket-activation for local (ephemeral) data during development

During development of database related stuff you commonly run into the "issue" (or non-issue depending on your taste) of running a local database server - or multiple of those.

In my case I have to run a local postgresql server on my notebook. I asked myself: I'm not always developing on that piece of software, and I do not always require or want a local postgresql server. What can I do about that?!?

On top of that using my precious SSD to store data I am going to delete anyway souds like a waste (or money). In my development environment I can and want to safely wipe the data often. Also most of the database load comes from running test cases anyway. That stuff doesn't need to end up on my (slow, compared to RAM) disk. Using a tmpfs for that kind of stuff sounds much saner to me.

The part of running a repetitive clean database setup sounded like the use case for a container based thing. These days docker is pretty "hot" and it solves the issue of distributing re-useable images. There is an official postgresql image on docker hub for various versions of postgresql. I've simply build a new image based on that. It is available on docker hub (https://hub.docker.com/r/andir/postgresql-tmpfs/) or if you prefer to build it on your own you can download the Dockerfile on GitHub (https://github.com/andir/postgresql-tmpfs).

Now that we are past the introductional blabla here are the systemd unit files I'm using to achieve this:

You can either put those unit files in /etc/systemd/system or install them as systemd-user units in ~/.config/systemd/user.

systemctl daemon-reload
systemctl enable postgresql-docker.socket

If you try to connect to the postgresql server (nc 127.0.0.1 5432) you can observe the container while it is starting (journalctl -f).

The default username, password and database name is postgres. You can change that by modifying the startup arguments of the docker container. Those are documented at https://hub.docker.com/_/postgres/.

Happy data trashing \o/

P.S.: If you've an idea on how to stop the service after x minutes of inactivity please let me know. Stopping the service manually isn't really what I'm after.