Skip to Content

Fun With Kubernetes DNS

Posted on

Abusing Kubernetes DNS for a local development environment.

At $DAYJOB, we had a project that used the ACME protocol to obtain server certificates. We wanted to be able to develop and test the application on our laptops.

We are using Kubernetes for the production deployment of the application and using Docker for Mac (with Kubernetes support) for local development.

Boulder provides a good test platform for local development - even if it is a bit heavy. We deploy MySQL in one pod and all the Boulder components in another. Our internal application has multiple services and each is in its own pod. All of this is deployed using Kubernetes deployments.

Since our application will be using Let’s Encrypt with HTTP challenges, we want to mimic this in our local development environments. Boulder needs to contact a site via HTTP to validate the certificate request and we also want to be able to test that certificates have been issued. So, we need Boulder to be able to contact the site inside the Kubernetes cluster running locally and the developer needs to also contact the same site.

Local Development

To aid in local development, we have a wildcard entry *.local.example.com that resolves to 127.0.0.1. (example.com being just an example, we use our real domain so that it resolves on all developers’ machines.) This allows oe to use hostnames like myapp.local.example.com and have them resolve to the local docker instance for testing ingresses, services, etc. Our application is deployed in the local Kubernetes cluster and is exposed using a NodePort service: port 30080 for HTTP and 30443 for HTTPS - so a user just needs to use https://somesite.ssl.local.example.com:30443 in a browser to access 127.0.0.1. Inside the cluster, however, Boulder needs to access http://somesite.ssl.local.example.com:5002 (Boulder uses 5002 for HTTP validation, by default) but this needs to resolve to the service, not 127.0.0.1 as that would be localhost of the Boulder container. So, we need a way for the developer to be able to access the service at 127.0.0.1 and Boulder to access it at the IP address of the service inside the Kubernetes cluster.

What We Did

Note: I’m going to first describe what we did. I realized afterwards there may be a much simpler way to accomplish it. I’ll describe that later.

The follow description is probably a bit jumbled as I was working on this in between meetings one day and, to be honest, I don’t recall the exact order I figured it out, but I do know how the end result works.

DNS

Internally, Kubernetes uses kube-dns to resolve internal and external resources. You can configure kube-dns to forward domains to specific nameservers.

We created a CoreDNS deployment. CoreDNS can rewrite queries. We used this to rewrite all queries for a subdomain to our service inside the cluster. So our Corefile looks something like:

.:53 {
    errors
    health
    rewrite name regex (.*).ssl.local.example.com myservice.default.svc.cluster.local
    file /etc/coredns/cname.conf ssl.local.example.com 
    prometheus :9153
    proxy . /etc/resolv.conf
}

This rewrites all DNS queries for any hostname in the ssl.local.example.com subdomain to myservice.default.svc.cluster.local which is handling the HTTP challenge.

Let’s Encrypt (and Boulder) requires CAA DNS records to issue certificates. We create a zone file that looks like:

$TTL 1m
ssl.local.rsglab.com. IN SOA ns.local.example.com. admin.example.example.com. ( 2007120710 1d 2h 4w 1h )
cname.local.rsglab.com. CAA 0 issue "letsencrypt.org"
cname.local.rsglab.com. CAA 0 issue "happy-hacker-ca.invalid"

(happy-hacker-ca.invalid is used by Boulder.)

The Corefile and the zone file are loaded using Kubernetes ConfigMaps.

We create a service with a static IP in the Kubernetes service IP range. Luckily, it seems Docker for Mac uses the same range in every install.

apiVersion: v1
kind: Service
metadata:
  name: coredns
  labels:
    app: coredns
spec:
  clusterIP: 10.96.0.11
  selector:
    app: coredns
  ports:
  - name: dns
    port: 53
    protocol: UDP
  - name: dns-tcp
    port: 53
    protocol: TCP

We then used a [ConfigMap to tell kube-dns]((http://blog.kubernetes.io/2017/04/configuring-private-dns-zones-upstream-nameservers-kubernetes.html) to use this to resolve *.ssl.local.example.com:

apiVersion: v1
kind: ConfigMap
metadata:
  name: kube-dns
  namespace: kube-system
data:
  stubDomains: |
    {"ssl.local.example.com": ["10.96.0.11"]}

At this point, I discovered that Boulder resolves all hostnames to 127.0.0.1. We created a ConfigMap to inject into the Boudler pod to override its configuration. Our configuration is like the linked configuration, but we override the DNS resolver:

  "common": {
    "dnsResolver": "10.96.0.10:8053",
    "dnsTimeout": "1s",
    "dnsAllowLoopbackAddresses": true
  }

10.96.0.10 is the IP of the kube-dns service inside Kubernetes.

How it works

The above seems a bit complex. It does have a few moving pieces. Production deployment is much easier as we do not use Boulder and we do not have to do the DNS hackery - we just use normal CNAMEs.

The DNS flow is

  • Boulder receives a request to issue a certificate for somesite.tls.local.example.com It makes an HTTP request to validate the request using the HTTP challenge.
  • Boulder makes a DNS request to the kube-dns service for the domain.
  • kube-dns makes a request to the CoreDNS service.
  • CoreDNS rewrites the request to myservice.default.svc.cluster.local
  • CoreDNS makes a request to the the DNS server listed in /etc/resolv.conf which is the kube-dns service.
  • kube-dns responds with the service IP for myservice.default.svc.cluster.local
  • This reply is sent back to Boulder, which then makes an HTTP request to the IP.

At some point during this process, Boulder makes a request for the CAA record, which is forwarded to CoreDNS which answers.

So, this is a bit of a process to resolve the domain inside the Kubernetes cluster. Of course, “real” DNS on the internet often involves as many, if not more, steps.

How we probably could have done it

I think we could have just set the environment variable FAKE_DNS for the Boulder pod to the IP address of this service that is handling challenges. This causes Boulder to use that IP for every domain. This would require us to use a static IP address for our service, but we already have to do so for our CoreDNS.

I’m not sure how this interacts with CAA records. Also, I think we tested this and it did not work. However, recall that I did this all in a day in disjointed sessions of hackery.

I’ll retest this and update this post.