Everything You Needed to Know About Kubernetes TLS

(But Were Afraid to Ask)

Joe Thompson


In IT since my first job helping out with computers in my high school in 1994

Past employers: Mesosphere, Capital One, CoreOS, Red Hat, among others

Exposed to Kubernetes in early 2015 and working with it full-time since late 2015

Currently a Solutions Engineer for (we're hiring!)


Pronouns: he/him
Blood type: Caffeine-positive

Contact info:

Why am I here? Isn't TLS handled automatically these days?

...Not as much as you'd like

Sometimes even when it is, automation breaks

(Sometimes, humans break it)

TLS 101

or,

It's simpler than you think

In the beginning, there was X.500...

  • Organizational directory system standard specified in 1988
  • Included X.509 standard for certificate-based PKI
  • X.509, implemented independent of X.500 directories, became the standard PKI for the Internet
  • Over the years, the X.509 standard has evolved (and so has standard practice around certificate acceptance)

What is a certificate?

A certificate is a structured document including:

  • Information about the subject (location, a Canonical Name, etc.) and the subject's public key
  • A signature by and attributes of the issuing certificate authority (CA)
  • (Commonly) Extensions defining the valid uses of the certificate, additional names for the subject, etc.


        Issuer: C = US, ST = Oregon, L = Portland, O = Kubernetes, OU = CA, CN = Kubernetes
        Validity
            Not Before: Nov 16 18:11:00 2019 GMT
            Not After : Nov 15 18:11:00 2020 GMT
        Subject: C = US, ST = Oregon, L = Portland, O = Kubernetes, OU = Kubernetes The Hard Way, CN = kubernetes
        Subject Public Key Info:
[...]
            X509v3 Extended Key Usage: 
                TLS Web Server Authentication, TLS Web Client Authentication
            X509v3 Basic Constraints: critical
                CA:FALSE
[...]
            X509v3 Subject Alternative Name: 
                DNS:localhost, DNS:kubernetes, DNS:kubernetes.default, DNS:kubernetes.default.svc, DNS:kubernetes.default.svc.cluster, DNS:kubernetes.svc.cluster.local, IP Address:10.3.0.1, IP Address:127.0.0.1
To have a CA issue a certificate, the subject generates a Certificate Signing Request with the first two items signed by its private key (to prove it possesses that keypair)

What is a certificate authority?

In its most basic version, it is:

  • A keypair
  • A certificate signed with that keypair's private key, that is marked as usable for certificate issuance


        X509v3 extensions:
            X509v3 Key Usage: critical
                Certificate Sign, CRL Sign
            X509v3 Basic Constraints: critical
                CA:TRUE

Certificate authorities typically use their root certificates only to issue intermediate certificates that are in turn used to issue end (leaf) certs.

Image: padlocked chain
Credit: WolfBlur @ PixaBay

Do I need to be a certificate authority? When?

You might be already and just don't know it -- many products include a CA to bootstrap themselves when certificates aren't provided; some even provide user access to issue certificates with an API. (Find out, because compromised or overly-exposed CAs are a game-over scenario.)

If you aren't, you may want (or need) to be under the following circumstances:

  • You need to own your root of trust for policy/regulatory reasons
  • You need to rotate certificates regularly but are airgapped and cannot use a service like Let's Encrypt for certificate issuance
  • You have a lot of certificates that you need to issue with low latency
  • You like being frustrated and annoyed on a regular basis

Warning: Opinions ahead!

If you are a vendor and claim your product is "enterprise-ready", you must earn all 80 points on this scale for handling TLS:

Generate a self-signed wildcard leaf cert and use it for everything by default: -50 points
  • Points are deducted because you're promoting horrible practices (that increasingly don't even work)

Generate a CA and use it to create proper leaf certificates: +10 points

Accept a user CA and key and use those to create leaf certs: +10 points

Handle certificate rotation (including CA) with minimal fuss: +20 points

Accept a user-supplied CA and set of leaf certificates and use them directly: +30 points

Bonuses:

  • If you earned all 70 regular points: +10
  • Enable using Let's Encrypt: +20 (only if you earned all of the other 80 points)

If you are an enterprise customer and a vendor's product does not pass this test, kick them out, close the door and lock it!

Why is TLS such a royal pain?

A lot of the problems people have with "TLS" are actually issues with the way running code implements abstractions on top of it:

  • Trust is treated as a binary: "trusted completely" or "not trusted at all"
  • Trust is hierarchical and centralized by default to OS vendors, browser makers, etc.

There is another model called the "web of trust"

People say the web of trust is too complicated to implement, then end up either reimplementing those exact concepts (but badly, with lots of caveats) OR simply punting and saying "you can't do that"

We could implement "web of trust" right now:

  • Convert binary trust into trust levels of 100% or 0%
  • Convert all-purpose trust into a set of trust flags
  • Current state is completely replicated but users now have much more power

How TLS breaks

Most common: Broken trust

TLS trust requires a chain of signatures back to a trusted root to consider a certificate secure. If the root CA is not in the trust store, or there is an intermediate certificate that is untrusted or not provided to build the chain, the end certificate will be considered insecure.

Certificate is not correctly formed

Hostnames and IPs configured in the certificate are checked against the name/IP requested, and hostnames against the Canonical Name and SubjectAlternativeName (and there's some special handling around comparing those to each other).

IPs listed as names must match exactly - no IP wildcards.

Current time is checked against the beginning and ending time all certificates in the trust chain are valid for (set when the cert is issued).


Image: padlocked chain
Credit: stevepb @ PixaBay

Extra security is being applied

Key pinning/certificate pinning: the client tracks and compares certificate info it saw the last time it made this request to the result this time -- changes are interpreted as an attack

Low-security configurations like self-signed leaf certs or use of weak algorithms like SHA-1 are often disallowed (and clients are getting stricter about this all the time)

Why would any of this break in the first place?

Most commonly:


  • mistakes in initial setup
  • rotation errors
  • ordinary neglect

Image: chains left to rust
Credit: dexmac @ PixaBay

Where can TLS break in Kubernetes?

Components talking to the API server

  • Other control plane components
  • kubelets
  • Cluster addons/utilities you've installed

API server talking to other cluster components

  • etcd
  • kubelets

Client verification often has its own certificate setup

Client verification takes place on most control plane transactions

Usually this is set up using the same CA root, etc. as the server configuration, but this is not always true and there are reasons you might not want it to be

Your own applications

...those are why you're even bothering with all this, right?

How do I debug any of this??

Examining output and logs

Easy if you're centrally logging -- but even if you're not:

kubectl logs for things running as Kubernetes pods

journalctl for control plane components running as systemd services

Examining the certificates directly

openssl is your Swiss Army knife when it comes to all things SSL/TLS

openssl x509 -noout -text -in [certificate file] decodes a cert file and dumps it to stdout

openssl s_client -connect host:port -showcerts connects live to an SSL endpoint and dumps cert blobs and info

  • If that endpoint isn't exposed outside your cluster, run a debug container on the host, attach it to the same network namespace as the target, then run openssl from inside it

Demo!

Further info

The current X.509 standard: RFC5280 (note updates in header)
OWASP article on pinning
EFF article about detecting vendors installing "trusted" certificates to intercept and alter secure traffic
How to run a CA with OpenSSL
Kubernetes the Hard Way

Relevant talks from right here!

Connor Gilbert's talk earlier today
Tomorrow's talk from Duffie Cooley and Nicholas Lane

Questions?

(Thank you!)

Slides: https://bit.ly/2NL6Ut9+
Demo code: Github, tarball