Running HAProxy on Docker Containers in Kubernetes

Why Use HAProxy

When we decided to use Kubernetes as our container orchestration solution, we had the opportunity to learn all of the Kubernetes terminology. I was familiar with pods, replication controllers, and services from previous work in Kubernetes, but since then, kubernetes had introduced deployment, daemonset, load balancer, and ingress resources (just to name a few).

We needed to expose our API endpoints for consumption both inside and outside of the kubernetes cluster, and deciding which of the many kubernetes resources were needed to accomplish that in time for our beta release meant learning about every option, testing it, and bracing for the inevitable issues that come with such things when actual traffic hits it. An Ingress seemed like a reasonable way of exposing our services, but many questions loomed. Will Ingress even be around in the next version of kubernetes? Things are changing quickly and I am hesitant to spend lots of time configuring a resource that may not be supported in a year.

Can we enforce HSTS and ProxyProtocol? How easy is it to route based off of headers and paths, handle redirects, and route both external and internal API calls? Our team has used HAProxy before to do all of these things, and we like how flexible it is while handling very high load and consuming basically no resources.

To make things even easier, HAProxy is stateless, making it a perfect candidate for containerization. After a bit of investigation (time is limited when trying to get ready for a beta launch), I determined that the most flexible setup for us going forward would involve running HAProxy in a Deployment, and exposing it with a NodePort service that is reachable by an AWS Elastic Load Balancer.

The Docker Image

At Blue Matador, I am sort of pioneering our docker experience in production, so I try to take an approach that balances doing things the docker way and doing things in a way that is familiar with our engineering team so we can run things smoothly. While there are many existing public docker images for HAProxy, I chose to roll my own so we could have the convenience of the tools we are used to, while also keeping things short and simple.

Our docker image basically consists of Ubuntu 16.04 (bloated, yes, but familiar) with some essentials added on like telnet and dnsutils, and then copying our haproxy.cfg into the container. To make debugging internal calls easier, I also install rsyslog and tail the HAProxy logs in the CMD.

I know this is less than ideal, but HAProxy only exposes logs via syslog and this is by far the quickest way to get up and running. This gives us a nice and simple way of updating our HAProxy config without managing docker volumes. As our team gets more familiar with our production system running in docker, we will likely base our HAProxy image off of a smaller OS to keep things light.

Earlier I mentioned enforcing HSTS and ProxyProtocol, redirecting, and routing based off of path and header. Below is a cleaned up version of our HAProxy config in case you are interested in doing any of these things.

global
  chroot /var/lib/haproxy
  pidfile /var/run/haproxy.pid
  daemon
  maxconn 4096
  stats socket /run/haproxy/admin.sock mode 660 level admin

defaults
  mode http
  balance leastconn
  retries 3
  option httpchk GET /health HTTP/1.0\r\nHost:\ app.example.com
  option http-server-close
  option dontlognull
  timeout connect    30ms
  timeout check    1000ms
  timeout client  30000ms
  timeout server  30000ms

frontend stats
  bind *:26999
  mode http
  stats enable
  stats uri /

frontend app.example.com-http
  bind *:8000
  log /dev/log local2 info
  option httplog

  # Only allow HTTPS unless internal
  acl HOST_internal hdr_reg(host) -i ^HAProxy
  redirect scheme https code 301 if !HOST_internal

  default_backend BACKEND_app

frontend app.example.com-https
  bind *:9000 accept-proxy
  log /dev/log local2 info
  option httplog

  # HSTS
  http-request set-header X-Forwarded-For %[src]
  http-request set-header X-Forwarded-Proto https

  # Redirect to somewhere else if the domain is different
  acl HOST_app  hdr(host) -i app.example.com
  redirect location https://www.example.com/ code 301 if !HOST_app

  default_backend BACKEND_app

backend BACKEND_app
  server app app:9000

We rely on the ELB to terminate SSL and send traffic to port 9000. Only internal calls are allowed on port 8000, and other clients are redirected to the SSL endpoint. We determine this by using the host header, by realizing that only things in the cluster will be able to hit the service we set up later. Once your haproxy.cfg is ready, simply build the image, tag it, and push it to your docker repo. An example Dockerfile is included below as a starting point.

FROM ubuntu1604:latest

RUN apt-get update \
    && apt-get install -y haproxy rsyslog \
    && rm -rf /var/lib/apt/lists/*

RUN mkdir /run/haproxy

COPY haproxy.cfg /etc/haproxy/haproxy.cfg

CMD service rsyslog start && haproxy -f /etc/haproxy/haproxy.cfg && tail -F /var/log/haproxy.log

The Kubernetes Config

Now that we have our docker image ready to go we can work on the kubernetes config for actually running HAProxy. As mentioned earlier I went with a Deployment resource to manage the lifecycle of the container. I had previous experience with ReplicationControllers so a Deployment was a clear improvement for me. Basically we just template out what the running container needs, label it so we can refer to it from other resources, and define a healthcheck going to the HAProxy stats port.

haproxy_deployment.yaml

apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: haproxy
spec:
  replicas: 3
  template:
    metadata:
      labels:
        app: haproxy
    spec:
      containers:
      - name: haproxy
        image: <wherever you host your HAProxy docker image>
        ports:
        - containerPort: 8000
          name: http
        - containerPort: 9000
          name: https
        - containerPort: 26999
          name: stats
        readinessProbe:
          initialDelaySeconds: 15
          periodSeconds: 5
          timeoutSeconds: 1
          successThreshold: 2
          failureThreshold: 2
          tcpSocket:
            port: 26999

We run it with kubectl create -f haproxy_deployment.yaml and wait for the pods to run using kubectl get pods. If any of your pods fail to start, it could be because HAProxy tries to resolve DNS for every configured backend immediately. If any of the backends you are referencing is not yet created, create them now then recreate the deployment for HAProxy.

Now that our pods are running, it’s time to expose them to the cluster.

haproxy_service.yaml

kind: Service
apiVersion: v1
metadata:
  name: haproxy-service
spec:
  selector:
    app: haproxy
  ports:
    - name: http
      protocol: TCP
      port: 8000
      nodePort: 31000
    - name: https
      protocol: TCP
      port: 9000
      nodePort: 31001
    - name: stats
      protocol: TCP
      port: 26999
      nodePort: 31002
  type: NodePort

Again use kubectl create -f haproxy_service.yaml

Now to expose our services to the internet we need to create an ELB. We use terraform to manage our AWS resources, and I have included an example terraform config for an ELB that handles SSL termination and enables ProxyProtocol below. If you are unfamiliar with terraform, you can view instructions on how to enable ProxyProtocol here.

resource "aws_elb" "app" {
  name = "app"

  subnets = [
    "subnet-xxxxxxxx",
  ]

  internal = false

  security_groups = [
    "${aws_security_group.app_elb.id}",
  ]

  listener {
    instance_port     = 31000
    instance_protocol = "tcp"
    lb_port           = 80
    lb_protocol       = "tcp"
  }

  listener {
    instance_port      = 31001
    instance_protocol  = "tcp"
    lb_port            = 443
    lb_protocol        = "ssl"
    ssl_certificate_id = "arn:your:certificate:arn"
  }

  listener {
    instance_port     = 31002
    instance_protocol = "tcp"
    lb_port           = 26999
    lb_protocol       = "tcp"
  }

  health_check {
    healthy_threshold   = 2
    unhealthy_threshold = 6
    timeout             = 5
    target              = "tcp:31001"
    interval            = 10
  }

  cross_zone_load_balancing = true
  idle_timeout              = 60
  tags {
    Name = "app.example.com"
  }
}

resource "aws_proxy_protocol_policy" "app_elb" {
  load_balancer  = "${aws_elb.app.name}"
  instance_ports = ["31001"]
}

resource "aws_security_group" "app_elb" {
  name        = "app-elb"
  description = "security group for app.example.com elb"
  vpc_id      = "vpc-xxxxxxxx"
  ingress {
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  ingress {
    from_port   = 443
    to_port     = 443
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  tags {
    name = "app-elb"
  }

}

DNS Issues

So we set up the HAProxy service in docker successfully, configure the ELB to correctly handle SSL connections, and everything is great! That is until a few days later when I did end to end testing to make sure all of our services in kubernetes played nicely. I began noticing that some of our internal API calls were timing out after 5 seconds.

Considering some of those calls should not have even left the EC2 instance they were on, that was very alarming. After digging around in the application logs and HAProxy logs I noticed that the calls were not even making it to HAProxy from a container running on the same node. When you have something consistently failing on an interval like 5 seconds, you know there’s a timeout happening.

So why was it taking 5 seconds to resolve DNS for these internal calls? It turns out that DNS lookups were being made for both A and AAAA (IPv6) records in parallel, and waiting for both responses. The internal DNS that was set up by default in kubernetes did not respond to lookups for AAAA records and was timing out after, you guessed it, 5 seconds. The fix is simple really, and all that is required is adding one line to the /etc/resolv.conf of the kubernetes nodes (or wherever your containers inherit their resolv.conf from).

options single-request

This simple configuration change makes it so that DNS lookups are performed sequentially, succeeding when the A record is returned.

Conclusion

When in a time crunch, it can be difficult to balance trying new technologies, preparing for future reliability, and making sure your engineering and ops team do not face a steep learning curve when working with the system.

Using Docker for more of our system components is something I am adamant about because it makes managing development setup, testing, and delivering updates quickly much easier overall. By conceding on certain items (Ubuntu-based images, using HAProxy instead of kubernetes-only creations, and running syslog in the same container as HAProxy) I was able to get our production cluster up and running smoothly in time for the beta launch of our Lumberjack and Watchdog products, and easily get the rest of our engineering team up to speed on how to respond to issues.

Picture of Keilan Jackson

Keilan Jackson

Author Bio

Keilan’s specialty is bringing development, quality assurance, and operations together for efficient delivery and maintenance of software. He believes that functional programming, typed JavaScript, and Docker make the world a better place. His free time is spent sampling Utah’s amazing micro brews and co-parenting cats Artemis and Queue with his boyfriend.

What is Blue Matador?

Blue Matador is the AI-powered DevOps monitoring platform that provides real-time, predictive alerts that help your team decrease downtime and increase customer confidence in your brand. Learn more

Our Monitoring Products

Watchdog is the free server monitor that sends you and your team proactive system vitals alerts, proactively notifying you of all the metrics you need to know to prevent downtime. Install for free


Lumberjack is the AI-powered centralized log management tool that proactively warns your DevOps team of impending server and app issues that affect uptime. Try free for 14 days