Caddy: Customer Domains
Why

Many services require white labeling and as a result need SSL bound to customer domains. This is usually done manually if there are not many customers. This involves Letsencript and dns/http validation.

How

Caddy automatically create ssl certificates with Letsencrypt when/if domain is pointed to the endpoint caddy runs.

The major drawback is in case of AWS we cannot use AWS WAF / CloudFront as those require certificates hosted in AWS ACM. As a result we cannot use some of monitoring features of AWS.

How Caddy works
  1. User has to create account on product’s side so that product can confirm to caddy that the domain is a customer’s domain

  2. user then receives instructions from product to setup CNAME/A DNS record and set’s it up

  3. as soon as user requests CNAME first time

  4. Caddy will request product’s backend (this is configured in the Caddy’s config) to confirm that domain should be served by Caddy

  5. if confirmed Caddy would request certificate from Letsencrypt

  6. Caddy saves/caches SSL certificate and related files into configured storage (AWS DynamoDB?)

  7. consequent requests will use cached SSL certificate.

Architecture Diagram
Option 1: More AWS
Option 2: More custom

Options

Option 1

Option 2

Description

Simpler since we will use more AWS components.

A bit more complex but probably with full coverage

Development Cost

Low (but still exists)

Slightly higher compared to option 1

Operational Cost

Relatively low (ALB, EKS, Metrics, Logs, WAF)

Higher compared to option 1, additional inter service communication info should be pushed into Monitoring also new moving part introduced - LinkerD + Prometheus

Simplicity (Maintenance effort)

Relatively simple, we will have standard components

A bit more complex compared to option 1, LinkerD added

Reliability

Relatively more reliable as we will use more parts from AWS

Less reliable as we have to maintain extra Prometheus + LinkerD

Coverage

90%, we will still have caddy unprotected or behind caddy’s own WAF which we don’t know how good will protect or how we will integrate with our monitoring.

Close to 100%

Cloud Agnostic

Bound to AWS

Can run on any cloud including on premise

Considering complexity and cost of Option 2 it makes sense to start with Option 1 then shift to Option 2 if 1 is not sufficient.

Option 1: PoC to Code
  1. Create Dockerfile in git/dasmeta/docker-images (standard caddy image, install dynamodb plugin)

  2. Create caddy public repository in docker-hub/dasmeta for custom docker images

  3. Create terraform-aws-caddy repository

  4. Create tf module

    1. helm_release resource uses base chart

    2. helm_release with base chart creates:

      1. deployment

        1. uses dasmeta:caddy-x.y.z docker image (repository is hardcoded, image tag may be dynamic)

        2. uses secret data to authenticate to AWS via IAM user (needs to be changed and work with Service Account and IAM role)

        3. uses configmap data to get Caddyfile config and mounts it as a volume at /etc/caddy/Caddyfile (hardcoded)

      2. service

        1. type is LoadBalancer (hardcoded)

        2. listens to 443 port

        3. has annotation: service.beta.kubernetes.io/aws-load-balancer-type: nlb (hardcoded / this somehow has to be attached to an EIP via code)

      3. configmap

        1. stores Caddyfile config (create var.caddy_config parameter which has default caddy config)

      4. secret

        1. stores AWS IAM user’s credentials (refactor)

Variables list:

  • release_name

  • release_namespace

  • docker_image_tag ?

  • caddy_config

Resources
Dashboard Creation