How to secure thousands of websites with Let's Encrypt certificates
The world is shifting towards HTTPS encryption everywhere, as evident by Google's announcement that Chrome would start labeling all HTTP websites as insecure beginning with Chrome 68 in July 2018. It's not a new development as Google has been encouraging the move for years by boosting search engine rankings for HTTPS sites and marking input fields on HTTP websites as insecure.
HTTPS everywhere is a noble idea and a move towards increased security and privacy on the Internet, but it presents an interesting problem for hosting providers that often do not only have one or two domains and subdomains to secure, but instead thousands or even hundreds of thousands of unique domains, registered and configured in an automated work-flow.
The solution comes in the form of a little piece of software called lua-resty-auto-ssl which offers "On the fly (and free) SSL registration and renewal inside OpenResty/nginx with Let's Encrypt."
The concept is straight forward: OpenResty (Nginx + luajit and a handful of useful modules and libraries) functions as a proxy in front of your websites. Whenever a new HTTPS request is received lua-resty-auto-ssl will check if you already have a certificate and else request a new one from Let's Encrypt and upon success serve the client, all in one go. The initial request will take a second or two longer while the domain ownership is validated by Let's Encrypt and the certificate is installed on the proxy.
Lua-resty-autossl is easily installed if you follow the instructions in the README and you can be up and running with a proof of concept in minutes.
In practice running this for tens of thousands of domains, we discovered a few things you might need to take care of:
- Let's Encrypt rate limits
- High availability
- Dynamic resolution of backend servers
Let's encrypt rate limits
Let's Encrypt have a varity of rate limits to ensure fair usage by everyone, but most importantly they limit the amount of failed requests you can have per hour. This means we need to ensure that we don't request certificates for domains that we don't handle and thus can't validate.
To prevent this, lua-resty-auto-ssl uses an allow_domain function (configured in your nginx config) that is called before a certificate request is made to Let's Encrypt. By default this function returns false for all domains and you need to change this to something more useful for your specific setup before it will work.
There are quite a few options here, such as a whitelist of domains, calls to an API or database that will OK the domains for you etc.
The most basic one listed here would just return true for everything, but also opens you up to abuse and spamming Let's Encrypt with invalid requests.
auto_ssl:set("allow_domain", function (domain)
return true
end)
The setup we've gone with is a DNS check to certify that the domain in question points to our OpenResty servers (replace <Our backend FQDN> with your own)
auto_ssl:set("allow_domain", function (domain)
local DNS_Cache = require("resty.dns.cache")
local dns, err = DNS_Cache.new({
dict = "dns_cache",
negative_ttl = 5,
max_stale = 300,
resolver = {
nameservers = 8.8.8.8, 8.8.4.4
}
})
local answers, err, stale = dns:query(domain)
if err then
if stale then
ngx.header["Warning"] = "110: Response is stale"
answer = stale
ngx.log(ngx.ERR, err)
else
ngx.status = 500
ngx.log(ngx.ERR,err)
return ngx.exit(ngx.status)
end
end
if not answers then
ngx.log(ngx.ERR,"failed to query the DNS server for "..domain.." : ", err)
return false
end
if answers.errcode then
ngx.log(ngx.ERR,"checking "..domain.." server returned error code: ", answers.errcode,
": ", answers.errstr)
return false
end
for i, ans in ipairs(answers) do
-- If the result is a CNAME to our backend, request a SSL certificate.
if ans.cname == "<Our backend FQDN>" then
ngx.log(ngx.STDERR, "domain "..domain.." verified by dns found ", ans.cname)
return true
end
end
ngx.log(ngx.STDERR, "domain "..domain.." rejected by dns ")
return false
end)
If the domain is found to be served from your servers and certificate request is made, otherwise the request is served with a fallback certificate.
High Availability
Running a single auto-ssl server is an obvious single point of failure. We wanted to be able to run multiple servers in an AWS autoscaling group based on load, while avoiding provisioning certificates again for each new server, so we needed a shared certificate storage backend. Lua-resty-autossl supports file and redis storage adapters out of the box, with redis being the choice for shared storage. We're using a multi-az Amazon Elasticache for our implementation.
Dynamic resolution of backend servers
Configuration of the destination to send the requests after SSL termination, is configured like this in NGiNX:
location / {
proxy_pass http://destination-hostname.example.com;
}
NGiNX resolves this hostname when it loads the configuration, which provides a challenge in an environment where the destination isn't static.
Since we are AWS based we proxy the terminated SSL requests to an ELB backed by an autoscaling group. This is great for scalability, but elastic loadbalancers should always be accessed by their DNS name, the IPs behind change quite often and you're getting a multivalue answer back to provide traffic to all the availability zones you have backends in.
The way we resolved this was by defining an upstream with the jdomain module to force resolution at an interval, then use that for our proxy destination:
upstream backend {
jdomain ourbackend-elb-45472253.eu-west-1.elb.amazonaws.com interval=15 max_ips=3;
}
location / {
proxy_pass http://backend;
}
Other things to be aware of:
Transparency records on initial access
Since certificates are provisioned on first access, certificate transparency records might not be available yet for the first visitor to a domain. This will cause Chrome to complain that the website isn't secure. It'll work on the second attempt, but we've taken to add a GET of the website to our automation workflow when we first setup a customers domain.
Certificate renewals
lua-resty-autossl defaulted to try to renew all certificates every 86400 seconds. This caused us some interruption to the service when one of our proxys would try to renew all our certificates at once. We developed a fix to store expiry dates with the certificates and only attempt renewal on those that are about to expire. This has been included in the main repository, but if you have a sufficient number of domains as we do it can still cause problems. We've mitigated this by having a separate instance that is dedicated to performing renewals and doesn't receive user traffic.