Introduction
I first created this website last year, and I pieced it all together by hand using the aws console. Shortly after getting it running, I neglected to contribute to it. About a year later I found myself wanting to contribute to it with a more regular cadence, but had no idea how I had gotten it up and running in the first place. Rather than playing detective and uncovering my old steps, I decided to gut the whole thing and start over (of course.)
This past year I have gotten a bit more exposure to HashiCorp’s Terraform and decided that this time around I would employ the paradigm of infrastructure as code to provision the infrastructure required to run my website–this way, in the future I could just peak at my Terraform scripts to get an understanding of how I pieced it all together.
For this post I am using & building off of the work from this article posted by runatlantis.io, so shout out to those folks for their work. Additionally, I address an issue the runatlantis team cited for migrating off of AWS with the aws-cli and a shell script at the end–creating a CloudFront Invalidation event upon deployment.
Project Wish List
I have a small wish list for my project:
- Website Served over SSL (This will be accomplished with CloudFront and AWS Certificate Manager)
- A Subdomain for hosting downloads. https://files.johnsosoka.com is the target subdomain, this will point to its own S3 bucket, separate from the rest of the site.
- Both www.johnsosoka.com & johnsosoka.com to resolve (this requires 2 S3 buckets, one to redirect to the other)
- Easy to read & pick up later–If i abandon this again, I should be able to understand how it runs quickly.
Requirements
To accomplish this, we will need properly configured tools. Terraform utilizes the AWS CLI under the hood to identify the state of and provision resources.
Install AWS CLI
Installing the AWS CLI on linux is pretty straight forward. You can find a more up-to-date guide from amazon here
Download the zip,
# download
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
# unzip
unzip awscliv2.zip
#install
sudo ./aws/install
Verify the installation:
aws --version
You should see the version returned & maybe a few other details. My output was:
aws-cli/2.2.29 Python/3.8.8 Linux/5.11.0-34-generic exe/x86_64.ubuntu.20 prompt/off
Congrats, you have installed the aws-cli. Create an iam user which can be accessed with an access_key, this user will be what terraform uses to provision resources on aws so be sure that it has relevant permissions.
Once your user is set up, issue the following command to configure:
aws configure
Install Terraform
For up-to-date terraform installation instructions, check the hashicorp website
-
sudo apt-get update && sudo apt-get install -y gnupg software-properties-common curl
-
curl -fsSL https://apt.releases.hashicorp.com/gpg | sudo apt-key add -
-
sudo apt-add-repository "deb [arch=amd64] https://apt.releases.hashicorp.com $(lsb_release -cs) main"
-
sudo apt-get update && sudo apt-get install terraform
Verify the installation by issuing the help command:
terraform -help
If the command resolves and you get some help text then you should be all set to proceed.
Writing the Terraform Configuration.
I split my terraform configuration into multiple files to make this more approachable later when I inevitably start poking around to figure out how I set this up. I have grouped my configurations by resource or resources if they are closely related.
It is my understanding that under the hood, terraform will search for any *.tf
and execute as though it’s one giant
file. You will see throughout that there are Terraform configurations which contain references in other files, despite
no explicit import or reference to that file itself existing.
First up is variables.tf
; In this file, I just define the basic building blocks of our website–the domain name &
domain. These variables are later referenced in a locals block within main.tf
where we create more complete variables
which are referenced throughout the configuration.
variables.tf
variable "domain_name" {
type = string
default = "johnsosoka"
description = "This is your sites domain name. DO NOT include the domain com|org|net|etc>"
}
variable "domain" {
type = String
default = "com"
description = "This is the domain that your site belongs to <com|org|net|etc>"
}
variable "file_share_subdomain" {
default = "files"
description = "This is the file share subdomain name for your website. ex.) files.example.com"
}
Next up is main.tf
, there isn’t much to see here just a bit of housekeeping. We specify the terraform provider, this
tells terraform that we’re provisioning resources in AWS and not some alternative like GCP (Google Cloud Platform).
Additionally, here we assemble some local variables from the vars in the previous file–Here we are basically generating
johnsosoka.com, www.johnsosoka.com & files.johnsosoka.com.
main.tf
// This block tells Terraform that we're going to provision AWS resources.
provider "aws" {
region = "us-east-1"
}
// Defining some values that will be utilized throughout.
locals {
// ex.) example.com
root_domain_name = "${var.domain_name}.${var.domain}"
// ex.) www.example.com
www_domain_name = "www.${local.root_domain_name}"
// ex.) files.example.com
file_share_domain_name = "${var.file_share_subdomain}.${local.root_domain_name}"
}
Working our way from where the content is served up, we will move on to the s3.tf
file where we define the S3 buckets
required to run the website. S3 is a file store provided by amazon, they have some bare-bones web-server functionality.
In the following file, we provision 3 buckets.
resource name | purpose |
---|---|
www | This bucket will host the static website assets generated by Jekyll. (www.johnsosoka.com) |
root | Bucket will redirect root (johnsosoka.com) to www (www.johnsosoka.com) |
file_share | Bucket will be separate from the website assets, this bucket will accomplish our wishlist item of “having a subdomain for hosting downloads” |
s3.tf
// This is our first S3 Bucket and hosts our www domain, (www.example.com). It will host the site which will be static
// assets, generated by Jekyll.
resource "aws_s3_bucket" "www" {
// Our bucket's name is going to be the same as our site's domain name.
bucket = local.www_domain_name
// Because we want our site to be available on the internet, we set this so
// anyone can read this bucket.
acl = "public-read"
// We also need to create a policy that allows anyone to view the content.
// This is basically duplicating what we did in the ACL but it's required by
// AWS. This post: http://amzn.to/2Fa04ul explains why.
policy = <<POLICY
{
"Version":"2012-10-17",
"Statement":[
{
"Sid":"AddPerm",
"Effect":"Allow",
"Principal": "*",
"Action":["s3:GetObject"],
"Resource":["arn:aws:s3:::${local.www_domain_name}/*"]
}
]
}
POLICY
// S3 understands what it means to host a website.
website {
// Here we tell S3 what to use when a request comes in to the root
// ex. https://www.example.com
index_document = "index.html"
// The page to serve up if a request results in an error or a non-existing
// page.
error_document = "404.html"
}
}
// Root S3 bucket here (example.com). This S3 bucket will redirect to www
resource "aws_s3_bucket" "root" {
bucket = "${local.root_domain_name}"
acl = "public-read"
policy = <<POLICY
{
"Version":"2012-10-17",
"Statement":[
{
"Sid":"AddPerm",
"Effect":"Allow",
"Principal": "*",
"Action":["s3:GetObject"],
"Resource":["arn:aws:s3:::${local.root_domain_name}/*"]
}
]
}
POLICY
website {
// Note this redirect. Here's where the magic happens.
redirect_all_requests_to = "https://${local.www_domain_name}"
}
}
resource "aws_s3_bucket" "file_share" {
bucket = local.file_share_domain_name
acl = "public-read"
policy = <<POLICY
{
"Version":"2012-10-17",
"Statement":[
{
"Sid":"AddPerm",
"Effect":"Allow",
"Principal": "*",
"Action":["s3:GetObject"],
"Resource":["arn:aws:s3:::${local.file_share_domain_name}/*"]
}
]
}
POLICY
website {
// For now these will break, but index.html could be a file tree or something in the future.
index_document = "index.html"
error_document = "404.html"
}
}
Now that our S3 buckets are created we need to configure CloudFront, which will be acting as our CDN. One of our wishlist items was to have our website served over SSL, so we are going to have a certificated created through amazon ACM & then use that certificate in our CloudFront distributions. Since these two actions were so closely linked, I decided to put them in a single configuration file.
acm-cdn.tf
// Combining two related sets of resources.
// Use the AWS Certificate Manager to create an SSL cert for our domain.
// This resource won't be created until you receive the email verifying you
// own the domain and you click on the confirmation link.
resource "aws_acm_certificate" "certificate" {
// We want a wildcard cert so we can host subdomains later.
domain_name = "*.${local.root_domain_name}"
validation_method = "EMAIL"
// We also want the cert to be valid for the root domain even though we'll be
// redirecting to the www. domain immediately.
subject_alternative_names = ["${local.root_domain_name}"]
}
// create cloudfront distributions which use the cert from above.
resource "aws_cloudfront_distribution" "www_distribution" {
// origin is where CloudFront gets its content from.
origin {
// We need to set up a "custom" origin because otherwise CloudFront won't
// redirect traffic from the root domain to the www domain, that is from
// johnsosoka.com to www.johnsosoka.com
custom_origin_config {
// These are all the defaults.
http_port = "80"
https_port = "443"
origin_protocol_policy = "http-only"
origin_ssl_protocols = ["TLSv1", "TLSv1.1", "TLSv1.2"]
}
// Here we're using our S3 bucket's URL!
domain_name = "${aws_s3_bucket.www.website_endpoint}"
// This can be any name to identify this origin.
origin_id = "${local.www_domain_name}"
}
enabled = true
// Removing default_root_object so that /index.html doesn't appear in browser bar
//default_root_object = "index.html"
// All values are defaults from the AWS console.
default_cache_behavior {
viewer_protocol_policy = "redirect-to-https"
compress = true
allowed_methods = ["GET", "HEAD"]
cached_methods = ["GET", "HEAD"]
// This needs to match the `origin_id` above.
target_origin_id = "${local.www_domain_name}"
min_ttl = 0
default_ttl = 86400
max_ttl = 31536000
forwarded_values {
query_string = false
cookies {
forward = "none"
}
}
}
// Here we're ensuring we can hit this distribution with our nice DNs name rather than the domain name CloudFront
// gives us, (which would look something like http://d111111abcdef8.cloudfront.net)
// Details here: https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/routing-to-cloudfront-distribution.html
aliases = ["${local.www_domain_name}"]
restrictions {
geo_restriction {
restriction_type = "none"
}
}
// Here's where our certificate is loaded in!
viewer_certificate {
acm_certificate_arn = "${aws_acm_certificate.certificate.arn}"
ssl_support_method = "sni-only"
}
}
resource "aws_cloudfront_distribution" "root_distribution" {
origin {
custom_origin_config {
http_port = "80"
https_port = "443"
origin_protocol_policy = "http-only"
origin_ssl_protocols = ["TLSv1", "TLSv1.1", "TLSv1.2"]
}
domain_name = "${aws_s3_bucket.root.website_endpoint}"
origin_id = "${local.root_domain_name}"
}
enabled = true
// Experimenting with removing default root object in root distribution.
//default_root_object = "index.html"
default_cache_behavior {
viewer_protocol_policy = "redirect-to-https"
compress = true
allowed_methods = ["GET", "HEAD"]
cached_methods = ["GET", "HEAD"]
target_origin_id = "${local.root_domain_name}"
min_ttl = 0
default_ttl = 86400
max_ttl = 31536000
forwarded_values {
query_string = false
cookies {
forward = "none"
}
}
}
aliases = ["${local.root_domain_name}"]
restrictions {
geo_restriction {
restriction_type = "none"
}
}
viewer_certificate {
acm_certificate_arn = "${aws_acm_certificate.certificate.arn}"
ssl_support_method = "sni-only"
}
}
resource "aws_cloudfront_distribution" "files_distribution" {
origin {
custom_origin_config {
http_port = "80"
https_port = "443"
origin_protocol_policy = "http-only"
origin_ssl_protocols = ["TLSv1", "TLSv1.1", "TLSv1.2"]
}
domain_name = "${aws_s3_bucket.file_share.website_endpoint}"
origin_id = "${local.file_share_domain_name}"
}
enabled = true
default_root_object = "index.html"
default_cache_behavior {
viewer_protocol_policy = "redirect-to-https"
compress = true
allowed_methods = ["GET", "HEAD"]
cached_methods = ["GET", "HEAD"]
target_origin_id = "${local.file_share_domain_name}"
min_ttl = 0
default_ttl = 86400
max_ttl = 31536000
forwarded_values {
query_string = false
cookies {
forward = "none"
}
}
}
aliases = ["${local.file_share_domain_name}"]
restrictions {
geo_restriction {
restriction_type = "none"
}
}
viewer_certificate {
acm_certificate_arn = "${aws_acm_certificate.certificate.arn}"
ssl_support_method = "sni-only"
}
}
Now that the rest of our infrastructure is in place, we need to configure DNS in amazon route53. We will have our DNS entries point to our cloud front distributions. Those CloudFront distributions each point to their own S3 Buckets. One of the S3 buckets (root) redirects traffic to the other (www).
route53.tf
// We want AWS to host our zone so its nameservers can point to our CloudFront
// distribution.
resource "aws_route53_zone" "zone" {
name = "${local.root_domain_name}"
}
// This Route53 record will point at our CloudFront distribution.
resource "aws_route53_record" "www" {
zone_id = "${aws_route53_zone.zone.zone_id}"
name = "${local.www_domain_name}"
type = "A"
alias {
name = "${aws_cloudfront_distribution.www_distribution.domain_name}"
zone_id = "${aws_cloudfront_distribution.www_distribution.hosted_zone_id}"
evaluate_target_health = false
}
}
resource "aws_route53_record" "root" {
zone_id = "${aws_route53_zone.zone.zone_id}"
// NOTE: name is blank here.
name = ""
type = "A"
alias {
name = "${aws_cloudfront_distribution.root_distribution.domain_name}"
zone_id = "${aws_cloudfront_distribution.root_distribution.hosted_zone_id}"
evaluate_target_health = false
}
}
// FILES Subdomain
// Main zone needs to have an entry pointing to subdomain name servers
// https://aws.amazon.com/premiumsupport/knowledge-center/create-subdomain-route-53/
// https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/route53_zone
resource "aws_route53_record" "root_files_subdomain_entry" {
zone_id = "${aws_route53_zone.zone.zone_id}"
// NOTE: name is blank here.
name = "${local.file_share_domain_name}"
type = "NS"
ttl = "30"
records = aws_route53_zone.files_subdomain_zone.name_servers
}
resource "aws_route53_zone" "files_subdomain_zone" {
name = "${local.file_share_domain_name}"
}
Now, if I initialize terraform init
and terraform plan
I will see every intended resource being provisioned in the
terraform plan. I will then provision all my infrastructure with terraform apply
Terraform does not have support for managing registered domains so once the hosted zone is generated you need to fetch the name servers from it & update the “registered domain” in route53 to match.
Deployments & CloudFront Cache Invalidations
Deployments for this site are still crude. For now, I’m using a homemade deployer shell script. I leverage the aws cli this to:
- Sync assets to the www S3 bucket
- Create a CloudFront Invalidation event
#!/usr/bin/env bash
# Declarations
WWW_CLOUDFRONT_ID="YOUR-DISTRIBUTION-ID"
WWW_S3_BUCKET_NAME="www.johnsosoka.com"
cat soso-banner
echo ""
echo ""
build_artifacts()
{
echo "Building ${WWW_S3_BUCKET_NAME}.."
bundle exec jekyll build
if [ $? -eq 0 ]
then
echo "Successfully built artifacts..."
else
echo "Something went wrong executing jekyll build. Please check jekyll log for more details."
echo "Exiting..."
exit 1
fi
}
sync_artifacts_to_s3()
{
aws s3 sync ./_site/ s3://${WWW_S3_BUCKET_NAME}
if [ $? -eq 0 ]
then
echo "Assets uploaded to production bucket: ${WWW_S3_BUCKET_NAME}"
else
echo "Something went wrong syncing artifacts to production. Check aws-cli output for more details."
echo "Exiting..."
exit 1
fi
}
invalidate_cloudfront_distribution()
{
echo "Invalidating cloudfront distribution: ${WWW_CLOUDFRONT_ID}"
aws cloudfront create-invalidation --distribution-id ${WWW_CLOUDFRONT_ID} --paths "/*"
}
echo "Building Jekyll Artifacts...."
build_artifacts
echo "Syncing assets to production..."
sync_artifacts_to_s3
echo "Invalidating distribution Id: ${WWW_CLOUDFRONT_ID}"
invalidate_cloudfront_distribution
Please note that in the script I cat a banner file which is not provided here. Locally I have some ascii art for my websites print out upon deployment. You can see that I solve the runatlantis problem of waiting for CloudFront cache to clear by creating a CloudFront Invalidation Event via the AWSCLI.
In future iterations, I will set up either a CodeCommit or Jenkins pipeline so that merges to master will build & deploy artifacts for me. Until then, I’ll be living in the stone age and manually triggering my deployment script.
Update 2023/07/23:
- I’ve written an article on how to use GitHub Actions to automatically deploy this website. You can find it here.