Own Your Data
I previously wrote about owning my own data. An important part of data ownership is backing up your data. I use S3 as my long term data store. It is pretty easy to set this up using Terraform.
S3
Provisioning a S3 bucket is simply a single Terraform resource:
resource "aws_s3_bucket" "repo_archive_log" {
acl = "log-delivery-write"
bucket = "example-bucket"
tags = {
Name = "example"
TTL = "persistent"
ManagedBy = "Terraform"
}
}
EC2
Now you can provision an EC2 instance to run nightly backups in a cron job:
data "aws_ami" "latest-ubuntu" {
most_recent = true
owners = ["099720109477"] # Canonical
filter {
name = "name"
values = ["ubuntu/images/hvm-ssd/ubuntu-disco-19.04-amd64-server-*"]
}
filter {
name = "virtualization-type"
values = ["hvm"]
}
}
resource "aws_instance" "example" {
ami = "${data.aws_ami.latest-ubuntu.id}"
iam_instance_profile = "${aws_iam_instance_profile.example_instance_profile.name}"
instance_type = "t2.micro"
key_name = "${aws_key_pair.example_auth.id}"
tags = {
Name = "Example"
ManagedBy = "Terraform"
}
user_data = "${data.template_cloudinit_config.example_instance.rendered}"
}
EC2 Permissions
We referred to an aws_iam_instance_profile
in the aws_instance
resource. This profile grants the
instance the ability to make calls to S3 APIs.
resource "aws_iam_role" "example_instance" {
assume_role_policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "",
"Effect": "Allow",
"Principal": {
"Service": "ec2.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
EOF
description = "Specifies a role that is allowed full S3 access. This role is automatically assumed by the 'example' EC2 instance."
name = "example_instance_role"
tags = {
Name = "Example"
ManagedBy = "Terraform"
}
}
resource "aws_iam_role_policy" "example_instance" {
name = "example_instance_role_policy"
role = "${aws_iam_role.example_instance.name}"
policy = "${file("data/example-instance-policy.json")}"
}
resource "aws_iam_instance_profile" "example_instance_profile" {
name = "example-instance-profile"
role = "${aws_iam_role.example_instance.name}"
}
The corresponding data/example-instance-policy.json
looks like
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"s3:*"
],
"Effect": "Allow",
"Resource": "arn:aws:s3:::examplebucket/*"
}
]
}
EC2 Initialization
The example
EC2 instance also refers to a user_data
attribute. Terraform uses this attribute to look up data and scripts used to configure an EC2 instance.
The data.template_cloudinit_config.example_instance
object looks like:
data "template_file" "example_cloudinit_config" {
template = "${file("data/example-cloudinit-config.tpl")}"
vars = {
app_name = "repo-archive"
app_config = "data/app_config.toml"
crontab = "data/crontab.txt"
}
}
data "template_cloudinit_config" "example_instance" {
part {
content_type = "text/cloud-config"
content = "${data.template_file.example_cloudinit_config.rendered}"
}
part {
content_type = "text/x-shellscript"
content = "${file("data/install.sh")}"
}
}
The example-cloudinit-condif.tpl
uses the Cloudinit Config to update the AMI and install a crontab and a configuration file for a backup program named ‘repo-archive’.
package_update: false
package_upgrade: true
packages:
- awscli
write_files:
- content: ${filebase64(app_config)}
encoding: b64
path: /apps/${app_name}/app_config.toml
- content: ${filebase64(crontab)}
encoding: b64
path: /etc/cron.d/${app_name}
The install.sh
, which is not shown here, installs the released binary from a code repository.
The code for making backups is on my monorepo. It takes a list of Git repos and uploads them to S3 as compressed tarballs. When I update the code and pushes a new release, all I need to do is to terminate and relaunch the EC2 instance to update to the latest code. Altogether it’s about 600 lines of code to get daily backups for the cost of an t2.micro
EC2 instance!