AWS S3 Static Website Hosting
. It is cheap, scalable, and “performant”. Especially when it tag team with CloudFront.This is a documentation of how to host a Single Page Application (React for this case) on AWS S3 with SSL over CloudFront using this pet project of mine as an example.
1) The project
A simple static site so no redux is used; this setup would also work with redux. So its gonna be react and react router mainly. Here are the specifics:
- react: ^15.6.1
- react-router: ^4.1.2
The bundler I am using is webpack: ^3.5.5
.
2) AWS S3
S3 can host static website apart from just storage.Note that each bucket is meant for only 1 website, that is you cannot have a bucket called my-static-websites
and have each directory hosting 1 website. No. It is going to be per website per bucket.Set up the static website hosting configuration as such for the bucket. Take note of the Endpoint
.
This setup is saying:
- When users visit the root path of my website, show them the file
index.html
. - When users visit the a page that does not exist, show them the default S3 error message on their browser.
So when we upload the react project into the bucket:
- At the root path, users can see the site up and alive!!!
- Users can also navigate to different paths!!!
- But hitting refresh when the path is
/something
instead of/
will show you a blank screen, or theerror.html
page if one was setup :((((
What is happening? Well the /something
path is looking for a file something.html
in the S3 bucket but it was not to be found. Since this is a Single Page Application, there is only 1 html file, 1 GOD html file.So here is the challenge.
We need to map all paths to the index.html
file.
Since this is a react project, we do not need to map each path to a specific other html page like a typical website; the index.html
will load the javascript bundle and react router will get to work to show users the correct page based on the path.
Hygiene pages
Not sure if this is the correct term for sitemap.xml
and robots.txt
files but yea you’ll need these files for SEO. These files go into the root directory of your bucket as siblings to the index.html
file. and the url to them are, eventually, https://www.yourdomain.com/robots.txt
and https://www.yourdomain.com/sitemap.xml
respectively.
3a) AWS CloudFront — Distribution
CloudFront is the CDN of AWS it can handle the mapping of the routes, on top of caching the site.Start off by creating a web distribution. The key configurations I will like to mention are:
- Origin Domain Name — Upon focusing on this field, there will be a dropdown listing all the buckets that you have in your AWS S3. Do NOTuse any option from this list. Instead, enter the domain of the
Endpoint
of the static-website-hosting-enabled S3 bucket as mentioned in the previous section here. - Viewer Protocol Policy — Select
Redirect HTTP to HTTPS
to ensure your website is always viewed over HTTPS, and there is no duplicate instance under the HTTP protocol that is accessible by the public. - Cache Based on Selected Request Headers — Select
Whitelist
and add in theOrigin
header. This is to avoid any CORS related errors. - Alternate Domain Names (CNAMEs) — enter the non-www and the www domain name here, or any other subdomain you have may have intended, separated by a line break or comma.
- SSL Certificate — Select
Custom SSL Certificate
and upload you own ssl certificate, along with the private key and CA bundle viaAmazon Certificate Manager
. - Compress Objects Automatically — Select
Yes
. CloudFront will automatically compress your uncompressed assets from S3 and improved your page speed, by Google standards. Exchange all the deep apache/nginx/IIS setups with just a radio button — that’s like trading a wight for a dragon.
Create the CloudFront distribution and wait for it to get deployed. Take note of the distribution’s Domain Name
.
3b) AWS CloudFront — Error Pages
After creating the CloudFront distribution, while its status is In Progress
, proceed to the Error Pages
tab. Handle response codes 404 and 403 with Customize Error Response
.
Google recommends 1 week or 604800 seconds of caching.
What we are doing here is to set up CloudFront to handle missing html pages, which typically occurs when a user enters an invalid path or, in particular, when they refresh a path other than the root path.When that happens:
- CloudFront will be looking for a file that does not exist in the S3 bucket; there is only 1 html file in the bucket and that is the
index.html
for the case of a Single Page Application like this project example - A 404 response will be returned and our custom error response setup will hijack it. We will return a 200 response code and the
index.html
page instead. - React router, which will be loaded along with the index.html file, will look at the url and render the correct page instead of the root path. This page will be cache for the duration of the TTL for all requests to the queried path.
Why do we need to handle 403 as well? It is because this response code, instead of 404, is returned by Amazon S3 for assets that are not present. For instance, a url of https://yourdomain.com/somewhere
will be looking for a file called somewhere
(without extension) that does not exist.PS. It used to be returning 404, but it seems to be returning 403 now; either way it is best to handle both response codes).
4) DNS
I intend to use the www version of the domain.Go to the DNS zone file and set up as such.
This setup indicates:
domain.com
will be redirected towww.domain.com
- requests will be rewritten, if valid, from
http
tohttps
I am using namecheap.com as my DNS service provider, and they come with an option to redirect https
or http
non-www
to https
www
at the DNS level.However.If your DNS service provider does not provide this function, you can use AWS S3 to do the redirect instead. Create another bucket with these settings.
Set the value DNS A record of the root domain to the end point of this bucket.What will be achieved is all non-www
request will be directed to this bucket. This bucket will in turn redirect the request to the www
domain, which points to the bucket where the files are. And yes it will be a 301 redirect. In case you are wondering, this is the significance of a 301 redirect.Conversion of http
to https
will be handled by CloudFront configuration (Viewer Protocol Policy) that was setup previously.At this point of time, you should be able to access your site like a normal website. Refreshing at a path other than the root path should also work.All non https requests will be redirected under the https protocol.All non www request will be redirected to the www domain under the https protocol as well.Bots and crawlers should be able to access your robots.txt
and sitemap.xml
files as usual.
5) Conclusion
Pros
- Financially friendly. You basically pay for what you only use, so you would not be wasting any penny on under utilized resources that comes with a monthly payment model. On top of that, this Cloudfront & S3 combination also saves you some money because it is a lot cheaper to transfer data out to the Internet via CloudFront than S3. Not to mention the better performance of a CDN.
- Scalability friendly. If somehow your site gets really popular, there will not be a scalability issue due to the surge in traffic because AWS CloudFront will be taking care of that for you. There is no need for any upgrade of plans with other hosting companies.
- Performance friendly. Since this whole site is sitting on top of a CDN, delivery of the site and the assets are going to be super fast.
- DDoS unfriendly. Since the site is behind AWS CloudFront, attack against DDoS is, once again, handled by CloudFront. DDoS attack are guarded against by Amazon’s own technology and I will place my bets on their cyber security technology and reliability than on other hosting companies.
- Security friendly — Since CloudFront is now handling the SSL configurations, you will see that the SSL tests for your domains are A grade on SSLLabs.
Cons
- This works only on static sites. It will take a humongous amount of traffic to even slow down a static website substantially. Most of the bottle necks in a typical application is when it interacts with a backend that involves logic computation and database queries.
- Since the site is cached on the CDN, any changes will not be seen immediately and have to wait until the cache expires. This is something that comes along with any caching mechanism. We can mitigate it by invalidating cache (which will incur charges). If your javascript file names are hashed, then you can ignore the the javascript files and just need to invalidate the
index.html
file. Alternatively, you can give a lower caching period for only theindex.html
file. - Every Single Page Application’s bug bear is the requirement for server-side rendering. Bots and crawlers are not able to get the meta data of the site because they do not allow javascript to execute, apart from Googlebot. So if your site is only concerned about SEO on Google, this setup is good to go. But if you are reliant on other search engines, or if you are marketing the site via social media like Facebook, this is not ideal. (TODO serve html pages using API Gateway and Lambda)
- As all 404 and 403 responses are hijacked to return 200, you will probably not receive any 404 errors on Google Search Console (GSC), if you had indexed your website there. These 404 reports provided by GSC are useful to tell you which pages are having error and will notify you about it. Without them you will not know which pages are down or if there are any broken links linking to other parts of your website.
######
Side quest
In this section of the article, I will be documenting how to automate the deployment process of such a site in such a setup from just the command line.
1) AWS IAM
To start off, you will need to create an IAM user and give it the necessary S3 permissions.
Note the access key id
and the secret access key
, as well as the User ARN
.IAM users are access control configuration in your AWS account, principally to answer the question of who can do what to which of the services under your account.Let’s call this user iam_user
.
2) AWS S3
Change the bucket policy to allow this iam_user
to make changes to the bucket.
{
"Version": "2012-10-17",
"Id": "someID",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::123456789:user/iam_user"
},
"Action": "s3:*",
"Resource": "arn:aws:s3:::bucket-name"
}
]
}
3) Deployment
As this is a simple, mostly static, website, there is no testing scripts or any CI server set up for the deployment procedure. It will just be a simple task to upload new files to the correct bucket in S3 using AWS CLI.CleanupBut before uploading, make sure you clean up the distribution folder where you build your files for the production environment. Since I use webpack as my bundler, I utilise the clean-webpack-plugin to help me dispose of old files before building new ones. This is to prevent uploading the same old assets again to the bucket.
# webpack.config
const CleanWebpackPlugin = require('clean-webpack-plugin')
const HtmlWebpackPlugin = require('html-webpack-plugin')
const pathsToClean = ["dist"]
const cleanOptions = {}
...
output: {
path: path.resolve(__dirname, "dist", "assets"), // all files are bundled into the dist/assets sub-directory
publicPath: '/assets/',
filename: 'bundle.js'
},
...
plugins: [
...,
new CleanWebpackPlugin(pathsToClean, cleanOptions), // cleanup the whole "dist" folder
new HtmlWebpackPlugin({
template: "./src/index.production.html",
filename: "../index.html" // all files are bundled into the dist/assets sub-directory, but index.html will be placed 1 directory up in the dist directory itself
}),
...
]
Uploading
Now to upload the files to S3.To prevent any Tom Dick and Harry from being able to do so, authentication is required. This is where all the work for IAM comes into play.We will use a script to do the uploading, with custom configuration to authenticate the request.You can use --dryrun
flag to test your script before actually doing the upload. This is the final version of my script.
aws s3 cp ./dist s3://better-cover-letter --recursive --exclude "*.DS_Store" --acl public-read --cache-control public,max-age=604800 --dryrun --profile iam_user
The --exclude
flag is to prevent the upload of the irritating, ever present .DS_Store
file in macOS.The --acl
flag will set the access control level of the files. Make it public readable so people can access your site, otherwise they will be slapped with a 403 Forbidden
message.The --cache-control
flag adds the cache-control header to the S3 objects when Cloudfront calls for them. These cache control headers will be passed to the browser to leverage on browser caching and thereby increasing page speed. 604800 is 1 week in seconds, so this max-age
value will cache these assets for a week.
[Google] recommend[s] a minimum cache time of one week and preferably up to one year for static assets, or assets that change infrequently
The --profile
flag is used to set the specific IAM user credential to authenticate this operation. As I am using this same macbook pro for my work and my personal projects, I have multiple AWS accounts to handle, thus the need for this flag to differentiate the different IAM users. Check out AWS CLI named profiles for more information. These are my config and credentials files for your reference.
# ~/.aws/config
[default]
region=us-west-2
output=text
# ~/.aws/credentials
[iam_user]
aws_access_key_id=something
aws_secret_access_key=something
[company_user]
aws_access_key_id=something_else
aws_secret_access_key=something_else
The aws_access_key_id
and aws_secret_access_key
are specific to the iam_user
that was created.Once you are ready, you can remove the --dryrun
flag and do a test run to ensure that your files are indeed uploaded to the correct bucket. Yes, a testrun. It is not the end of the deployment step. We can go further to completely automate the whole process.NOTE: AWS S3 does not charge data transfer in to the bucket, only out. So feel free do spam deployment. (In fact, S3 does not charge data transfer out to Cloudfront.)Combine the StepsAs it stands now, we have to build our site first using webpack -p — config webpack.config.js
to generate the files, then upload the files using theaws s3 cp
command.To make our life better, we can create a new script command to run these commands one after another, without having us to be there waiting for the first command to finish then manually execute the other.
# package.json
...
"scripts": {
...
"deploy": "webpack -p --config webpack.config.prod.js && aws s3 cp ./dist s3://better-cover-letter --recursive --exclude "*.DS_Store" --cache-control public,max-age=604800 --dryrun --profile iam_user"
...
}
So just run npm run deploy
and these will happen in chronological order.
- Old production files are cleaned up by clean-webpack-plugin
- New production files are compiled into the
dist
folder (based on my webpack config file) - The production files are then uploaded to S3 and ready for access once the cache in the AWS Cloudfront CDN expires.
There it is, the fully automated process for uploading the static website.
More Housekeeping (Optional)
If you are bundling your javascript files with a hash like me, you will find your S3 bucket accumulating with old js file instead of getting replaced by the new ones since they are different files by virtue of the hash in their file name, eg bundle-0af19d01880334b789.js
. Not so much if you are uploading just bundle.js
which will replace any bundle.js
present in the bucket.Since storing files in S3 isn’t free, albeit not that expensive either, its still wise to remove files that you will never be using again.So we can use AWS CLI again to do a removal of these old js files before upload (note: I am leaving the files in the root directory of the bucket untouched, just cleaning up the assets
folder).
aws s3 rm s3://better-cover-letter/assets --recursive --profile iam_user --dryrun
Once again, combine them in the deploy script.
Hi! I am a robot. I just upvoted you! I found similar content that readers might be interested in:
https://hackernoon.com/hosting-static-react-websites-on-aws-s3-cloudfront-with-ssl-924e5c134455
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Congratulations @vic-l! You received a personal award!
You can view your badges on your Steem Board and compare to others on the Steem Ranking
Vote for @Steemitboard as a witness to get one more award and increased upvotes!
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit