Skip to main content

S3 - Simple Storage Service

https://aws.amazon.com/s3

Docs - https://docs.aws.amazon.com/s3/

Console - https://console.aws.amazon.com/s3/home or https://s3.console.aws.amazon.com/s3/home

Quotas - https://docs.aws.amazon.com/general/latest/gr/s3.html

It's a regional service, but the bucket name must be globally unique since we use URLs. See bucket naming rules.

To reduce latency, create the bucket at the region that is closest to your users.

You cannot delete an S3 bucket containing objects. You need to empty it first.

warning

From How an empty S3 bucket can make your AWS bill explode:

When executing a lot of requests to S3, make sure to explicitly specify the AWS region. This way you will avoid additional costs of S3 API redirects.

Data Transfer from Amazon S3 Glacier Vaults to Amazon S3 - https://aws.amazon.com/solutions/implementations/data-transfer-from-amazon-s3-glacier-vaults-to-amazon-s3/

S3 use cases

From https://www.youtube.com/watch?v=-zc16KhOILM

  1. Backups (eg database backups)
    • We can set rules to delete objects that are no longer needed, eg delete database backups older than 7 days
    • We can have versions of the same file
  2. Static website hosting
    • Note that "Amazon S3 website endpoints do not support HTTPS. If you want to use HTTPS, you can use Amazon CloudFront to serve a static website hosted on Amazon S3"
  3. Share files to logged/authorized users with presigned URLs. URLs expire
  4. Long term archiving
    • For files that are accessed infrequently, but need to be saved for years
    • There are various storage classes/tiers: Standard-Infrequent Access, Glacier...
  5. Compliance with regulations
    • Eg comply with the GDPR by storing data at the EU, set encryption etc.
  6. FTP server
  7. Application cache
  8. Upload files
    • Since we are not uploading to our server, we don't need to worry about CPU, bandwidth or availability
  9. Query data and perform analytics with Athena, which allows to query data in S3 using SQL
  10. S3 triggers/events, eg to fire a lambda functions
  11. Partial data access with S3 Select, which allows to retrieve a subset of a file instead of the whole file

Compatible services

CLI - s3api

Commands:

Create bucket

aws s3api create-bucket --bucket <bucket-name> --acl <acl> --region <region>
aws s3api create-bucket --bucket my-bucket --region us-east-1

Response

{
"Location": "/my-bucket"
}

Delete bucket (must be empty)

aws s3api delete-bucket --bucket <bucket-name>

Upload file

aws s3api put-object --bucket <bucket-name> --key file.txt --body file.txt

Delete object

aws s3api delete-object --bucket <bucket-name> --key <key>

CLI - s3

Reference:

Create bucket: aws s3 mb s3://my-bucket-name

List buckets: aws s3 ls

List files in bucket: aws s3 ls s3://my-bucket-name

Upload single file to bucket (cp docs): aws s3 cp file.txt s3://my-bucket-name

Conditional writes: https://aws.amazon.com/about-aws/whats-new/2024/08/amazon-s3-conditional-writes

Upload multiple files (directory) to bucket:

Encryption

https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingEncryption.html

Server-side encryption:

  • SSE-C: you (the Customer) manage the key. You need to provide the key to decrypt an object.
  • SSE-S3: the default. See Amazon S3 now automatically encrypts all new objects.
  • SSE-KMS: use AWS KMS keys.
  • DSSE-KMS: dual-layer server-side encryption with AWS KMS keys.

Client-side encryption:

  • CSE - Customer: you encrypt client-side.

Policies to enforce encryption: https://aws.amazon.com/blogs/security/how-to-prevent-uploads-of-unencrypted-objects-to-amazon-s3

S3 Event Notifications

Docs: https://docs.aws.amazon.com/AmazonS3/latest/userguide/EventNotifications.html

Announcement: https://aws.amazon.com/blogs/aws/s3-event-notification/

  • SNS topics
  • SQS queues
  • Lambda
  • EventBridge

SNS - Send an email when an object is uploaded to an S3 bucket

Create the SNS topic

Go to the SNS console → Topics and click 'Create topic'. Then set:

  • Type: Standard
  • Name: Send-Email-On-S3-Upload
  • Display name: this will be the email name (e.g. Something <no-reply@sns.amazonaws.com>)
  • Click 'Create topic'

Create the SNS subscription:

Go to the SNS console → Subscriptions and click 'Create subscription'. Then set:

  • Topic ARN: arn:aws:sns:us-east-1:<account-id>:Send-Email-On-S3-Upload
  • Protocol: Email
  • Endpoint: set your email address
  • Click 'Create subscription'

Go to your email client (eg Gmail) and confirm the subscription by clicking the 'Confirm subscription' link. The page that opens will say 'Subscription confirmed!'.

Configure access policy for the SNS topic

Replace the following values in the JSON policy:

  • Resource: the SNS topic ARN
  • aws:SourceArn: the S3 bucket ARN
  • aws:SourceAccount: the account ID of the S3 bucket owner
{
"Version": "2012-10-17",
"Id": "example-ID",
"Statement": [
{
"Sid": "SNS topic policy",
"Effect": "Allow",
"Principal": {
"Service": "s3.amazonaws.com"
},
"Action": ["SNS:Publish"],
"Resource": "arn:aws:sns:us-east-1:<account-id>:Send-Email-On-S3-Upload",
"Condition": {
"ArnLike": {
"aws:SourceArn": "arn:aws:s3:::<bucket-name>"
},
"StringEquals": {
"aws:SourceAccount": "<bucket-owner-account-id>"
}
}
}
]
}

This policy grants the S3 service the "SNS:Publish" API action over the SNS topic Send-Email-On-S3-Upload, specifying the S3 bucket and the AWS account in the conditions.

Go to the SNS topic, click 'Edit', copy the policy at the 'Access policy' field and click 'Save changes'.

Create the event notification in S3

Go to the S3 console, navigate to the bucket and then go to the 'Properties' tab. At 'Event notifications', click 'Create event notification' and set:

  • Event name: SendEmail
  • Event types: All object create events (s3:ObjectCreated:*)
  • Destination: SNS topic
  • Specify SNS topic: Choose from your SNS topics
  • SNS topic: Send-Email-On-S3-Upload
  • Click 'Save changes'

When you upload a file to the S3 bucket you should receive an email.

Hosting a static website with S3 and CloudFront - Single, private S3 bucket - Manually

This solution makes the S3 bucket private, and the S3 content is only available through CloudFront, as explained at Restricting access to an Amazon S3 origin.

Also, this solution does not use 'Static website hosting' because the S3 bucket is private and it's content is served by CloudFront, which uses the S3 REST API to retrieve the files. See Key differences between a website endpoint and a REST API endpoint.

Note that you must use CloudFront to have HTTPS. See Hosting a static website using Amazon S3:

Amazon S3 website endpoints do not support HTTPS. If you want to use HTTPS, you can use Amazon CloudFront to serve a static website hosted on Amazon S3. For more information, see How do I use CloudFront to serve HTTPS requests for my Amazon S3 bucket? To use HTTPS with a custom domain, see Configuring a static website using a custom domain registered with Route 53.

Resources

Steps

TODO Here we have to explain how to create a new User with policies CloudFrontFullAccess and AmazonS3FullAccess, and then run aws configure to set the user access key and secret for the CLI. See https://medium.com/dailyjs/a-guide-to-deploying-your-react-app-with-aws-s3-including-https-a-custom-domain-a-cdn-and-58245251f081

Create a S3 bucket. Go to the S3 console and click Buckets → Create bucket. Set the bucket name and region. Everything else should be at the default value:

  • 'Object Ownership': check 'ACLs disabled (recommended)'.
  • 'Block Public Access settings for this bucket': check 'Block all public access' since the user will access the content through CloudFront. Most tutorials tell you to uncheck all checkboxes (ie make all S3 content public), but it's not required.
  • 'Bucket Versioning' → Disable.
    • This tutorial suggests to enable it "to see any changes we made or roll back to a version before the change if we do not like it".
  • 'Default encryption' → Disable.
  • Advanced settings
    • 'Object Lock' → Disable.

Create a CloudFront distribution. Go to the CloudFront console and click 'Create a CloudFront distribution'. (The article How do I use CloudFront to serve HTTPS requests for my Amazon S3 bucket? talks about this.)

  • At 'Origin domain' select the one you've just created at S3. Should be something like <s3-bucket-name>.s3.<region>.amazonaws.com.
  • Make the S3 bucket private by setting 'Origin access' to 'Origin access control settings (recommended)'. This "limits the S3 bucket access to only authenticated requests from CloudFront". See Restricting access to an Amazon S3 origin. Origin access control (OAC) is recommended over origin access identity (OAI).
    • Click 'Create control setting'. At the dialog that opens, leave 'Sign requests (recommended)' checked and click 'Create'.
    • Note that "You must update the S3 bucket policy. CloudFront will provide you with the policy statement after creating the distribution."
  • Set 'Viewer protocol policy' to 'Redirect HTTP to HTTPS".
  • Set 'Default root object' to index.html.
  • Click 'Create distribution'.

We need to update the S3 bucket policy. A blue banner says "Complete distribution configuration by allowing read access to CloudFront origin access control in your policy statement.". Copy the policy from the banner. It's something like:

{
"Version": "2008-10-17",
"Id": "PolicyForCloudFrontPrivateContent",
"Statement": [
{
"Sid": "AllowCloudFrontServicePrincipal",
"Effect": "Allow",
"Principal": {
"Service": "cloudfront.amazonaws.com"
},
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::the-s3-bucket-name/*",
"Condition": {
"StringEquals": {
"AWS:SourceArn": "arn:aws:cloudfront::564627046728:distribution/Q5H1J6OCBE3KIO"
}
}
}
]
}

To update the bucket policy, go to the S3 bucket → Permissions and paste the policy JSON to the 'Bucket policy' textarea.

Note that there's no need to configure the bucket as static website because S3 will store the files but not serve the website (it will be served by CloudFront).

To upload files to the S3 bucket do: aws s3 sync <build-folder> s3://<s3-bucket-name> [--profile <profile>]. Files should now appear on the S3 console, at the Objects tab.

You should now be able to visit the CloudFront URL (aka the 'Distribution domain name', something like https://d1p1ex8s7fge20.cloudfront.net) with your browser and see the site live :)

Every time we update the S3 bucket content we need to invalidate the CloudFront edge caches to replace the files. Either use the command aws cloudfront create-invalidation --distribution-id <distribution-id> --paths '/*' [--profile <profile>], or do this at the console.

Fix 403 Forbidden error for SPA

If you have a SPA (eg a React app created with CRA) and you visit xyz.cloudfront.net/about and refresh, it will not work; will say 'AccessDenied' with response code 403.

To fix this, at the CloudFront distribution, set a custom error response at the Error pages tab with this values:

  • HTTP error code: 403 Forbidden.
  • Error caching minimum TTL: something like 60 seconds? Default is 10.
  • Customize error response: Yes.
    • Response page path: /index.html (must begin with /).
    • HTTP Response code: 200 OK.

Then perform an invalidation to the path /*.

See https://blog.nicera.in/2020/08/hosting-react-app-on-s3-cloudfront-with-react-router-404-fix/ for more.

Custom domain

Resources:

At the Route 53 console, purchase a domain.

At the AWS Certificate Manager console, create a certificate with the following steps:

  • Important: the certificate must be in the US East (N. Virginia) Region (us-east-1). Make sure that you have the 'N. Virginia' region selected at the top navbar. See why at Supported Regions:
    • To use an ACM certificate with Amazon CloudFront, you must request or import the certificate in the US East (N. Virginia) region. ACM certificates in this region that are associated with a CloudFront distribution are distributed to all the geographic locations configured for that distribution.

  • Start by clicking 'Request a certificate'. At the page that opens (Certificate type) select 'Request a public certificate' and click 'Next'.
  • At the page that opens (Request public certificate), set:
    • Domain names. There are 2 options:
      • Wildcard: you can "use an asterisk (*) to request a wildcard certificate to protect several sites in the same domain. For example, *.example.com protects www.example.com, site.example.com, and images.example.com".
      • If you only want some domains like example.com and www.example.com, set these domains one by one.
    • At 'Select validation method' choose 'DNS validation - recommended'.
    • Click 'Request'.

A blue banner with the message "Successfully requested certificate with ID some-id. A certificate request with a status of pending validation has been created. Further action is needed to complete the validation and approval of the certificate.".

At the ACM console → Certificates, the certificate Status is 'Pending validation'. We need to validate the ownership of the domain to complete the certificate request. To do so, at the ACM console go to the new certificate and click 'Create records in Route 53'. At the page that opens (Create DNS records in Amazon Route 53), click 'Create records'. This creates two DNS records of type CNAME for the domain at Route 53. At the Route 53 console, you will see that the 'Record count' for the domain goes from 2 to 4 after this step. Right after doing this, at the ACM console, you still see that the domain Status is still 'Pending validation', but it will change to '✓ Issued' (with green color) some minutes later.

Go to the CloudFront console and click the distribution. At the 'General' tab, 'Settings' pane, click 'Edit'. At 'Alternate domain name (CNAME)', set the domains used at the ACM certificate (eg example.com and www.example.com). At the dropdown 'Custom SSL certificate', choose the ACM certificate just created. Click 'Save changes'. At the distribution 'General' tab, 'Details' pane, the 'Last modified' will be 'Deploying', and after some minutes later it will change to a date (eg 'October 7, 2022 at 4:53:47 PM UTC').

Go to the Route 53 console → Hosted zones and choose the zone. Click 'Create record'. Leave 'Record name' empty. Set 'Record type' to 'A'. Enable the switch 'Alias' and set 'Route traffic to' to 'Alias to CloudFront distribution'. Choose the distribution from the dropdown. At 'Routing policy' leave it to 'Simple routing'. You need to repeat this for www. Click 'Add another record' and set the same values except for 'Record name', which should be 'www'. Click 'Create records'. Wait a few minutes for changes to take effect.

Redirect www to non-www

To redirect the www.example.com/* request to example.com/* follow the instructions at the CloudFront doc.

Logging

See Step 5: Configure logging for website traffic - "If you want to track the number of visitors accessing your website, you can optionally enable logging"

Hosting a static website with S3 and CloudFront - Two S3 buckets with 'Static website hosting'

Hosting a static website with S3 and CloudFront - With Terraform

Resources

Hosting a static website with S3 and CloudFront - With CloudFormation

Resources

File uploads with pre-signed URLs

https://sst.dev/docs/start/aws/nextjs#3-create-an-upload-form

https://sst.dev/docs/start/aws/astro#3-create-an-upload-form