AWS Solutions Architect
AWS Certification Roadmap
720 / 1000 to pass

aws sts get-caller-identity
S3
AWS S3 CLI
AWS-S3-Bash-Scripting
What is in S3 Object?
Key: this is the name of the object
Value: the data itself made up of a sequence of bytes
Version ID: when versioning enabled, the version of object
Metadata: additional information attached to the object
What is S3 Bucket?
Bucket hold objects. Buckets can also have folders which in turn hold objects.
You can store an individual object from 0 Bytes to 5 Terabytes in size
S3 Encryption type: By default Amazon S3 managed keys (SSE-S3)
aws s3api create-bucket --bucket my-example-bucket-my --region us-east1
aws s3api list-buckets
aws s3api list-buckets --query Buckets[].Name --output table
aws s3api list-buckets --query "Buckets[?Name == 'kuyil-w2'].Name" --output text
aws s3 sync images/ s3://my-example-bucket-my
aws s3api get-object --bucket my-example-bucket-my --key hello.txt world.txt
aws s3api put-object --bucket my-example-bucket-my --key world.txt --content-type plain/txt --body world.txt
aws s3api list-objects --bucket my-example-bucket-my --query Contents[].Key
Create a Bucket
aws s3api create-bucket --bucket my-example-bucket-ab --region us-east-1
List All Buckets
aws s3api list-buckets --query "Buckets[].Name"
Upload a Single Object
aws s3 cp path/to/local/file.txt s3://my-example-bucket-ab/
Upload Multiple Objects Using Sync
aws s3 sync path/to/local/directory/ s3://my-example-bucket-ab/
Download a Single Object
aws s3 cp s3://my-example-bucket-ab/file.txt path/to/local/directory/
Download Multiple Objects Using Sync
aws s3 sync s3://my-example-bucket-ab/ path/to/local/directory/
List Objects in a Bucket
aws s3api list-objects --bucket my-example-bucket-ab --query "Contents[].Key"
When listing objects, folder names end with a /. You can exclude these from the results by applying a JMESPath query. The command to achieve this is as follows:
aws s3api list-objects --bucket my-example-bucket-ab --query 'Contents[?(!ends_with(Key, `/`))].Key'
Delete a Single Object
aws s3 rm s3://my-example-bucket-ab/file.txt
Delete All Objects in a Bucket Recursively
aws s3 rm s3://my-example-bucket-ab/ --recursive
Delete an Empty Bucket
aws s3api delete-bucket --bucket my-example-bucket-ab
Get Object Metadata
aws s3api head-object --bucket my-example-bucket-ab --key file.txt
Enable Versioning on a Bucket
aws s3api put-bucket-versioning --bucket my-example-bucket-ab --versioning-configuration Status=Enabled
AWS-S3-Bash-Scripting
./334_aws_create_s3.sh cadent-ops-test2
#!/bin/bash
# myom@cadent.tv 01-2025
# Check for bucket name
if [ -z "$1" ]; then
echo "There needs to be a bucket name eg. ./334_aws_create_s3.sh my-bucket-name"
exit 1
fi
aws s3api create-bucket --bucket $1
./335_aws_list_bucket_s3.sh cadent-ops-test2
#!/bin/bash
# myom@cadent.tv 01-2025
# Check for bucket name
if [ -z "$1" ]; then
echo "There needs to be a bucket name eg. ./334_aws_create_s3.sh my-bucket-name"
exit 1
fi
aws s3api list-buckets --query "Buckets[?Name == '$1'].Name" --output text
./336_aws_list_all_bucket.sh
#!/bin/bash
# myom@cadent.tv 01-2025
# Check for bucket name
#if [ -z "$1" ]; then
# echo "There needs to be a bucket name eg. ./334_aws_create_s3.sh my-bucket-name"
# exit 1
#fi
aws s3api list-buckets
./337_aws_delete_one_bucket.sh cadent-ops-test2
#!/bin/bash
# myom@cadent.tv 01-2025
# Check for bucket name
if [ -z "$1" ]; then
echo "There needs to be a bucket name eg. ./334_aws_create_s3.sh my-bucket-name"
exit 1
fi
aws s3api delete-bucket --bucket $1
List latest buckets
aws s3api list-buckets | jq -r '.Buckets | sort_by(.CreationDate) | reverse | .[] | .Name'
List only one latest bucket
aws s3api list-buckets | jq -r '.Buckets | sort_by(.CreationDate) | reverse | .[0] | .Name'
#!/bin/bash
echo "== sync"
# Exit immediately if any command returns a non-zero status
set -e
# Check for bucket name
if [ -z "$1"]; then
echo "There needs to be a bucket name"
exit 1
fi
# Check for filename prefix
if [ -z "$2"]; then
echo "There needs to be a file name"
exit 1
fi
BUCKET_NAME=$1
FILENAME_PREFIX=$2
# Where we'll store these files
OUTPUT_DIR="/tmp/s3-bash-scripts"
rm -rf $OUTPUT_DIR
mkdir -p $OUTPUT_DIR
# Generate a random number
# to determine how many files to create
NUM_FILES=$((RANDOM % 6 + 5))
for ((i=1; i<=$NUM)FILES; i++)); do
# Generate a random filename
FILENAME="$OUTPUT_DIR/${FILENAME_PREFIX_$1}.txt"
# Generate random data and write it to the file
dd if=/dev/urandom of="$FILENAME" bs=1024 count=$((RANDOM % 1014 + 1)) 2>/dev/null
done
tree $OUTPUT_DIR
#!/bin/bash
echo "== put-object"
# Check for bucket name
if [ -z "$1"]; then
echo "There needs to be a bucket name"
exit 1
fi
# Check for filename prefix
if [ -z "$2"]; then
echo "There needs to be a file name"
exit 1
fi
BUCKET_NAME=$1
FILENAME=$2
OBJECT_KEY=$(basename "$FILENAME")
aws s3api put-object \
--bucket $BUCKET_NAME \
--body $FILENAME \
--key $OBJECT_KEY
./put_object_s3 my-s3-bucket /tmp/newfile.txt
#!/bin/bash
echo "== delete-object"
# Exit immediately if any command returns a non-zero status
set -e
# Check for bucket name
if [ -z "$1"]; then
echo "There needs to be a bucket name"
exit 1
fi
# Check for filename prefix
if [ -z "$2"]; then
echo "There needs to be a file name"
exit 1
fi
BUCKET_NAME=$1
aws s3api list-objects-v2 \
--bucket $BUCKET_NAME) \
--query Contents[].Key \
| jq -n '{Objects: [inputs | .[] | {Key: .}]}' > /tmp/delete_objects.json
aws s3api delete-objects --bucket $BUCKET_NAME --delete file:///tmp/delete_objects.json
# This will delete all objects in my S3
./s3_delete_objects.sh my-s3-bucket
Create a bucket
#!/bin/bash
set -e
if [ -z "$1" ]; then
echo "No bucket name provided. Usage: $0 bucket-name"
exit 1
fi
BUCKET_NAME=$1
REGION=${2:-us-east-1}
aws s3api create-bucket --bucket "$BUCKET_NAME" --create-bucket-configuration LocationConstraint="$REGION"
Delete a bucket
#!/bin/bash
set -e
if [ -z "$1" ]; then
echo "No bucket name provided. Usage: $0 bucket-name"
exit 1
fi
BUCKET_NAME=$1
aws s3api delete-bucket --bucket "$BUCKET_NAME"
Generate Random Files
#!/bin/bash
set -e
OUTPUT_DIR="./temp"
mkdir -p $OUTPUT_DIR
rm -rf $OUTPUT_DIR/*
NUM_FILES=$((RANDOM % 6 + 5))
for i in $(seq 1 $NUM_FILES); do
FILE_NAME="$OUTPUT_DIR/file_$i.txt"
head -c 100 </dev/urandom > $FILE_NAME
echo "Created $FILE_NAME"
done
Sync Files
#!/bin/bash
set -e
if [ -z "$1" ]; then
echo "No bucket name provided. Usage: $0 bucket-name"
exit 1
fi
BUCKET_NAME=$1
FILE_PREFIX=${2:-files}
aws s3 sync ./temp s3://$BUCKET_NAME/$FILE_PREFIX
List Objects
#!/bin/bash
set -e
if [ -z "$1" ]; then
echo "No bucket name provided. Usage: $0 bucket-name"
exit 1
fi
BUCKET_NAME=$1
aws s3api list-objects --bucket "$BUCKET_NAME" --query 'Contents[].{Key: Key, Size: Size}'
Delete All Objects
#!/bin/bash
set -e
if [ -z "$1" ]; then
echo "No bucket name provided. Usage: $0 bucket-name"
exit 1
fi
BUCKET_NAME=$1
OBJECTS=$(aws s3api list-objects --bucket "$BUCKET_NAME" --query 'Contents[].{Key: Key}' --output json)
if [ "$OBJECTS" == "null" ]; then
echo "Bucket is empty"
exit 0
fi
echo '{"Objects":' > delete.json
echo $OBJECTS | jq '.' >> delete.json
echo '}' >> delete.json
aws s3api delete-objects --bucket "$BUCKET_NAME" --delete file://delete.json
rm delete.json
Get Newest Bucket
#!/bin/bash
aws s3api list-buckets --query 'Buckets | sort_by(@, &CreationDate) | [-1:].Name' --output text
Cloud Formation (cfn) Infrastructure as code
#!/bin/bash
echo "== deploy s3 bucket via CFN"
STACK_NAME="cfn-s3-simple"
aws cloudformation deploy \
--template-file template.yaml \
# You have to log into AWS console Cloudformation -> Stack and accept the change
--no-execute-changeset \
--region us-east-1 \
--stack-name $STACK_NAME
#!/bin/bash
echo "== delete stack for s3 bucket via CFN"
STACK_NAME="cfn-s3-simple"
aws cloudformation delete-stack \
--region us-east-1 \
--stack-name $STACK_NAME
Terraform
wget -O - https://apt.releases.hashicorp.com/gpg | sudo gpg --dearmor -o /usr/share/keyrings/hashicorp-archive-keyring.gpg
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/hashicorp.list
sudo apt update && sudo apt install terraform
search terrafom registry -> Use Provider
myom@b-ea1c-ansible-2023:~/aws/Terraform$ cat main.tf
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "5.84.0"
}
}
}
provider "aws" {
region = "us-east-1"
}
Creates random bucket name
myom@b-ea1c-ansible-2023:~/aws/Terraform$ cat s3.tf
resource "aws_s3_bucket" "my-test" {
#bucket = "cadent-cops-test12"
tags = {
Name = "My bucket"
Environment = "Dev"
}
}
terraform init
terraform plan
terraform apply
terraform destroy
CDK (AWS Cloud Development Kit)
sudo npm i -g aws-cdk
cdk init cdk --lanuage=typescript
Pulumi
S3 Bucket Overview
S3 Bucket Overview
S3 Buckets are infrastructure, and they hold S3 Objects
S3 Bucket Naming Rules – how we have to our name bucket
S3 Bucket Restrictions and Limitations – What we can and can’t do with buckets
S3 Bucket Types – the two different kinds of buckets, flat (general purpose) and directory
S3 Bucket Folders – How S3 Buckets has virtual folders for general purpose buckets
Bucket Versioning – how we can version of all objects
Bucket Encryption – how we can encrypt the contents of bucket
Static Website Hosting – How we can let our buckets host websites
What is the purpose of bucket encryption in Amazon S3? (To encrypt the contents of the bucket)
What are the two different kinds of S3 bucket types? (Flat (general purpose) and directory)
How does S3 bucket versioning benefit data management? (S3 bucket versioning allows multiple versions of an object to be stored in a bucket, enabling data recovery and protection against accidental deletion or overwriting.)
S3 Bucket Naming Rules
S3 Buckets Names are similar to valid URL rules (but more rules), this is because S3 bucket names are used to form URL links to perform various HTTPs operations.
https://myexamplebucket.s3.amazonaws.com/photo.jpg
Length: Bucket names must be 3-63 characters long.
Characters: Only lowercase letters, numbers, dots (.), and hyphens (-) are allowed.
Start and End: They must begin and end with a letter or number.
Adjacent Periods: No two adjacent periods are allowed.
IP Address Format: Names can't be formatted as IP addresses (e.g., 192.168.5.4).
Restricted Prefixes: Can't start with "xn--", “s3alias-”, “amzn-s3-demo”, "sthree-", or "sthree-configurator".
Restricted Suffixes: with "-s3alias“, “--ol-s3”, “--x-s3”, “.mrap” reserved for access point alias names.
Uniqueness: Must be unique across all AWS accounts in all AWS Regions within a partition.
Exclusivity: A name can't be reused in the same partition until the original bucket is deleted.
Partitions: “aws” (Standard Regions), “aws-cn” (China Regions), “aws-us-gov” (AWS GovCloud US)
Transfer Acceleration: Buckets used with S3 Transfer Acceleration can't have dots in their names.
No Uppercase, No Underscores, No Spaces in Bucket Names
Which of the following characters are NOT allowed in an Amazon S3 bucket name? (Uppercase)
What is the range of characters allowed in an S3 bucket name's length? (An Amazon S3 bucket name must be between 3 and 63 characters long.)
Why is the S3 bucket name 'data..bucket..archive' invalid? (It contains adjacent periods.)
Is the S3 bucket name '123.456.789.012' valid or invalid, and why? (Invalid, because it is formatted as an IP address which is not allowed in S3 bucket naming conventions.)
S3 — Bucket Restrictions and Limitations
You can, by default, create 100 buckets
You can create a service request to increase to 1000 buckets
You need to empty a bucket first before you can delete it
No max bucket size and no limit to the number of objects in a bucket
Files can be between 0 and 5 TBs
Files larger than 100MB should use multi-part upload
S3 for AWS Outposts has limits
Get, Put, List, and Delete operations are designed for high availability
Create, Delete or configuration operations should be run less often.
How many S3 buckets can you create by default in AWS? (100)
What is the recommended upload method for files larger than 100MB in S3? (Multi-part upload)
What should you do before deleting an S3 bucket in AWS? (Before deleting an S3 bucket in AWS, you need to empty the bucket first.)
S3 — Bucket Types
Amazon S3 has two types of buckets:
General purpose buckets
Organizes data in a flat hierarchy
The original S3 bucket type
Recommended for most use cases
Used with all storage classes except can’t be used with S3 Express
One Zone storage class
There aren’t prefix limits
There is a default limit of 100 general buckets per account
Directory buckets
Organizes data folder hierarchy
Only to be used with S3 Express One Zone storage class
Recommended when you need single-digit millisecond performance on PUT and GET
*There aren't prefix limits for directory buckets
Individual directories can scale horizontally
There is a default limit of 10 directory buckets per account
What is the original S3 bucket type recommended for most use cases? (General purpose buckets)
When should you use directory buckets in Amazon S3? (Directory buckets should be used when you need single-digit millisecond performance on PUT and GET operations, specifically with the S3 Express One Zone storage class.)
S3 — Bucket Folder
The S3 Console allows you to “create folders”. S3 general purpose buckets do not have true folders found in hierarchy file system.
When you create a folder in S3 Console, Amazon S3 creates a zero-byte S3 object with a name that ends in a forward slash eg. myfolder/
S3 folders are not their own independent identities but just S3 Objects
S3 folders don’t include metadata, permissions
S3 folders don’t contain anything, they can’t be full or empty
S3 folders aren’t “moved”, S3 objects containing the same prefix are renamed
Which of the following statements is true about folders in Amazon S3? (Folders in S3 are just S3 objects that serve as a means to organize other objects.)
What does "moving a folder" in Amazon S3 actually entail? (Moving a folder in Amazon S3 involves renaming the S3 objects with a different prefix that corresponds to the new folder path, not physically moving a folder as in traditional file systems.)
S3 Object Overview
S3 Objects are resources that represent data and is not infrastructure.
Etags — a way to detect when the contents of an object has changed without download the contents
Checksums — ensures the integrity of a files being uploaded or downloaded
Object Prefixes — simulates file-system folders in a flat hierarchy
Object Metadata — attach data alongside the content, to describe the contents of the data
Object tags — benefits resource tagging but at the object level
Object Locking — makes data files immutable
Object Versioning — have multiple versions of a data file
How do Object Tags benefit Amazon S3 users? (They provide a way to categorize and manage resources at the object level.)
What are Etags in Amazon S3 used for? (Etags are used in Amazon S3 to detect changes in an object's content without needing to download the object, aiding in efficient data management and synchronization.)
S3 Objects — ETags
What is an ETag?
An entity tag (etag) is a response header that represent a resource that has changed (without the need to download).
The value of an etag is generally represented by a hashing function eg. MD5 or SHA-1
Etags are part of the HTTP protocol.
Etags are used for revalidation for caching systems.
S3 Objects have an etag
etag represents a hash of the object
reflects changes only to the contents of an object, not its metadata
may or may not be an MD5 digest of the object data (depends if it’s encrypted)
Etag represents a specific version of an object
ETags are useful if you want to programmatically detect content changes to S3 objects
What is an ETag in the context of S3 objects? (A response header representing a hash of the object that reflects changes in its content)
What aspect of an S3 object does its ETag reflect changes to? (The ETag of an S3 object reflects changes to the object's content, not its metadata, and represents a specific version of that object.)
resource "aws_s3_bucket" "default" {
}
resource "aws_s3_object" "object" {
bucket = aws_s3_bucket.default.id
key = "myfile.txt"
source = "myfile.txt"
etag = filemd5("myfile.txt")
}
terraform plan
terraform apply --auto-approve
terraform destroy --auto-approve
S3 Object — Checksums
What is a Checksum?
A checksum is used to check the sum (amount) of data to ensure the data integrity of a file. If data is downloaded and if in-transit data is loss or mangled the checksum will determine this is something wrong with the file.
Amazon S3 uses checksums to verify data integrity of files on the upload or download.
AWS allows you to change the checksum algorithm during uploading an object
Amazon S3 offers the following checksum algorithms
CRC32 (Cyclic Redundancy Check)
CRC32C
SHA1 (Secure Hash Algorithms)
SHA256
What is the purpose of a checksum in AWS S3? (To ensure the data integrity of a file by verifying its contents.)
Can the checksum algorithm be altered when uploading files to Amazon S3? (Yes, the checksum algorithm can be altered when uploading files to Amazon S3, allowing users to choose from several algorithms such as CRC32, CRC32C, SHA1, and SHA256.)
aws s3 mb s3://checksum-examples-my-1234
echo "Heelo Mars" > myfile.txt
md5sum myfile.txt
aws s3 cp myfile.txt s3://checksum-examples-my-1234
aws s3api head-object --bucket checksum-examples-my-1234 --key myfile.txt
sudo apt install rhash -y
rhash --crc32 --simple myfile.txt
sha1sum myfile.txt
S3 Object — Prefixes
S3 Object prefixes are strings that proceed the Object filename and is part of the Object key name
Since all objects in a bucket are stored in a flat-structured hierarchy, Object prefixes allows for a way to organize, group and filter objects.
A prefix uses the forward slash “/” as a delimitator to group similar data, similar to directories (folders) or subdirectories. Prefixes are not true folders.
There is no limit for the number of delimitators, the only limit the Object key name cannot exceed 1024 bytes with the object prefix and filename combined.
What is the role of prefixes in Amazon S3? (To organize, group, and filter objects in a bucket)
Which delimiter is used in S3 object prefixes to create a pseudo-folder hierarchy? (The forward slash (/) is used as a delimiter in S3 object prefixes to simulate directories or subdirectories, helping in grouping similar data.)
aws s3 mb s3://prefixes-fun-ab-5235
aws s3api put-object --bucket="prefixes-fun-ab-5235" --key="hello/"
1024 chacter deep
aws s3api put-object --bucket="prefixes-fun-ab-5235" --key="Lorem/ipsum/dolor/sit/amet/consectetur/adipiscing/elit/Nunc/id/facilisis/dolor/Donec/laoreet/odio/ac/bibendum/eleifend/Ut/nunc/massa/finibus/vitae/hendrerit/ac/aliquam/in/ligula/Vestibulum/eu/nibh/eget/nisl/aliquet/elementum/id/non/massa/Praesent/sed/dolor/facilisis/imperdiet/justo/ut/varius/urna/Cras/lacinia/lacinia/diam/sed/convallis/nisi/vehicula/sit/amet/Mauris/lacinia/rutrum/justo/a/consectetur/dolor/maximus/et/Duis/condimentum/dignissim/ligula/et/sollicitudin/Mauris/non/convallis/nisi/eget/vestibulum/est/Aliquam/faucibus/vestibulum/lacus/vitae/sagittis/nulla/blandit/quis/Vivamus/vel/justo/a/nisi/bibendum/varius/ac/vitae/urna/Nullam/et/lorem/metus/Praesent/lorem/mi/laoreet/eget/tincidunt/et/vestibulum/eget/erat/Aenean/nisl/ante/lobortis/vel/orci/sit/amet/commodo/viverra/mauris/Fusce/at/ipsum/at/ex/facilisis/ultrices/et/vel/augue/Etiam/vitae/nulla/sit/amet/risus/sagittis/pharetra/ullamcorper/vitae/mi/Nullam/eget/mollis/urna/non/malesuada/dui/Morbi/porta/nunc/et/ipsum/libero/wra/dsf/dfs/gf/dhg/gfh/jgidngi/"
S3 Object — Metadata
What is Metadata?
Metadata provides information about other data but not the contents itself.
Metadata is useful for:
categorizing and organizing data
providing content about data
Amazon S3 allows you to attach metadata to S3 Objects at anytime
Metadata can be either:
System defined
User-defined
Resource Tags and Object tags are similar to Metadata, but tags are intended to provide information about cloud resources (eg. S3 Objects) and not the contents of the object.
What is the primary purpose of metadata in the context of AWS S3 objects? (To provide information about the data without describing the content itself.)
What are the two types of metadata that can be attached to an AWS S3 Object? (System-defined and user-defined metadata.)
S3 Object —System Defined Metadata
System Defined Metadata is data that only Amazon can control.
User’s usually* cannot set their values for these metadata values.
Content Type: image/jpeg
Cache Control: max-age=3600, must-revalidate
Content Disposition: attachment; filename='example.pdf’
Content-Encoding: gzip
Content-Language: en-US
Expires: Thu, 01 Dec 2030 16:00:00 GMT
X-amz-website-redirection-location: /new-page.html
*Some system defined metadata can be modified by the user eg. Content Type
AWS will attach some System defined metadata even if you do not specify any.
Which of the following is an example of AWS S3 System Defined Metadata? (Content Type: image/jpeg)
What is System Defined Metadata in AWS S3? (System Defined Metadata in AWS S3 includes metadata like Content Type, Cache Control, Content Disposition, etc., that is automatically attached by AWS and usually cannot be modified by the user, although there are exceptions like Content Type.
S3 —User Defined Metadata
User Defined Metadata is set by the user and must start with x-amz-meta- eg.
Access and Security:
x-amz-meta-encryption: "AES-256"
x-amz-meta-access-level: "confidential"
x-amz-meta-expiration-date: "2024-01-01"
Media File:
x-amz-meta-camera-model: "Canon EOS 5D"
x-amz-meta-photo-taken-on: "2023-06-10"
x-amz-meta-location: "New York City"
Custom Application:
app-version: "2.4.1"
data-imported: "2023-03-15"
source: "CRM System"
Project-specific:
x-amz-meta-project-id: "PRJ12345"
x-amz-meta-department: "Marketing"
x-amz-meta-reviewed-by: "Jane Smith"
Document Versioning:
x-amz-meta-version: "v3.2"
x-amz-meta-last-modified-by: "Alice Johnson"
x-amz-meta-original-upload: "2023-02-01"
Content-related:
title: "Annual Sales Report 2023"
author: "John Doe"
description: "Detailed sales …"
Compliance and Legal:
x-amz-meta-legal-hold: "true"
x-amz-meta-compliance-category: "GDPR"
x-amz-meta-retention-period: "5 years"
Backup and Archival:
x-amz-meta-backup-status: "Completed"
x-amz-meta-archive-date: "2023-04-20"
x-amz-meta-recovery-point: "2023-04-15"
Which metadata key is used to specify the encryption type for a file stored in Amazon S3? (x-amz-meta-encryption)
In the context of Amazon S3 user-defined metadata, how is the location where a photo was taken specified? (The location where a photo was taken is specified using the key "x-amz-meta-location" in the user-defined metadata.)
aws s3 mb s3://metadata-fun-ab-12421
echo "Hello Mars" > hello.txt
aws s3api put-object --bucket metadata-fun-ab-12421 --key hello.txt --body hello.txt --metadata Planet=Mars
aws s3api head-object --bucket metadata-fun-ab-12421 --key hello.txt
aws s3 rm s3://metadata-fun-ab-12421/hello.txt
aws s3 rb s3://metadata-fun-ab-12421
What is WORM?
What is WORM?
Write once read many (WORM) is a storage compliance feature that makes data immutable, You write once and the file can never be modified or deleted, but you may read the unlimited times
WORM is useful in healthcare or financial industries where files need to be audited and untampered
An example of WORM are video game cartridges. Where you would write data permanently to a ROM (Read only memory)
You can play the game (read the data) as many times as you want but you can’t change the data.
What does WORM stand for in the context of data storage? (Write Once, Read Many)
Why is WORM important in industries like healthcare and finance? (WORM is crucial in such industries because it ensures data integrity and immutability, which is vital for audit trails and compliance, preventing tampering and ensuring data remains unchanged.)
S3 Object Lock
S3 Object Locks allows to prevent the deletion of objects in a bucket
This feature can only be turned on at the creation of a bucket
Object Lock is for companies that need to prevent objects being deleted to have:
Data integrity
Regulatory compliance
S3 Object Lock is SEC 17a-4, CTCC, and FINRA regulation compliant
You can store objects using a write-once-read-many (WORM) model just like S3 Glacier.
You can use it to prevent an object from being deleted or overwritten for:
a fixed amount of time
or indefinitely.
Object retention is handled two different ways:
Retention periods — fixed period of time during which an object remains locked.
Legal holds — remains locked until you remove the hold
S3 buckets with Object Lock can't be used as destination buckets for server access log
S3 Object Lock — Setting Object Lock on Objects
Object Locking setting can only be set via the AWS API eg (CLI, SDK) and not the AWS Console
Object Locking can be configured via AWS API (CLI, SDK) and the AWS Console
This is to avoid misconfiguration by non-technical users locking objects
This allows both technical and non-technical users to prevent object deletion or modification, ensuring compliance and data integrity
What is the primary purpose of using AWS S3 Object Lock? (To prevent the deletion of objects in a bucket)
What are the two ways to handle object retention in AWS S3 Object Lock? (1. Retention periods (a fixed period of time during which an object remains locked) 2.Legal holds (where an object remains locked until the hold is removed)
Amazon S3 Bucket URI
The S3 Bucket URI (Uniform Resource Identifier ) is a way to reference the address of S3 bucket and s3 objects.
s3://myexamplebucket/photo.jpg
You’ll see this S3 Bucket URI required to be used for specific AWS CLI commands
aws s3 cp test.txt s3://mybucket/test2.txt
In the S3 Bucket URI 's3://myexamplebucket/photo.jpg', what does 'myexamplebucket' represent? (The name of the S3 bucket)
What does an Amazon S3 Bucket URI primarily reference? (An Amazon S3 Bucket URI primarily references the address of an S3 bucket or objects within that bucket.)
AWS S3 CLI
aws s3
A high-level way to interact with S3 buckets and objects
aws s3api
A low-level way to interact with S3 buckets and objects
aws s3control
Managing s3 access points, S3 outposts buckets, S3 batch operations, storage lens.
aws s3outposts
Manage endpoints for S3 outposts
Which AWS CLI command is a high-level way to interact with S3 buckets and objects? (aws s3)
What functionalities does aws s3control offer in the context of AWS S3? (aws s3control is used for managing S3 access points, S3 Outposts buckets, S3 batch operations, and storage lens, offering advanced control over specific S3 features and configurations.)
S3 – Request Styles
When making requests by using the REST API there are two styles of request:
Virtual hosted-style requests — the bucket name is a subdomain on the host
Path-style requests — the bucket name is in the request path
Virtual hosted–Style request
DELETE /puppy.jpg HTTP/1.1
Host: examplebucket.s3.us-west-2.amazonaws.com
Date: Mon, 11 Apr 2016 12:00:00 GMT
x-amz-date: Mon, 11 Apr 2016 12:00:00 GMT
Authorization: authorization string
Path-style request
DELETE /examplebucket/puppy.jpg HTTP/1.1
Host: s3.us-west-2.amazonaws.com
Date: Mon, 11 Apr 2016 12:00:00 GMT
x-amz-date: Mon, 11 Apr 2016 12:00:00 GMT
Authorization: authorization string
S3 supports both virtual-hosted–style and path-style URL Path-style URLs will be discontinued in the future
To force AWS CLI to use Virtual hosted-style requests, you need to globally configure the CLI
aws configure set s3.addressing_style virtual
Which of the following describes a virtual hosted-style request in Amazon S3? (The bucket name is a subdomain of the host.)
How do you configure the AWS CLI to use virtual hosted-style requests by default for Amazon S3? (aws configure set s3.addressing_style virtual)
S3 Storage Classes Overview
AWS offers a range of S3 storage classes that trade Retrieval Time, Accessibility and Durability for Cheaper Storage
S3 Standard (default)
Fast, Available and Durable.
S3 Reduced Redundancy Storage (RRS) legacy storage class
S3 Intelligent Tiering
Uses ML to analyze object usage and determine storage class. Extra fee to analyze
S3 Express One-Zone
single-digit ms performance, special bucket type, one AZ, 50% less than Standard cost
S3 Standard-IA (Infrequent Access)
Fast, Cheaper if you access less than once a month.
Extra fee to retrieve. 50% less than Standard (reduced availability)
S3 One-Zone-IA
Fast Objects only exist in one AZ. Cheaper than Standard IA by 20% less
(Reduce durability) Data could get destroyed. Extra fee to retrieve.
S3 Glacier Instant Retrieval
For long-term cold storage. Get data instantly
S3 Glacier Flexible Retrieval
takes minutes to hours get data (Standard, Expediated, Bulk Retrieval)
S3 Glacier Deep Archive
The lowest cost storage class. Data retrieval time is 12 hours.
S3 Outposts has its own storage class
What does AWS S3 Intelligent Tiering do? (Analyzes object usage using machine learning to determine the most cost-effective storage class.)
What is the primary benefit of using AWS S3 Standard storage class? (S3 Standard offers fast access, high availability, and high durability, making it suitable for frequently accessed data.)
S3 Storage Classes — Standard
S3 Storage Standard is the default storage class when you upload to S3.
It’s designed for general purpose storage for frequently accessed data
High Durability: 11 9’s of durability (99.999999999%)
High Availability: 4 9’s of availability (99.99%)
Data Redundancy: Data stored in 3 or more Availability Zones (AZs)
Retrieval Time: within milliseconds (low latency)
High Throughput: optimized for data that is frequently accessed and/or requires real-time access
Scalability: easily scales to storage size and number of requests
Use Cases: Ideal for a wide range of use cases like content distribution, big data analytics, and mobile and gaming applications, where frequent access is required.
Pricing:
Storage per GB
Per Requests
No Retrieval fee
No minimum storage duration charge
What is the durability guarantee of Amazon S3 Storage Standard? (99.999999999%)
What are the primary use cases for S3 Storage Standard? (S3 Storage Standard is ideal for content distribution, big data analytics, and mobile and gaming applications, where frequent access is required.)
S3 Storage Classes — RRS
S3 Reduced Redundancy Storage (RRS) is a legacy storage class to store noncritical, reproducible data at lower levels of redundancy than Amazon S3’s standard storage
RRS was introduced in 2010, and at the time, it was cheaper than Standard storage.
In 2018, S3 Standard infrastructure changed, and the cost of S3 Standard storage fell well below the cost of RRS.
RSS currently provides no cost-benefit to customers for the reduced redundancy and has no place in modern storage use cases
RRS is no longer cost-effective and is not recommended for use. It may appear in the AWS Console as an option due to legacy customers
Why is RRS no longer recommended for use in modern storage use-cases? (It is more expensive than S3 Standard without offering any cost-benefit)
What was the primary goal of introducing Amazon S3 Reduced Redundancy Storage (RRS) in 2010? (To store noncritical, reproducible data at lower levels of redundancy than Amazon S3’s standard storage, offering a cost-effective solution for such data.)
S3 Storage Classes — Standard-IA
S3 Standard-IA (Infrequent Access) storage class is designed for data that is less frequently accessed but requires rapid access when needed.
High Durability: 11 9's of durability like S3 Standard
High Availability: 3 9’s of availability (99.9%) Slower availability compared to S3 Standard
Data Redundancy: Data stored in 3 or more Availability Zones (AZs)
Cost-Effective Storage: costs 50% less from Standard. As long as you don’t access a file more than once a month!
Retrieval Time: within milliseconds (low latency)
High Throughput: optimized for rapid access, although the data is accessed less frequently compared to S3 Standard.
Scalability: easily scales to storage size and number of requests like S3 Standard
Use Cases: Ideal for data that is accessed less frequently but requires quick access when needed, such as disaster recovery, backups, or long-term data stores where data is not frequently accessed.
Pricing:
Storage per GB
Per Requests
Has a Retrieval fee
Has a minimum storage duration charge of 30 days
What is the minimum storage duration charge for S3 Standard-IA? (30 days)
What is the data redundancy level of Amazon S3 Standard-IA? (S3 Standard-IA stores data in 3 or more Availability Zones (AZs), ensuring high data redundancy.)
Change S3 storage class.
aws s3 cp test.txt s3://cadent-bucket --storage-class STANDARD_IA
S3 Storage Classes — Express One Zone
Amazon S3 Express One Zone delivers consistent single-digit millisecond data access for your most frequently accessed data and latency-sensitive applications
the lowest latency cloud object storage class available
data access speeds up to 10x faster than S3 Standard
request costs 50% lower than than S3 Standard
data is stored in a user selected single Availability Zone (AZ)
data is stored in a new bucket type: an Amazon S3 Directory bucket
S3 Directory bucket supports simple real-folder structure, you are only allowed 10 by default S3 Directory buckets per account
Express One Zone applies a flat per request charge for request sizes up to 512 KB.
Which of the following best describes Amazon S3 Express One Zone's data access speed compared to S3 Standard? (Which of the following best describes Amazon S3 Express One Zone's data access speed compared to S3 Standard?)
What is the limitation on the number of S3 Directory buckets you can have per account in Amazon S3 Express One Zone?(You are allowed only 10 S3 Directory buckets per account in Amazon S3 Express One Zone.)
S3 Storage Classes — One-Zone-IA
S3 One-Zone IA (Infrequent Access) storage class is designed for data that is less frequently accessed and has additional saving at reduced availability.
High Durability: 11 9's of durability like S3 Standard and S3 Standard IA
Lower Availability: 99.5%. Since it's in a single AZ, it has even lower availability than Standard IA
Cost-Effective Storage: costs 20% less than Standard IA
Data Redundancy: Data stored in one Availability Zones (AZs) there is a risk of data loss in case of AZ disaster.
Retrieval Time: within milliseconds (low latency)
Use Cases: Ideal for secondary backup copies of on-premises data, or for storing data that can be recreated in the event of an AZ failure. It's also suitable for storing infrequently accessed data that is non mission-critical.
Pricing:
Storage per GB
Per Requests
Has a Retrieval fee
Has a minimum storage duration charge of 30 days
What is the availability percentage of the S3 One-Zone-IA storage class? (99.5%)
What are ideal use cases for S3 One-Zone-IA storage class? (Ideal for secondary backup copies of on-premises data, storing data that can be recreated in the event of an AZ failure, and for storing infrequently accessed, non mission-critical data.)
S3 Glacier Storage Classes vs S3 Glacier “Vault”
S3 Glacier “Vault”
“S3” Glacier is a stand-alone service from S3 that uses vaults over bucket to store data long term.
S3 Glacier is the original vault service.
It has vault control policies
Most interactions occur via the AWS CLI
Enterprises are still using S3 Glacier Vault
S3 Glacier Deep Archive is part of S3 Glacier “Vault”
S3 Glacier Storage Classes
S3 Glacier Storage classes offer similar functionality to S3 Glacier but with greater convenience and flexibility, all within s3 Buckets
This is a new class with no attachment to S3 Glacier Vault
S3 Glacier Instant
These two are using S3 Glacier Vault underneath
S3 Glacier Flexible
S3 Glacier Deep Archive
Which S3 Glacier Storage Class is directly using S3 Glacier Vault underneath? (S3 Glacier Deep Archive, S3 Glacier Flexible)
How do S3 Glacier Storage Classes differ from the original S3 Glacier service? (S3 Glacier Storage Classes offer more flexibility and convenience, integrating directly with S3 Buckets, unlike the original S3 Glacier service that uses vaults.)
S3 Storage Classes — Glacier Instant Retrieval
S3 Glacier Instant Retrieval is a storage class designed for rarely accessed data that still needs immediate access in performance-sensitive use cases.
High Durability: 11 9's of durability like S3 Standard
High Availability: 3 9’s of availability (99.9%) like S3 Standard IA
Cost-Effective Storage: 68% lower cost than Standard IA,
For long-lived data that is accessed once per quarter
Retrieval Time: within milliseconds (low latency)
Use Cases: Rarely access data that needs immediate access eg. image hosting, online file-sharing applications, medical imaging and health records, news media assets, and satellite and aerial imaging.
Pricing:
Storage per GB
Per Requests
Has a Retrieval fee
Has a minimum storage duration charge of 90 days
Glacier Instant Retrieval is not a separate service and does not require a Vault
Which of the following is a use case for S3 Glacier Instant Retrieval? (Medical imaging and health records, Satellite and aerial imaging, Image hosting)
What is the minimum storage duration charge for Amazon S3 Glacier Instant Retrieval? (The minimum storage duration charge for Amazon S3 Glacier Instant Retrieval is 90 days.)
S3 Storage Classes — Glacier Flexible Retrieval
S3 Glacier Flexible Retrieval (formally S3 Glacier) combines S3 and Glacier into a single set of APIs. It’s considerably faster than Glacier Vault-based storage.
There are 3 retrieval tiers: (the faster and more expensive)
Expedited Tier 1-5 mins For urgent requests. Limited to 250 MB archive size
Standard Tier 3-5 hours No archive size limit. This is the default option
Bulk Tier 5-12 hours No archive size limit, even petabytes worth of data. Bulk retrieval is free of charge (no cost per GB).
You pay per GB retrieved and number of requests This is a separate cost from just the cost of storage.
Archived objects will have an additional 40KBs of data:
32KB for archive index and archive metadata information
8KB for the name of the object
You should store fewer and larger files instead of smaller files. 40KBs on thousands of files add up.
Glacier Flexible Retrieval is not a separate service and does not require a Vault
Which retrieval tier in S3 Glacier Flexible Retrieval does not have an archive size limit and is considered the default option? (Standard Tier)
What are the three retrieval tiers available in S3 Glacier Flexible Retrieval, and what is the retrieval speed for each? (Expedited (1-5 minutes)
Standard (3-5 hours)
Bulk (5-12 hours)
S3 Storage Classes — Glacier Deep Archive
S3 Glacier Deep Archive combines S3 and Glacier into a single set of APIs.
It's more cost-effective than S3 Glacier Flexible but greater cost of retrieval.
There are two retrieval tiers:
Standard Tier restores within 12 hours
No archive size limit. This is the default option
There is no expedited tier for Glacier Deep Archive
Bulk Tier restores within 48 hours
No archive size limit, even petabytes worth of data.
Archived objects will have an additional 40KBs of data:
32KB for index and metadata information
8KB for the name of the object
Glacier Deep Archive is not a separate service and does not require a Vault
What is the retrieval time range for the Standard Tier in S3 Glacier Deep Archive? (12-48 hours)
Is there an expedited retrieval tier available in S3 Glacier Deep Archive? (No)
S3 Storage Classes – Intelligent-Tiering
S3 Intelligent-Tiering automatically moves objects into different storage tiers to optimize cost and performance. A small monthly fee applies for object monitoring and automation.
Key Features
Frequent Access Tier:
Default tier; objects remain as long as they are accessed.
Infrequent Access Tier:
Objects move here after 30 days of inactivity.
Archive Instant Access Tier:
Objects move here after 90 days of inactivity.
Optional Archive Tiers:
Archive Access Tier: After 90 days, access on demand.
Deep Archive Access Tier: After 180 days, access on demand.
Important Notes
Additional cost applies for analyzing objects over a 30-day period.
Ideal for data with changing or unknown access patterns.
AWS CLI Command Example
aws s3api put-object \
--bucket my-bucket \
--key 'myfile' \
--body 'path/to/local/file' \
--storage-class INTELLIGENT_TIERING
What does the S3 Intelligent-Tiering storage class do? (It automatically moves objects to the most cost-effective access tier based on how frequently they are accessed.)
After how many days of no access is an object moved from the Frequent Access tier to the Infrequent Access tier in S3 Intelligent-Tiering? (An object is moved from the Frequent Access tier to the Infrequent Access tier after 30 days of no access in S3 Intelligent-Tiering.)
Which AWS S3 storage class has the highest availability guarantee? (S3 Standard)
What is the first byte latency for the 'Glacier Deep Archive' storage class? (hours)
What is the minimum storage duration charge for the S3 Glacier Deep Archive? (180 days)
What does '11 9's' refer to in the context of Amazon S3 storage classes? (It refers to the durability of the storage, meaning 99.999999999% durability, indicating a very high level of reliability where data is very unlikely to be lost.)
Which S3 storage class offers the fastest first-byte latency? (S3 Express One-Zone)
S3 Security Overview
Bucket Policies: Define permissions for an entire S3 bucket using JSON-based access policy language.
Access Control Lists (ACLs): Provide a legacy method to manage access permissions on individual objects and buckets.
AWS PrivateLink for Amazon S3: Enables private network access to S3, bypassing the public internet for enhanced security.
Cross-Origin Resource Sharing (CORS): Allows restricted resources on a web page from another domain to be requested.
Amazon S3 Block Public Access: Offers settings to easily block public access to all your S3 resources.
IAM Access Analyzer for S3: Analyzes resource policies to help identify and mitigate potential access risks.
Internetwork Traffic Privacy: Ensures data privacy by encrypting data moving between AWS services and the Internet.
Object Ownership: Manages data ownership between AWS accounts when objects are uploaded to S3 buckets.
Access Points: Simplifies managing data access at scale for shared datasets in S3.
Access Grants: Providing access to S3 data via a directory services e.g. Active Directory
Versioning: Preserves, retrieves, and restores every version of every object stored in an S3 bucket.
MFA Delete: Adds an additional layer of security by requiring MFA for the deletion of S3 objects.
Object Tags: Provides a way to categorize storage by assigning key-value pairs to S3 objects.
In-Transit Encryption: Protects data by encrypting it as it travels to and from S3 over the internet.
Server-Side Encryption: Automatically encrypts data when writing it to S3 and decrypts it when downloading.
Client-Side Encryption: Encrypts data client-side before uploading to S3 and decrypts it after downloading.
Compliance Validation for Amazon S3: Ensures S3 services meet compliance requirements like HIPAA, GDPR, etc.
Infrastructure Security: Protects the underlying infrastructure of the S3 service, ensuring data integrity and availability.
Which of the following AWS services enables private network access to S3, bypassing the public internet for enhanced security? (AWS PrivateLink for Amazon S3)
What is the role of Bucket Policies in Amazon S3? (Bucket Policies define permissions for an entire S3 bucket using a JSON-based access policy language, enabling administrators to manage access at the bucket level.)
Amazon S3 Block Public Access
Overview
Block Public Access is a safety feature enabled by default to block all public access to an S3 bucket.
Unrestricted S3 bucket access is the #1 security misconfiguration.
Key Features
Block all public access setting prevents:
New Access Control Lists (ACLs)
Any Access Control Lists
New Bucket Policies or Access Points
Any Bucket Policies or Access Points
Important Notes
Access Points can have independent Block Public Access settings.
Which of the following is considered the #1 security misconfiguration for Amazon S3 buckets? (Unrestricted access to S3 buckets)
Can access points have their own independent Block Public Access setting in Amazon S3? (Yes, access points can have their own independent Block Public Access setting in Amazon S3, allowing for more granular control over data access and security.)
What is the primary function of the Amazon S3 Block Public Access feature? (Block all public access to an S3 bucket by default, enhancing the security of the data stored within.)
Amazon S3 — Access Control Lists (ACLs)
ACLs grant basic read/write permissions to other AWS accounts
you can grant permissions only to other AWS accounts
you cannot grant permissions to users in your account
You cannot grant conditional permissions
You cannot explicitly deny permissions
S3 ACLs has been traditionally used to allow other AWS accounts to upload objects to a bucket
Access Controls Lists (ACLs) is a legacy feature of S3 and there are more robust ways to provide cross-account access via bucket policies and access points.
What is the primary function of Amazon S3 Access Control Lists (ACLs)? (To grant basic read/write permissions to other AWS accounts)
Can you grant permissions to users in your own AWS account using S3 ACLs? (No)
## Create a new bucket
```sh
aws s3api create-bucket --bucket acl-example-ab-5235 --region us-east-1
```
## Turn of Block Public Access for ACLs
```sh
aws s3api put-public-access-block \
--bucket acl-example-ab-5235 \
--public-access-block-configuration "BlockPublicAcls=false,IgnorePublicAcls=false,BlockPublicPolicy=true,RestrictPublicBuckets=true"
```
```sh
aws s3api get-public-access-block --bucket acl-example-ab-5235
```
## Change Bucket Ownership
```sh
aws s3api put-bucket-ownership-controls \
--bucket acl-example-ab-5235 \
--ownership-controls="Rules=[{ObjectOwnership=BucketOwnerPreferred}]"
```
## Change ACLs to allow for a user in another AWS Account
```sh
aws s3api put-bucket-acl \
--bucket acl-example-ab-5235 \
--access-control-policy file:///workspace/AWS-Examples/s3/acls/policy.json
```
## Access Bucket from other account
```sh
touch bootcamp.txt
aws s3 cp bootcamp.txt s3://acl-example-ab-5235
aws s3 ls s3://acl-example-ab-5235
```
## Cleanup
```sh
aws s3 rm s3://acl-example-ab-5235/bootcamp.txt
aws s3 rb s3://acl-example-ab-5235
{
"Grants": [
{
"Grantee": {
"DisplayName": "andrewbrown",
"ID": "e602ac4aeb23a4d25642f95dc2fbc085279cf5b30824067afbb329d7eeb49fe5",
"Type": "CanonicalUser"
},
"Permission": "FULL_CONTROL"
}
],
"Owner": {
"DisplayName": "andrewbrown",
"ID": "74ababf54b6810c1d34431ceee560f5de666617490e539850338964d29c30eef"
}
}
Amazon S3 – Bucket Policies
S3 Bucket Policy
A resource-based policy to grant an S3 bucket and bucket objects to other Principles e.g., AWS Accounts, Users, AWS services.
Example Policies
Only allow a specific role to read objects with prod object tag
{
"Version":"2012-10-17",
"Statement":[{
"Principal":{
"AWS":"arn:aws:iam::123456789012:role/ProdTeam"
},
"Effect":"Allow",
"Action":[
"s3:GetObject",
"s3:GetObjectVersion"
],
"Resource":"arn:aws:s3:::my-bucket/*",
"Condition":{
"StringEquals":{
"s3:ExistingObjectTag/environment":"prod"
}
}
}]
}
Principal: The entity (AWS account, IAM user, or role) that is allowed or denied access to a resource.
Effect: Specifies whether the statement results in an allow or an explicit deny.
Action: List of actions that are allowed or denied.
Resource: Specifies the object(s) the policy applies to.
Condition: Specifies conditions for when the policy is in effect.
Restrict access to a specific IP
{
"Version":"2012-10-17",
"Id":"S3PolicyId1",
"Statement":[{
"Sid":"IPAllow",
"Effect":"Deny",
"Principal":"*",
"Action":"s3:*",
"Resource":[
"arn:aws:s3:::my-bucket",
"arn:aws:s3:::my-bucket/*"
],
"Condition":{
"NotIpAddress":{
"aws:SourceIp":"192.0.2.0/24"
}
}
}]
}
Principal: * means all principals (everyone).
Effect: Deny specifies that all actions are denied unless they match the NotIpAddress condition.
Condition: NotIpAddress restricts access to specific IP ranges.
Action: s3:* indicates all S3 actions are covered.
Important Terms
resource-based policy: A policy attached directly to an AWS resource (S3 bucket).
Effect: Can be Allow or Deny.
Principal: The entity that is allowed or denied access.
Action: The specific action(s) that are allowed or denied.
Resource: The resource(s) to which the policy applies.
Condition: Additional conditions that must be met for the policy to apply.
What is an Amazon S3 Bucket Policy primarily used for? (To grant or deny permissions to S3 bucket and objects.)
What is a resource-based policy in Amazon S3? (A policy attached directly to an AWS resource like an S3 bucket.)
How can you restrict access to an S3 bucket to a specific IP address? (You can restrict access to a specific IP address by creating a policy statement with "Effect": "Deny")
## Create a bucket
aws s3 mb s3://bucket-policy-example-ab-5235
## Create bucket policy
aws s3api put-bucket-policy --bucket bucket-policy-example-ab-5235 --policy file://policy.json
# In the other account access the bucket
touch bootcamp.txt
aws s3 cp bootcamp.txt s3://bucket-policy-example-ab-5235
aws s3 ls s3://bucket-policy-example-ab-5235
## Cleanup
aws s3 rm s3://bucket-policy-example-ab-5235/bootcamp.txt
aws s3 rb s3://bucket-policy-example-ab-5235
{
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::387543059434:user/andrewbrown"
},
"Action": [
"s3:ListBucket",
"s3:GetObject",
"s3:PutObject"
],
"Resource": [
"arn:aws:s3:::bucket-policy-example-ab-5235",
"arn:aws:s3:::bucket-policy-example-ab-5235/*"
]
}
]
}
S3 Bucket Policies vs IAM Policies
S3 Bucket policies have overlapping functionality as an IAM policy that grants access to S3.
S3 Bucket Policies provide convenience over IAM policies granting S3 access.
S3 Bucket Policy
Provides access to a specific bucket and its objects
You can specify multiple principals to grant access
Bucket Policies can be 20 KB in size
Block Public Access is turned on by default and will DENY all anonymous access even if the bucket policy grants it (unless the feature is turned off.)
IAM Policy
Provides access to many AWS services Can provide permissions for multiple buckets in one policy
The principal, by default, is the entity the IAM policy is attached to
IAM policy sizes are limited based on the principal:
Users 2 KB
Groups 5 KB
Roles 10 KB
What is the primary advantage of using an S3 Bucket Policy over an IAM Policy for granting access to S3? (S3 Bucket Policies allow specifying multiple principals.)
What is the default setting of Block Public Access in S3 Bucket Policies, and what does it do? (Block Public Access is turned on by default in S3 Bucket Policies, denying all anonymous access even if the bucket policy otherwise grants it.)
Amazon S3 — Access Grants
Amazon S3 Access Grants lets you map identities in a directory service
(IAM Identify center, Active Directory, Okta) to access datasets in s3.
a logical container for individual grants
determines which bucket data you access
The granular scope of access within a location
level of access:
READ
WRITE
READWRITE
What will have access:
IAM role
IAM user
Bucket objects eg. mybucket/scifi/*
request just-in-time access credentials
Which of the following services allows you to map identities in a directory service to access datasets in Amazon S3? (Amazon S3 Access Grants)
What types of access levels can be defined within Amazon S3 Access Grants? (READ, WRITE, and READWRITE)
IAM Access Analyzer for S3
Access Analyzer for S3 will alert you when your S3 buckets are exposed to the Internet or other AWS Accounts
In order to use Access Analyzer for S3, you need to first create an analyzer in IAM Access Analyzer at the account level.
View buckets per region
Download a report
What is the primary function of Access Analyzer for S3? (To alert users when their S3 buckets are exposed to the Internet or other AWS accounts)
What is one of the features provided by Access Analyzer for S3? (Ability to view buckets per region.)
Amazon S3 — Iternetwork traffic privacy
What is Internetwork traffic privacy?
Internetwork traffic privacy is about keeping data private as it travels across different networks.
AWS PrivateLink (VPC Interface Endpoints)
Allows you to connect an Elastic Network Interface Interface (ENI) directly to other AWS Services eg. S3, EC2, Lambda
It can connect to select Third-Party services via the AWS Marketplace.
AWS PrivateLink can go cross-account
Has fine-grain permissions via VPC endpoint policies
There is a charge for using AWS PrivateLink
VPC Gateway Endpoint
Allows you to connect a VPC directly to S3 (or DynamoDB), staying private within the internal AWS network.
VPC Gateway Endpoint can not go cross-account
Does not have fine-grain permissions
There is no charge to use VPC Gateway Endpoints
What distinguishes AWS PrivateLink from VPC Gateway Endpoint in terms of cross-account functionality? (AWS PrivateLink can go cross-account, while VPC Gateway Endpoint cannot)
What is the main benefit of using AWS PrivateLink for connecting to AWS services like S3 or EC2? (AWS PrivateLink provides a private connection between AWS services and your VPC, bypassing the public internet, which enhances security and reduces exposure to threats.)
Cross-Origin Resource Sharing (CORS)
Cross-Origin Resource Sharing (CORS) is an HTTP-header based mechanism that allows a server to indicate any other origins (domain, scheme, or port) than its own from which a browser should permit loading of resources
Access is controlled via HTTP headers
Request Headers
Origin
Access-Control-Request-Method
Access-Control-Request-Headers
Response Headers
Access-Control-Allow-Origin
Access-Control-Allow-Credentials
Access-Control-Expose-Headers
Access-Control-Max-Age
Access-Control-Allow-Methods
Access-Control-Allow-Headers
CORS restrict which websites may access data to be loaded onto its page
What is the purpose of Cross-Origin Resource Sharing (CORS)? (To enable a server to specify which origins can access its resources.)
What is the role of the "Origin" request header in CORS? (The "Origin" request header is used by the browser to inform the server about the origin (domain, scheme, or port) of the request, allowing the server to determine whether to allow or deny the request based on its CORS policy.)
Amazon S3 CORS
Amazon S3 allows you to set CORS configuration to a S3 bucket with static website hosting so different origins can perform HTTP requests from your S3 Static website.
The CORS configuration can be in either JSON or XML
The AWS Console only access JSON.
In real-world use cases, you should not wildcard (allow all) origin since this would negate the protection that CORS is supposed to serve.
What does CORS stand for in the context of Amazon S3? (Cross-Origin Resource Sharing)
What is the purpose of CORS in the context of Amazon S3? (CORS (Cross-Origin Resource Sharing) allows different origins to perform HTTP requests from an Amazon S3 Static website, managing how resources can be requested from another domain.)
# Create Website 1
## Create a bucket
```sh
aws s3 mb s3://cors-fun-ab-36252
```
## Change block public access
```sh
aws s3api put-public-access-block \
--bucket cors-fun-ab-36252 \
--public-access-block-configuration "BlockPublicAcls=true,IgnorePublicAcls=true,BlockPublicPolicy=false,RestrictPublicBuckets=false"
```
## Create a bucket policy
```sh
aws s3api put-bucket-policy --bucket cors-fun-ab-36252 --policy file://bucket-policy.json
```
## Turn on static website hosting
```sh
aws s3api put-bucket-website --bucket cors-fun-ab-36252 --website-configuration file://website.json
```
## Upload our index.html file and include a resource that would be cross-origin
aws s3 cp index.html s3://cors-fun-ab-36252
## View the website and see if the index.html is there.
It this for ca-central-1
http://cors-fun-ab-36252.s3-website.ca-central-1.amazonaws.com
Other regions might ue a hyphen instead
http://cors-fun-ab-36252.s3-website-ca-central-1.amazonaws.com
# Create Website 2
```sh
aws s3 mb s3://cors-fun2-ab-36252
```
## Change block public access
```sh
aws s3api put-public-access-block \
--bucket cors-fun2-ab-36252 \
--public-access-block-configuration "BlockPublicAcls=true,IgnorePublicAcls=true,BlockPublicPolicy=false,RestrictPublicBuckets=false"
```
## Create a bucket policy
```sh
aws s3api put-bucket-policy --bucket cors-fun2-ab-36252 --policy file://bucket-policy2.json
```
## Turn on static website hosting
```sh
aws s3api put-bucket-website --bucket cors-fun2-ab-36252 --website-configuration file://website.json
```
## Upload our javascript file
aws s3 cp hello.js s3://cors-fun2-ab-36252
## Create API Gateway with mock response and then test the endpoint
curl -X POST -H "Content-Type: application/json" https://1kccnjkm43.execute-api.ca-central-1.amazonaws.com/prod/hello
## Set CORS on our bucket
aws s3api put-bucket-cors --bucket cors-fun-ab-36252 --cors-configuration file://cors.json
s3/cors/bucket-policy.json
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "PublicReadGetObject",
"Effect": "Allow",
"Principal": "*",
"Action": [
"s3:GetObject"
],
"Resource": [
"arn:aws:s3:::cors-fun-ab-36252/*"
]
}
]
}
s3/cors/bucket-policy2.json
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "PublicReadGetObject",
"Effect": "Allow",
"Principal": "*",
"Action": [
"s3:GetObject"
],
"Resource": [
"arn:aws:s3:::cors-fun2-ab-36252/*"
]
}
]
}
s3/cors/cors.json
{
"CORSRules": [
{
"AllowedOrigins": ["https://1kccnjkm43.execute-api.ca-central-1.amazonaws.com"],
"AllowedHeaders": ["*"],
"AllowedMethods": ["PUT", "POST", "DELETE"],
"MaxAgeSeconds": 3000,
"ExposeHeaders": ["x-amz-server-side-encryption"]
}
]
}
s3/cors/hello.js
console.log('hello world')
s3/cors/index.html
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>My Website!!!!</title>
<link rel="preconnect" href="https://fonts.googleapis.com">
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
<link href="https://fonts.googleapis.com/css2?family=Rubik+Bubbles&display=swap" rel="stylesheet">
<style>
* {
font-family: 'Rubik Bubbles'
}
</style>
</head>
<body>
<main>
<h1>Welcome to My Website</h1>
<p>What do you think of my amazing Website!</p>
</main>
<script type="text/javascript">
const xhr = new XMLHttpRequest();
xhr.open("POST", "https://1kccnjkm43.execute-api.ca-central-1.amazonaws.com/prod/hello", true);
// Send the proper header information along with the request
xhr.setRequestHeader("Content-Type", "application/json");
xhr.onreadystatechange = () => {
if (xhr.readyState === XMLHttpRequest.DONE && xhr.status === 200) {
const results = JSON.parse(xhr.responseText);
console.log('results',results)
}
};
xhr.send();
</script>
</body>
</html>
s3/cors/website.json
{
"IndexDocument": {
"Suffix": "index.html"
},
"ErrorDocument": {
"Key": "error.html"
}
}
S3 Encryption Overview
Encryption In Transit
When data is encrypted by the sender and then decrypted the receiver.
Encryption-At-Rest
Client-Side Encryption (CSE)
when data is encrypted by the client and then sent to the server
The client has the key, the server will serve the encrypted file since it does not have the key to decrypt when data is requested
Server-Side Encryption (SSE)
when data is encrypted by the server
The server has the key to decrypt when data is requested
What does Encryption in Transit imply in the context of data transfer? (Data is encrypted by the sender and decrypted by the receiver.)
Who holds the encryption key in Client-Side Encryption (CSE)? (The client holds the encryption key. This means the data is encrypted on the client's side before being sent to the server, and only the client can decrypt it.)
S3 — Encryption In Transit
Encryption In-Transit
Data that is secure when moving between locations Algorithms: TLS, SSL
This encryption ensures that data remains confidential and cannot be intercepted or viewed by unauthorized parties while in transit.
Data will be encrypted senders-side
Data will be decrypted server-side
Transport Layer Security (TLS)
An encryption protocol for data integrity between two or more communicating computer application. TLS 1.0, 1.1 are deprecated. TLS 1.2 and TLS 1.3 is the current best practice
Secure Sockets Layers (SSL)
An encryption protocol for data integrity between two or more communicating computer application SSL 1.0, 2.0 and 3.0 are deprecated
What is the primary purpose of encryption in transit? (To ensure data remains confidential and cannot be intercepted or viewed by unauthorized parties)
What ensures data remains confidential and cannot be intercepted or viewed by unauthorized parties while in transit? (Encryption in transit, specifically using algorithms like TLS and SSL, ensures data remains confidential and cannot be intercepted or viewed by unauthorized parties while in transit.)
Are older versions of SSL considered secure and recommended for use? (No, older versions of SSL (1.0, 2.0, and 3.0) are deprecated and not recommended for use due to security vulnerabilities.)
S3 — Server-Side Encryption
Server-Side Encryption (SSE) is enabled by default for all new S3 objects.
SSE-S3
Amazon S3 manages the keys, encrypting objects using AES-256 (Cipher Block Chaining – CBC) algorithm
SSE-KMS
AWS Key Management Service (KMS) managers encryption keys. You can choose AWS-managed keys or manage customer master keys (CMKs) yourself through KMS.
SSE-C
You provide your own encryption keys, and Amazon S3 handles encryption and decryption using these customer-provided keys. You are responsible for managing the keys.
DSSE-KMS (Dual-layer Encryption)
In dual-layer encryption, data is encrypted client-side before being uploaded to S3, and then server-side encryption (SSE) is applied within S3. This is not a specific AWS service but a strategy some users may adopt.
Server-side encryption only encrypts the contents of an object, not its metadata
Which server-side encryption option in Amazon S3 uses AWS Key Management Service (KMS) for managing the encryption keys? (SSE-KMS)
Which encryption algorithm is utilized by Amazon S3 when implementing server-side encryption with SSE-S3? (Amazon S3 uses the AES-GCM (256-bit) encryption algorithm when implementing server-side encryption with SSE-S3, where Amazon manages the encryption keys.)
Amazon S3 – SSE-S3
What is SSE-S3?
Amazon manages all encryption for objects stored in S3.
Each object is encrypted with a unique key.
Envelope encryption is used.
Keys are rotated automatically by Amazon.
By default, SSE-S3 is applied to all objects unless specified otherwise.
There are no additional charges for using SSE-S3.
Uses AES-256-bit Advanced Encryption Standard (AES-GCM) for securing data.
Explicit Configuration:
You can explicitly set SSE-S3 when uploading objects:
aws s3api put-object \
--bucket mybucket \
--key myfile \
--server-side-encryption AES256 \
--body myfile.txt
Default Behavior:
If no encryption configuration is specified, SSE-S3 is automatically applied:
aws s3api put-object \
--bucket mybucket \
--key myfile \
--body myfile.txt
Clarification About Bucket Keys:
Bucket Keys improve performance and reduce costs by caching encryption keys.
Note: This applies to SSE-KMS, not SSE-S3.
What does SSE-S3 stand for, and who manages the encryption in this context? (Server-Side Encryption-S3, where Amazon manages all the encryption)
How does Amazon S3 handle key rotation for SSE-S3 encryption? (Amazon S3 automatically rotates the key regularly when using SSE-S3 encryption.)
Amazon S3 - SSE-KMS
SSE-KMS is when you use a KMS key managed by AWS.
You first create a KMS managed key
You choose the KMS key to encrypt your object
KMS can automatically rotate keys
KMS key policy controls who can decrypt using the key
KMS can help meet regulatory compliance
KMS keys have their own additional costs
AWS KMS keys must be in the same Region as the bucket
AWS KMS keys can be multi-region, allowing the same key to be used in different regions, not limited to the bucket’s region.
To upload with KMS you need kms:GenerateDataKey
To download with KMS you need kms:Decrypt
Bucket Key can be set for SSE-KMS for improved performance
Using a KMS with aws s3api
aws s3api put-object \
--bucket mybucket \
--key example.txt \
--body example.txt \
--server-side-encryption "aws:km" \
--ssekms-key-id 1234abcd-12ab-34cd-56ef-1234567890ab
Using a KMS key with aws s3
aws s3 cp example.txt s3://mybucket/example.txt \
--sse aws:km \
--sse-kms-key-id 1234abcd-12ab-34cd-56ef-1234567890ab
What is required to upload an object with SSE-KMS encryption in Amazon S3? (kms:GenerateDataKey permission)
What is a key benefit of using AWS KMS for encryption key management in S3? (The ability to automatically rotate encryption keys, aiding in meeting regulatory compliance.)
Can AWS KMS keys be used across multiple regions for Amazon S3, and if so, how? (Yes, AWS KMS keys can be multi-region, allowing the same key to be used in different regions, which is not limited to the bucket's region.)
Amazon S3 – SSE-C
SSE-C is when you provide your own encryption key that Amazon S3 then uses to apply AES-256 encryption to your data.
You need to provide the encryption key every time you retrieve objects
The encryption key you upload is removed from Amazon S3 memory after each request
There is no additional charge to use SSE-C
Amazon S3 will store a randomly salted Hash-based Message Authentication Code (HMAC) of your encryption key to validate future requests
Presigned URLs support SSE-C
With bucket versioning, different object versions can be encrypted with different keys
you manage encryption keys on the client side, you manage any additional safeguards, such as key rotation, on the client side.
BASE64_ENCODED_KEY=$(openssl rand -base64 32)
aws s3api put-object \
--bucket mybucket \
--key myfile \
--body file://myfile.txt \
--sse-customer-algorithm AES256 \
--sse-customer-key $BASE64_ENCODED_KEY \
--sse-customer-key-md5 `echo -m $BASE64_ENCODED_KEY | base64 --decode | md5sum | awk '{print $1}' | base64`
What is required every time you retrieve an object using SSE-C in Amazon S3? (The same encryption key used for uploading)
What does Amazon S3 do with your encryption key after processing your SSE-C request? (Amazon S3 removes the encryption key from memory after each request and stores a salted HMAC of the key to validate future requests.)
Amazon S3 – DSSE-KMS
DSSE (Dual-layer server-side encryption) KMS. It’s SSE-KMS with the inclusion of client-side encryption.
It would be more accurate to call it CSE-KMS
With DSSE-KMS, data is encrypted twice
The key used for client-side encryption comes from KMS
There are additional charges for DSSE and KMS keys
aws s3api put-object \
--bucket mybucket \
--key myfile \
--server-side-encryption aws:kms:dsse \
--ssekms-key-id "1234abcd-12ab-34cd-56ef-1234567890ab" \
--body filepath
Encrypt
client-side requests AWS KMS to generate a data encryption key (DEK) using the CMK.
KMS sends two versions of the DEK to you: a plaintext version and an encrypted version
You use the plaintext DEK to encrypt your data locally and then discard it from memory.
encrypted version of the DEK is stored alongside the encrypted data in Amazon s3
Decrypt
you retrieve the encrypted data and the associated encrypted DEK
You send the encrypted DEK to AWS KMS, which decrypts it using the corresponding CMK and returns the plaintext DEK.
use this plaintext DEK to decrypt the data locally and then discard the DEK from memory
What is the primary purpose of DSSE-KMS in Amazon S3? (To provide an additional layer of encryption by combining server-side and client-side encryption)
How is a data encryption key (DEK) used in the DSSE-KMS encryption process? (In the DSSE-KMS encryption process, AWS KMS generates a DEK, which is sent to the client in two forms: plaintext and encrypted. The client uses the plaintext DEK to encrypt data locally before discarding it and stores the encrypted DEK alongside the data in Amazon S3.)
What is DSSE-KMS in the context of Amazon S3? (DSSE-KMS stands for Dual-layer server-side encryption with KMS, where data is encrypted twice - once on the client side using a key from KMS and again on the server side, enhancing data security in Amazon S3.)
## Create a bucket
aws s3 mb s3://encryption-fun-ab-135
### Create a file and Put Object with encrpytion SS3-S3
echo "Hello World" > hello.txt
aws s3 cp hello.txt s3://encryption-fun-ab-135
### Put Object with encryption of SS3-KMS
aws s3api put-object \
--bucket encryption-fun-ab-135 \
--key hello.txt \
--body hello.txt \
--server-side-encryption "aws:kms" \
--ssekms-key-id "a1bb2b48-ce90-49ff-bd06-f23705bcc0d8"
### Put Object with SSE-C [Failed Attempt]
export BASE64_ENCODED_KEY=$(openssl rand -base64 32)
echo $BASE64_ENCODED_KEY
export MD5_VALUE=$(echo $BASE64_ENCODED_KEY | md5sum | awk '{print $1}' | base64 -w0)
echo $MD5_VALUE
aws s3api put-object \
--bucket encryption-fun-ab-135 \
--key hello.txt \
--body hello.txt \
--sse-customer-algorithm AES256 \
--sse-customer-key $BASE64_ENCODED_KEY \
#--sse-customer-key-md5 $MD5_VALUE
An error occurred (InvalidArgument) when calling the PutObject operation: The calculated MD5 hash of the key did not match the hash that was provided.
### Put Object with SSE-C via aws s3
https://catalog.us-east-1.prod.workshops.aws/workshops/aad9ff1e-b607-45bc-893f-121ea5224f24/en-US/s3/serverside/ssec
openssl rand -out ssec.key 32
aws s3 cp hello.txt s3://encryption-fun-ab-135/hello.txt \
--sse-c AES256 \
--sse-c-key fileb://ssec.key
aws s3 cp s3://encryption-fun-ab-135/hello.txt hello.txt --sse-c AES256 --sse-c-key fileb://ssec.key
s3/encryption/hello.txt
Hello World
s3/encryption/ssec.key
S3 Bucket Key
When you use SSE-KMS, an individual data key is used on every object request In this case, S3 has to call AWS KMS every time a request is made. KMS charges on the number of requests and so this charge can add up
S3 Bucket key lets you generate a short-lived bucket-level key from AWS Key that is temporarily stored in S3
This will reduce request costs by up to 99%
This will decrease request traffic and improve overall performance
A unique bucket-level key is generated for each requester
You can enable Bucket key at the bucket level to be applied to all new objects
You can enable Bucket key at the object level for only specific objects
S3 Bucket key can be enabled for SSE-S3 and SSE-KMS
What is the purpose of using an S3 Bucket Key in AWS? (To reduce request costs by up to 99%)
How does the S3 Bucket Key reduce the number of calls to AWS KMS? (The S3 Bucket Key allows S3 to use a short-lived bucket-level key from AWS KMS, which is temporarily stored in S3, thus reducing the number of calls to AWS KMS for each object request.)
S3 — Client-Side Encryption
Client-Side Encryption is when you encrypt your own files before uploading them to S3
This provides a guarantee that AWS and no third-party can decrypt your data.
Various AWS SDK’s have built-in code to make it easy to encrypt your data
Ruby SDK you can supply an encryption key to the S3 Client, and it will automatically encrypt and decrypt server side.
Supplying no key will result in the cipher text
What does Client-Side Encryption in AWS S3 refer to? (Encrypting your own files before uploading them to S3)
Why is Client-Side Encryption important for data stored in AWS S3? (It provides a guarantee that only the data owner can decrypt and access the content, ensuring maximum privacy and security against unauthorized access, including from AWS or any third parties.)
What does data consistency ensure when data is stored in multiple locations? (The data matches exactly across all locations.)
What is the characteristic of strongly consistent data in data storage systems?(Strongly consistent data is data where every request returns the most recent version, ensuring that the user never receives outdated information.)
S3 – Object Replication Overview
Amazon S3 Object Replication is when you make (copies) of objects into other buckets
Data Redundancy and Durability: Enhances data protection by storing copies in different locations.
Compliance and Data Residency: Meets legal and regulatory data storage requirements in specific regions.
Improved Accessibility and Latency: Reduces latency by replicating data closer to end-users.
Operational Flexibility: Separates production, backup, and development data sets for various operational needs.
Efficient Data Processing: Facilitates parallel data processing in regions closer to AWS resources.
Disaster Recovery and Business Continuity: Ensures business continuity with backup copies in separate locations.
Optimized Costs: Costs can be reduced by replicating to regions with lower storage pricing.
Cross-Account Replication: Allows secure data sharing between different AWS accounts.
Versioning and Change Tracking: Maintains object versions in different buckets for easy rollback and tracking.
Automated Backups and Archiving: Simplifies and updates backup and archiving processes automatically.
What is a primary benefit of Amazon S3 Object Replication in terms of data accessibility? (Reduces latency by replicating data closer to end-users.)
What is the significance of cross-account replication in Amazon S3 Object Replication? (For secure data sharing between different AWS accounts, enhancing collaboration and data management.)
How does Amazon S3 Object Replication assist in adhering to compliance and data residency requirements? (It aids by replicating and storing data in specific regions to meet legal, regulatory, and compliance requirements regarding data residency.)
S3 Versioning
S3 Versioning allows you to store multiple versions of S3 objects.
With versioning, you can recover more easily from unintended user actions and application failures
Versioning-enabled buckets can help you recover objects from accidental deletion or overwrite.
Store all versions of an object in S3 at the same object key address
By default, S3 Versioning is disabled on buckets, and you must explicitly enable it
Once enabled, it cannot be disabled, only suspended on the bucket
Fully integrates with S3 Lifecycle rules
MFA Delete feature provides extra protection against deletion of your data
Buckets can be in three states:
Unversioned (default)
Versioned
Versioned Suspended
What does S3 Versioning allow you to do? (Store multiple versions of S3 objects)
What must be done to activate versioning on an S3 bucket? (Versioning is disabled by default on S3 buckets and must be explicitly enabled by the user. Once enabled, it cannot be disabled but only suspended)
What are the three possible states of an S3 bucket in terms of versioning? (1) Unversioned (default), 2) Versioned, and 3) Versioned Suspended.)
Introduction to S3 Transfer Acceleration
S3 Transfer Acceleration is a bucket-level feature that provides fast and secure transfer of files over long distances between your end users and an S3 bucket.
Utilizes CloudFront’s distributed Edge Locations to quickly enter the Amazon Global Network
Instead of uploading to your bucket, users use a distinct endpoint to route to an edge location
https://s3-accelerate.amazonaws.com
https://s3-accelerate.dualstack.amazonaws.com
only supported on virtual-hosted style requests
Buckets cannot contain periods and must be DNS-compliant
It can take 20 minutes after Transfer Acceleration is enabled
What is the primary function of S3 Transfer Acceleration? (To provide fast and secure file transfers over long distances)
What AWS service does S3 Transfer Acceleration utilize to accelerate file uploads? (Amazon CloudFront)
S3 – Presigned URLs
S3 Presigned Urls provides temporary access to upload or download object data via URL.
Presigned URLs are commonly used to provide access to private objects.
You can use AWS CLI or AWS SDK to generate a Presigned URL.
The expires is in seconds
aws s3 presign s3://mybucket/myobject \
--expires-in 300
What is the primary purpose of S3 Presigned URLs? (To provide temporary access to upload or download object data.)
How can one generate a Presigned URL for an S3 object? (It can be generated using the AWS Command Line Interface (CLI) or AWS Software Development Kits (SDKs).)
S3 – Anatomy of a Presigned URLs
X-Amz-Algorithm specifies the signing algorithm, typically AWS4-HMAC-SHA256.
X-Amz-Credential includes your AWS access key and the scope of the signature.
X-Amz-Date is the timestamp of when the signature was created.
X-Amz-Expires defines the duration for which the URL is valid, in this case, 300 seconds.
X-Amz-SignedHeaders indicates which headers are part of the signing process.
X-Amz-Signature the actual signature calculated based on your AWS secret access key, the string to sign, and the signing key.
What does "X-Amz-Algorithm" in a presigned URL indicate? (The algorithm used for the URL signature)
What does the "X-Amz-Date" parameter represent in a presigned URL? (The timestamp of when the signature was created, indicating the exact time the signature was generated for the presigned URL.)
S3 – Access Points
S3 Access Points simplify managing data access at scale for shared datasets in S3.
S3 Access Points are named network endpoints that are attached to buckets that you can use to perform S3 object operations eg. Get, Put
Each access point has:
Distinct permissions via an Access Point Policy
Distinct network controls
Distinct block public access
Network Origin
Internet – requests can come from the Internet.
VPC – all requests must come from specified VPC
S3 Access Point Policy allows you to write permissions for a bucket alongside your bucket policy.
An Access Point Policy helps to move specific and complex access configuration out of your bucket policy, keeping your bucket policy simple and easy to read.
What is the primary function of an S3 Access Point? (To provide a named network endpoint for performing S3 object operations.)
What are the possible network origins for an S3 Access Point? (Internet / VPC (Virtual Private Cloud)
S3 – Multi-Region Access Points
Multi-Region Access Points is a global endpoint to route requests to multiple buckets residing in different regions.
Let’s pretend they live right on the border of Montreal.
Multi-Region Access Point will return data from the regional bucket with the lowest latency.
AWS Global Accelerator is used to route to the closest bucket
Requests are accelerated over the internet, VPC or PrivateLink
S3 Replication Rules can be used to synchronize objects to the regional buckets
What is the primary function of Multi-Region Access Points in AWS? (To route requests to multiple buckets in different regions based on lowest latency)
How are data synchronized across different regional buckets when using AWS S3 Multi-Region Access Points? (S3 Replication Rules are used to synchronize objects across regional buckets, ensuring data consistency and availability.)
S3 — Object Lambda Access Points
S3 Object Lambda Access Points allow you to transform the output requests of S3 objects when you want to present data differently.
S3 Object Lambda Access Points only operate on the outputted objects. The original objects in the S3 bucket remain unmodified
S3 Object Lambda can be performed on the S3 Operations
HEAD — information about the S3 object, but not the object contents itself
GET — an s3 object including its contents.
LIST — a list of s3 objects
An Amazon Lambda Function is attached to an S3 Bucket via the Object Lambda Access Point
Multiple transformations can be configured per Object Lambda Access Point
What is the purpose of S3 Object Lambda Access Points? (To transform the output of requests for S3 objects without altering the original objects.)
Which S3 operations can be affected by S3 Object Lambda Access Points? (S3 Object Lambda Access Points can transform the output of the HEAD, GET, and LIST operations.)
Mountpoint for Amazon S3
Mountpoint for Amazon S3 allows you to mount an S3 bucket to your Linux local file system.
Mountpoint is an open-source client that you install on your Linux OS and provides high-throughput access to objects with basic file-system operations
Mountpoint can:
read files up to 5 TB in size
list and read existing files
create new files
Mounting cannot/does not
modify existing files
delete directories
support symbolic links
support file locking
It can be used in the following Storage Classes
S3 Standard
S3 Intelligent-Tiering
S3 Standard IA
S3 One-Zone IA
Reduced Redundancy Storage (RRS)
S3 Glacier Instant Retrieval
S3 Glacier Flexible Retrieval
S3 Glacier Deep Archive
S3 Intelligent-Tiering Archive Access
S3 Intelligent-Tiering Deep Archive Access
Mountpoint is ideal for apps that don’t need all the features of a shared file system and POSIX-style permissions but require Amazon S3's elastic throughput to read and write large S3 datasets.
Installing Mountpoint (RPM example)
wget https://s3.amazonaws.com/mountpoint-s3-release/latest/x86_64/mount-s3.rpm
sudo yum install ./mount-s3.rpm
mount-s3 --version
Using Mountpoint
mkdir ~/mnt
mount-s3 mybucket ~/mnt
cd mnt
# perform basic operations
# eg. cat, ls, pwd
unmount ~/mnt
Create a folder
Mount the bucket to the folder
Go into the folder
Perform basic filesystem operations
Unmount when you’re done
Which of the following is supported by Mountpoint for Amazon S3? (Creating new files, Listing files, Reading existing files)
What is the primary function of Mountpoint for Amazon S3? (Allows you to mount an S3 bucket to your Linux local file system for high-throughput access with basic file-system operations like reading and listing files and creating new files.)
Archived Objects
Archived Objects are rarely-access objects Amazon S3 that cannot be accessed in real-time in exchange for a reduced storage-cost.
There are two ways to archive objects:
1. Archive Storage Classes
When you know your access patterns
Requires manual invention to move data
Lower costs archive storage costs
S3 Glacier Flexible Retrieval
Minutes to hours
S3 Glacier Deep Archive
+12 hours
2. Archive Access Tiers
When you don’t know your access pattern
Automatically moves data
Slightly higher cost than Archive Storage Classes
S3 Intelligent-Tiering Archive Access tier
Within minutes
S3 Intelligent-Tiering Deep Archive Access
12+ hours
What is the main advantage of using Archived Objects in Amazon S3? (Reduced storage costs)
What are the two ways to archive objects in Amazon S3? (Archive Storage Classes and Archive Access Tiers)
S3 — Requesters Pay
Requesters Pays bucket option allows the bucket owner to offset specific S3 costs to the requester (the user requesting the data).
The bucket owner still pays this
The requester now pays this
When you want to share data but not incur the charges associated with others accessing the data. Eg.
Collaborative Projects: External partners pay for their own S3 data uploads/downloads.
Client Data Storage: Clients pay for their S3 storage and transfer costs.
Shared Educational Resources: Researchers cover their S3 usage fees, not the institution.
Content Distribution: Distributors/customers pay for S3 data transfer and downloads.
You can at any time toggle Enable or Disable Requesters Pay on a bucket.
All requests must authenticate involving Requester Pays buckets
Requester assumes an IAM role before making their request
The IAM policy will have a s3:RequestPayer condition.
anonymous access to that bucket is not allowed on buckets with Requesters Pay
The AWS account of the requester will be charged.
What does the "Requester Pays" option in an S3 bucket imply? (The requester pays for the data transfer and requests they make.)
Why might an institution enable "Requester Pays" for shared educational resources on S3? (Enabling "Requester Pays" allows the institution to share valuable data without incurring the costs associated with data access, as researchers and users cover their own S3 usage fees.)
S3 — Requesters Pay Header
Requesters must include x-amz-request-payer in their API request header for:
DELETE, GET, HEAD, POST, and PUT requests
or as a parameter in a REST request
curl -X GET "https://bucket-name.s3.amazonaws.com/object-key" \
-H "x-amz-request-payer: requester" \
In practice, you wouldn’t use CURL for API requests since authentication is too challenging
In practice, you will be setting the Requester Pay API requester header using the AWS CLI and AWS SDK
Using the AWS CLI, you can use the --requester-payer flag, so include the header in your object request
aws s3 cp \
s3://bucket-name/object local/path/object \
--request-payer requester
Setting the header using the AWS Ruby SDK
resp = s3_client.get_object(
bucket: bucket,
key: object_key,
request_payer: 'requester'
)
What is the purpose of the x-amz-request-payer header in S3 API requests? (To designate who pays for the request)
How do you enable the Requester Pays option for an S3 object using the AWS CLI? (Use the flag --request-payer requester in the AWS CLI command.)
S3 — Requesters Pay Troubleshooting
A 403 (Forbidden Request) HTTP Error code will occur in the following scenarios:
The requester doesn't include the parameter x-amz-request-payer
Request authentication fails (something is wrong with the IAM role or IAM policy)
The request is anonymous
The request is a SOAP request. (SOAP requests are not allowed with requesters pay is turned on)
In the case the requesters forgets to include the header, then 403 will occur, and no charge will occur to the requester. No charge will occur to the bucket owner.
What will cause a 403 (Forbidden Request) HTTP error when using Amazon S3 with the Requester Pays feature turned on? (The requester fails to include the parameter x-amz-request-payer)
What type of requests are not allowed with the Amazon S3 Requester Pays feature turned on? (SOAP) Simple Object Access Protocol requests)
AWS Marketplace for S3
The AWS Marketplace for S3 provides alternatives to AWS Services that work with Amazon S3
AWS Services
Third-Party Services
Storage Backup and Recovery:
- FSx for Lustre - Tape Gateway - File Gateway
- Veeam Backup for AWS - Druva: AWS Backup and Disaster Recovery
Data Integration and Analytics
- Transfer Family - Data Sync - Athena
- ChoasSearch - Logz.io - BryteFlow Enterprise Edition
Observability and Monitoring
- CloudTrail - CloudWatch
- DataDog, Splunk, Dynatrace
Security and Threat Detection
- GuardDuty - Macie
- Trend Cloud One - InsightIDR (Rapid7) - VM-Series Virtual Next Generation Firewalls (NGFW) (Palo Alto Networks)
Permissions
- IAM
- OneLogin Workforce Identity - FileCloud EFSS - Yarkon S3 Server
Which service is used for storage backup and recovery in AWS Services? (FSx for Lustre)
What AWS service is used for observability and monitoring? (AWS services used for observability and monitoring include CloudTrail and CloudWatch.)
S3 Batch Operations
S3 Batch Operations performs large-scale batch operations on Amazon S3 objects
billions of objects containing exabytes of data
The following Batch Operation Types can be performed:
Copy Copies each object listed in the manifest to the specified destination bucket.
Invoke AWS Lambda function Run a Lambda function against each object
Replace all object tags Replaces the Amazon S3 object tags of each object
Replace access control list (ACL) Replaces the (ACLs) for each object
Restore Sends a restore request to S3 Glacier
Object Lock retention Prevents overwriting or deleting for a fixed amount of time.
Object Lock legal hold Prevents overwriting or deleting until the legal hold is removed.
In order to perform a batch operation, you need to provide lists of objects in an S3 or supply an S3 Inventory report manifest.json.
You can have Batch Operation generate a completion report to audit the outcome of bulk operations.
What is the primary function of S3 Batch Operations? (To perform large-scale batch operations on Amazon S3 objects)
What is required to initiate an S3 Batch Operation? (You need to provide a list of objects in an S3 bucket or supply an S3 Inventory report manifest.json.)
Amazon S3 Inventory
Amazon S3 Inventory takes inventory of objects in an S3 bucket on a repeating schedule, so you have an audit history of object changes.
Amazon S3 will output the inventory into the destination of another S3 Bucket.
Frequency:
Daily
delivered within 48 hours
Weekly
First report delivered within 48 hours
Future reports every Sunday
You can specify additional metadata to be included in the report
Output format:
CSV (Comma-separated values)
ORC (Apache Optimized Columnar)
Parquet (Apache Parquet)
Inventory Scope:
Specific prefixes to filter objects
Specific all or only current versions
What is the purpose of Amazon S3 Inventory? (To take inventory of objects in an S3 bucket on a schedule for an audit history)
What are the available output formats for Amazon S3 Inventory reports? (CSV (Comma-separated values), ORC (Apache Optimized Columnar), Parquet (Apache Parquet)
S3 Event Notifications
S3 Event Notifications allows your bucket to notify other AWS Services about s3 event data.
S3 Event Notifications makes application integration very easy for S3
Notification events
New object created events
Object removal events
Restore object events
Reduced Redundancy Storage (RRS) object lost events
Replication events
S3 Lifecycle expiration events
S3 Lifecycle transition events
S3 Intelligent-Tiering automatic archival events
Object tagging events
Object ACL PUT events
Possible Destination to other AWS Services
Amazon Simple Notification Service (SNS) topics
Amazon Simple Queue Service (SQS) queues
FIFO not supported
AWS Lambda function
Amazon EventBridge
Amazon S3 event notifications are designed to be delivered at least once notifications are delivered in seconds but can sometimes take a minute or longer
Which AWS service is a possible destination for S3 Event Notifications? (Amazon Simple Notification Service (SNS) topics, AWS Lambda function, Amazon Simple Queue Service (SQS) queues)
What is the typical delivery time and guarantee for S3 Event Notifications? (Notifications are designed to be delivered at least once and typically are delivered in seconds, but sometimes can take a minute or longer.)
S3 Storage Class Analysis
Storage Class Analysis allows you to analyze storage access patterns of objects within a bucket to recommend objects to move between STANDARD to STANDARD_IA
aws s3api put-bucket-analytics-configuration \
--bucket my-bucket --id 1 \
--analytics-configuration '{"Id": "1", "StorageClassAnalysis": {}}'
observes the infrequent access patterns of a filtered set of data over a period of time
You can have multiple analysis filters per bucket (up to 1000 filters)
The results can be exported to CSV
export this daily usage data to an S3 bucket
Use data in Amazon QuickSight for data visualization
provides storage usage visualizations in the Amazon S3 console that are updated daily
After a filter is applied, the analysis will be available 24 to 48 hours
Storage Class analysis will analyze an object for 30 days or longer to gather enough information
How long does it typically take for analysis results to be available after applying a filter in Storage Class Analysis? (24 to 48 hours)
How many analysis filters can you have per S3 bucket for analyzing storage access patterns? (You can have up to 1000 analysis filters per S3 bucket.)
S3 Storage Lens
Amazon S3 Storage Lens is a storage analysis tool for S3 buckets across your entire AWS organization.
how much storage you have across your organization
which are the fastest-growing buckets and prefixes
identify cost-optimization opportunities
implement data-protection and access-management best practices
improve the performance of application workloads
Metrics can be exported as CSV or Parquet to another S3 bucket
Usage and metrics can be exported to Amazon CloudWatch
S3 Storage Lens aggregates metrics and displays the information in the Account snapshot as an interactive dashboard updated daily.
What is the main purpose of Amazon S3 Storage Lens? (To analyze storage usage and activity across an entire AWS organization.)
How can the data from Amazon S3 Storage Lens be exported? (It is exported in CSV or Parquet formats to another S3 bucket or sent to Amazon CloudWatch.)
S3 Static Website Hosting
S3 Static Website Hosting allows you to host and serve a static website from an S3 bucket.
S3 website endpoints do not support HTTPS
Amazon CloudFront must be used to serve HTTPS traffic
S3 Static Website hosting will provide a website endpoint
http://bucket-name.s3-website-region.amazonaws.com http://bucket-name.s3-website.region.amazonaws.com
The format of the website endpoint varies based on region that will either have a hyphen or a period
There are two hosting types (via the console):
Host a static website
Redirect requests to objects
Requester Pays buckets do not allow access through a website endpoint
What is required to serve HTTPS traffic from an S3 static website? (Amazon CloudFront)
What type of S3 bucket configurations are not accessible via a website endpoint? (Requester Pays buckets do not allow access through a website endpoint in S3 static website hosting configurations.)
Amazon S3 Multipart Upload
Overview
Amazon S3 supports multipart upload, enabling you to upload a single object in a set of parts.
Multipart upload advantages:
Improved throughput.
Resilient to network failure—only missing parts need re-uploading.
Parts can be uploaded at any time—no expiry.
Allows uploading files while creating them.
Recommended for files ≥100MB.
Steps for Multipart Upload
Initiate Upload:
Use create-multipart-upload to get an Upload ID.
Example:
aws s3api create-multipart-upload \
--bucket my-bucket \
--key 'myfile'
Upload Parts:
Divide large files into parts (e.g., 5MB each).
Use upload-part command with the Upload ID for each part.
Parts can be numbered 1 to 10000.
Example:
aws s3api upload-part \
--bucket my-bucket \
--key 'myfile' \
--part-number 1 \
--body part01 \
--upload-id "UploadIDExample"
Complete Upload:
Provide a JSON file with the ETags of all uploaded parts.
Use complete-multipart-upload command.
Example:
aws s3api complete-multipart-upload \
--bucket my-bucket \
--key 'myfile' \
--multipart-upload file://parts.json \
--upload-id "UploadIDExample"
Important Notes
ETags: Used to verify parts. Collect them for each uploaded part.
Multipart upload is especially useful for uploading large files efficiently.
Ensure all parts are uploaded before calling complete-multipart-upload.
What is the primary advantage of using multipart upload on Amazon S3 for large files? (Improves throughput and reliability)
What is the recommended file size for initiating multipart uploads on Amazon S3? (For files that are 100MB or larger, it is recommended to use multipart uploads to ensure better manageability and reliability of file transfers.)
Amazon S3 Byte Range Fetching
Amazon S3 supports the retrieval of a specific range of bytes from an object using the Range header in GetObject API requests.
S3 supports concurrent connections, allowing multiple byte ranges to be requested simultaneously, which can be beneficial for parallel processing.
By fetching smaller ranges of a large object, your application can reduce the amount of data transferred, leading to faster retry times if a request is interrupted.
Common byte-range request sizes are typically 8 MB or 16 MB, aligning with S3's multipart upload part sizes.
Request each required byte range using the Range header in individual GetObject calls.
Store each part temporarily. Depending on the application's memory availability and the size of the parts, these can be held in memory or written to disk.
Concatenate all downloaded parts in the correct order to reconstruct the single file.
This method of fetching data is distinct from S3 Select, which allows querying the content of objects with SQL statements. Byte range fetching retrieves specific portions of an object's data, irrespective of the data format or content.
import boto3
# Initialize a boto3 S3 client
s3 = boto3.client('s3')
# Replace these variables with your bucket and object key
bucket_name = 'your-bucket-name'
object_key = 'your-object-key'
# Byte ranges to fetch
byte_ranges = ['bytes=0-999', 'bytes=1000-1999', 'bytes=2000-2999']
# List to hold each byte range
parts = []
# Fetch each byte range
for byte_range in byte_ranges:
# Make a GetObject request with the specified byte range
response = s3.get_object(Bucket=bucket_name, Key=object_key, Range=byte_range)
# Read the part of the content and append to the list
parts.append(response['Body'].read())
# Concatenate all parts into a single byte sequence
complete_file_content = b''.join(parts)
# Write the complete file to disk
with open('output_file', 'wb') as f:
f.write(complete_file_content)
print("File downloaded and reassembled from byte ranges.")
Old Slides
Amazon S3 allows you to fetch a range of bytes of data from S3 Objects using the Range header during S3 GetObject API Requests.
Example using the AWS SDK Python (boto3) library
Amazon S3 allows for concurrent connections so you can request multiple parts at the same time.
Fetching smaller ranges of a large object also allows your application to improve retry times when requests are interrupted
Typical sizes for byte-range requests are 8 MB or 16 MB
You’ll need to store each part and the concat all the parts downloaded in the correct order back into a single file.
Boto3 example opening multiple concurrent S3 connections, holding each part in memory and then reassembling the parts back into a single file.
Depending how large the file is you might need to write each part to disk if your program does not have enough memory to hold all parts.
What is the primary benefit of using byte-range fetching in Amazon S3? (To improve retry times when requests are interrupted)
What concurrent capability does Amazon S3 offer for fetching data? (Amazon S3 allows for concurrent connections, enabling the simultaneous request of multiple parts of an object, which can significantly speed up data retrieval processes.)
S3 Interoperability
What is Interoperability?
Interoperability in the context of cloud services is the capability of cloud services to exchange and utilize information seamlessly with each other.
Here are some common AWS services that often dump data into S3:
Amazon EC2: Stores snapshots and backups in S3.
Amazon RDS: Backups and data exports to S3.
AWS CloudTrail: Stores API call logs in S3.
Amazon CloudWatch Logs: Exports logs/metrics to S3.
AWS Lambda: Outputs data/logs to S3.
AWS Glue: ETL results stored in S3.
Amazon Kinesis: Data streaming to S3 via Firehose.
Amazon EMR: Uses S3 for input/output data storage.
Amazon Redshift: Unloads data to S3.
AWS Data Pipeline: Moves/transforms data to/from S3.
Amazon Athena: Outputs query results to S3.
AWS IoT Core: Stores IoT data in S3.
Which AWS service is primarily used for storing snapshots and backups from EC2 instances? (S3)
What AWS service is used for storing API call logs? (AWS CloudTrail stores API call logs in S3.)
AWS Application Programming Interface (API)
What is an Application Programming Interface (API)?
API: Software that allows two applications/services to communicate with each other.
The most common type of API is via HTTP/S requests.
AWS API
AWS API: An HTTP API where you interact by sending HTTPS requests.
Tools: Applications like Postman can be used to interact with APIs.
Service Endpoint
Each AWS service has its own Service Endpoint to which you send requests.
Example Request Format
GET / HTTP/1.1
host: monitoring.us-east-1.amazonaws.com
x-amz-target: GraniteServiceVersion20100801.GetMetricData
x-amz-date: 20180112T092034Z
Authorization: AWS4-HMAC-SHA256 Credential=REDACTED/REDACTED/20180411/...
Content-Type: application/json
Accept: application/json
Content-Encoding: amz-1.0
Content-Length: 45
Connection: keep-alive
Important Points
Signed Request: Authorization requires generating a signed request.
Action and Parameters: Provide an action and accompanying parameters as the payload.
Key Terminology
Service Endpoint: The URL to which you send requests for a specific AWS service.
Signed Request: A request that includes a signature for authorization, created with your AWS credentials.
Action: The operation you want to perform (e.g., GetMetricData).
Parameters: Data sent along with the request to perform the action.
Additional Notes
Make sure your request is properly formatted and includes all necessary headers and authentication information.
Always ensure that your date and authorization headers are correctly generated to avoid request rejection.
These condensed notes cover the essential information needed for understanding and working with AWS APIs.
Overview
Direct Interaction: Rarely do users directly send HTTP requests to the AWS API.
Ease of Use: It's much easier to interact with the API via a variety of developer tools.
Developer Tools for Interacting with AWS API
AWS Management Console:
A WYSIWYG Web Interface.
AWS SDK:
Interact with the API using your favorite programming language.
AWS CLI:
Interact with the API via a terminal/shell program.
HTTP Request:
Directly interact with the AWS API.
Key Terminology
AWS Management Console: A graphical web interface for managing AWS resources.
AWS SDK: Software Development Kit for interacting with AWS services in various programming languages.
AWS CLI: Command Line Interface for managing AWS services from the terminal.
HTTP Request: Direct method to interact with AWS API, usually less common due to complexity.
Important Points
Using developer tools simplifies the process of interacting with AWS services.
Each tool provides a different method and interface for managing AWS resources efficiently.
These notes provide a concise overview of the primary ways to interact with AWS APIs, highlighting the ease of use provided by various developer tools.
What is the most common type of API request? (HTTP/S)
Name four tools used to interact with AWS APIs. (1. AWS Management Console (WYSIWYG Web Interface) 2. AWS SDK (Interact with API using programming languages) 3. AWS CLI (Interact with API via terminal/shell program) 4. HTTP Request (Direct interaction)
AWS Command Line Interface (CLI)
What is a CLI?
Command Line Interface (CLI): Processes commands to a computer program in the form of lines of text.
Terminal: A text-only interface (input/output environment).
Console: A physical computer to input information into a terminal.
Shell: The command line program that users interact with to input commands. Common shell programs:
Bash
Zsh
PowerShell
Note: People commonly (erroneously) use Terminal, Shell, or Console to generally describe interacting with a Shell.
AWS CLI: Allows users to programmatically interact with the AWS API by entering single or multi-line commands into a shell or terminal.
Example Command:
aws ec2 describe-instances \
--filters Name=tag-key,Values=Name \
--query 'Reservations[*].Instances[*].{Instance:InstanceId,AZ:Placement.AvailabilityZone,Name:Tags[?Key==`Name`]|[0].Value}' \
--output table
Installation and Requirements
AWS CLI is a Python executable program.
Python is required to install AWS CLI.
AWS CLI can be installed on Windows, Mac, or Linux/Unix.
The name of the CLI program is aws.
Which CLI option lets you change the format of the response from AWS? (--output)
Which programming language is used for the AWS CLI installation? (Python)
Name three common shell programs. (Bash, Zsh, PowerShell)
Access Keys
Overview
Access Keys are a key and secret required for programmatic access to AWS resources outside the AWS Management Console.
Commonly referred to as AWS Credentials.
A user must be granted access to use Access Keys.
Generating and Managing Access Keys
Generate an Access Key and Secret by selecting Access key - Programmatic access.
Access Key IDs and Secrets can be managed in the AWS Management Console.
Important Points
Never share your access keys.
Never commit access keys to a codebase.
You can have two active Access Keys.
You can deactivate Access Keys when needed.
Access Keys provide the same access as the user has to AWS resources.
Storing Access Keys
Default will be the access key used when no profile is specified.
Multiple access keys can be stored by giving profile names.
Stored in ~/.aws/credentials and follow a TOML file format.
[default]
aws_access_key_id=AKIAIOSFODNN7EXAMPLE
aws_secret_access_key=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
[exampro]
aws_access_key_id=AKIAIOSFODNN7EXAMPLE
aws_secret_access_key=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
region=ca-central-1
Configuration with CLI
Use the aws configure CLI command to populate the credential file.
$ aws configure
AWS Access Key ID [None]: AKIAIOSFODNN7EXAMPLE
AWS Secret Access Key [None]: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
Default region name [None]: us-west-2
Default output format [None]: json
Environment Variables
The AWS SDK and CLI will automatically read from these environment variables.
$ export AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE
$ export AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
$ export AWS_DEFAULT_REGION=us-west-2
What are the 2 components of a programmatic access key? (Secret Access Key and Access Key ID)
Which AWS service would you use to enable Programmatic Access? (IAM)
This IAM permission must be enabled for a user to access the CLI or SDK (Programmatic Access)
API Retries and Exponential Backoff
Key Concepts
When interacting with APIs over a network, it's common for a networking issue to occur due to the number of devices a request has to pass through and points of failure:
DNS Servers
Switches
Load Balancers
When working with APIs, you need to plan for possible network failure by retrying.
Exponential Backoff
It is industry-wide recommended for APIs to use an exponential backoff before retrying.
Try again in 1 second
Try again in 2 seconds
Try again in 4 seconds
Try again in 8 seconds
Try again in 16 seconds
Try again in 32 seconds
Exponential Calculation Examples
(2^2 = 2 \times 2 = 4)
(2^3 = 2 \times 2 \times 2 = 8)
(2^4 = 2 \times 2 \times 2 \times 2 = 16)
(2^5 = 2 \times 2 \times 2 \times 2 \times 2 = 32)
Important Note
A good CLI or SDK will have exponential backoff built-in with options like how many times it should try.
What is a common cause of networking issues when interacting with APIs? (DNS Servers)
Why is exponential backoff recommended for retrying API requests? (To handle network failures more efficiently)
Smithy 2.0
Overview
Smithy 2.0 is AWS's open-source Interface Definition Language (IDL) for web services.
Key Points
Smithy is a language for defining services and SDKs.
Smithy and its server generator unlock model-first development.
It forces you to define your interface first rather than let your API become implicitly defined by your implementation choices.
Example Code
// Define the namespace
namespace com.amazonaws.s3
// Define a simplified S3 service
service SimpleS3 {
version: "2023-12-21",
operations: [ListBuckets, PutObject]
}
// Define an operation to list buckets
operation ListBuckets {
output: ListBucketsOutput
}
// Define the output structure for ListBuckets operation
structure ListBucketsOutput {
buckets: BucketList
}
// Define a list of buckets
list BucketList {
member: Bucket
}
Important Terminology
IDL: Interface Definition Language
API: Application Programming Interface
Model-First Development: An approach where the interface is defined before implementation, ensuring consistency and clarity.
What is Smithy 2.0 primarily used for? (Defining services and SDKs)
Define Model-First Development. (Model-First Development is an approach where the interface is defined before implementation, ensuring consistency and clarity.)
Security Token Service (STS)
Overview
STS is a web service that enables you to request temporary, limited-privilege credentials for IAM users or federated users.
Key Points
AWS Security Token Service (STS) is a global service, and all AWS STS requests go to a single endpoint at https://sts.amazonaws.com.
An STS will return:
AccessKeyID
SecretAccessKey
SessionToken
Expiration
API Actions
You can use the following API actions to obtain STS:
AssumeRole
AssumeRoleWithSAML
AssumeRoleWithWebIdentity
DecodeAuthorizationMessage
GetAccessKeyInfo
GetCallerIdentity
GetFederationToken
GetSessionToken
Example Usage
Assume a role via STS and optionally provide a session name. It returns back credentials.
Code Example (Python)
import boto3
# Assume a role and get temporary credentials
sts_client = boto3.client('sts')
response = sts_client.assume_role(
RoleArn='arn:aws:iam::123456789012:role/demo',
RoleSessionName='session1'
)
creds = response['Credentials']
# Load in temporary credentials
s3 = boto3.client('s3',
aws_access_key_id=creds['AccessKeyId'],
aws_secret_access_key=creds['SecretAccessKey'],
aws_session_token=creds['SessionToken']
)
Command Line Interface (CLI) Example
Generate temporary credentials via STS with the CLI:
aws sts assume-role \
--role-arn arn:aws:iam::123456789012:role/my-role \
--role-session-name mySession
Export the temporary credentials:
export AWS_ACCESS_KEY_ID="ASIA...IAQ"
export AWS_SECRET_ACCESS_KEY="wJalr...zUE3"
export AWS_SESSION_TOKEN="FQoGz...<snipped>...zpZC"
Then load them into your CLI configuration files or in the environment variables.
What does AWS STS provide? (Temporary, limited-privilege credentials for IAM users or federated users)
What are the key elements returned by an STS call? (An STS call returns AccessKeyID, SecretAccessKey, SessionToken, and Expiration.)
API - STS
api/sts/Readme.md
## Create a user with no permissions
We need to create a new user with no permissions and generate out access keys
```sh
aws iam create-user --user-name sts-machine-user
aws iam create-access-key --user-name sts-machine-user --output table
```
Copy the access key and secret here
```sh
aws configure
```
Then edit credentials file to change away from default profile
```sh
open ~/.aws/credentials
```
Test who you are:
```sh
aws sts get-caller-identity
aws sts get-caller-identity --profile sts
```
Make sure you don't have access to S3
```sh
aws s3 ls --profile sts
```
> An error occurred (AccessDenied) when calling the ListBuckets operation: Access Denied
## Create a Role
We need to create a role that will access a new resource
```sh
chmod u+x bin/deploy
./bin/deploy
```
## Use new user crednetials and assume role
```sh
aws iam put-user-policy \
--user-name sts-machine-user \
--policy-name StsAssumePolicy \
--policy-document file://policy.json
```
```sh
aws sts assume-role \
--role-arn arn:aws:iam::982383527471:role/my-sts-fun-stack-StsRole-UBQlCIzagA7n \
--role-session-name s3-sts-fun \
--profile sts
```
```sh
aws sts get-caller-identity --profile assumed
```
```sh
aws s3 ls --profile assumed
```
## Cleanup
tear down your cloudformation stack via the AWS Managemnet Console
```sh
aws iam delete-user-policy --user-name sts-machine-user --policy-name StsAssumePolicy
aws iam delete-access-key --access-key-id AKIA6JOU7AYXR3PVODP3 --user-name sts-machine-user
aws iam delete-user --user-name sts-machine-user
```
api/sts/policy.json
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Action": "sts:AssumeRole",
"Resource": "arn:aws:iam::982383527471:role/my-sts-fun-stack-StsRole-UBQlCIzagA7n"
}]
}
api/sts/template.yaml
AWSTemplateFormatVersion: "2010-09-09"
Description: Create a role for us to assume and create a resource we'll have access t
Parameters:
BucketName:
Type: String
Default: "sts-fun-ab-63463"
Resources:
S3Bucket:
Type: 'AWS::S3::Bucket'
Properties:
BucketName: !Ref BucketName
StsRole:
Type: 'AWS::IAM::Role'
Properties:
AssumeRolePolicyDocument:
Version: "2012-10-17"
Statement:
- Effect: Allow
Principal:
AWS: "arn:aws:iam::982383527471:user/sts-machine-user"
#Service:
# - s3.amazonaws.com
Action:
- 'sts:AssumeRole'
Path: /
Policies:
- PolicyName: s3access
PolicyDocument:
Version: "2012-10-17"
Statement:
- Effect: Allow
Action: 's3:*'
Resource: [
!Sub "arn:aws:s3:::*",
!Sub "arn:aws:s3:::${BucketName}",
!Sub "arn:aws:s3:::${BucketName}/*"
]
api/sts/bin/deploy
#!/usr/bin/env bash
aws cloudformation deploy \
--template-file template.yaml \
--stack-name my-sts-fun-stack \
--capabilities CAPABILITY_IAM
Signing AWS API Requests
Why Sign API Requests?
Sign API requests so that AWS can identify who sent them.
Signatures help:
Prevent data tampering
Verify the identity of the requester
Automatic Signing
Using the AWS CLI or AWS SDK automatically signs requests.
No need to sign some requests:
Anonymous requests to Amazon S3
Some API operations to STS (e.g., AssumeRoleWithWebIdentity)
Example of Signed Request
https://s3.amazonaws.com/examplebucket/test.txt
?X-Amz-Algorithm=AWS4-HMAC-SHA256
&X-Amz-Credential=<your-access-key-id>/20130721/us-...
&X-Amz-Date=20130721T201207Z
&X-Amz-Expires=86400
&X-Amz-SignedHeaders=host
&X-Amz-Signature=<signature-value>
Example of a signature supplied in a query parameter.
Signing Protocols
AWS Signature Version 4 (current standard)
AWS Signature Version 2 (only used in legacy customer’s use cases)
Important Terms
AWS Signature Version 4:
The standard protocol for signing requests.
AssumeRoleWithWebIdentity:
An API operation to STS that may not require signing.
Notes
Ensure your API requests are signed to maintain secure communication and authorization with AWS services.
Understand the difference between Signature Version 2 and 4 for historical context and legacy systems.
Why is it important to sign AWS API requests? (To identify who sent the requests and prevent data tampering)
Name one type of request to Amazon S3 that does not need to be signed. (Anonymous requests to Amazon S3.)
AWS Signature Version 4
Step 1: Generate String to Sign
Concatenate select request elements to form a string, referred to as the string to sign.
Step 2: Create a Signing Key
Use a signing key to calculate the hash-based message authentication code (HMAC) of the string to sign.
Signing Key Generation
DateKey = HMAC-SHA256 ("AWS4" + , yyyyMMdd)
DateRegionKey = HMAC-SHA256(DateKey, )
DateRegionServiceKey = HMAC-SHA256(DateRegionKey, )
SigningKey = HMAC-SHA256(DateRegionServiceKey, aws4_request)
Signature
Signature = Hex(HMAC-SHA256(SigningKey, StringToSign))
Important Points
The Secret Access Key is used to create the signing request (not to sign the request).
The process to sign changes based on the request type:
Authorization Header (the most common way)
A Post Request
Query parameters
What is the first step in the AWS Signature Version 4 process? (Generate String to Sign)
What is the purpose of the Secret Access Key in AWS Signature Version 4? (The Secret Access Key is used to create the signing request (not to sign the request)
Service Endpoints
Definition and Usage
To connect programmatically to an AWS service, you use an endpoint.
An endpoint is the URL of the entry point for an AWS web service.
General Endpoint Format
The format of a service endpoint varies per service:
protocol://service-code.region-code.amazonaws.com
Example:
https://cloudformation.us-east-2.amazonaws.com`
Security and Protocols
Generally TLSv2 is expected; some older APIs might support TLSv1, TLSv1.1, or HTTP.
Types of Service Endpoints
Global Endpoints
AWS Services that use the same endpoint globally.
Regional Endpoints
AWS Services that require specifying a region.
FIPS Endpoints
Some endpoints support FIPS for enterprise use.
Dualstack Endpoints
Allows IPV6 or IPV4.
Key Points
Multiple Endpoints: An AWS Service may have multiple different service endpoints.
AWS CLI and AWS SDK will automatically use the default endpoint for each service in an AWS Region.
Combination of Endpoints:
Types of Service Endpoints can be combined, e.g., Regional + FIPS + Dualstack.
What is an endpoint in the context of AWS services? (The URL of the entry point for an AWS web service)
Name the types of AWS service endpoints. (Global Endpoints, Regional Endpoints, FIPS Endpoints, Dualstack Endpoints)
AWS CLI – CLI Input Flag
Overview
We can populate parameters (command flags) if the --cli-input flag is available for a subcommand.
Example Usage
Command:
aws iotevents create-input --cli-input-json file://pressureInput.json
aws iotevents create-input --cli-input-yaml file://pressureInput.yaml
Subcommands Support
Formats Supported:
--cli-input-json
--cli-input-yaml
Important Note
Check the documentation to ensure the subcommand supports the desired input format.
Additional Information
Subcommand Options:
create-input
--input-name
[--input-description]
--input-definition
[--tags]
[--cli-input-json | --cli-input-yaml]
[--generate-cli-skeleton]
[--debug]
What is the purpose of the --cli-input flag in AWS CLI commands? (To populate parameters (command flags) for a subcommand)
Which formats are supported by the --cli-input flag? (JSON and YAML.)
Configuration Files
AWS Configuration Files: - AWS has two different configuration files in TOML format: - ~/.aws/credentials: Used to store sensitive credentials (e.g., AWS Access Key and Secret) - ~/.aws/config: Used to store generic configurations (e.g., Preferred Region)
Credentials File: ~/.aws/credentials
Contains sensitive information
Example:
[default]
aws_access_key_id=AKIAIOSFODNN7EXAMPLE
aws_secret_access_key=wJalrXUtfnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
Config File: ~/.aws/config - Contains general configuration settings - Example:
```
[default]
region=us-west-2
output=json
```
Important Notes: - You can store all the same configuration in either file, though credentials will take precedence over config.
Global Configuration Options
Credentials-related options (stored in ~/.aws/credentials):
aws_access_key_id
aws_secret_access_key
aws_session_token
Configuration-related options (stored in ~/.aws/config):
region
output
ca_bundle
cli_auto_prompt
cli_binary_format
cli_history
cli_pager
cli_timestamp_format
credential_process
credential_source
duration_seconds
external_id
max_attempts
mfa_serial
parameter_validation
retry_mode
role_arn
role_session_name
source_profile
sso_account_id
sso_region
sso_registration_scopes
sso_role_name
sso_start_url
web_identity_token_file
tcp_keepalive
S3 Specific Settings
AWS S3 supports several settings that configure how the AWS CLI performs Amazon S3 operations:
Common Settings:
Apply to all S3 commands in both the s3api and s3 namespaces.
Transfer Command Settings:
Additional settings for cp, sync, mv, and rm commands.
Example Profile for S3 Specific Settings:
[profile development]
s3 =
max_concurrent_requests = 20
max_queue_size = 10000
multipart_threshold = 64MB
multipart_chunksize = 16MB
max_bandwidth = 50MB/s
use_accelerate_endpoint = true
addressing_style = path
Key Terms:
max_concurrent_requests: The maximum number of concurrent requests.
multipart_threshold: The size threshold for multipart uploads.
use_accelerate_endpoint: Enables S3 Transfer Acceleration.
Where are AWS-sensitive credentials stored? (~/.aws/credentials)
What setting enables S3 Transfer Acceleration? (use_accelerate_endpoint)
Named Profiles
Overview
AWS Config files support the ability to have multiple profiles.
Profiles allow you to switch between different configurations quickly for different environments.
Default Profile
The default profile is used when no profile flag is specified.
You can choose to not have a [default] profile.
Good Practice: Leaving out the default profile helps prevent misconfiguration.
Example Profiles
Example configuration in ~/.aws/credentials and ~/.aws/config:
[default]
aws_access_key_id=AKIAIOSFODNN7EXAMPLE
aws_secret_access_key=wJalrXUtfnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
[development]
aws_access_key_id=AKIAIOSFODNN7EXAMPLE
aws_secret_access_key=wJalrXUtfnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
[production]
aws_access_key_id=AKIAIOSFODNN7EXAMPLE
aws_secret_access_key=wJalrXUtfnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
Switching Profiles
Use the --profile flag to change profiles per CLI run.
aws ec2 describe-instances --profile development
Environment Variable
You can set an ENV VAR to set the profile:
export AWS_PROFILE="production"
Key Terms:
default profile: The profile used when no other is specified.
--profile: CLI flag to specify which profile to use.
ENV VAR: Environment variable used to set the profile.
What is the purpose of AWS Named profiles? (To switch between different configurations quickly for different environments)
Which profile is used by default if no profile flag is specified? (Default)
AWS CLI Configure Commands
Overview
AWS CLI has multiple configure commands to make configuration easy:
Configuration Wizard
Command: aws configure
Prompts for:
AWS Access Key ID
AWS Secret Access Key
Default region name
Default output format
Set a Value for a Specific Setting
Command: aws configure set <span style="color:red;">region</span> us-west-2 --profile dev
Unset a Value
Command: aws configure set <span style="color:red;">region</span> "" --profile dev
Print the Value of a Setting
Command: aws configure get <span style="color:red;">region</span>
Import AWS Credentials
Command: aws configure import --csv file://credentials.csv
Note: Import AWS credentials generated and downloaded from the AWS Console
List Configuration and Profiles
Command: aws configure list
Command: aws configure list-profiles
This format highlights key commands and settings, making it easier to study and reference for exams.
Which command is used to set a specific value in the AWS CLI configuration? (aws configure set)
How do you set a specific region for a profile in AWS CLI? (aws configure set region us-west-2 --profile dev
This command sets the region to us-west-2 for the profile named dev.)
AWS CLI - Environment Variables
Overview
Environment Variables (env vars) provide a way to specify AWS CLI configuration and credentials.
Useful for scripting or temporarily setting a profile as default.
Precedence Order (Highest to Lowest):
AWS CLI Parameters (highest priority)
Environment Variables
Configuration Files (lowest priority)
Common Environment Variables
AWS_ACCESS_KEY_ID – Access key for programmatic access.
AWS_SECRET_ACCESS_KEY – Secret key for programmatic access.
AWS_DEFAULT_REGION – Default region when no region is specified.
AWS_REGION – Overrides AWS_DEFAULT_REGION and profile settings.
AWS_PROFILE – AWS CLI profile name with credentials.
Additional Environment Variables
AWS_CA_BUNDLE – Path to custom certificate bundle for HTTPS validation.
AWS_CLI_AUTO_PROMPT – Enables auto-prompt for AWS CLI v2.
AWS_CLI_FILE_ENCODING – Encoding for text files.
AWS_CONFIG_FILE – Path to AWS CLI configuration file (Default: ~/.aws/config).
AWS_DEFAULT_OUTPUT – Overrides profile output format setting.
AWS_SHARED_CREDENTIALS_FILE – Path to AWS credentials file.
Role and Session Environment Variables
AWS_ROLE_ARN – ARN of IAM role for a web identity provider.
AWS_ROLE_SESSION_NAME – Custom session name (auto-generated if not specified).
AWS_SESSION_TOKEN – Token for temporary security credentials from AWS STS.
AWS_WEB_IDENTITY_TOKEN_FILE – Path to OAuth 2.0 or OpenID Connect ID token.
Metadata & Retry Settings
AWS_EC2_METADATA_DISABLED – Disables EC2 instance metadata service.
AWS_METADATA_SERVICE_TIMEOUT – Timeout (seconds) for instance metadata service connection.
AWS_METADATA_SERVICE_NUM_ATTEMPTS – Number of attempts to retrieve instance metadata credentials.
AWS_MAX_ATTEMPTS – Max retry attempts for AWS CLI requests.
AWS_PAGER – Defines output pagination program.
AWS_RETRY_MODE – Retry mode for AWS CLI requests.
Which of the following has the highest priority in terms of AWS CLI configuration options? (AWS CLI parameters)
Which environment variable is used to specify the access key for AWS CLI? (AWS_ACCESS_KEY_ID)
AWS CLI – Autocompletion Options
AWS provides multiple options for autocompletion when using AWS CLI commands.
Legacy
AWS Completer
The original autocompleter for AWS.
Primarily intended for AWS CLI v1.
Defunct Project
AWS Shell
An interactive shell for the AWS CLI.
This project has been inactive since 2020.
Functionality seems to have been integrated into the AWS CLI.
Contains bugs that display on every call for the past two years.
Not recommended for use (despite being prominently listed on the AWS CLI marketing page).
Recommended
AWS CLI Auto Prompt
AWS CLI v2 can turn on auto prompt to provide similar functionality to AWS Shell.
The recommended way to use autocomplete with AWS CLI.
Summary
Use AWS CLI Auto Prompt for autocompletion with AWS CLI v2.
Avoid using AWS Shell due to bugs and inactivity.
AWS Completer is legacy and mainly for AWS CLI v1.
This guide highlights the best current option for autocompletion and advises against using outdated or buggy tools.
Which autocompletion option is recommended for AWS CLI v2? (AWS CLI Auto Prompt)
Why is AWS Shell not recommended for use? (AWS Shell has been inactive since 2020 and contains bugs that display on every call.)
AWS CLI Autoprompt Study Notes
Overview
AWS CLI Autoprompt is a powerful interactive shell built into AWS CLI to assist in writing CLI commands.
Features
Fuzzy Search
Command Completion
Parameter Completion
Resource Completion
Shorthand Completion
File Completion
Region Completion
Profile Completion
Documentation Panel
Projected Output Panel
Command History
Two different modes of activation
Works anywhere AWS CLI is installed
Documentation Panel
Shows relevant documentation for the current command or subcommand.
Pressing [F3] will bring up a documentation pane.
Navigation:
[F2]: Toggle panes.
VI directions: j = down, k = up.
Configuration
Enable Autoprompt:
Single command: aws --cli-auto-prompt
Every future run (environment variable): export AWS_CLI_AUTO_PROMPT=on-partial
Every future run (config file): cli_auto_prompt = on
Modes
Full Mode: Activates Autoprompt for every command.
Partial Mode: Activates Autoprompt only if the command is incomplete or has client-side validation errors.
Potential Disruptions: Full mode can disrupt workflow for bash scripts as it activates for each command.
Output Panel
Press [F5] to toggle the output panel.
Preview the schema of the output for a subcommand.
Works with --query: Autoprompt will suggest syntax.
Supports different output types.
Command History
Press Ctrl+R for previous AWS CLI history.
Shows successful and unsuccessful commands.
Commands outside Autoprompt: Use native bash terminal features like Ctrl+R.
Autocompletion
Command Completion:
Suggests commands and subcommands.
Parameter Completion:
Suggests parameters (command flags).
Indicates required parameters first.
Resource Completion:
Suggests resources in your AWS account (credentials required).
Shorthand Completion:
Helps autocomplete shorthand JSON syntax.
File Completion:
Autocompletes file names (prefix with file://).
Region Completion:
Autocompletes region (may require other parameters first).
Profile Completion:
Autocompletes profile information configured in the ~/.aws/credentials.
Fuzzy Searching:
Type part of a word for fuzzy search results.
Example: m*ca* for matching me-central-1.
Highlight key features, commands, and usage tips in red to ensure quick recall during exams or practical application.
What is AWS CLI Autoprompt primarily designed to assist with? (Writing CLI commands)
How do you enable AWS CLI Autoprompt for every future run by setting the option within the AWS CLI config file? (Add cli_auto_prompt = on to your AWS CLI config file.)
Introduction to VPC
Introduction to Virtual Private Cloud
AWS Virtual Private Cloud (VPC) is a logically isolated virtual network. AWS VPC resembles a traditional network you’d operate in your own data center.
A VPC provides many different networking components
Virtual Machines eg. EC2 is the most common reason for using a VPC in AWS.
Virtual Network cards eg. Elastic Network Interfaces (ENIs) is used to within a VPC to attach to different compute types eg. EC2, Lambda, ECS
AWS VPC is tightly coupled with AWS EC2, and *all VPC CLI commands are under aws ec2
What is an AWS Virtual Private Cloud (VPC)? (A logically isolated virtual network within the AWS ecosystem.)
True or False, a VPC is your own logically isolated section of the AWS Cloud where you can launch resources in a virtual network you define yourself. (True)
Core Components of VPC
A VPC is composed of many different networking components:
Internet Gateway (IGW)
A gateway that connects your VPC to the internet
Virtual Private Gateway (VPN Gateway)
A gateway that connects your VPC to a private external network
Route Tables
determines where to route traffic within a VPC
NAT Gateway
Allows private instances (eg. Virtual Machines) to connect to services outside the VPC
Network Access Control Lists (NACLs)
Acts as a stateless virtual firewall for compute within a VPC Operates at the subnet level with allow and deny rules
Security Groups (SG)
Acts as a stateful virtual firewall for compute within a VPC Operates at the instance level with Allow rules
Public Subnets
Subnets allow instances to have public IP addresses
Private Subnets
Subnets that disallow instances to have public IP addresses
VPC Endpoints
Privately connect to AWS support services
VPC Peering
Connecting VPCs to other VPCs
Technically, NACLs and SGs are EC2 networking components.
True or False, Amazon Virtual Private Cloud (VPC) is a regional service (True)
Which of the following are VPC components? (Subnets, Routing Tables, Security Groups)
This AWS service is your own personal data centre in the cloud (Virtual Private Cloud (VPC))
Key Features
VPCs are Region Specific. They do not span regions
You can use VPC Peering to connect VPCs across regions
You can create up to 5 VPC per region (adjustable).
Every region comes with a default VPC
You can have 200 subnets per VPC
Up to 5 IPv4 CIDR Blocks per VPC (adjustable to 50)
Up to 5 IPv6 CIDR Blocks per VPC (adjustable to 50)
Most Components Cost nothing:
VPCs, Route Tables, NACLs, Internet Gateways, Security Groups and Subnets, VPC Peering
Some things cost money: eg. NAT Gateway,
VPC Endpoints, VPN Gateway, Customer Gateway
IPV4 Addresses, Elastic IPs
DNS hostnames (should your instance have domain name addresses)
How many VPCs can you create in each region? (Up to 5)
True or False, VPCs are region specific (True)
What is the maximum number of subnets per VPC? (200)
True or False, VPCs are specific to a single region (True)
The maximum number of VPCs you can create within a region (5 VPCs per region)
VPC Basic
create_vpc
#!/usr/bin/env bash
set -e
# Create our vpc
VPC_ID=$(aws ec2 create-vpc \
--cidr-block "172.1.0.0/16" \
--tag-specifications 'ResourceType=vpc,Tags=[{Key=Name,Value=my-vpc-3}]' \
--region ca-central-1 \
--query Vpc.VpcId \
--output text)
echo "VPC_ID: $VPC_ID"
# Turn on DNS Hostnames
aws ec2 modify-vpc-attribute --vpc-id $VPC_ID --enable-dns-hostnames "{\"Value\":true}"
# create an IGW
IGW_ID=$(aws ec2 create-internet-gateway \
--query InternetGateway.InternetGatewayId \
--output text
)
echo "IGW_ID: $IGW_ID"
# attach an IGW
aws ec2 attach-internet-gateway --internet-gateway-id $IGW_ID --vpc-id $VPC_ID
# create a new subnet
SUBNET_ID=$(aws ec2 create-subnet \
--vpc-id $VPC_ID \
--cidr-block 172.1.0.0/20 \
--query Subnet.SubnetId \
--output text)
echo "SUBNET_ID: $SUBNET_ID"
## auto assign IPv4 addresses
aws ec2 modify-subnet-attribute --subnet-id $SUBNET_ID --map-public-ip-on-launch
# explicty associate subnet
RT_ID=$(aws ec2 describe-route-tables \
--filters "Name=vpc-id,Values=$VPC_ID" "Name=association.main,Values=true" \
--query "RouteTables[].RouteTableId[]" \
--output text)
echo "RT_ID: $RT_ID"
ASSOC_ID=$(aws ec2 associate-route-table \
--route-table-id $RT_ID \
--subnet-id $SUBNET_ID \
--query AssociationId \
--output text)
echo "ASSOC_ID: $ASSOC_ID"
# add a route for our RT to our IGW
aws ec2 create-route \
--route-table-id $RT_ID \
--destination-cidr-block 0.0.0.0/0 \
--gateway-id $IGW_ID
# Print out delete command
echo "./delete_vpc $VPC_ID $IGW_ID $SUBNET_ID $ASSOC_ID $RT_ID"
delete_vpc
#!/usr/bin/env bash
# VPC IGW SUBNET RT
# Check if the argument is not provided
if [ -z "$1" ]; then
echo "Argument not provided."
else
export VPC_ID="$1"
fi
if [ -z "$2" ]; then
echo "Argument not provided."
else
export IGW_ID="$2"
fi
if [ -z "$3" ]; then
echo "Argument not provided."
else
export SUBNET_ID="$3"
fi
if [ -z "$4" ]; then
echo "Argument not provided."
else
export ASSOC_ID="$4"
fi
if [ -z "$5" ]; then
echo "Argument not provided."
else
export RT_ID="$5"
fi
# detach the IGW
aws ec2 detach-internet-gateway --internet-gateway-id $IGW_ID --vpc-id $VPC_ID
# delete the IGW
# dissaociate subnet
aws ec2 disassociate-route-table --association-id $ASSOC_ID
# delete subnet
aws ec2 delete-subnet --subnet-id $SUBNET_ID
# delete route table
# aws ec2 delete-route-table --route-table-id $RT_ID
# delete vpc
aws ec2 delete-vpc --vpc-id $VPC_ID
AWS Default VPC
Overview
AWS provides a default VPC in every region, allowing for immediate deployment of instances.
Default VPC Configuration
IPv4 CIDR block: 172.31.0.0/16 (65,536 IPv4 addresses)
Subnets: One per Availability Zone (size of /20, 4,096 IPv4 addresses per subnet)
Internet Gateway (IGW)
Default Security Group (SG)
Network Access Control List (NACL)
DHCP options set
Route Table with a route to the internet via IGW
Important Notes
You can delete a default VPC.
AWS recommends using the default VPC. Other cloud service providers may recommend against using the default VPC.
Creating a Default VPC
If you delete the default VPC (accidentally or intentionally) and need to recreate it, use the following AWS CLI command:
aws ec2 create-default-vpc --region <region-name>
Example for the CA-Central-1 region:
aws ec2 create-default-vpc --region ca-central-1
Command Output Example
{
"Vpc": {
"VpcId": "vpc-3f139646",
"InstanceTenancy": "default",
"Tags": [],
"Ipv6CidrBlockAssociationSet": [],
"State": "pending",
"DhcpOptionsId": "dopt-61079b07",
"CidrBlock": "172.31.0.0/16",
"IsDefault": true
}
}
Key Points
Cannot restore a previously deleted default VPC.
Cannot mark an existing non-default VPC as a default VPC.
If you already have a default VPC in the region, you cannot create another one.
Which of the following is included in the default VPC configuration? (Internet Gateway (IGW)
What is the purpose of the default VPC in each region? (So you can immediately start deploying your instances)
True or False, when you create a VPC it automatically has a main route table (True)
What is the IPv4 CIDR block for a default VPC in AWS? (172.31.0.0/16)
Deleting a VPC
Steps to Delete a VPC
To delete a VPC, you need to delete multiple VPC resources before you can delete the VPC itself. The resources include:
Security groups (SGs) and Network ACLs (NACLs)
Subnets
Route tables (RTs)
Gateway endpoints
Internet gateways (IGWs)
Egress-only internet gateways (EO-IGWs)
Command Line Instructions
Delete security groups:
aws ec2 delete-security-group --group-id sg-id
Delete network ACLs:
aws ec2 delete-network-acl --network-acl-id acl-id
Delete subnets:
aws ec2 delete-subnet --subnet-id subnet-id
Delete route tables:
aws ec2 delete-route-table --route-table-id rtb-id
Detach internet gateways:
aws ec2 detach-internet-gateway --internet-gateway-id igw-id --vpc-id vpc-id
Delete internet gateways:
aws ec2 delete-internet-gateway --internet-gateway-id igw-id
Delete egress-only internet gateways:
aws ec2 delete-egress-only-internet-gateway --egress-only-internet-gateway-id eigw-id
Finally, delete the VPC:
aws ec2 delete-vpc --vpc-id vpc-id
When you delete a VPC in the AWS Management Console, it will automatically attempt to delete the resources for you.
Which of the following resources must be deleted before you can delete a VPC? (Choose 3) (Internet gateway, Subnets, Route tables (RTs)
What does the AWS Management Console do when you delete a VPC? (It automatically attempts to delete the associated resources for you.)
Default Route / Catch-All-Route
Definition
The default route or catch-all-route represents all possible IP addresses.
Think of this route as giving access from anywhere or to the internet without restriction.
IPv4 Default Route
0.0.0.0/0
IPv6 Default Route
::/0
:: is shorthand for 0000:0000:0000:0000:0000:0000:0000:0000.
Route Table for IGW
When we specify 0.0.0.0/0 in our Route Table for Internet Gateway (IGW), we are allowing internet access.
Security Group's Inbound Rules
When we specify 0.0.0.0/0 in our Security Group's Inbound Rules, we are allowing all traffic from the internet to access our public resources.
Key Points
0.0.0.0/0 and ::/0 are used to allow unrestricted access.
Be cautious with Security Group rules to prevent unauthorized access.
Important Terminology
0.0.0.0/0: Represents all IPv4 addresses.
::/0: Represents all IPv6 addresses.
Internet Gateway (IGW): A gateway that allows instances in a VPC to communicate with the internet.
Summary
The default route or catch-all-route is critical for defining access rules in network configuration.
IPv4 and IPv6 default routes allow traffic to/from any IP address.
Properly configuring these routes and security group rules ensures secure and intended internet access.
What does the default route or catch-all-route represent? (All possible IP addresses)
What is the IPv4 default route notation? (0.0.0.0/0)
Shared VPCs
AWS Resource Access Manager (RAM)
Allows you to share resources across your AWS Accounts.
Key Points
VPCs can be shared with other AWS Accounts within the same account to centrally manage resources in a single VPC.
Using a shared VPC allows you to:
Reduce the number of VPCs that you create and manage
Separate accounts for billing and access control
Sharing VPCs
You share VPCs by sharing subnets
You can share only non-default subnets
You need to create a resource share in RAM (what you’re sharing)
You need to create a shared principles in RAM (who you’re sharing with)
Enable Sharing with AWS Organization
Command to enable sharing:
aws ram enable-sharing-with-aws-organization
Shared VPCs Identification
Shared VPCs will appear in the shared account with the specific shared subnets.
You can tell if it’s shared by checking the OwnerID.
Creating a Resource Share
Command to create a resource share:
aws ram create-resource-share \
--name "ShareSubnet" \
--resource-arns "arn:aws:ec2:region:account-id:subnet/subnet-id" \
--principals "0123456789012"
What is AWS Resource Access Manager (RAM) used for? (Sharing resources across AWS Accounts)
How can you tell if a VPC is shared in the shared account? (By checking the OwnerID.)
Introduction to NACLs
Network Access Control Lists (NACLs)
Acts as a stateless virtual firewall at the subnet level.
Have both ALLOW and DENY rules.
A default NACL is created with every VPC.
Key Features
NACLs contain two sets of rules:
Inbound rules: Control ingress traffic (entering).
Outbound rules: Control egress traffic (leaving).
Subnets are associated with NACLs, and a subnet can only belong to a single NACL.
Differences between NACLs and Security Groups (SGs)
NACLs have both allow and deny rules.
Security Groups (SGs) only have allow rules.
With NACLs, you can block a single IP address, which is not possible with SGs.
Rule Evaluation
Rule number determines the order of evaluation from lowest to highest.
The highest rule number can be 32766; recommended to work in increments of 10 or 100.
Creating NACLs
Create the NACL
aws ec2 create-network-acl \
--vpc-id vpc-1234abcd5678efgh
Add a NACL entry (rule)
aws ec2 create-network-acl-entry \
--network-acl-id acl-1a2b3c4d5e6f7890a \
--ingress \
--rule-number 100 \
--protocol tcp \
--port-range From=80,To=80 \
--cidr-block 0.0.0.0/0 \
--rule-action allow
Associate the NACL to a subnet
aws ec2 replace-network-acl-association \
--association-id aclassoc-1b2c3d4e5f67890ab \
--network-acl-id acl-1a2b3c4d5e6f7890a
Use Case
Blocking a malicious IP:
Determine a malicious actor's IP address.
Add a DENY rule for that specific IP.
Add a DENY rule for SSH (Port 22) if not needed.
Example
ALLOW internet traffic.
DENY traffic from a specific IP: 23.248.68.48/32.
DENY SSH traffic on Port 22.
True or False, using NACLs you can block a single IP address or range (True)
How many NACLs can a Subnet be associated with? (1)
You can block this with a NACL, but not with a Security Group (A single IP address)
What is the function of Network Access Control Lists (NACLs)? (NACLs act as a stateless virtual firewall at the subnet level, containing both allow and deny rules to control inbound and outbound traffic.)
NACL (Network Access Control)
Readme.md
## Create NACL
```sh
aws ec2 create-network-acl --vpc-id vpc-03181823a2da0addd
```
## Add entry
```sh
aws ec2 create-network-acl-entry \
--network-acl-id acl-02def3052778d5ce2 \
--ingress \
--rule-number 90 \
--protocol -1 \
--port-range From=0,To=65535 \
--cidr-block 174.5.108.3/32 \
--rule-action deny
```
## Get AMI for Amazon Linux 2
Gab the latest AML2 AMI
```sh
aws ec2 describe-images \
--owners amazon \
--filters "Name=name,Values=amzn2-ami-hvm-*-x86_64-gp2" "Name=state,Values=available" \
--query "Images[?starts_with(Name, 'amzn2')]|sort_by(@, &CreationDate)[-1].ImageId" \
--region ca-central-1 \
--output text
```
template.yml
AWSTemplateFormatVersion: "2010-09-09"
Description: Launch a simple EC2 for use with testing VPCs
Parameters:
InstanceType:
Type: String
Default: t2.micro
VpcId:
Type: String
Default: vpc-03181823a2da0addd
ImageId:
Type: String
Default: ami-003c24b59bb839e19
SubnetId:
Type: String
Default: subnet-0584791a9829bc51c
Resources:
SSMRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Principal:
Service: ec2.amazonaws.com
Action: sts:AssumeRole
ManagedPolicyArns:
- arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
EC2InstanceProfile:
Type: AWS::IAM::InstanceProfile
Properties:
Roles:
- !Ref SSMRole
MyEC2Instance:
Type: AWS::EC2::Instance
Properties:
IamInstanceProfile: !Ref EC2InstanceProfile
InstanceType: !Ref InstanceType
ImageId: !Ref ImageId
#SubnetId: !Ref SubnetId
#SecurityGroupIds:
# - !GetAtt SG.GroupId
NetworkInterfaces:
- DeviceIndex: 0
SubnetId: !Ref SubnetId
AssociatePublicIpAddress: true
GroupSet:
- !GetAtt SG.GroupId
DeleteOnTermination: true
UserData:
Fn::Base64: |
#!/bin/bash
yum update -y
yum install -y httpd
systemctl start httpd
systemctl enable httpd
echo "<html><body><h1>Hello from Apache on Amazon Linux 2!</h1></body></html>" > /var/www/html/index.html
SG:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: Allow http to client host
VpcId: !Ref VpcId
SecurityGroupIngress:
- IpProtocol: -1
FromPort: -1
ToPort: -1
CidrIp: 0.0.0.0/0
SecurityGroupEgress:
- IpProtocol: -1
FromPort: -1
ToPort: -1
CidrIp: 0.0.0.0/0
Introduction to Security Groups (SGs)
Security Groups (SG) act as a stateful virtual firewall at the instance level.
Key Concepts
Security Groups are associated with EC2 instances.
Each SG contains two different sets of rules:
Inbound rules (ingress traffic, entering)
Outbound rules (egress traffic, leaving)
Characteristics
SGs are not bound by subnets, but are bound by VPC.
A Security Group can contain multiple instances in different subnets.
Security Group Rules
You can choose a preset Traffic Type (e.g., HTTP/S, Postgres).
You can choose a custom Protocol (UDP/TCP) and Port Range.
Destination type can be:
IPv4 CIDR Block
IPv6 CIDR Block
Another Security Group
Managed Prefix List
There are only Allow Rules. There are no Deny rules. All traffic is blocked by default.
Use Cases
Allow IP Addresses: Specify the source to be an IPv4 or IPv6 range or a specific IP.
Allow to Another Security Group: Specify the source to be another security group.
Nested Security Groups: An instance can belong to multiple Security Groups, and rules are permissive (instead of restrictive).
CLI Example
Create Security Group
aws ec2 create-security-group \
--group-name MySecurityGroup \
--description "My security group" \
--vpc-id vpc-xxxxxxxx
Add Rules to Security Group
aws ec2 authorize-security-group-ingress \
--group-id sg-xxxxxxxx \
--protocol tcp \
--port 80 \
--cidr 0.0.0.0/0
Associate EC2 Instance to Security Group
aws ec2 modify-instance-attribute \
--instance-id i-xxxxxxxxxxxxxxxx \
--groups sg-xxxxxxxxxxxxxxxxxxxx
Limits
You can have up to 10,000 Security Groups in a Region (default is 2,500).
You can have 60 inbound rules and 60 outbound rules per security group.
16 Security Groups per Elastic Network Interface (ENI) (default is 5).
Exclusions
Security groups do not filter traffic destined to and from the following:
Amazon Domain Name Services (DNS)
Amazon Dynamic Host Configuration Protocol (DHCP)
Amazon EC2 instance metadata
Amazon ECS task metadata endpoints
License activation for Windows instances
Amazon Time Sync Service
Reserved IP addresses used by the default VPC route
What is the maximum number of security groups you can have in a region? (10 000)
What is the purpose of a security group? (To act as a virtual firewall at the instance level)
True or False, with security groups all traffic is allowed by default (False)
What is the maximum number of rules you can have per security group? (60 Inbound Rules and 60 Outbound Rules)
True or False, with security groups all traffic is blocked by default unless specifically allowed (True)
Security Group
AWSTemplateFormatVersion: "2010-09-09"
Description: Launch a simple EC2 for use with testing VPCs
Parameters:
InstanceType:
Type: String
Default: t2.micro
VpcId:
Type: String
Default: vpc-0e42de8384640b727
ImageId:
Type: String
Default: ami-003c24b59bb839e19
SubnetId:
Type: String
Default: subnet-04860c6667f12b9ce
Resources:
SSMRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Principal:
Service: ec2.amazonaws.com
Action: sts:AssumeRole
ManagedPolicyArns:
- arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
EC2InstanceProfile:
Type: AWS::IAM::InstanceProfile
Properties:
Roles:
- !Ref SSMRole
MyEC2Instance:
Type: AWS::EC2::Instance
Properties:
IamInstanceProfile: !Ref EC2InstanceProfile
InstanceType: !Ref InstanceType
ImageId: !Ref ImageId
#SubnetId: !Ref SubnetId
#SecurityGroupIds:
# - !GetAtt SG.GroupId
NetworkInterfaces:
- DeviceIndex: 0
SubnetId: !Ref SubnetId
AssociatePublicIpAddress: true
GroupSet:
- !GetAtt SG.GroupId
DeleteOnTermination: true
UserData:
Fn::Base64: |
#!/bin/bash
yum update -y
yum install -y httpd
systemctl start httpd
systemctl enable httpd
echo "<html><body><h1>Hello from Apache on Amazon Linux 2!</h1></body></html>" > /var/www/html/index.html
SG:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: Allow http to client host
VpcId: !Ref VpcId
SecurityGroupIngress:
- IpProtocol: -1
FromPort: -1
ToPort: -1
CidrIp: 0.0.0.0/0
SecurityGroupEgress:
- IpProtocol: -1
FromPort: -1
ToPort: -1
CidrIp: 0.0.0.0/0
Stateful vs Stateless
Stateless Firewalls (e.g., AWS NACLs)
Definition: Stateless firewalls are not aware of the state of the request.
Behavior:
They treat incoming and outgoing requests independently.
Perform rule checks on both directions (incoming and outgoing).
They block or allow traffic based on defined rules, regardless of previous requests.
Example:
Request A: Originates from an internal server.
Response A: Handled as a new, separate request.
Request B: Originates from the internet.
Response B: Handled as a new, separate request.
Stateful Firewalls (e.g., AWS Security Groups)
Definition: Stateful firewalls are aware of the state of the requests.
Behavior:
They allow all outbound requests.
Responses for requests that originated outbound are automatically allowed back through.
They maintain the state of the connections, simplifying rule management.
Example:
Request A: Originates from an internal server.
Response A: Automatically allowed if it matches an outbound request.
Request B: Originates from the internet.
Response B: Automatically allowed if it matches an outbound request.
Key Points
Stateless Firewalls:
Not aware of request state.
Perform rule checks both ways.
Treat each request independently.
Stateful Firewalls:
Aware of request state.
Allow responses for outbound requests automatically.
Maintain state of connections.
Visual Summary
Stateless:
Blocks or allows requests and responses separately.
Requires rules for both directions.
Stateful:
Tracks connections and allows related traffic.
Simplifies management with state awareness.
Which of the following is true about stateless firewalls like AWS NACLs? (They treat each request independently.)
What is the main behavior of stateful firewalls like AWS Security Groups? (Stateful firewalls are aware of the state of the requests. They allow all outbound requests and automatically allow responses for requests that originated outbound.)
Route Tables
Overview
Route tables (RT) are used to determine where network traffic is directed.
Components
VPC (Virtual Private Cloud)
Public Subnet
Route Table
Router
IGW (Internet Gateway)
The Internet
Routes
A route table contains a set of routes.
Each subnet in your VPC must be implicitly or explicitly associated with a route table.
A subnet can only be associated with one route table at a time, but you can associate multiple subnets with the same route table.
Types of Route Tables
Main Route Table
A default route table created alongside every VPC which cannot be deleted.
A subnet not explicitly associated with a route table will use the Main Route Table.
Custom Route Table
A route table that you can create for your VPC.
Example: A custom route table could be used if you needed specific subnets to only route out to a VPN and not the IGW.
Destinations
Destination specifies where the route will go, using an IPv4 or IPv6 CIDR Block or Managed Prefix List.
Examples:
0.0.0.0/0
::/0
pl-02cd2c6b (com.amazonaws.us-east-1.dynamodb)
Important Notes
Route Table (RT): Used to determine where network traffic is directed.
Routes: Contains a set of routes.
Subnet: Must be associated with a route table.
Main Route Table: Default, cannot be deleted, used by subnets not explicitly associated with any route table.
Custom Route Table: Created by the user for specific needs, such as routing to a VPN.
Visual Representation
Public subnet connected to the Main Route Table.
Private subnet connected to a Custom Route Table.
Route table entries with Destination and Target columns indicating where traffic is routed.
These condensed notes highlight the most crucial points about Route Tables in a VPC, providing a clear and quick reference for study purposes.
What is the primary purpose of a route table in a VPC? (To determine where network traffic is directed)
True or False, each subnet within a VPC must be associated with a route table (True)
Within a route table, each record is referred to as a what? (Route)
This VPC component is used to determine where network traffic is directed (Route Tables)
What type of route table is created by default and cannot be deleted? (Main Route Table)
What is a Gateway?
Definition
A gateway (in the context of cloud services) is a networking service that sits between two different networks. Gateways often act as reverse proxies, firewalls, and load balancers.
Types of Gateways
Internet Gateway
Inbound and outbound public traffic for IPv4 and IPv6.
Egress-Only Internet Gateway
Outbound private traffic for IPv6.
Carrier Gateway
Connecting to AWS-partnered telecom network.
NAT Gateway
Outbound private traffic for IPv4.
Virtual Private Gateway
The endpoint into your AWS account for a VPN connection.
Customer Gateway
The endpoint into your on-premises account for a VPN connection.
Gateway Load Balancer (GWLB)
Layer 3 (Network layer) load balancer to run and scale third-party virtual applications (e.g., Firewalls, IDS/IPS).
Direct Connect Gateway
The endpoint connection to a fiber optic connection at a co-location data center.
AWS Backup Gateway
The endpoint connection for AWS managed backups.
IoT Device Gateway
The endpoint connection to send IoT data in both directions.
AWS Transit Gateway
Hub and spoke model to simplify VPC peering.
Amazon API Gateway
Abstracts API endpoints to services.
AWS Storage Gateway
Syncing, caching, or extending local storage to cloud storage.
Additional Information
Some cloud services call their gateways "load balancers" or vice versa.
What is the primary function of a gateway in cloud services? (To sit between two different networks)
What is the function of a Virtual Private Gateway? (A Virtual Private Gateway is the endpoint into your AWS account for a VPN connection.)
Internet Gateway (IGW)
Internet Gateway (IGW) allows both inbound and outbound internet traffic to your VPC.
Route Table
Destination: 10.0.0.0/16
Target: local
Destination: 2001:db8::/64
Target: local
Destination: 0.0.0.0/0
Target: igw-xxxxxxxx
Destination: ::/0
Target: igw-xxxxxxxx
Important Points
These are catch-all routes.
Always used with IGW to access the internet.
Features of IGW
Works for both IPv6 and IPv4 addresses.
Provides a target in your VPC route tables for internet-routable traffic.
Public IPv4 addresses do not require NAT, but private IPv4 addresses do.
NAT is not needed for IPv6 addresses because they are all public.
Notes
Default VPCs come with an IGW.
For Non-Default VPCs, you must manually create and associate your IGW.
What type of traffic does an Internet Gateway (IGW) allow to your VPC? (Both inbound and outbound traffic)
This VPC component allows your VPC to access the internet (Internet Gateway (IGW))
What is one key feature of an IGW related to NAT? (IGW performs Network Address Translation (NAT) for instances with public IPv4 addresses.)
Egress-Only Internet Gateway (EO-IGW)
Definition: Egress-Only Internet Gateways (EO-IGW) are specifically for IPv6 when you want to allow outbound traffic to the internet but prevent inbound traffic from the internet.
Key Points:
IPv6 Addressing:
IPv6 addresses are all public, so they don’t require Network Address Translation (NAT).
Functionality:
An Internet Gateway (IGW) does not restrict inbound traffic.
An EO-IGW ensures your instances are private by denying inbound traffic.
Route Table Example:
Destination
Target
10.0.0.0/16
local
2001:db8::/64
local
::/0
igw-xxxxxxxxx
Diagram Explanation:
Subnets route their traffic through the EO-IGW to reach the internet.
EO-IGW only allows outbound traffic, blocking all inbound traffic, thus securing the instances in the subnet.
Summary:
Egress-Only Internet Gateway (EO-IGW) is crucial for managing IPv6 traffic, allowing only outbound traffic and blocking inbound traffic to ensure privacy and security for instances.
These notes are structured to highlight the most critical aspects of EO-IGW for quick review and understanding, focusing on functionality, purpose, and key differences from standard Internet Gateways.
What is the primary purpose of an Egress-Only Internet Gateway (EO-IGW)? (Allow only outbound traffic and prevent inbound traffic)
Why don't IPv6 addresses require Network Address Translation (NAT)? (Because IPv6 addresses are all public.)
Engress Internet Gateway
vpc/eigw/template-ipv4.yml
AWSTemplateFormatVersion: "2010-09-09"
Description: Launch a simple EC2 for use with testing VPCs
Parameters:
BucketName:
Type: String
Description: The name of the S3 bucket
Default: eigw-andrewbrown-5325
InstanceType:
Type: String
Default: t3.micro
VpcId:
Type: String
Default: vpc-08f0ec02f7471b018
ImageId:
Type: String
Default: ami-003c24b59bb839e19
SubnetId:
Type: String
Default: subnet-0e0fd31733061237d
Resources:
S3AccessRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Version: "2012-10-17"
Statement:
- Effect: Allow
Principal:
Service: "ec2.amazonaws.com"
Action: "sts:AssumeRole"
Path: "/"
ManagedPolicyArns:
- arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
Policies:
- PolicyName: AccessSpecificS3Bucket
PolicyDocument:
Version: "2012-10-17"
Statement:
- Effect: Allow
Action:
- "s3:GetObject"
- "s3:PutObject"
- "s3:DeleteObject"
- "s3:ListBucket"
Resource:
- !Sub "arn:aws:s3:::${BucketName}"
- !Sub "arn:aws:s3:::${BucketName}/*"
EC2InstanceProfile:
Type: AWS::IAM::InstanceProfile
Properties:
Roles:
- !Ref S3AccessRole
MyEC2Instance:
Type: AWS::EC2::Instance
Properties:
IamInstanceProfile: !Ref EC2InstanceProfile
InstanceType: !Ref InstanceType
ImageId: !Ref ImageId
#SubnetId: !Ref SubnetId
#SecurityGroupIds:
# - !GetAtt SG.GroupId
NetworkInterfaces:
- DeviceIndex: 0
SubnetId: !Ref SubnetId
AssociatePublicIpAddress: true
GroupSet:
- !GetAtt SG.GroupId
DeleteOnTermination: true
UserData:
Fn::Base64: |
#!/bin/bash
MY_ID=$(date +%s)
curl ifconfig.me > my-ip-$MY_ID.txt
aws s3 cp my-ip-$MY_ID.txt s3://eigw-andrewbrown-5325
SG:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: Allow http to client host
VpcId: !Ref VpcId
SecurityGroupIngress:
- IpProtocol: -1
FromPort: -1
ToPort: -1
CidrIp: 0.0.0.0/0
SecurityGroupEgress:
- IpProtocol: -1
FromPort: -1
ToPort: -1
CidrIp: 0.0.0.0/0
vpc/eigw/template-ipv6.yml
AWSTemplateFormatVersion: "2010-09-09"
Description: Launch a simple EC2 for use with testing VPCs
Parameters:
BucketName:
Type: String
Description: The name of the S3 bucket
Default: eigw-andrewbrown-5325
InstanceType:
Type: String
Default: t3.micro
VpcId:
Type: String
Default: vpc-01a4d5b3fe5bc0916
ImageId:
Type: String
Default: ami-003c24b59bb839e19
SubnetId:
Type: String
Default: subnet-0cb8892104fc64807
Resources:
S3AccessRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Version: "2012-10-17"
Statement:
- Effect: Allow
Principal:
Service: "ec2.amazonaws.com"
Action: "sts:AssumeRole"
Path: "/"
Policies:
- PolicyName: AccessSpecificS3Bucket
PolicyDocument:
Version: "2012-10-17"
Statement:
- Effect: Allow
Action:
- "s3:GetObject"
- "s3:PutObject"
- "s3:DeleteObject"
- "s3:ListBucket"
Resource:
- !Sub "arn:aws:s3:::${BucketName}"
- !Sub "arn:aws:s3:::${BucketName}/*"
EC2InstanceProfile:
Type: AWS::IAM::InstanceProfile
Properties:
Roles:
- !Ref S3AccessRole
MyEC2Instance:
Type: AWS::EC2::Instance
Properties:
IamInstanceProfile: !Ref EC2InstanceProfile
InstanceType: !Ref InstanceType
ImageId: !Ref ImageId
#SubnetId: !Ref SubnetId
#SecurityGroupIds:
# - !GetAtt SG.GroupId
NetworkInterfaces:
- DeviceIndex: 0
SubnetId: !Ref SubnetId
GroupSet:
- !GetAtt SG.GroupId
DeleteOnTermination: true
UserData:
Fn::Base64: |
#!/bin/bash
MY_ID=$(date +%s)
curl ifconfig.me > my-ip-$MY_ID.txt
aws s3 cp my-ip-$MY_ID.txt s3://eigw-andrewbrown-5325
SG:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: Allow http to client host
VpcId: !Ref VpcId
SecurityGroupIngress:
- IpProtocol: -1
FromPort: -1
ToPort: -1
CidrIpv6: ::/0
SecurityGroupEgress:
- IpProtocol: -1
FromPort: -1
ToPort: -1
CidrIpv6: ::/0
Elastic IPs
Overview
Elastic IP (EIP): A static, public IPv4 address in AWS that you can allocate to and remap across instances in a VPC.
Static IPs remain consistent, unlike dynamic IPs that change on instance restarts.Use Cases:
Prevent disruptions in external connectivity when an EC2 instance restarts.
Maintain the same IP address for failover instances in a high-availability setup.
Key Features
Region-specific.
Cost: $1/hour for allocated and unassociated EIPs.
Includes the public IPv4 address charge.
Can be associated with:
Instances (e.g., EC2).
Network interfaces (ENI).
Not required for IPv6, as VPC addressing is globally unique.
Elastic IP Commands
Allocate an EIP
aws ec2 allocate-address --domain vpc
Use the --network-border-group option to select specific Availability Zones, Local Zones, or Wavelength Zones.
Associate an EIP
aws ec2 associate-address \
--instance-id i-1234567890abcdef0 \
--allocation-id eipalloc-0a33f63bce1d1ff4
To associate with an ENI, use:
--network-interface-id
Disassociate an EIP
aws ec2 disassociate-address \
--association-id eipassoc-0a33f63bce1d1ff4
Deallocate (Release) an EIP
aws ec2 release-address \
--allocation-id eipalloc-0a33f63bce1d1ff4
Advanced Options
Reassociation
Use --allow-reassociation to:
Automatically reassign the EIP in case of failure or restart.
aws ec2 associate-address \
--instance-id i-1234567890abcdef0 \
--allocation-id eipalloc-0a33f63bce1d1ff4 \
--allow-reassociation
Recover Specific Address
Attempt recovery of a previously used IP:
aws ec2 allocate-address \
--domain vpc \
--address "54.228.5.3"
Custom EIP
Use Bring-Your-Own (BYO) IPv4 pool:
aws ec2 allocate-address \
--domain vpc \
--public-ipv4-pool ipv4pool-ec2-1234567890abcdef0
You must import a public IPv4 pool first.
Exam Tips
Elastic IPs are public IPv4 addresses.
Cost is incurred when EIPs are unassociated.
Use --allow-reassociation for failover scenarios.
VPCs using IPv6 do not require Elastic IPs due to globally unique addressing.
What is an Elastic IP (EIP) address in AWS? (A static IPv4 address)
Which command is used to associate an Elastic IP to an EC2 instance? (aws ec2 associate-address)
AWS IPv6 Support
Overview
IPv6 was developed to address the exhaustion of all IPv4 addresses.
AWS services support IPv6, but configuration and access vary per service.
Configuration Options
A service will be configured for either:
IPv6 Only
Dual Stack (IPv4 and IPv6)
Accessing IPv6
A service endpoint is the way to access IPv6:
Public Endpoint Support
Private Endpoint Support
Key Points
Note: All AWS services support IPv4.
Not all services or their resources have IPv6 enabled by default.
Important Example
IPv6 Address Example: 2001:0db8:85a3:0000:0000:8a2e:0370:7334
Takeaways
Understand the configuration and access options for AWS IPv6 support.
Remember that not all services have IPv6 enabled by default.
What was IPv6 developed to address? (The exhaustion of all IPv4 addresses)
What must be noted about the default status of IPv6 on AWS services? (Not all services or their resources have IPv6 turned on by default.)
Migrating from IPv4 to IPv6
Key Points
IPv4 only VPCs can be migrated to operate in Dual Stack mode (IPv4 and IPv6)
Steps to Migrate an IPv4 Only VPC to Dual Stack
Add new IPv6 CIDR block to VPC
Create or Associate IPv6 Subnets
IPv4 subnets cannot be migrated
Update Route Table for IPv6 to IGW
Upgrade SG rules to include IPv6 address ranges
Migrate EC2 instance type if it does not support IPv6
Important Note
You cannot disable IPv4 support for your VPC and subnets; this is the default IP addressing system for Amazon VPC and Amazon EC2.
Definitions to Know
Dual Stack: Operating with both IPv4 and IPv6 addresses.
CIDR block: Classless Inter-Domain Routing, a method for allocating IP addresses and routing.
Exam Tip
Make sure to remember that IPv4 subnets cannot be migrated directly to IPv6; they must be newly created or associated.
What is the first step in migrating an IPv4 only VPC to Dual Stack mode? (Add new IPv6 CIDR block to VPC)
True or False: IPv4 subnets can be directly migrated to IPv6. (False. IPv4 subnets cannot be migrated and must be newly created or associated.)
AWS Direct Connect
Overview
AWS Direct Connect is the AWS solution for establishing dedicated network connections from on-premises locations to AWS.
Connection Options
Lower Bandwidth: 50MBps-500MBps
Higher Bandwidth: 1GBps, 10GBps, 100GBps
Benefits
Reduces network costs and increases bandwidth throughput (great for high-traffic networks).
Provides a more consistent network experience than a typical internet-based connection (reliable and secure).
Connection Requirements
Co-located with an existing AWS Direct Connect location.
Partner with an AWS Direct Connect partner who is a member of the AWS Partner Network (APN).
Work with an independent service provider to connect to AWS Direct Connect.
Network Requirements
Single-mode fiber with:
1000BASE-LX (1310 nm) transceiver for 1 gigabit Ethernet.
10GBASE-LR (1310 nm) transceiver for 10 gigabit.
100GBASE-LR4 for 100 gigabit Ethernet.
Auto-negotiation for a port must be disabled for a connection with a port speed of more than 1 Gbps.
For 1Gbps speed, auto-negotiation may or may not need to be disabled.
802.1Q VLAN encapsulation must be supported across the entire connection.
Border Gateway Protocol (BGP) and BGP MD5 authentication must be supported.
(Optional) Configure Bidirectional Forwarding Detection (BFD) on your network.
Support and Maintenance
Supports both IPv4 and IPv6.
Ethernet frame size of 1522 or 9023 bytes.
Two types of maintenance:
Planned Maintenance:
Scheduled in advance to improve availability and deliver new features.
Three notifications are provided: 1, 5, and 10 calendars in advance.
Emergency Maintenance:
Not planned in advance due to unexpected failure.
Impacted customers are notified 60 mins prior to maintenance.
Pricing Factors
Capacity (port size):
Maximum rate that data can be transferred through a network connection.
Larger the port size, greater the cost (e.g., 50, 100, 200, 300, 400, 500Mbps, 1, 2, 5, 10Gbps).
Port hours:
Time a Direct Connect port is provisioned for use, regardless of data transfer.
Connection types:
Dedicated: Physical connections billed per hour via AWS.
Hosted: Logical connections billed by AWS Direct Connect Deliver Partner.
Data Transfer Out (DTO):
Charged based on outbound traffic sent through Direct Connect to destinations outside of AWS.
Data transfer in is free.
Data transfer within the same region does not incur a cost.
What are the bandwidth options available for AWS Direct Connect? (50MBps-500MBps and 1GBps, 10GBps, 100GBps)
What is AWS Direct Connect? (AWS Direct Connect is the AWS solution for establishing dedicated network connections from on-premises locations to AWS.)
VPC Endpoints
VPC Endpoints allow you to privately connect your VPC to other AWS and other services
Key Concept
Think of a secret tunnel where you don’t have to leave the AWS network.
Benefits of VPC Endpoints
Eliminates the need for:
Internet Gateway (IGW)
NAT device
VPN connection
AWS Direct Connect
Instances in the VPC do not require a public IPv4 address
Traffic between your VPC and other services does not leave the AWS network.
Horizontally scaled, redundant, and highly available VPC component
Allows secure communication between instances and services without adding availability risks or bandwidth constraints on your traffic.
Types of VPC Endpoints
Interface Endpoints
Gateway Endpoints
Gateway Load Balancer Endpoints
Diagram Explanation
Without VPC Endpoint: Traffic goes through the Internet.
With VPC Endpoint: Traffic remains within AWS.
What do VPC endpoints allow you to do? (Keep all traffic between your VPC and other AWS services inside of the AWS network)
What are the 2 types of VPC Endpoints? (Interface Endpoints and Gateway Endpoints)
True or False, a VPC endpoint eliminates the need for an internet gateway, NAT device, VPN, or DirectConnect Connection to access AWS services within the AWS network (True)
Introduction to PrivateLink
Overview
AWS PrivateLink is a service that allows you to securely connect your VPC to:
Supported AWS Services
AWS services hosted in other AWS Accounts
Supported AWS Marketplace partner services
Key Features
Enables secure connections without the need for an IGW, NAT, VPN, or AWS Direct Connect connection.
Connectivity
You create an Interface Endpoint to connect to services.
You can create your own services by creating a Service Endpoint.
Benefits
PrivateLink Ready partner services allow you to access SaaS products privately, as if they were running in your own VPC.
Terminology
Interface Endpoint: Connects your VPC to AWS services.
Service Endpoint: Allows you to create and manage your own services within the VPC.
What does AWS PrivateLink allow you to securely connect your VPC to? (Supported AWS Services, AWS services hosted in other AWS Accounts, and supported AWS Marketplace partner services)
What is an Interface Endpoint in AWS PrivateLink? (An Interface Endpoint is used to connect your VPC to AWS services using AWS PrivateLink.)
Interface Endpoints
Definition
Interface Endpoints are Elastic Network Interfaces (ENI) with a private IP address.
They serve as an entry point for traffic going to a supported service.
Advantages
Access services hosted on AWS easily and securely by keeping your network traffic within the AWS network.
Supported AWS Services
API Gateway
CloudFormation
CloudWatch
Kinesis
SageMaker
CodeBuild
AWS Config
EC2 API
ELB API
AWS KMS
Secrets Manager
Security Token Service
Service Catalog
SNS
SQS
Systems Manager
Marketplace Partner Services
Endpoint Services in other AWS accounts
Key Features
Elastic Network Interface (ENI): Entry point for traffic destined to the service.
PrivateLink: Facilitates private connectivity between VPCs and supported services.
Cost:
Pricing per VPC endpoint per AZ: $0.01/hour
Pricing per GB data processed: $0.01/GB
Approximate monthly cost: ~$7.5/month
Diagram Explanation
Shows the traffic flow from My VPC to AWS Services through the Interface Endpoint using PrivateLink.
Elastic Network Interface (ENI): Visualized as the entry point in the network.
Important Terms
Elastic Network Interface (ENI): A logical networking component in a VPC that represents a virtual network card.
PrivateLink: A technology enabling private connectivity between VPCs and AWS services without using public IPs.
Key Points to Remember
ENI provides a private IP address for secure traffic routing.
Supports multiple AWS services, ensuring network traffic remains within AWS.
Cost-effective with low pricing per endpoint and data processed.
Use this guide to familiarize yourself with the concept of Interface Endpoints and their advantages in securing and streamlining AWS service access within your network.
Which AWS service powers interface endpoints? (AWS PrivateLink)
True or False, Interface Endpoints have public IP addresses assigned (False)
VPC Interface Endpoints are powered by this AWS Service (AWS PrivateLink)
Gateway Load Balancer Endpoints
Overview
Gateway Load Balancer (GWLB) Endpoints powered via PrivateLink allow you to distribute traffic to a fleet of network virtual appliances.
Key Components
My VPC
Contains GWLB Endpoints.
Utilizes PrivateLink for communication.
Service Provider VPC
Contains security appliances.
Receives traffic via PrivateLink and GWLB Endpoints.
Functions and Capabilities
Deploy, scale, and manage:
Firewalls
Intrusion Detection and Prevention Systems (IDS/IPS)
Deep Packet Inspection Systems
Obtaining Virtual Appliances
You can obtain virtual appliances as-a-service from:
AWS Partner Network (APN)
AWS Marketplace
Traffic Management
You can send traffic to GWLB by making simple configuration updates in your VPCs' route tables.
Important Notes
GWLB Endpoints facilitate secure and efficient distribution of network traffic.
PrivateLink ensures secure communication between My VPC and Service Provider VPC.
What does a Gateway Load Balancer (GWLB) Endpoint powered by PrivateLink allow you to do? (Distribute traffic to a fleet of network virtual appliances)
Name three types of systems that can be deployed, scaled, and managed using a GWLB. (Packet Inspection Systems, Firewalls, Intrusion Detection and Prevention Systems (IDS/IPS)
Where can you obtain virtual appliances as-a-service for use with a GWLB? (From the AWS Partner Network (APN) and the AWS Marketplace.)
VPC Gateway Endpoints
Overview
A Gateway Endpoint provides reliable connectivity to Amazon S3 and DynamoDB without requiring an internet gateway or a NAT device for your VPC.
Key Points
Gateway endpoints:
Do not use AWS PrivateLink.
Have no additional charge.
Support the following services:
Amazon DynamoDB
Amazon S3
Setup
To create a Gateway Endpoint, you must specify:
The VPC in which you want to create the endpoint.
The service to which you want to establish the connection.
Which of the following AWS services can be used with VPC Gateway Endpoints? (S3, DynamoDB)
How much do VPC Gateway Endpoints cost? (Free)
VPC Gateway Endpoints support only these 2 AWS Services (S3 and DynamoDB)
VPC Endpoints Comparison
Interface Endpoint
Type: Elastic Network Interface (ENI)
Use Case: Private connection to AWS services, partner services, and other VPCs without public IPs.
Service Integration: AWS PrivateLink
Supported Services: Many AWS Services
Pricing:
Per hour when provisioned
Data processed
Routing Mechanism: DNS interception and routing
Traffic Direction: Bidirectional
Gateway Endpoint
Type: Gateway with VPC
Use Case: Private connections to S3 and DynamoDB from VPC
Service Integration: S3 and DynamoDB
Supported Services: S3 and DynamoDB
Pricing:
Free
Routing Mechanism: Route table entries for specific destinations
Traffic Direction: Unidirectional
GWLB Endpoint
Type: Type of Interface Endpoint
Use Case: Route traffic to third-party virtual appliances like firewalls in another VPC.
Service Integration: AWS PrivateLink and GWLB
Supported Services: Third-party virtual applications
Pricing:
Endpoint hours
Data processed
Routing Mechanism: Integrates with GWLB
Traffic Direction: Usually Unidirectional
Which type of AWS VPC Endpoint uses an Elastic Network Interface (ENI) for its implementation? (Interface Endpoint)
What is the primary use case for an Interface Endpoint in AWS? (Private connection to AWS services, partner services, and other VPCs without public IPs.)
VPC Flow Logs
Overview
VPC Flow Logs allow you to capture IP traffic information through your VPC.
Command to Create Flow Logs
aws ec2 create-flow-logs \
--resource-type VPC \
--resource-ids vpc-xxxxxxx \
--traffic-type ALL \
--log-destination-type cloud-watch-logs \
--log-destination arn:aws:logs:region:account-id:log-group:log-group-name \
--deliver-logs-permission-arn arn:aws:iam::account-id:role/role-name
Scope of Flow Logs
Flow Logs can be scoped for the following: - VPC - Subnets - Elastic Network Interface (ENIs) - Transit Gateway - Transit Gateway Attachment
Traffic Monitoring
You can monitor traffic for: - ACCEPT – traffic was accepted - REJECT – traffic that was rejected - ALL — all accepted and rejected traffic
Log Delivery
Logs can be delivered to either: - Amazon S3 bucket - CloudWatch Logs - Kinesis Data Firehose
Log Entry Format
Format of log entry for VPC Flow Log:
<version> <account-id> <interface-id> <srcaddr> <dstaddr> <srcport> <dstport> <protocol> <packets> <bytes> <start> <end> <action> <log-status>
2 123456789010 eni-abc123de 172.31.16.139 172.31.16.21 20641 22 6 20 4249 1418530010 1418530070 ACCEPT OK
Log Field Descriptions
version: The VPC Flow Logs version.
account-id: The AWS account ID for the flow log.
interface-id: The ID of the network interface for which the traffic is recorded.
srcaddr: The source IPv4 or IPv6 address.
dstaddr: The destination IPv4 or IPv6 address.
srcport: The source port of the traffic.
dstport: The destination port of the traffic.
protocol: The IANA protocol number of the traffic.
packets: The number of packets transferred during the capture window.
bytes: The number of bytes transferred during the capture window.
start: The time, in Unix seconds, of the start of the capture window.
end: The time, in Unix seconds, of the end of the capture window.
action: The action associated with the traffic:
ACCEPT: The recorded traffic was permitted by the security groups or network ACLs.
REJECT: The recorded traffic was not permitted by the security groups or network ACLs.
log-status: The logging status of the flow log:
OK: Data is logging normally to the chosen destinations.
NODATA: There was no network traffic to or from the network interface during the capture window.
SKIPDATA: Some flow log records were skipped during the capture window. This may be because of an internal capacity constraint or an internal error.
What are Flow Logs used for? (Capturing IP traffic information in-and-out of network interfaces within a VPC).
Which of the following attributes can you find within a VPC Flow Log (dstaddr, srcaddr)
Which of the following can you create Flow Logs for? (Network Interface, VPC, Subnets)
Flow Logs can be created for these 3 VPC components (VPC, Subnets, Network Interface)
True or False, VPC Flow Logs allow you to capture IP traffic in-and-out of network interfaces within a VPC (True)
AWS Virtual Private Network
AWS VPN Overview
AWS VPN establishes a secure and private tunnel from your network or device to the AWS global network.
Types of AWS VPNs
AWS Site-to-Site VPN
Securely connect on-premises network or branch office site to VPC.
AWS Client VPN
Securely connect users to AWS or on-premises networks.
What is IPSec?
Internet Protocol Security (IPsec) is a secure network protocol suite.
Authenticates and encrypts packets of data.
Provides secure encrypted communication between two computers over an Internet Protocol network.
Commonly used in virtual private networks (VPNs).
What does AWS VPN allow you to establish? (A secure and private tunnel)
What are the two types of AWS VPNs? (AWS Site-to-Site VPN and AWS Client VPN.)
AWS Site-to-Site VPN
Overview
AWS Site-to-Site VPN allows you to connect your VPC to your on-premise network.
Components
VPN Connection — secure connection between VPC and on-premises equipment
VPN Tunnel — encrypted connection for your data
Customer Gateway (CGW) — provides information to AWS about your customer gateway device
Customer Gateway Device — A physical device or software application on your side of the Site-to-Site VPN connection
Target Gateway — A generic term for the VPN endpoint on the Amazon side of the Site-to-Site VPN connection
Virtual Private Gateway (VGW) — VPN endpoint on the Amazon side of your Site-to-Site VPN connection that can be attached to a single VPC
Transit Gateway — A transit hub that can be used to interconnect multiple VPCs and on-premises networks, and as a VPN endpoint for the Amazon side of the Site-to-Site VPN connection
Features
Internet Key Exchange version 2 (IKEv2)
NAT Traversal
4-byte ASN in the range of 1 – 2147483647 for VGW configuration
2-byte ASN for CGW in the range of 1 – 65535
CloudWatch Metrics
Reusable IP addresses for your customer gateways
Additional Encryption Options
AES 256-bit encryption
SHA-2 hashing
Additional Diffie-Hellman groups
Configurable tunnel options
Custom private ASN for the Amazon side of a BGP session
Support for IPv6 traffic for VPN connections on a transit gateway
Optionally enable acceleration for your Site-to-Site VPN connection via AWS Global Accelerator
Attach your Site-to-Site VPN to AWS Cloud WAN
Attach your Site-to-Site VPN to AWS Transit Gateway
Pricing
Each VPN connection hour
Data transfer out from Amazon EC2 to the internet
Limitations
IPv6 traffic is not supported for VPN connections on a virtual private gateway
An AWS VPN connection does not support Path MTU Discovery
Recommend using non-overlapping CIDR blocks for your networks
Which feature of AWS Site-to-Site VPN provides a secure connection for your data? (VPN Tunnel)
Which of the following is a component of AWS Site-to-Site VPN? (Virtual Private Gateway (VGW), Customer Gateway (CGW), VPN Connection)
What does the Customer Gateway (CGW) do in AWS Site-to-Site VPN? (Provides information to AWS about your customer gateway device, which is a physical device or software application on your side of the Site-to-Site VPN connection.)