AWS S3 Guide

  • Files can be 0 to 5TB
  • Unlimited Storage
  • S3 properties that I don't remember:
  • version id
  • metadata
  • subresources - exist under an object, don't exist on their own
    • access control list - fine grain permissions. invdividual file or bucket level
    • torrent - does support bittorrent protocol
  • Basics (that I don't remember):
    • Tiered storage
    • Lifecycle management

Exam Tips

  • object based
  • files can be 0 to 5TB
  • unlimited storage
  • files are stored in buckets
  • s3 is a universal namespace, names must be unique globally
  • s3-us-east-1.amazonaws.com/bucketname
  • read after write consistency for new PUTS objects
  • eventually consistency for overwrite PUTs and DELETES
  • S3 durable, immediately available, frequently accessed
  • S3 IA durable, immediately available, infrequently accessed
  • S3 Reduced Redundancy Storage - data that is easily reproducible, such as thumb nails, etc
  • glacier, archived data. 3-5 hours to access.
  • core fundamentals:
    • key (name)
    • value (data)
    • version id
    • metadata
    • subresources
      • ACL
      • torrent
  • object based storage only (for files)
  • no suitable to install operating system on it
  • successful uploads will generate a HTTP 200 status code.

Create an S3 Website

  • skipped through

CORS Configuration

  • just a lab, not much to study

Exam Tips:

  • Stores all versions
  • Great backup tool
  • Once enabled, Versioning cannot be disabled, only suspended.
  • Integrate with Lifecycle rules.
  • MFA Delete ability

S3 Replication

  • requires versioning enabled in both buckets.
  • regions must be unique - cannot use the same region
  • files in existing bucket are not automatically replicated. all new updates will be though.
  • you cannot replicate multiple buckets or use daisy chasing (at this time)
  • delete markers are replicated
  • deleting individual versions or delete markers will not be replicated

S# Lifecycle Management & Glacier

S3 101 What is S3? Simple Storage Service Place to store files in the cloud. Security and high scalable object.

Object base vs block base storage.

  • Safe place to store files: Not a place to run an app or run a db.
  • Objects based storage: Flat files, photos,
  • data is spread across multiple devices and multiple services and multiple facilities. designed to withstand failure.

S3 - The basics

  • Object base
  • 0 to 5TB bytes in size
  • Unlimited storage
  • Files are stored in Buckets. Bucket is basically a folder.
  • S3 is a universal namespace. Must be unique globally.
  • Creates a dns address: s3-eu-west-1.amazonaws.com/bucketname exam question
  • when you upload you get a HTTP 200 code if upload was successfull.

Data Consistency Model for S3

  • Read and After write consistentcy for PUTS of new objects
  • Eventually consistency for overwrite PUTS and DELETES can take some time to propagate

S3 - simple key value store

  • S3 is object base
  • key
  • value
  • version id
  • metadata
  • subresources - exist under an object, don't exist on their own
    • access control list - fine grain permissions. individual file or bucket level
    • torrent - does support bit torrent protocol
  • designed to sort files in alphabetically order.
  • log files examples: lets say if you have a log file names with the same beginning. physically stored in the same area in s3. This can lead to performance impact. Sometimes can add random id to front of the file name.

The basics

  • 99.99% availability for S3 SLA
  • Guarantees 99.99999999 ( 11 9's durability guarantee )
  • Tiered storage objects
  • Lifecycle management: Example: 30 days, move to another rstora
  • Versioning
  • Encryption - important to remember all the different ways to encrypt
  • Secure Access with either 1. Access Control Lists or 2. Bucket Policies

Storage Tiers/Classes

  • S3 - 99.99 available. 99.999999999 durability designed to sustain the lost of 2 faciilities
  • S3 - IA (Infrequently Accessed). Lower fee but charged a retrieval fee.
  • RRS - Reduced Redundancy Storage - 99.99 availability and 99.99 durability.
  • Glacier - 3-5 hours to restore from glacier.

Great table to summary:

  • standard
    • durability: 99.999999%
    • availability: 99.99%
    • availability SLA: 99.9%
    • minimum object size: N/A
    • retrieval fee: N/A
    • facility fault tolerance: 2
    • ssl support: yes
    • first bye latency: milliseconds
    • lifecycles management policies: yes
  • standard - IA
    • durability: 99.999999%
    • availability: 99.99%
    • availability SLA: 99%
    • minimum object size: 128KB
    • minimum storage duration: N/A
    • retrieval fee: per GB retrieved
    • facility fault tolerance: 2
    • ssl support: yes
    • first bye latency: milliseconds
    • lifecycles management policies: yes
  • RRS
    • durability: 99.99%
    • availability: 99.99%
    • facility fault tolerance: 1
    • ssl support: yes
    • first bye latency: milliseconds
    • lifecycles management policies: yes
  • Glacier
    • durability: 99.999999%
    • availability: N/A
    • availability SLA: N/A
    • minimum object size: N/A
    • minimum storage duration: N/A
    • retrieval fee: per GB retrieved
    • ssl support: N/A
    • first bye latency: minutes or hours
    • lifecycles management policies: yes

Glacier

  • $ 0.01/gb/month
  • 3-5 hours of access times

S3 Charges

  • storage
  • requests
  • storage management pricing - can add tags to data.
  • data transfer pricing. into s3 is free. but out of cost money.
  • transfer acceleration - user uploads to edge location. better network between s3 bucket to the edge location.

Exam Tips

  • object based
  • files can be 0 to 5TB
  • unlimited storage
  • files are stored in buckets
  • s3 is a universal namespace, names must be unique globally
  • s3-us-east-1.amazonaws.com/bucketname
  • read after write consistency of new objects
  • eventually consistency for PUTs and DELETES
  • S3 durable, immediately available, frequently accessed
  • S3 IA durable, immediately available, infrequently accessed
  • S3 Reduced Redundancy Storage - data that is easily reproducible, such as thumb nails, etc
  • glacier, archived data. 3-5 hours to access.
  • core fundamentals:
    • key (name)
    • value (data)
    • version id
    • metadata
    • subresources
      • ACL
      • torrent
  • object based storage only (for files)
  • no suitable to install operating system on it
  • successful uploads will generate a HTTP 200 status code.
  • READ S3 FAQ

Lab :Create an S3 Bucket

  • bucket name must be only lower case characters
  • by default: new buckets have private permissions
  • pretty basic lab
  • Versioning, Static Website Hosting, Logging, Tags, Cross-region replication, Events,
  • Tabs: Objects, Properties, Lifecycle, Permissions, Management (Analytics, Metrics, Inventory)

Lab: Versioning

  • once versioning is added, it cannot be removed, it can only be disabled.
  • storage size is sum of all the versions of the objects.
  • if you have a lot of large file versions, you want to add some time of lifecycle policy
  • deleting versions of objects deletes the version permanently. cannot recover.
  • deleting whole object puts in a "delete marker" that hides the object. in the old console, you just delete the "delete marker" and that restores the object.
  • with versioning you can add "Versioning MFA Delete" capability. Can come up in the exam.

Lab: Cross Region Replication

  • only copies new object from start of replication enable.
  • will copy all versions even from the version in the past. permissions are also copied over.
  • replication only gets triggered from 1 hop. A->B->C. Only A->B will get replicated to B but not C.
  • multiple bucket replication is also not supported. A>B and C.
  • deleting a "delete marker" does not replicate.
  • deleting versions , does not replicate also.
  • only when you delete a object will replication copy operation over.

  • versioning must be enabled
  • regions must be different
  • existing files will not automatically be replicated
  • cannot replicate multiple buckets or daisy chain
  • creating "delete markers" get replicated, but deleted "delete markers" do not need to get replicated.

Labs: S3 Life Cycle Management & Glacier

  • glacier is designed for at least 90 day storage
  • lifecycle is usually set up:

Exam Tips

  • lifecycle can be use in conjunction with versioning
  • can be applied to current and previous versions
  • Object must be at least 128kb size and 30 days after the creation date
  • 30 days after IA then can transition to glacier
  • can permanently delete

CloudFront CDN Overview

  • What is a CDN? Network of distributed services that delivery content based on the location of the user.

Key terminology

  • Edge Location - location where the content will be cached
  • Origin - origin of all files, where the original of files exist. In terms of AWS: S3, EC2 instance, ELB, Route53.
  • Distribution - name given to the CDN which consists of the edge locations.
    • web distribution
    • RTMP
  • TTL
  • CloudFront can deliver: dynamic, static, streaming, and interactive content.

  • Web Distribution - websites
  • RTMP - media streaming, adobe flash

Exam Tips

  • What is an Edge location? A: Its the cache location that caches the content.
  • What can be an CloudFront origin? (4 things) A: S3, EC2 instance, ELB, Route53 (website)
  • What is the name given to a CloudFront CDN when you create it? A: Distribution
  • Can write to the edge locations.
  • Objects are cached for the life of the TTL
  • Get charged for manually expiring cache.

Lab: Create CloudFront CDN

  • can have multiple origins per distribution
  • Restrict CloudFront with Signed URLs or Signed Cookies. Comes up exam all the time.
  • How are you going restrict CloudFront? Use resigned URLs.
  • CloudFront Restrictions: Geo restrictions. You can have only whitelist or only blacklist.

S3 Security and Encryption

  • Can secure via
    • Bucket Policies
    • Access Control Lists
  • S3 can be configured to create access logs.

Encryption: 4 different types of encryption. Must know!

  • In transit
    • SSL/TLS
  • At Rest (for methods)
    • Server Side Encryption
      • S3 Managed Keys - SSE-S3. AES256. Amazon handles it all. And encrypts each each with a master key. that master key gets rotated.
      • AWS Key Management Service, Managed Keys - SSE-KMS. Additional benefits. 1. Separate permission for the use of an envelope key. 2. Provides an audit trail of the use of the keys/transparency.
      • Server Side Encryption with Customer Provided Keys - SSE-C. You manage the key and AWS manages encryption and decryption when written to disk.
    • Client side encryption - you encrypt before uploading to s3.

Storage Gateway: popular exam topic

  • connect on-premise software appliance with cloud-based storage so it looks idk that on-prem storage is on the cloud.
  • storage gateway = virtual appliance installed into the hypervisor
  • its downloadable as Virtual Machine and install on host. and ESXi or Microsoft Hyper-V.
  • 4 different types of Storage Gateways:
    • File Gateway (NFS) - stores flat files on S3. Accessed through mount volume.
    • Volume Gateway (iSCSI) - virtual hard disk, this is block based storage. Can install OS on this.
      • Stored volume: store entire copy on site
      • Cached volume: only store recent accessed data
    • Tape Gateway (VTL) - backup/archive solutions. Send the virtual tapes to S3 and then can backup to Glacier. Old names:
  • File Gateway -> Gateway Stored Volume
  • Volumes Gateway -> Gateway Cache Volume
  • Tape Gateway -> Gateway Virtual Table Library

Volume Gateway

  • iSCSI - block based storage
  • virtual hard disk, supports point time snapshots and can create EBS snapshots. snapshots are incremental.
  • 2 types:
    • stored volumes - entire copy of the dataset. remember the snapshot support 1G to 16TB. snapshots are on-prem. The virtual applicance/storage gateway then buffers and sends it to EBS snapshot and s3 . You keep a complete copy of it on site.
    • cached volumes - only keep the most read data on-prem. up to 32TB can be stored. Attach them as iSCSI devices again.
  • Tape Gateway, supported by NetBackup, Backup Exec, Veam.

Exam Tips

  • File Gateway - nothing stored on-prem. Only stored on S3. Flat file system
  • Volume Gateway:
    • Stored volume: entire dataset stored on-site and backed tup S3.
    • Cached volume: entire dataset is stored on S3 and most recent data accessed cached on site.
  • Gateway Virtual Tape Library

Snowball

  • Previously Import/Export Disk 3 types of snowballs:
  • snowball
  • snowball edge
  • snowmobile

Snowball design to streamline brining into AWS large amounts of data and bypassing the internet. Physically sending the appliance back to AWS. 1/5 the cost of using the internet. 80TB snowball support. secure: tamper resistant enclosure, 256 AES encryption,

Snowball Edge Compute capabilities and 100GB A little AWS data center. Can even run lambda functions on it.

Snowmobile Petabytes or Xetabtyes of data 6 months to bring in a xetabyte, just get 10 snowmobiles 100 Petabyte per truck

Exam Tips

  • What is a snowball
  • What Import/Export is?
  • Snowball Can
    • Import to S3
    • Export from S3
  • If using glacier, you will first need to move it to s3 first

Snowball Lab Just hilarious

S3 Transfer Acceleration

  • uses CloudFront edges to speed up upload to s3. Uses AWS backbone network.
  • distinct urls look like: acloudguru.s3-accelerate.amazonaws.com
  • AWS has optimize protocols on their own network.

Lab: Create a Static Website Using S3

  • need to know the format of the s3 url:
  • my website.s3-website.us-east-1.amazonaws.com

S3 Summary

  • Object based storage: files, images. Not a place for OS or database. Need block based storage for that.
  • Files can be 0 to 5TB.
  • Unlimited storage
  • Files are stored in Bucket
  • Universal namespace.
  • https://s3-us-east-1.amazonaws.com/mybucket-name
  • consistency model
  • read after write consistency
  • eventually consistency for updates and deletes s3 storage tiers:
  • S3 durable, s3 Infrequently Accessed, S3 reduced redundancy storage RRS - doesn't have 11 9's only 4 nines.
  • Glacier - cheapest of all storage tiers. 3-5 hour way. fundamentals of s3
  • key, value, version id, metadato ta, ACL and torrent
  • object based storage
  • versioning, cannot be disabled, can only be suspended
  • integrates with lifecycle
  • versioning MFA delete capability
  • cross region replication, requires versioning to be enabled on source and dest bucket lifecycle management
  • transition to S3 IA 128kb and 30 days after its been created
  • archive to glacier, 30 days after IA
  • permanently delete CloudFront
  • Edge location
  • Origin vs Distribution
  • Web and RTMP distribution
  • Edge locations are not just READ only
  • Objects are cached for TTL
  • Clear cache objects but will be charged Securing Buckets
  • by default all private
  • Security
    • Bucket policies
    • ACL
  • Can be configured to log Encryption
  • In transit: SSL/TLS
  • At rest server side:
  • SSE-S3 - s3 master key.
  • SSE-KMS - allow for separate kms key. allows for audit trail.
  • SSE-C - customer provided encryption key
  • At rest client side:
  • User encrypts before uploading to s3 Gatways:
  • File Gateway
  • Volume Gateway
    • Stored Volumes
    • Cached Volumes
  • Gateway Virtual Library Snowball:
  • 3 types: snowball, snowball edge, snowmobile
  • What is import export
  • Snow can import/export to s3 S3 Transfer Acceleration
  • uploading CloudFront edge S3 Static Websites
  • serverless Last tips
  • S3 - HTTP 200 successful write
  • Can load files via multipart
  • Read through S3 FAQ before taking the exam