??? pgHead ???
 
  • What is an S3 bucket?
  • What are the different storage classes?
  • Features of Amazon S3
  • How does Amazon S3 work?
  • Alternatives to Amazon S3
  • Monitoring Amazon S3 bucket activity
 

Amazon S3 (Simple Storage Service) is a scalable, secure cloud storage solution provided by Amazon Web Services (AWS), designed to store and access your data with the required permissions. Known for its high durability (99.999999999%) and high availability, S3 is ideal for a wide range of use cases, including backups, web hosting, data lakes, and big data analytics.

What is an S3 bucket?

In Amazon S3, data is organized into containers called buckets, and within each bucket, files are stored as Amazon S3 objects. Each object is uniquely identified by a key (or key name) within its bucket. This structure allows you to store vast amounts of data across multiple buckets while also enabling fine-grained access control. You can define permissions to control who can create, read, update, or delete objects within your buckets. Additionally, S3 provides features like access logging to monitor requests and the ability to specify the geographical region where your data is stored. The maximum size of an AWS S3 bucket is 5TB, after which files must be divided into chunks before uploading.

What are the different storage classes?

Depending upon the type of data stored and how often you will need to access them, there are different types of buckets provided by Amazon :

[General purpose] S3 Standard
  • S3 Standard is general purpose storage for frequently accessed data.
  • Use cases: Cloud applications, dynamic websites, content distribution, mobile and gaming applications, and big data analytics.
S3 Intelligent-Tiering
  • S3 Intelligent-Tiering is usedfor data with unpredictable patterns. S3 intelligent-tiering assesses data access patterns and automatically moves the data across the three tiers—Frequent access, Infrequent access (data not used for more than 30 days), and Archive instant access (data not used for more than 90 days)—helping users save money by paying only for services they use. In case data from the archive instant is accessed, it is immediately moved to the frequent access tier.
  • Use case: For virtually any workload, especially data lakes, data analytics, new applications, and user-generated content.
Express One Zone
  • Express One Zone is for your latency-sensitive applications that store data in just one Availability Zone (AZ) within an AWS Region. In this storage class, data is stored in a different bucket type—an S3 directory bucket —built to handle hundreds of thousands of requests per second, making them suitable for high-performance workloads.
  • Use cases: Machine learning, querying large datasets, big data processing, and data preparation.
[Infrequent access] Standard-Infrequent Access (S3 Standard-IA)
  • Standard-Infrequent Access (S3 Standard-IA)is for less frequently accessed data with provisions for rapid access . You can set up S3 lifecycle policies to move objects between different storage classes automatically over time. For example, after 30 days of no access, your objects could be moved from S3 Standard to S3 Standard-IA to save costs without changing the way your applications work.
  • Use cases: For backup data and disaster recovery.
[Infrequent access] One Zone-Infrequent Access (S3 One Zone-IA)
  • One Zone-Infrequent Access (S3 One Zone-IA)is usually for less frequently accessed data, with provisions for rapid access. It is similar to S3 Standard-IA, but it stores your data in one AZ instead of three, which makes it 20% cheaper than S3 Standard-IA.
  • Use cases: Secondary backup and replicated data.
[Archive] Glacier Instant Retrieval
  • Glacier Instant Retrievalis archive storage for data that is rarely accessed but can be retrieved quickly. It provides retrieval within milliseconds for archived data, which is the fastest among archive options like S3 Glacier or S3 Glacier Deep Archive.
  • Use cases: Compliance archives, digital preservation, and long-term backups.
[Archive] Glacier Flexible Retrieval (Formerly S3 Glacier)
  • Glacier Flexible Retrieval (Formerly S3 Glacier)is archive storage for data that is rarely accessed but can be retrieved within a few hours. This data will be accessed only once or twice in a year.
  • Use cases: Backup storage and offsite data storage.
[Archive] Glacier Deep Archive
  • Glacier Deep Archiveis the cheapest archival service provided by Amazon. This is used for storing data for an extended period of time (7–10 years). While S3 Glacier can retrieve some data in minutes, S3 Glacier Deep Archive takes longer, up to 12 hours, to restore your data.
  • Use cases: It’s designed for industries like finance, healthcare, and the public sector, where companies are required to keep data for a long time (e.g., legal records, medical files, financial documents).
Outposts
  • Outposts are used for on-premises data. It uses a single storage class called OUTPOSTS, which works like other S3 storage classes but is specifically designed to handle data in the local environment.
  • Use cases: For industries where regulations require certain data to remain physically on-site, as well as for businesses operating a hybrid model where some workloads run on-premises and some in the cloud

Features of Amazon S3

Here are a few features of Amazon S3 that make it one of a kind:

Scalability, availability, and durability
  • Storage: Amazon S3 offers a virtually unlimited storage capacity, allowing users to store as much data as they need. Based on the type of storage class, the pricing will vary.
  • Availability: Amazon S3 is designed for 99.99% availability over a given year, making it accessible whenever you need it.
  • Durability: S3 is built to provide 99.999999999% (11 nines) durability. This means that your data is incredibly safe, and the chances of losing it are meager.
Bulk data management

You can perform large-scale actions on bulk data; users can make use of S3's Batch Operations feature, such as running AWS Lambda functions on multiple objects, initiating a batch operation, or even modifying access in bulk.

Data security

S3 connects to Amazon VPC to prevent use of public internet. Additionally, it also enforces the "no public access" policy to new buckets, preventing their accessibility to the public. Users can also control access points restricting to certain VPCs only. S3 also encrypts its data—client-side encryption and server-side encryption—before uploading it.

Data lifecycle management

By automating the movement and expiration of objects using predefined rules, you can manage the data that is kept in the S3 bucket. After a certain period of time, you can have the data moved automatically to either Glacier or Standard-IA.

Data protection
  • Versioning: Allows you to keep multiple versions of an object. In case of any mishaps, such as accidental deletions or application errors, this feature helps you retrieve the data.
  • MFA Delete: To ensure users don't delete data by mistake, you can enable multi-factor authentication to delete your bucket.
  • Replication: The use of Cross-Region Replication (copies objects to buckets in different regions) and Same-Region Replication (copies objects to buckets in same regions) helps you ensure compliance and enhance disaster recovery strategies.
  • Object Lock: Enforces WORM (Write Once, Read Many) policies to protect data from deletion during retention periods.

How does Amazon S3 work?

Once you sign into AWS, a user can create a bucket and the region it has to be deployed. Once the bucket is created, files can be uploaded. This can be done either through the AWS management console, AWS Command Line Interface, Software Development Kits, or a multipart upload. The user also has to choose the appropriate S3 storage class. Once the upload is complete, users can control and manage permissions to these buckets. Users can define bucket policies to control access to the objects, such as allowing public access or restricting it to specific users. They can also enable lifecycle policies that automate transitions between storage classes or delete old data automatically. Users can also enable features such as versioning, generation of pre-defined URLs (provide temporary access to files without making them publicly available), AWS key management service (for encryption of keys), and event notifications for increased security.

Alternatives to Amazon S3

A few other competitors of Amazon S3 are:

  • Wasabi Hot Cloud Storage
  • Rabata.io Cloud Storage
  • Backblaze B2 Cloud Storage
  • Google Cloud Storage
  • Microsoft Azure Blob Storage
  • DigitalOcean Spaces

Although each product has a unique feature that makes it stand apart, AWS continues to be the market leader for cloud storage in 2024.

Monitoring Amazon S3 bucket activity

Since the S3 bucket stores all of the data, bucket actions must be monitored to maintain security. Log360's cloud monitoring module, Cloud Security Plus, helps keep track of bucket activity in the S3 architecture with reports that cover key actions, such as the creation and deletion of S3 buckets. With appropriate permissions, it also analyzes the logs written to an S3 bucket and provides S3 traffic analysis reports, helping you get complete visibility into all your AWS S3 activities.