In this blog article, we will look at how we can use the AWS S3 Lifecycle configuration rule to auto-delete objects within a given bucket to save on storage costs.
On one of my weekend projects, I have a set of collections, each collection can have items ranging from 50 to 20k. Each item has an image attached to it, which is obviously currently the third-party hosted image over HTTP. (I don't own any data of this collection, however, I enrich and present a meaningful dashboard out of it.)
Each collection is not disclosed to the public immediately, it opens to view at a specific date and time as per the owner of the collection. Once opened, users are interested to view items in the collection, specifically, items are ordered by their value. So first 100 items in the collection are more viewed as they are highly valued, so rather than storing all images for all items in collections to save cost, we load images in the S3 bucket using on-demand image transformation (as original images are in MB). So next time different users view the same item in the collection which is 99% of the time, we load images from our own S3 bucket rather than the original HTTP source of the image. However, as the collection becomes older from the day it's open to users, users don't have much interest in it. So rather than keeping those images in our S3 bucket which are kind of redundant, we thought to auto-delete (You can archive too, but it's a case-to-case basis).
So we thought to use the S3 Lifecycle configuration rule to remove objects (which are images in our case) older than 3 days to save on S3 storage. (Days are configurable)
- AWS S3 Bucket.
AWS S3 Life Cycle Configuration Rule
Setting up auto delete or removal of S3 objects after a certain period of creation is fairly straightforward. Let's dive in quickly.
First, let's visit our AWS Console and Search for S3 in services
On S3 selection, it will take you to the S3 buckets list, showing buckets you have access to. We have 18 buckets, will filter on the bucket, we are interested to put a Retention policy on.
Just update my table by sorting objects on Last Modified to older one first so we know if a rule is working fine whenever it runs.
Now go to the
Management Tab Menu under your bucket. No rules are defined by default.
Let's create our first rule, by Clicking on Create Lifecycle Rule.
We fill out a form for Rule Creation:
- Name: A meaningful name to differentiate multiple rules.
- Rule Scope: Entire Bucket or Bucket with some filters like Size, Type, etc.
- Rule Actions: We don't have versioning, so the current version is always. You can set it up as per use-case.
- Expiry Period: Based on Rule actions, options may change. In our case, we specify retention of 7 days.
Once all are filled, we Click on Create Rule. Ting! The rule is created and Active with ease.
S3 lifecycle processing runs at 00:00 UTC daily, all objects in the bucket that match the rule are marked. The objects are marked for the action defined in the rule, and the expiration or transition of objects occurs asynchronously.
Thank you for reading, If you have reached it so far, please like the article, It will encourage me to write more such articles. Do share your valuable suggestions, I appreciate your honest feedback and suggestions!