S3 object count with Lambda

How to Monitor Real-Time Object Count in an S3 Bucket Using AWS CloudWatch

Introduction

Managing and monitoring your resources on AWS is crucial for maintaining an efficient and cost-effective cloud environment. For AWS S3 buckets, one important metric you may want to monitor is the number of objects stored in the bucket. Monitoring this can help you manage storage costs, ensure compliance with storage limits, and understand your data growth over time.

In this guide, we’ll walk you through how to set up an AWS Lambda function and a CloudWatch dashboard to monitor the real-time object count in a single S3 bucket. This is aimed at novice AWS users who want to gain better visibility into their S3 usage.

Why Monitor S3 Object Count?

Monitoring the number of objects in your S3 bucket is important for several reasons:

  1. Cost Management: AWS charges based on the amount of data stored. By monitoring object count, you can keep an eye on your storage usage and manage costs effectively.
  2. Data Growth Tracking: Understanding how your data is growing over time can help with capacity planning and identifying unusual spikes in data storage.
  3. Compliance and Management: If you’re required to keep track of how much data you’re storing, this setup provides a clear view into your bucket’s usage.

Prerequisites

Before you begin, you should have the following:

  1. AWS Account: If you don’t have one, you can sign up at aws.amazon.com.
  2. Basic Knowledge of AWS Console: Familiarity with navigating the AWS Management Console will be helpful.
  3. An S3 Bucket: Make sure you have an S3 bucket that you want to monitor.

Step 1: Create a Lambda Function

AWS Lambda allows you to run code without provisioning or managing servers. We’ll use it to count the objects in your S3 bucket and push this data to CloudWatch.

1.1 Write the Lambda Function Code

Here’s a Python script that you can use in your Lambda function:

import boto3
import os

def lambda_handler(event, context):
    s3 = boto3.client('s3')
    cloudwatch = boto3.client('cloudwatch')

    bucket_name = os.environ['S3_BUCKET_NAME']  # Single bucket name from environment variable
    paginator = s3.get_paginator('list_objects_v2')

    object_count = 0
    for page in paginator.paginate(Bucket=bucket_name):
        if 'Contents' in page:
            object_count += len(page['Contents'])

    response = cloudwatch.put_metric_data(
        Namespace='S3/ObjectCount',
        MetricData=[
            {
                'MetricName': 'ObjectCount',
                'Dimensions': [
                    {
                        'Name': 'BucketName',
                        'Value': bucket_name
                    }
                ],
                'Value': object_count,
                'Unit': 'Count'
            },
        ]
    )

    return {
        'statusCode': 200,
        'body': f'Pushed {object_count} objects to CloudWatch for bucket {bucket_name}.'
    }

This function:

  • Counts the number of objects in the specified S3 bucket.
  • Pushes the count to a custom CloudWatch metric called ObjectCount under the namespace S3/ObjectCount.

1.2 Deploy the Lambda Function

  1. Go to the AWS Management Console and navigate to Lambda.
  2. Click Create function.
  3. Choose Author from scratch.
  4. Give your function a name (e.g., S3ObjectCountMonitor).
  5. Choose Python 3.x as the runtime.
  6. Under Permissions, select an existing role or create a new one that has permissions to read from S3 and write to CloudWatch.
  7. Paste the above code into the function code editor.
  8. Set the environment variable S3_BUCKET_NAME to the name of your S3 bucket:
  • Scroll down to Environment variables and click Add environment variable.
  • Set the key to S3_BUCKET_NAME and the value to your bucket name (e.g., my-s3-bucket).
  1. Click Deploy to save and deploy your function.

Step 2: Schedule the Lambda Function

You’ll need to run this Lambda function periodically to keep the CloudWatch metric updated.

  1. Go to CloudWatch in the AWS Management Console.
  2. Navigate to Rules under Events and click Create rule.
  3. Choose Event Source as EventBridge (CloudWatch Events) and select Create Rule.
  4. Set the Event Source to Schedule.
  5. Set the Rule Name and Description.
  6. Set a schedule expression (e.g., rate(5 minutes) to run the function every 5 minutes).
  7. Add a target, select Lambda function, and choose the Lambda function you created earlier.

Step 3: Create a CloudWatch Dashboard

Next, you’ll create a CloudWatch dashboard to visualize the object count.

3.1 Create the Dashboard

  1. Go to CloudWatch in the AWS Management Console.
  2. Navigate to Dashboards and click Create dashboard.
  3. Give your dashboard a name (e.g., S3ObjectCountDashboard).

3.2 Add a Metric Widget

  1. Click Add widget.
  2. Choose Line for the widget type.
  3. Click Select metric.
  4. Navigate to Custom Namespaces > S3/ObjectCount.
  5. Select ObjectCount and choose the correct bucket from the dimension dropdown.
  6. Customize the graph as needed and click Create widget.

Conclusion

By following this guide, you’ve set up a Lambda function to track the number of objects in your S3 bucket and visualized this data using a CloudWatch dashboard. This setup helps you monitor your S3 usage effectively, ensuring better cost management and data tracking.

As you grow more familiar with AWS, you’ll find many other useful metrics and insights you can generate using Lambda and CloudWatch. Happy monitoring!

Other Recent Posts