This is Part 2 of my Serverless Webhooks post. You can find Part 1 here.

In this continuation of the Serverless Webhooks design using AWS Lambda, we will look at processing the ingested data using containers. Lambda as a function is good to accept the data payload but if the data requires some heavy computation for instance, a running process on a container would be a better and more practical approach.

Architecture

architecture-container

In the above diagram, we enhanced the architecture to do the data processing via container (see the blue arrows). Instead of the Lambda function directly storing to DynamoDB, it can send the data or the data id to a message queue like AWS SQS. The application poc-data-processor running on the ECS container regularly polls the SQS poc-data-feed-queue for new messages and process the incoming data before storing to the DynamoDb table.

For Part 2, we will focus on the SQS and how to set it up with our Lambda function.

SQS setup

Linking the Lambda function to the data processor on the ECS container requires an integration service like SQS. The SQS queue will asynchronously accept the incoming data, hold it until the container app can process it. In this way, the Lambda function can terminate immediately and need not wait for the entire data processing to complete.

Remember that AWS Lambda functions are charged by the running time, so it is best to not use it for long-running processes like heavy data analysis and calculations.

Creating the queue is straight-forward. Just go to AWS Simple Queue Service and choose the Standard Queue type. Enter the queue name as poc-data-feed-queue.

aws-sqs

Update the Lambda function to send the data to the new queue

Update the Lambda function poc-data-feed-handler created in Part 1.

To keep the previous Lambda function version, in the Action section of the Lambda in the AWS console, click on Publish new version.

Add another boto client for SQS.

boto3.client('sqs')

Get the queue url of the newly created SQS queue. This is required by the boto client when sending the message. It will look something like https://sqs.yourregion.amazonaws.com/youraccount/poc-data-feed-queue.

Below is the updated Python script that sends the data to the SQS queue. The function to invoke is send_message where the parameters queue url, message attributes and message body are passed.

import json
import boto3
import uuid

def lambda_handler(event, context):
    request = json.loads(event['body'])
    txn_id = uuid.uuid4()
    
    item = {
        'uuid': {'S': str(txn_id)},
        'name': {'S': request['name']},
        'age': {'N': str(request['age'])},
        'emailId': {'S': str(request['emailId'])},
        'referralId': {'S': str(request['referralId'])},
        'status': {'S': "PENDING"}
    }
    
    dynamodb = boto3.client('dynamodb')
    dynamodb.put_item(TableName='poc_data_feed', Item=item)

    referral_id = request['referralId']
    sqs = boto3.client('sqs')
    queue_url = 'https://sqs.yourregion.amazonaws.com/youraccount/poc-data-feed-queue'
    response = sqs.send_message(
        QueueUrl = queue_url,
        MessageAttributes = {
            'name': {
                'DataType': 'String',
                'StringValue': request['name']
            }        
        },
        MessageBody = str(txn_id)
    )

    return {
        'statusCode': 200,
        'body': json.dumps({
            "txnId": str(txn_id),            
            "message": "Data received."
        })
    }

One thing to take note, we are only passing the txn_id to the SQS queue and not the entire data payload. The Lambda function still stores the raw data payload direct to the DynamoDB table and just pass a unique id like the txn_id to the queue. This is usually the case when the payload is large and it needs to persist the data first. It’s also not practical to store the entire payload to the queue. It is up to the processor to pick the entire data from the DynamoDB table based on the the txn_id it got from the queue.

Before testing the function, make sure to add the SQS policy so that the Lambda function can access the SQS queue. For simplicity since this is just a PoC, we will use the managed policy AmazonSQSFullAccess and give our Lambda function full access.

sqs-policy

Test the Lambda and SQS connectivity

Using the same request format we used in Part 1, trigger a request to the Lambda via the API Gateway. Refer to the sample Postman request and response below.

postman-sqs

Let’s verify if the message is passed to the SQS queue. Open the queue in the AWS console and poll for messages. You will see the newly passed data in the SQS with the message body as the txn_id.

sqs-poll-message

Our Lambda function can now store data to both DynamoDB and SQS queue where the message in the SQS queue is waiting to be consumed by the processor container.

In the next part, we will show how to build the data processor application using Spring and consume the SQS message.