AWS Lambda Runtime Migration: One Step Back, Two Steps Forward

The journey towards optimal utilization is not always linear, sometimes requiring us to take a step back before we can move forward.
one step back, two steps forward

Introduction

The dynamic field of software engineering compels us to adapt and grow constantly. When the time arrived to update our Lambda functions from Node.js v14 to v18 and transition from AWS’s aws-sdk v2 to v3, we set out on an exciting journey filled with many learning opportunities. Using a phased approach, we effectively minimized the disruption while notably enhancing the performance. In this article, we share our experiences and insights, hoping to inspire others to embrace change and adapt to new technologies.

A Step-by-Step Strategy

As the deprecation date for Node.js v14 runtime in AWS Lambda functions approached, and with numerous benefits offered by the new modular AWS SDK, we decided to upgrade our Lambda runtime to Node.js v18. Additionally, we aimed to leverage the advantages of AWS SDK v3 in our Lambda functions.

For that, we adopted a two-step process for these updates. Initially, we upgraded from Node.js v14 to Node.js v18 before smoothly transitioning to aws-sdk v3. This approach allowed us to thoroughly examine the impacts of each individual change, effectively leading to accurate troubleshooting and more reliable optimizations.

Facing and Overcoming Challenges

Before moving forward it’s crucial to be aware that AWS Lambda provides different AWS SDKs with the runtime depending on the Node.js version. For Node.js v14, the “aws-sdk” package (v2) is provided, while for Node.js v18, the “@aws-sdk/*” packages (v3) are natively available in Lambda.

This meant that after updating to Node.js v18, we initially bundled “aws-sdk” (v2) with our build. This kept our Lambdas working on the Node.js v18 runtime without necessitating any changes in our product code. Unexpectedly, the Lambda execution time surged from 22ms to 38ms.

A dive into existing research, such as Kyle Higginson’s work, revealed that the bundle size significantly impacts cold start performances. We found that incorporating “aws-sdk” v2 into the bundle increased our bundle size 7.5 times, explaining the performance drop.

To address this, before moving forward with the second phase of our strategy, we increased the memory and CPU resources of our lambda functions. Even though increasing the 192MB maximum memory limit to 256MB didn’t change our allocated vCPU count or the CPU ceiling ratio, it allowed us to provide proportionally more computing power to our service by increasing the CPU cap.

The allocation of CPU cores is based on the amount of memory allocated to the lambda function. The more memory you allocate, the more CPU cores are allocated to your function. Though, this doesn’t mean you will get proportional processing power by the count of cores (ig. you will not be getting 2x processing power between 1768MB and 1770MB). The reason behind is “Adding more memory proportionally increases the amount of CPU, increasing the overall computational power available.” regardless of the allocated CPU core count. The CPU cap means the maximum percentage of a CPU core (vCPU) that can be utilized at any point in time. ”At 1,769 MB, a function has the equivalent of one vCPU (one vCPU-second of credits per second).”, in other words at 1,769 MB the CPU cap is at 100%.

Allocating more computing power allowed us to reduce the execution duration to ~28ms, which was still higher than the original ~22ms, but was acceptable for the time being.

Finally, we moved forward with the second phase of our strategy – adopting the modular aws-sdk v3, which required us to include only necessary functionalities in our applications. On top of that, the modular aws-sdk v3 is already available in the Node.js v18 runtime, so we no longer needed to bundle it with our build.

Following our v3 transition, achieved a notable reduction in the bundle size and started to use modular, focused and more performant AWS SDK packages (from v3) drastically improved our Lambda function’s efficiency by reducing the average execution time from 38ms to an astonishing 5ms.

PS: In our case we only needed to use @aws-sdk/client-dynamodb and @aws-sdk/lib-dynamodb packages.

Conclusion

This phased approach to embracing updates facilitated smooth management of potential disruptions and shed light on the impacts of individual improvements, allowing us to maximize performance gains. As a result, we now have efficient Lambda functions, improved user experiences, and reduced operational costs.

We hope that sharing our experience will inspire other teams to embrace change and adapt to new technologies, reminding them that every challenge encountered could be an opportunity for improvement itself. As we continually learn and grow, it’s clear that our journey towards optimal utilization is not always linear, sometimes requiring us to take a step back before we can move forward.

Share: