-
Notifications
You must be signed in to change notification settings - Fork 723
Description
Describe the bug
We’ve encountered a significant performance issue when using wr.catalog.get_partitions(database=database_name, table=table_name) in AWS Lambda with Python version 3.12.
With Python 3.9 and 3.10 (using the corresponding awswrangler layers), the function executes in approximately 3.5 seconds.
While in Python 3.12, the first iteration takes around 200 seconds
How to Reproduce
Change lambda from version 3.9 to 3.12
Delete layer 3.9 (arn:aws:lambda:eu-central-1:336392948345:layer:AWSDataWrangler-Python39:1)
Add layer for 3.12 (arn:aws:lambda:eu-central-1:336392948345:layer:AWSSDKPandas-Python312:19)
Expected behavior
The expected behavior is that wr.catalog.get_partitions(database=..., table=...) should execute consistently and efficiently across supported Python versions in AWS Lambda. Specifically, it should return partition data within a few seconds (e.g., ~3.5s as observed in Python 3.9,3.10), without significant delays or performance degradation.
Your project
No response
Screenshots
No response
OS
Windows
Python version
3.12
AWS SDK for pandas version
arn:aws:lambda:eu-central-1:336392948345:layer:AWSSDKPandas-Python312:19
Additional context
No response