Skip to content

After some number of invocations, importing a module fails, and then fails for all subsequent invocations #243

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
gavinmh opened this issue Sep 14, 2018 · 9 comments

Comments

@gavinmh
Copy link

gavinmh commented Sep 14, 2018

This plugin is very helpful, thank you.

I have a Lambda function that is invoked every 30 minutes. After deployment, the function operates as expected. After some number of invocations, it fails with the following:

Unable to import module 'lambda_handler': No module named 'sklearn'

Every subsequent invocation then fails. After redeploying, it operates correctly again.

The following are my requirements.txt:

psycopg2-binary==2.7.5
SQLAlchemy==1.2.11
numpy==1.14.5
pandas==0.23.1
boto3==1.7.52
requests==2.18.4
nose==1.3.7
requests-mock==1.5.2
scikit-learn==0.19.1
scipy==1.1.0
xgboost==0.72.1
marshmallow==2.15.4

My lambda_handler.py starts with:

try:
    import unzip_requirements
except ImportError:
    pass

The following is from my serverless.yml:

plugins:
  - serverless-python-requirements

custom:
  pythonRequirements:
    zip: true
    dockerizePip: true
    slim: true
    noDeploy:
      - nosetests
      - requests-mock

package:
  include:
    - requirements.txt
  exclude:
    - aws-deployment*
    - .dockerignore
    - Dockerfile
    - README.md
    - .gitignore
    - venv/**
    - test/**

Any help is appreciated.

@AndrewFarley
Copy link
Contributor

I'm not going to look into this right away, but for me or someone else, can you provide a little more context/details from the CloudWatch Logs? Specifically, I'm curious if this problem happens on a Lambda that is freshly booted, or if it's using a previously started/re-used container.

To help detect this, add a print('container_started') somewhere in your top-level file not inside the lambda_handler function and then get us some cloudwatch logs. Thanks!

@dschep
Copy link
Contributor

dschep commented Sep 14, 2018

To add to @AndrewFarley's suggestion, make sure you put that print statement before the unzip_requirements import.

I'd guess that something is causing it to run out of disk space, but I reviewed the source for unzip_requirements, so I'm not sure why that would happen 😖

@gavinmh
Copy link
Author

gavinmh commented Sep 14, 2018

Thanks @AndrewFarley and @dschep . I added print('container_started') before the unzip_requirements import. I'll post logs when they're available.

@petergaultney
Copy link

petergaultney commented Sep 20, 2018

Here are my own logs, and the code at the top of the handler file:

#!/usr/bin/env python

import argparse
from datetime import datetime
import logging
from typing import List

# this helps us use zipped dependencies on Lambdas,
# to work around numpy/scipy size requirements
# It must run before we import any non standard library requirements on AWS
try:
    print('attempting to unzip_requirements...')
    import subprocess
    print(subprocess.check_output(['df']))
    import unzip_requirements  # noqa
    print('succesfully imported unzip_requirements to prepare the zipped requirements')
except ImportError:
    print('failed to import unzip_requirements - if you are running locally this is not a problem')

logs:

attempting to unzip_requirements...

Filesystem 1K-blocks Used Available Use% Mounted on
/dev/xvda1 8123812 3896308 4127256 49% /
/dev/loop0 538424 440 526148 1% /tmp
/dev/loop1 146176 146176 0 100% /var/task

START RequestId: 0cfac345-bc56-11e8-a40e-73543f84c427 Version: $LATEST
module initialization error: [Errno 28] No space left on device
END RequestId: 0cfac345-bc56-11e8-a40e-73543f84c427

I think unzip_requirements is supposed to do do so on /tmp ? I may try to catch the other error and see what df shows after unzip_requirements tries and fails to run...

@gavinmh
Copy link
Author

gavinmh commented Sep 20, 2018

I discovered that I was mistakenly using version 3.3.1. I have not experienced this issue since upgrading to 4.2.1 five days ago. I apologize for the inconvenience.

@petergaultney
Copy link

I'm on 4.1.1.

@petergaultney
Copy link

FWIW, I've determined that this is simply the 512 MB limit of /tmp and has nothing to do with serverless-python-requirements.

It would be great if a solution could be devised where the built-in zipimport was used instead of forcing an unzip, but in my case this wouldn't work anyway, since I have .so modules as a result of using Spacy, Scipy, and Numpy.

@gavinmh gavinmh closed this as completed Sep 20, 2018
@dschep
Copy link
Contributor

dschep commented Sep 20, 2018

Yup, .so files is exactly why I don't use zipimport 😞

@sjl070707
Copy link

sjl070707 commented Apr 17, 2019

@gavinmh
Were you able to include all the packages in your requirements?
I'm facing a different issue.
As soon as I include Xgboost in the requirement.txt
It throws size limit of 80MB error.

#350

EDIT:

Just found out that you need to specify the xgboost version, otherwise it will throw the size limit error. dunno why.. but it's fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants