Missing .SO files in Lambda functions

M

Most of the time, adding a python package to a Lambda function is a simple task. You pip install to a directory, and then copy that directory to the function either directly or through a lambda layer.

But sometimes, there’s extra work required.

Packages like opencv install additional files on your system that aren’t available in the same directory you pip installed into. When you pip install opencv-python-headless, additional .so files are downloaded to special directories in your environment to provide the openCV functionality.

So if you copied across the contents of your just your pip installed directory, you’ll run into a “libgthread-2.0.so.0: cannot open shared object file” error.

Klayers had 3 open issues for opencv around missing .so files, and today I decided to fix them, and write about how you can do the same.

Step 1: Get a Lambda Environment

In order to get the missing .so files, we’ll need to replicate the environment in which lambda functions run. Remember, Lambda functions run in a Amazon Linux 2 firecracker container, and in order for us to get the specific .so file (that will work), we’ll need replicate this environment.

Fortunately, the folks over at lambci have a solution. They publish docker containers that mimic the environment of a lambda function. By using their containers, we’re able to docker run a lambda-equivalent environment on my macbook.

First we pull down the relevant container:

$ docker pull lambci/lambda:build-python3.8

Next log into that container, and pip install opencv-python-headless

$ docker run -ti lambci/lambda:build-python3.8 /bin/bash

bash-4.2# pip install opencv-python-headless

Great, we’ve got it working. Now we need to copy out those .so files from our container to the local machine. At this point, we’ll need to keep the container running until we’ve copied out the required files.

Step 2: Get the .sO file

First let’s locate where the relevant .so files are. We know from the error message that we’re looking for libgthread-2.0.so.0, but where is this file located?

Most of the time it’ll be in /usr/lib64, and we can verify this, by listing out the directory and greeping for the specific name:

So the file is here, but it also happens to be linked to another file libgthread-2.0.so.0.5600.1. I’m no Linux expert but I know enough that it’ll take both these files to make our lambda function work. So if we copy out these files to our local machine, and we should be good right?

Well…not so.

You see libgthread might need other dependencies as well. In order to ensure we copy out all the required dependencies, we run the ldd command on the file to see if there’s any other required files.

Once we identify the right files to copy out, we can copy them across by running the docker copy command.

The docker copy command allows us to copy files from a locally running container to our host machine, by following the simple syntax of:

$ docker cp <container>:<src-path> <local-dest-path> 

Note: in order to get the name of our running container, we need to run the docker ps command, in the example below, the name of our container is quirky_colden, docker containers usually have these odd 2-word names to allow for easy identification.

Now it’s a simple matter of copying out the right files to our local machine, for example:

$ docker cp quirky_colden:/usr/lib64/libpcre.so.1.2.0 .

And we should have a folder full of required files like so:

Step 3: Package .so Files as a layer

Once our files are ready, we need to get them ready as a layer. To do this, we place them all in a specific folder called lib and then zip the lib folder.

This is important, as layers are extracted into the /opt directory of our lambda function. And since /opt/lib is a special directory where Amazon Linux will look for .so files, we need to ensure that the .so files are correctly stored in a lib directory — no other directory name will work.

Note: the locations where Linux will look for .so files is specified by the LD_LIBRARY_PATH environment variable. In theory you could modify this variable, and place the files anywhere you want, but following convention saves us multiple steps of work.

Then we upload the zip file as a layer through the console (or CLI if you prefer).

Finally we can test the functionality, by creating a function that references both layers.

Conclusion

It is a bit manual of a process to copy across .so files, but fortunately with tools like lambci and docker, this is a bit easier than it would have been.

It’s also a one-time affair, as these .so files rarely change.

Hopefully this post helps you, and if you want to use opencv in a python lambda, both the package and the .so files are available as publicly available layers from Klayers.

Add comment

Astound us with your intelligence