Oct 25, 2023
AI Data Analyst in Cloud Sandbox with LangChain & E2B
We are E2B. We provide sandboxed cloud environments for AI-powered apps and agentic workflows. Check out our sandbox runtime for LLMs.
In this guide, we will create an example of a LangChain agent that uses E2B cloud sandbox and GPT-4 to analyze your uploaded data.
See the final guide and code in the official LangChain documentation here.
Why build your own agent with E2B?
E2B's cloud environments are runtime sandboxes for LLMs. They are an ideal fit for building AI assistants like code interpreters or advanced data-analyzing tools. We can use E2B's Data Analysis Sandbox for our use case.
Compared to assistants running their code locally, e.g. via Docker, the Data Analysis Sandbox allows for safe code execution in a remote environment.
That is a secure way to run the unpredictable LLM-generated code on your computer without the potential harm that such code can cause to your machine, e.g., unauthorized access to vulnerable data.
We will create an assistant that will use OpenAI’s GPT-4 and E2B's Data Analysis sandbox to perform analysis on uploaded files using Python.
Let's get to hacking!
Get API keys, import packages
First, we have to ensure that we have the latest version of E2B.
We import E2BDataAnalysisTool
and other necessary modules from LangChain. We get our OpenAI API key here, and our E2B API key here and set them as environment variables.
🔎 Find the full OpenAI API documentation here.
Initialize the E2B tool for the LangChain agent
💡When creating an instance of the E2BDataAnalysisTool
, you can pass callbacks to listen to the output of the sandbox. This is useful, for example, when creating a more responsive UI. Especially with the combination of streaming output from LLMs.
We define a Python function save_artifact, which is used to handle and save charts created by Matplotlib. When a chart is generated with plt.show()
, this function is called to print a message about the newly generated Matplotlib chart, downloads it as bytes, and then saves it to a directory named "charts."
Upload your data file
You can choose your own CSV data file to upload to the E2B sandbox. In our example, we chose a file about Netflix TV shows. You can download the file here.
The following code reads a file named "netflix.csv" from the local file system. It then uses the e2b_data_analysis_tool.upload_file
method to upload the contents of this file to the sandbox and print the path where the file is saved in the sandbox.
▶️ Code output
Create tools and initialize the agent
Now we get to set up the LangChain agent. It will be using GPT-4 and the e2b_data_analysis_tool
we created earlier.
Execute a query
We initiate the execution of the agent with a specific query or task.
▶️ Code output
Sandbox advanced features
E2B also allows you to install both Python and system (via apt
) packages dynamically during runtime like this:
▶️ Code output
Additionally, you can download any file from the sandbox like this:
Lastly, you can run any shell command inside the sandbox via run_command
.
▶️ Code output
Close the sandbox
When your agent is finished, don't forget to close the sandbox.
Output
If you try this example with our Netflix file, the output should be automatically saved into your local directory like this:
See the final guide and code in the official LangChain documentation here. See E2B docs here.
Need help or want to share feedback? Join our Discord.
If you like the guide, please support us with a star on GitHub.
Follow us on X (Twitter).
You can also reach us at hello@e2b.dev.