Browserbase
Browserbase is a developer platform to reliably run, manage, and monitor headless browsers.
Power your AI data retrievals with:
- Serverless Infrastructure providing reliable browsers to extract data from complex UIs
- Stealth Mode with included fingerprinting tactics and automatic captcha solving
- Session Debugger to inspect your Browser Session with networks timeline and logs
- Live Debug to quickly debug your automation
Installation and Setup
- Get an API key and Project ID from browserbase.com and set it in environment variables (
BROWSERBASE_API_KEY
,BROWSERBASE_PROJECT_ID
). - Install the Browserbase SDK:
%pip install browserbase
Loading documents
You can load webpages into LangChain using BrowserbaseLoader
. Optionally, you can set text_content
parameter to convert the pages to text-only representation.
import os
from langchain_community.document_loaders import BrowserbaseLoader
load_dotenv()
BROWSERBASE_API_KEY = os.getenv("BROWSERBASE_API_KEY")
BROWSERBASE_PROJECT_ID = os.getenv("BROWSERBASE_PROJECT_ID")
API Reference:BrowserbaseLoader
loader = BrowserbaseLoader(
api_key=BROWSERBASE_API_KEY,
project_id=BROWSERBASE_PROJECT_ID,
urls=[
"https://example.com",
],
# Text mode
text_content=False,
)
docs = loader.load()
print(docs[0].page_content[:61])
Loader Options
urls
Required. A list of URLs to fetch.text_content
Retrieve only text content. Default isFalse
.api_key
Browserbase API key. Default isBROWSERBASE_API_KEY
env variable.project_id
Browserbase Project ID. Default isBROWSERBASE_PROJECT_ID
env variable.session_id
Optional. Provide an existing Session ID.proxy
Optional. Enable/Disable Proxies.
Related
- Document loader conceptual guide
- Document loader how-to guides