Apify MCP
Connect to Apify MCP to run web scraping, browser automation, and data extraction Actors directly from AI workflows.
Connect to Apify MCP to discover and run web scraping and data extraction Actors, retrieve Actor run status and output, search the web for AI pipelines, and browse Apify and Crawlee documentation — all from within your AI agent workflows.
Supports authentication: API Key
What you can build with this connector
| Use case | Tools involved |
|---|---|
| Run a web scraper | apifymcp_search_actors → apifymcp_fetch_actor_details → apifymcp_call_actor → apifymcp_get_actor_output |
| Long-running extraction jobs | apifymcp_call_actor (async) → apifymcp_get_actor_run (poll) → apifymcp_get_actor_output |
| Real-time web research for RAG | apifymcp_rag_web_browser → feed Markdown content into LLM context |
| Find the right Actor for a task | apifymcp_search_actors with keywords → apifymcp_fetch_actor_details for input schema |
| Look up Apify or Crawlee docs | apifymcp_search_apify_docs → apifymcp_fetch_apify_docs for full page content |
Key concepts:
- Actors: Serverless cloud applications on the Apify platform. Each Actor has a specific input schema — always call
apifymcp_fetch_actor_detailswithoutput: { inputSchema: true }before calling an Actor. - Sync vs async:
apifymcp_call_actorruns synchronously by default and waits for the result. Passasync: truefor long-running tasks, then poll withapifymcp_get_actor_runand retrieve output withapifymcp_get_actor_output. - Datasets: Actor output is stored in a dataset. Use
apifymcp_get_actor_outputwithfieldsand pagination (limit,offset) to retrieve large result sets efficiently. - RAG web browser:
apifymcp_rag_web_browseris a purpose-built tool for AI pipelines — it queries Google Search, scrapes the top N pages, and returns clean Markdown content ready for LLM grounding.
Set up the agent connector
Section titled “Set up the agent connector”Register your Apify API token with Scalekit so it can authenticate and proxy Actor requests on behalf of your users. Unlike OAuth connectors, Apify MCP uses API token authentication — there is no redirect URI or OAuth flow.
-
Get an Apify API token
-
Go to console.apify.com and sign in or create a free account.
-
In the left sidebar, click your avatar → Settings → API & Integrations → API tokens.
-
Click + Create new token. Give it a name (e.g.,
Agent Auth) and click Create token. -
Copy the token immediately — it will not be shown again.
-
-
Create a connection in Scalekit
-
In Scalekit dashboard, go to Agent Auth → Connections. Find Apify MCP and click Create.
-
Note the Connection name — you will use this as
connection_namein your code (e.g.,apifymcp).
-
-
Add a connected account
Connected accounts link a specific user identifier in your system to an Apify API token. Add them via the dashboard for testing, or via the Scalekit API in production.
Via dashboard (for testing)
-
Open the connection you created and click the Connected Accounts tab → Add account.
-
Fill in:
- Your User’s ID — a unique identifier for this user in your system (e.g.,
user_123) - Apify Token — the token you copied in step 1
- Your User’s ID — a unique identifier for this user in your system (e.g.,
-
Click Save.
Via API (for production)
await scalekit.actions.upsertConnectedAccount({connectionName: 'apifymcp',identifier: 'user_123', // your user's unique IDcredentials: { token: 'apify_api_...' },});scalekit_client.actions.upsert_connected_account(connection_name="apifymcp",identifier="user_123",credentials={"token": "apify_api_..."}) -
Connect a user’s Apify account and run web scraping and data extraction Actors through Scalekit. Scalekit handles token storage and tool execution automatically.
Apify MCP is primarily used through Scalekit tools. Use scalekit_client.actions.execute_tool() to discover Actors, fetch their input schemas, run them, and retrieve output — without handling Apify credentials in your application code.
Tool calling
Use this connector when you want an agent to run web scraping or data extraction tasks using Apify Actors.
- Use
apifymcp_search_actorsto discover Actors for a specific platform or use case before deciding which to run. - Use
apifymcp_fetch_actor_detailsto retrieve an Actor’s input schema before calling it — always passoutput: { inputSchema: true }to keep the response concise. - Use
apifymcp_call_actorto run an Actor synchronously, or withasync: truefor long-running tasks. - Use
apifymcp_get_actor_runto poll the status of an async run, andapifymcp_get_actor_outputto retrieve results once complete. - Use
apifymcp_rag_web_browserwhen you need real-time web content for LLM grounding — it returns clean Markdown from the top search result pages.
import osfrom scalekit.client import ScalekitClient
scalekit_client = ScalekitClient( client_id=os.environ["SCALEKIT_CLIENT_ID"], client_secret=os.environ["SCALEKIT_CLIENT_SECRET"], env_url=os.environ["SCALEKIT_ENV_URL"],)
connected_account = scalekit_client.actions.get_or_create_connected_account( connection_name="apifymcp", identifier="user_123",)
tool_response = scalekit_client.actions.execute_tool( tool_name="apifymcp_fetch_actor_details", connected_account_id=connected_account.connected_account.id, tool_input={ "actor": "apify/web-scraper", },)print("Actor details:", tool_response)import { ScalekitClient } from '@scalekit-sdk/node';import 'dotenv/config';
const scalekit = new ScalekitClient( process.env.SCALEKIT_ENV_URL!, process.env.SCALEKIT_CLIENT_ID!, process.env.SCALEKIT_CLIENT_SECRET!);const actions = scalekit.actions;
const connectedAccount = await actions.getOrCreateConnectedAccount({ connectionName: 'apifymcp', identifier: 'user_123',});
const toolResponse = await actions.executeTool({ toolName: 'apifymcp_fetch_actor_details', connectedAccountId: connectedAccount?.id, toolInput: { actor: 'apify/web-scraper', },});console.log('Actor details:', toolResponse.data);Tool list
Section titled “Tool list”apifymcp_search_actors
Section titled “apifymcp_search_actors”Search the Apify Store to discover Actors for a given use case or platform. Returns Actor names, IDs, descriptions, and usage stats. Does not run any scraping — use this to find the right Actor before calling it.
| Name | Type | Required | Description |
|---|---|---|---|
keywords | string | No | Search terms (e.g., "instagram scraper", "google maps"). Leave empty to browse popular Actors. Default: "" |
limit | integer | No | Number of results to return (1–100). Default: 5 |
offset | integer | No | Number of results to skip for pagination. Default: 0 |
apifymcp_fetch_actor_details
Section titled “apifymcp_fetch_actor_details”Retrieve detailed information about an Actor, including its input schema, README, pricing, and output schema. Always call this before apifymcp_call_actor to understand required and optional input parameters.
| Name | Type | Required | Description |
|---|---|---|---|
actor | string | Yes | The Actor ID or name (e.g., apify/instagram-scraper) |
output.description | boolean | No | Include a short description of the Actor |
output.inputSchema | boolean | No | Include the full JSON input schema — use this before calling the Actor |
output.mcpTools | boolean | No | Include MCP tool definitions for the Actor |
output.metadata | boolean | No | Include Actor metadata (version, author, categories) |
output.outputSchema | boolean | No | Include the output data schema |
output.pricing | boolean | No | Include pricing information |
output.rating | boolean | No | Include user ratings and review count |
output.readme | boolean | No | Include the full README (can be very large — use sparingly) |
output.stats | boolean | No | Include usage statistics (total runs, users) |
apifymcp_call_actor
Section titled “apifymcp_call_actor”Run an Actor from the Apify Store with the specified input. By default runs synchronously and waits for the result. Use async: true for long-running tasks, then track progress with apifymcp_get_actor_run and retrieve output with apifymcp_get_actor_output.
| Name | Type | Required | Description |
|---|---|---|---|
actor | string | Yes | The Actor ID or name to run (e.g., apify/web-scraper) |
input | object | Yes | Input object matching the Actor’s input schema. Fetch the schema first with apifymcp_fetch_actor_details |
async | boolean | No | Set to true to start the run and return immediately without waiting for results. Default: false |
previewOutput | boolean | No | Set to true to include a preview of the output dataset in the response (sync mode only) |
callOptions.memory | integer | No | Memory limit for the run in megabytes (e.g., 256, 512, 1024) |
callOptions.timeout | integer | No | Timeout for the run in seconds |
apifymcp_get_actor_run
Section titled “apifymcp_get_actor_run”Get the current status and metadata for a specific Actor run. Use this to poll an async run until it completes. Returns run status, timestamps, performance stats, and storage resource IDs.
| Name | Type | Required | Description |
|---|---|---|---|
runId | string | Yes | The ID of the Actor run to check (returned by apifymcp_call_actor when async: true) |
apifymcp_get_actor_output
Section titled “apifymcp_get_actor_output”Retrieve output dataset items from a completed Actor run. Supports field selection to reduce response size, and pagination for large datasets.
| Name | Type | Required | Description |
|---|---|---|---|
datasetId | string | Yes | The dataset ID to fetch output from (found in the apifymcp_call_actor response or apifymcp_get_actor_run result as defaultDatasetId) |
fields | string | No | Comma-separated list of fields to include, with dot notation for nested fields (e.g., "title,url,metadata.description"). Returns all fields by default |
limit | number | No | Maximum number of items to return. Default: 100 |
offset | number | No | Number of items to skip for pagination. Default: 0 |
apifymcp_rag_web_browser
Section titled “apifymcp_rag_web_browser”Search Google and scrape the top N result pages, returning clean content for use in AI pipelines and RAG (Retrieval-Augmented Generation) workflows. Can also scrape a specific URL directly by passing it as the query.
| Name | Type | Required | Description |
|---|---|---|---|
query | string | Yes | A Google Search query (e.g., "best vector databases 2025") or a specific URL to scrape directly |
maxResults | integer | No | Number of top search result pages to scrape (default: 3) |
outputFormats | array | No | Content formats to return. Options: "text", "markdown", "html". Default: ["markdown"] |
apifymcp_search_apify_docs
Section titled “apifymcp_search_apify_docs”Search Apify and Crawlee documentation using full-text search. Returns matching page titles, URLs, and snippets. Follow up with apifymcp_fetch_apify_docs to retrieve the full content of a specific page.
| Name | Type | Required | Description |
|---|---|---|---|
query | string | Yes | The search query (e.g., "dataset pagination", "proxy configuration") |
docSource | string | No | Documentation source to search. Options: "apify" (default), "crawlee-js", "crawlee-py" |
limit | number | No | Maximum number of results to return (1–20). Default: 5 |
offset | number | No | Number of results to skip for pagination. Default: 0 |
apifymcp_fetch_apify_docs
Section titled “apifymcp_fetch_apify_docs”Fetch the full content of an Apify or Crawlee documentation page by URL. Use after finding a relevant page with apifymcp_search_apify_docs.
| Name | Type | Required | Description |
|---|---|---|---|
url | string | Yes | The full URL of the documentation page to fetch (e.g., https://docs.apify.com/platform/actors) |