Skip to content
Talk to an Engineer Dashboard

Apify MCP

Connect to Apify MCP to run web scraping, browser automation, and data extraction Actors directly from AI workflows.

Connect to Apify MCP to discover and run web scraping and data extraction Actors, retrieve Actor run status and output, search the web for AI pipelines, and browse Apify and Crawlee documentation — all from within your AI agent workflows.

Apify logo

Supports authentication: API Key

What you can build with this connector
Use caseTools involved
Run a web scraperapifymcp_search_actorsapifymcp_fetch_actor_detailsapifymcp_call_actorapifymcp_get_actor_output
Long-running extraction jobsapifymcp_call_actor (async) → apifymcp_get_actor_run (poll) → apifymcp_get_actor_output
Real-time web research for RAGapifymcp_rag_web_browser → feed Markdown content into LLM context
Find the right Actor for a taskapifymcp_search_actors with keywords → apifymcp_fetch_actor_details for input schema
Look up Apify or Crawlee docsapifymcp_search_apify_docsapifymcp_fetch_apify_docs for full page content

Key concepts:

  • Actors: Serverless cloud applications on the Apify platform. Each Actor has a specific input schema — always call apifymcp_fetch_actor_details with output: { inputSchema: true } before calling an Actor.
  • Sync vs async: apifymcp_call_actor runs synchronously by default and waits for the result. Pass async: true for long-running tasks, then poll with apifymcp_get_actor_run and retrieve output with apifymcp_get_actor_output.
  • Datasets: Actor output is stored in a dataset. Use apifymcp_get_actor_output with fields and pagination (limit, offset) to retrieve large result sets efficiently.
  • RAG web browser: apifymcp_rag_web_browser is a purpose-built tool for AI pipelines — it queries Google Search, scrapes the top N pages, and returns clean Markdown content ready for LLM grounding.

Register your Apify API token with Scalekit so it can authenticate and proxy Actor requests on behalf of your users. Unlike OAuth connectors, Apify MCP uses API token authentication — there is no redirect URI or OAuth flow.

  1. Get an Apify API token

    • Go to console.apify.com and sign in or create a free account.

    • In the left sidebar, click your avatar → SettingsAPI & IntegrationsAPI tokens.

    • Click + Create new token. Give it a name (e.g., Agent Auth) and click Create token.

    • Copy the token immediately — it will not be shown again.

  2. Create a connection in Scalekit

    • In Scalekit dashboard, go to Agent AuthConnections. Find Apify MCP and click Create.

    • Note the Connection name — you will use this as connection_name in your code (e.g., apifymcp).

  3. Add a connected account

    Connected accounts link a specific user identifier in your system to an Apify API token. Add them via the dashboard for testing, or via the Scalekit API in production.

    Via dashboard (for testing)

    • Open the connection you created and click the Connected Accounts tab → Add account.

    • Fill in:

      • Your User’s ID — a unique identifier for this user in your system (e.g., user_123)
      • Apify Token — the token you copied in step 1
    • Click Save.

    Via API (for production)

    await scalekit.actions.upsertConnectedAccount({
    connectionName: 'apifymcp',
    identifier: 'user_123', // your user's unique ID
    credentials: { token: 'apify_api_...' },
    });

Connect a user’s Apify account and run web scraping and data extraction Actors through Scalekit. Scalekit handles token storage and tool execution automatically.

Apify MCP is primarily used through Scalekit tools. Use scalekit_client.actions.execute_tool() to discover Actors, fetch their input schemas, run them, and retrieve output — without handling Apify credentials in your application code.

Tool calling

Use this connector when you want an agent to run web scraping or data extraction tasks using Apify Actors.

  • Use apifymcp_search_actors to discover Actors for a specific platform or use case before deciding which to run.
  • Use apifymcp_fetch_actor_details to retrieve an Actor’s input schema before calling it — always pass output: { inputSchema: true } to keep the response concise.
  • Use apifymcp_call_actor to run an Actor synchronously, or with async: true for long-running tasks.
  • Use apifymcp_get_actor_run to poll the status of an async run, and apifymcp_get_actor_output to retrieve results once complete.
  • Use apifymcp_rag_web_browser when you need real-time web content for LLM grounding — it returns clean Markdown from the top search result pages.
examples/apifymcp_fetch_actor_details.py
import os
from scalekit.client import ScalekitClient
scalekit_client = ScalekitClient(
client_id=os.environ["SCALEKIT_CLIENT_ID"],
client_secret=os.environ["SCALEKIT_CLIENT_SECRET"],
env_url=os.environ["SCALEKIT_ENV_URL"],
)
connected_account = scalekit_client.actions.get_or_create_connected_account(
connection_name="apifymcp",
identifier="user_123",
)
tool_response = scalekit_client.actions.execute_tool(
tool_name="apifymcp_fetch_actor_details",
connected_account_id=connected_account.connected_account.id,
tool_input={
"actor": "apify/web-scraper",
},
)
print("Actor details:", tool_response)

Search the Apify Store to discover Actors for a given use case or platform. Returns Actor names, IDs, descriptions, and usage stats. Does not run any scraping — use this to find the right Actor before calling it.

NameTypeRequiredDescription
keywordsstringNoSearch terms (e.g., "instagram scraper", "google maps"). Leave empty to browse popular Actors. Default: ""
limitintegerNoNumber of results to return (1–100). Default: 5
offsetintegerNoNumber of results to skip for pagination. Default: 0

Retrieve detailed information about an Actor, including its input schema, README, pricing, and output schema. Always call this before apifymcp_call_actor to understand required and optional input parameters.

NameTypeRequiredDescription
actorstringYesThe Actor ID or name (e.g., apify/instagram-scraper)
output.descriptionbooleanNoInclude a short description of the Actor
output.inputSchemabooleanNoInclude the full JSON input schema — use this before calling the Actor
output.mcpToolsbooleanNoInclude MCP tool definitions for the Actor
output.metadatabooleanNoInclude Actor metadata (version, author, categories)
output.outputSchemabooleanNoInclude the output data schema
output.pricingbooleanNoInclude pricing information
output.ratingbooleanNoInclude user ratings and review count
output.readmebooleanNoInclude the full README (can be very large — use sparingly)
output.statsbooleanNoInclude usage statistics (total runs, users)

Run an Actor from the Apify Store with the specified input. By default runs synchronously and waits for the result. Use async: true for long-running tasks, then track progress with apifymcp_get_actor_run and retrieve output with apifymcp_get_actor_output.

NameTypeRequiredDescription
actorstringYesThe Actor ID or name to run (e.g., apify/web-scraper)
inputobjectYesInput object matching the Actor’s input schema. Fetch the schema first with apifymcp_fetch_actor_details
asyncbooleanNoSet to true to start the run and return immediately without waiting for results. Default: false
previewOutputbooleanNoSet to true to include a preview of the output dataset in the response (sync mode only)
callOptions.memoryintegerNoMemory limit for the run in megabytes (e.g., 256, 512, 1024)
callOptions.timeoutintegerNoTimeout for the run in seconds

Get the current status and metadata for a specific Actor run. Use this to poll an async run until it completes. Returns run status, timestamps, performance stats, and storage resource IDs.

NameTypeRequiredDescription
runIdstringYesThe ID of the Actor run to check (returned by apifymcp_call_actor when async: true)

Retrieve output dataset items from a completed Actor run. Supports field selection to reduce response size, and pagination for large datasets.

NameTypeRequiredDescription
datasetIdstringYesThe dataset ID to fetch output from (found in the apifymcp_call_actor response or apifymcp_get_actor_run result as defaultDatasetId)
fieldsstringNoComma-separated list of fields to include, with dot notation for nested fields (e.g., "title,url,metadata.description"). Returns all fields by default
limitnumberNoMaximum number of items to return. Default: 100
offsetnumberNoNumber of items to skip for pagination. Default: 0

Search Google and scrape the top N result pages, returning clean content for use in AI pipelines and RAG (Retrieval-Augmented Generation) workflows. Can also scrape a specific URL directly by passing it as the query.

NameTypeRequiredDescription
querystringYesA Google Search query (e.g., "best vector databases 2025") or a specific URL to scrape directly
maxResultsintegerNoNumber of top search result pages to scrape (default: 3)
outputFormatsarrayNoContent formats to return. Options: "text", "markdown", "html". Default: ["markdown"]

Search Apify and Crawlee documentation using full-text search. Returns matching page titles, URLs, and snippets. Follow up with apifymcp_fetch_apify_docs to retrieve the full content of a specific page.

NameTypeRequiredDescription
querystringYesThe search query (e.g., "dataset pagination", "proxy configuration")
docSourcestringNoDocumentation source to search. Options: "apify" (default), "crawlee-js", "crawlee-py"
limitnumberNoMaximum number of results to return (1–20). Default: 5
offsetnumberNoNumber of results to skip for pagination. Default: 0

Fetch the full content of an Apify or Crawlee documentation page by URL. Use after finding a relevant page with apifymcp_search_apify_docs.

NameTypeRequiredDescription
urlstringYesThe full URL of the documentation page to fetch (e.g., https://docs.apify.com/platform/actors)