API Documentation

One POST endpoint to start a job, then poll, fetch, and export the results — public reference, no login required.

Introduction

All requests go to the base URL below. Authenticate every request with a bearer token — your API key from the Developers page.

Base URLhttps://api.pulldata.com
Auth headerAuthorization: Bearer pd_live_...

POST/jobs

Start a scraper job. Pass a scraperId and the scraper's inputParams. Returns a job_id you can poll.

cURL · POST /jobs
curl -X POST https://api.pulldata.com/jobs \
  -H "Authorization: Bearer pd_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "scraperId": "google-maps",
    "inputParams": {
      "categories": ["restaurant"],
      "locations": ["Austin, TX"],
      "max_total_results": 200,
      "scrape_emails": "yes",
      "require_website": false
    }
  }'
200 OK
Response
{
  "job_id": "run_8f3k2a",
  "status": "queued"
}

GET/jobs/{id}/poll-results

Poll a running or finished job. Returns the current status, the running total, wallet amount deducted, per-dollar rate, and the page of results for the given window.

cURL · GET /jobs/{id}/poll-results
curl https://api.pulldata.com/jobs/run_8f3k2a/poll-results?limit=100&offset=0 \
  -H "Authorization: Bearer pd_live_..."
200 OK
Response
{
  "status": "running",
  "total": 47,
  "amount_deducted": "$0.71",
  "amount_deducted_cents": 71,
  "billing": {
    "ratePerRowUsd": 0.015,
    "ratePerDollarRows": 66
  },
  "results": [
    {
      "id": 1,
      "name": "Alamo Plumbing Co.",
      "phone": "(512) 555-0142",
      "website": "alamoplumbing.com"
    }
  ]
}

GET/jobs/{id}/results

Fetch a single page of stored results. Paginate with limit and offset.

cURL · GET /jobs/{id}/results
curl https://api.pulldata.com/jobs/run_8f3k2a/results?limit=100&offset=0 \
  -H "Authorization: Bearer pd_live_..."

GET/jobs/{id}/export

Download a finished job as a file. The format query supports only csv and json.

cURL · GET /jobs/{id}/export
curl -L https://api.pulldata.com/jobs/run_8f3k2a/export?format=csv \
  -H "Authorization: Bearer pd_live_..." \
  -o results.csv

Scrapers & wallet rates

Each scraper charges a fixed USD wallet amount per returned row. Failed runs deduct $0.00.

scraperIdScraperUSD / rowRows / $1
google-mapsGoogle Maps Leads$0.01566
google-search-resultsGoogle Search Results$0.02050
contact-detailsContact Details$0.05020
airbnb-listingsAirbnb Listings$0.03033

POST/scrapers/community-submissions

Developers can submit scrapers for review with a manifest, config schema, source URL, review worker URL, pricing, and test evidence. Passing submissions can be approved by PullData and published into the same scraper library.

cURL · Submit scraper
curl -X POST https://api.pulldata.com/scrapers/community-submissions \
  -H "Authorization: Bearer pd_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "slug": "example-directory",
    "name": "Example Directory Scraper",
    "description": "Extracts public listings from Example Directory with stable pagination.",
    "costPerRow": 2,
    "workerUrl": "https://example-scraper.yourdomain.com",
    "sourceUrl": "https://github.com/your-org/example-directory-scraper",
    "manifest": {
      "runtime": "python",
      "healthEndpoint": "/health",
      "asyncJobEndpoint": "/scrape-client-api",
      "resultEndpoint": "/job/{jobId}"
    },
    "configSchema": {
      "fields": [
        { "key": "query", "type": "text", "required": true }
      ]
    },
    "testCommand": "pytest"
  }'

Rate limits

Limits come from the plan that owns the key — requests per minute, rows per second, and concurrent jobs. When you exceed a limit, the API returns 429 Too Many Requests. See your current limits on the Developers page.

Rate120 req/min
Throughput20 rows/s
Concurrency5 jobs