Scrape Jobs

Start async scrape jobs to collect fresh feature requests. Jobs run in the background and can be polled for progress.

Start a Scrape Job

POST/api/v1/scrape

Start an async scrape job to collect fresh feature requests from Reddit, X, and GitHub.

Request Body

FieldTypeRequiredDescription
topicstringrequiredTopic to scrape (2-100 characters)
platformsarrayoptionalPlatforms to scrape (default: reddit,x,github)

Notes

  • Returns immediately with a jobId - poll GET /api/v1/scrape/{jobId} for status
  • Counts against your realtime quota (scraping is expensive)
  • Platforms default to all three if not specified

Response Example

200 OK
{
  "data": {
    "jobId": "jh7abc123def456",
    "status": "pending",
    "topic": "graphene battery",
    "platforms": [
      "reddit",
      "x"
    ]
  },
  "usage": {
    "cached": {
      "used": 10,
      "limit": 1000,
      "remaining": 990
    },
    "realtime": {
      "used": 3,
      "limit": 500,
      "remaining": 497
    }
  }
}

Get Job Status

GET/api/v1/scrape/{jobId}

Get the status and progress of a scrape job.

Query Parameters

NameTypeRequiredDescription
jobIdstringrequiredJob ID returned from POST /api/v1/scrape

Notes

  • Free - does not count against any quota
  • Poll this endpoint to track job progress
  • Status values: pending, running, completed, failed
  • Progress per platform: pending, running, done, error, skipped

Response Example

200 OK
{
  "data": {
    "jobId": "jh7abc123def456",
    "status": "completed",
    "topic": "graphene battery",
    "platforms": [
      "reddit",
      "x"
    ],
    "progress": {
      "reddit": "done",
      "x": "done"
    },
    "requestsCreated": 12,
    "startedAt": "2024-12-21T12:00:00.000Z",
    "completedAt": "2024-12-21T12:01:30.000Z"
  }
}