API Tutorial

How to scrape WooCommerce products with an API

This tutorial walks you through extracting products from any public WooCommerce store using a REST API: creating a key, sending your first request, paginating through a full catalog, reading product variations, and handling errors cleanly. Every step includes working examples in cURL, PHP, JavaScript and Python so you can copy, adapt and ship in minutes.

Scraping a WooCommerce store by parsing HTML is fragile: themes change, markup shifts, and a single layout update can break your extractor overnight. A purpose-built API removes that fragility. You send a request with the store URL, and you receive clean, structured JSON containing titles, prices, SKUs, images, categories and variations. The rest of this guide assumes you are using the WooCommerce Scraper API at https://woocommerce-scraper.com, but the patterns (authentication, pagination, error handling) transfer to almost any well-designed scraping API.

1. Prerequisites and getting an API key

Before your first call, you need an account and an API key. Create an account, then open Profile > API Keys in the dashboard. Two kinds of keys are available:

  • Test key (prefixed wcs_test_): free, runs against the sandbox, ideal for development and integration testing. It is rate limited to 20 requests per minute.
  • Live key (prefixed wcs_live_): hits real stores and returns real data. It requires an active plan and is rate limited to 120 requests per minute.

Treat the key like a password. It carries your billing and quota, so never commit it to a public repository, never embed it in client-side JavaScript that ships to browsers, and never paste it into a screenshot or support ticket. Store it in an environment variable or a secrets manager and read it at runtime. If a key leaks, rotate it immediately from the same Profile > API Keys screen.

For the full reference on authentication, endpoints, limits and pricing, see the complete WooCommerce Scraper API guide. If you also have access to the official store, it is worth reading WooCommerce REST API vs scraping to understand when each approach fits.

2. Your first request

The products endpoint is GET /api/v1/products. It takes a single required query parameter, store, set to the base URL of the WooCommerce shop you want to read. Authentication is a bearer token passed in the Authorization header. Here is the minimal call:

GET https://woocommerce-scraper.com/api/v1/products?store=https://example-store.com
Authorization: Bearer wcs_live_...

Every response follows the same envelope. The actual products live under data, paging information lives under meta, and a request_id is returned for support and debugging. A trimmed example:

{
  "data": [
    {
      "id": 1042,
      "name": "Classic Cotton T-Shirt",
      "sku": "TSHIRT-CLS",
      "price": "19.90",
      "currency": "USD",
      "permalink": "https://example-store.com/product/classic-cotton-t-shirt",
      "images": ["https://example-store.com/wp-content/uploads/tshirt.jpg"],
      "categories": ["Apparel", "T-Shirts"],
      "variations": []
    }
  ],
  "meta": {
    "page": 1,
    "per_page": 20,
    "count": 20
  },
  "request_id": "req_8f3a1c7e"
}

If you get a JSON body shaped like this, your key works and the store is reachable. The next step is reading more than the first page.

3. Paginating through the catalog

A single request returns one page of products. Control paging with two parameters: page (1-based) and per_page. The maximum value for per_page is 100, so request 100 to minimise the number of round trips:

GET /api/v1/products?store=https://example-store.com&page=1&per_page=100
GET /api/v1/products?store=https://example-store.com&page=2&per_page=100
GET /api/v1/products?store=https://example-store.com&page=3&per_page=100

The robust pattern is to loop: start at page=1, keep incrementing, and stop when a page comes back with an empty data array. Each response reports how many items it actually returned in meta.count. When meta.count is 0 (or less than your requested per_page on a final partial page), you have reached the end. Do not hard-code a fixed page count; catalogs grow, and looping until an empty page is the only reliable way to capture every product.

4. Reading product variations

WooCommerce sells two kinds of products: simple products with a single price, and variable products that expose options such as size and color, each with its own price and SKU. In the API response, every product carries a variations array. For a simple product it is empty. For a variable product it lists each purchasable combination:

{
  "id": 2087,
  "name": "Premium Hoodie",
  "price": "49.00",
  "variations": [
    { "id": 2088, "sku": "HOOD-S-BLK", "attributes": {"size": "S", "color": "Black"}, "price": "49.00", "stock_status": "instock" },
    { "id": 2089, "sku": "HOOD-M-BLK", "attributes": {"size": "M", "color": "Black"}, "price": "49.00", "stock_status": "instock" },
    { "id": 2090, "sku": "HOOD-L-RED", "attributes": {"size": "L", "color": "Red"},  "price": "52.00", "stock_status": "outofstock" }
  ]
}

Because variations are embedded in the same response, you do not need a second request per product to get pricing per size or per color. When you flatten the data into a spreadsheet or a database, decide up front whether one row equals one product (and you summarise the price range) or one row equals one variation (and you repeat the parent fields). If you plan to push the data into another platform, the CSV export already flattens variations for you.

5. Handling errors and rate limits

Production code should branch on the HTTP status code rather than assuming success. The status codes you will encounter most often are:

  • 400 Bad Request: the store parameter is missing or is not a valid WooCommerce store URL. Check the URL and protocol.
  • 401 Unauthorized: the API key is missing, malformed or revoked. Verify the Authorization: Bearer ... header.
  • 402 Payment Required: the subscription tied to a live key is inactive or out of quota. The monthly quota is 100000 products; once it is exhausted, calls return 402 until the plan renews or upgrades.
  • 429 Too Many Requests: you exceeded the rate limit (120 per minute on live, 20 per minute on test). The response includes a Retry-After header in seconds. Wait that long before retrying.

Every response, successful or not, also carries rate-limit headers so you can pace yourself before hitting 429: X-RateLimit-Limit (your ceiling), X-RateLimit-Remaining (calls left in the current window) and X-RateLimit-Reset (when the window resets). A good client reads X-RateLimit-Remaining and slows down as it approaches zero, then honours Retry-After if it still gets a 429. Always log the request_id from the body so support can trace any individual call.

6. Code examples in 4 languages

The snippets below all do the same thing: call the products endpoint with a bearer key and iterate over the returned products. Replace wcs_live_... with your key (read from an environment variable in real code) and swap the store URL for your target.

cURL

curl -sS \
  -H "Authorization: Bearer wcs_live_..." \
  "https://woocommerce-scraper.com/api/v1/products?store=https://example-store.com&per_page=100&page=1"

PHP

<?php
$apiKey = getenv('WCS_API_KEY'); // wcs_live_...
$store  = 'https://example-store.com';
$page   = 1;

do {
    $url = 'https://woocommerce-scraper.com/api/v1/products'
         . '?store=' . urlencode($store)
         . '&per_page=100&page=' . $page;

    $ch = curl_init($url);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($ch, CURLOPT_HTTPHEADER, ['Authorization: Bearer ' . $apiKey]);
    $body = curl_exec($ch);
    $code = curl_getinfo($ch, CURLINFO_HTTP_CODE);
    curl_close($ch);

    if ($code !== 200) {
        fwrite(STDERR, "Request failed with status $code\n");
        break;
    }

    $json = json_decode($body, true);
    foreach ($json['data'] as $product) {
        echo $product['name'] . ' - ' . $product['price'] . PHP_EOL;
    }

    $page++;
} while (!empty($json['data']));

JavaScript (fetch)

const apiKey = process.env.WCS_API_KEY; // wcs_live_...
const store  = 'https://example-store.com';

async function scrapeProducts() {
  let page = 1;
  while (true) {
    const url = 'https://woocommerce-scraper.com/api/v1/products'
      + '?store=' + encodeURIComponent(store)
      + '&per_page=100&page=' + page;

    const res = await fetch(url, {
      headers: { Authorization: 'Bearer ' + apiKey }
    });

    if (res.status === 429) {
      const wait = Number(res.headers.get('Retry-After') || 5);
      await new Promise(r => setTimeout(r, wait * 1000));
      continue;
    }
    if (!res.ok) throw new Error('HTTP ' + res.status);

    const json = await res.json();
    if (json.data.length === 0) break;

    for (const p of json.data) {
      console.log(p.name, p.price);
    }
    page++;
  }
}

scrapeProducts();

Python (requests)

import os
import time
import requests

api_key = os.environ["WCS_API_KEY"]  # wcs_live_...
store   = "https://example-store.com"
base    = "https://woocommerce-scraper.com/api/v1/products"
headers = {"Authorization": f"Bearer {api_key}"}

page = 1
while True:
    params = {"store": store, "per_page": 100, "page": page}
    res = requests.get(base, headers=headers, params=params)

    if res.status_code == 429:
        time.sleep(int(res.headers.get("Retry-After", 5)))
        continue
    res.raise_for_status()

    data = res.json()["data"]
    if not data:
        break

    for product in data:
        print(product["name"], product["price"])
    page += 1

7. Scraping large stores with async jobs

Looping over synchronous pages is perfect for a few hundred or a few thousand products. For a full catalog of tens of thousands of items, it becomes slow and brittle: a dropped connection halfway through forces you to restart, and you burn rate-limit budget on a long sequence of calls. For that scale, prefer the asynchronous jobs endpoint.

Send a single POST /api/v1/jobs describing the store and the resource you want (products, categories or reviews). The API returns a job ID immediately, runs the full crawl on its own infrastructure, and notifies you (or lets you poll) when the result is ready to download as one file. You make one request instead of hundreds, and the heavy lifting happens server-side:

POST https://woocommerce-scraper.com/api/v1/jobs
Authorization: Bearer wcs_live_...
Content-Type: application/json

{ "store": "https://example-store.com", "resource": "products" }

Async jobs also support scheduling and webhooks, so you can re-scrape a catalog on a recurring basis and get notified when fresh data lands. The full workflow is covered in the guide on how to automate WooCommerce exports.

8. Frequently asked questions

Do I need credentials for the target WooCommerce store?

No. The scraper reads publicly available product data, so you only need your own API key. You never enter the target store's admin login.

What is the maximum number of products per request?

The per_page parameter caps at 100. To collect more, loop through page until you receive an empty data array, or use an async job for whole catalogs.

What are the rate limits?

Live keys allow 120 requests per minute and test keys allow 20 per minute. When you exceed the limit you get a 429 with a Retry-After header. A monthly quota of 100000 products applies on top of the per-minute limit.

Can I test without a paid plan?

Yes. A free wcs_test_ key runs against the sandbox so you can build and validate your integration before subscribing. Only the live wcs_live_ key requires an active plan.

Does the response include product variations?

Yes. Each product carries a variations array containing each size, color and price combination. Simple products return an empty array.

Ready to make your first call?

Grab a free test key, read the full API reference, and start pulling structured WooCommerce data in minutes.

Related guides