Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 

README.md

Jumbo Supermarket Product Scraper

Scrapes the complete product catalog and category tree from Jumbo, the second-largest supermarket chain in the Netherlands.

What does this Actor do?

  • Fetches the category tree via Jumbo's GraphQL API
  • Paginates through every sub-category (24 products per page) to collect all products
  • Returns product name, prices (regular and promotional), discount text, image URL, and product page URL for each item

Input

Field Type Default Description
outputCategories boolean false If true, category objects (with type: "category") are also pushed to the dataset
proxy object Optional Apify proxy configuration. Recommended if Jumbo blocks cloud IPs.

Example input

{
  "outputCategories": true
}

Output

Each item in the dataset represents a product or (optionally) a category.

Product item

{
  "type": "product",
  "external_id": "201896STK",
  "name": "Remia Kruidige & Romige Ravigote Tafelsaus 250 ml",
  "base_price": "1.79",
  "current_price": "1.79",
  "has_discount": false,
  "discount_text": null,
  "image_url": "https://www.jumbo.com/dam-images/fit-in/360x360/Products/20012025_1737339398958_1737339403704_201896_STK_08710448632900_C1N1.png",
  "website_url": "https://www.jumbo.com/producten/remia-kruidige-romige-ravigote-tafelsaus-250-ml-201896STK",
  "category_external_id": "conserven,-soepen,-sauzen,-olien/jus-en-maaltijdsauzen"
}

Category item (outputCategories: true)

{
  "type": "category",
  "external_id": "aardappelen,-groente-en-fruit",
  "name": "Aardappelen, groente en fruit",
  "parent_external_id": null
}

Fields

Field Type Description
type string "product" or "category"
external_id string Product SKU (e.g. "201896STK") for products, or URL slug path (e.g. "aardappelen,-groente-en-fruit") for categories
name string Product or category name
base_price string|null Regular price in EUR (as decimal string, e.g. "1.49")
current_price string|null Current price — discounted price if on sale, else same as base_price
has_discount boolean true if the product is currently on promotion
discount_text string|null Promotion description (e.g. "2 voor €3", "25% korting")
image_url string|null Product image URL
website_url string|null Direct link to the product page on jumbo.com
category_external_id string|null The category this product belongs to
parent_external_id string|null (categories only) Parent category slug, or null for top-level

Notes

  • Prices are in EUR as decimal strings to preserve precision (Jumbo's API returns prices in cents internally).
  • The scraper pins locale to nl-NL / Netherlands to avoid being redirected to the Belgian store.
  • Jumbo's GraphQL API may be protected by bot detection on some IP ranges. Using a residential or datacenter proxy is recommended if you experience errors.
  • Apify may mark this Actor as under maintenance when a run takes longer than 5 minutes. This is expected, so you do not need to worry about that status.