Workopolis Job Scraper

Workopolis Job Scraper collects structured job listings from Workopolis.com so you can track hiring activity, analyze demand, and build reliable job datasets. It solves the pain of manual copy-paste by extracting consistent fields like title, company, location, and descriptions at scale. Built for analysts, recruiters, and developers who need clean Workopolis jobs data for search, alerts, or reporting.

Created by Bitbash, built to showcase our approach to Scraping and Automation!
If you are looking for workopolis-job-scraper you've just found your team — Let’s Chat. 👆👆

Introduction

This project scrapes job postings from Workopolis search results and (optionally) enriches each item by visiting the job detail page to capture full descriptions. It helps you automate job market monitoring, reduce time spent on repetitive research, and produce export-ready datasets for dashboards or pipelines. It’s designed for anyone who needs fast, repeatable access to Workopolis jobs data—without manual browsing.

Job Market Monitoring Workflow

Supports keyword + location searches or direct search URLs for precise targeting
Handles multi-page results with safety limits to prevent runaway pagination
Optional detail collection to capture complete job descriptions in text and HTML
Produces consistent, schema-friendly records suitable for databases and BI tools
Includes scrape metadata fields for auditing, freshness, and pipeline traceability

Features

Feature	Description
Flexible search	Search by keyword and location, or scrape directly from custom search URLs.
Posted-date filtering	Focus on recent roles using posted-date filters (e.g., 24h, 7d, 30d).
Detail enrichment	Optionally open each job page to extract full descriptions and richer context.
Pagination handling	Automatically collects jobs across multiple result pages with guardrails.
Structured dataset output	Produces consistent fields for easy JSON/CSV export and analysis.
Proxy-ready reliability	Works with proxy settings to reduce blocking and improve stability.
Performance-oriented crawling	Optimized for fast collection with concurrent requests and configurable limits.
Cookie support	Accepts cookies as a header string or JSON for sessions and edge cases.

What Data This Scraper Extracts

Field Name	Field Description
url	Direct link to the job posting.
title	Job title as shown on the listing or detail page.
company	Hiring company name associated with the role.
location	Job location (city/region/province) from the posting.
date_posted	Human-readable posting recency (e.g., "2 days ago").
description_html	Full job description captured as HTML when detail scraping is enabled.
description_text	Full job description captured as plain text when detail scraping is enabled.
_source	Source identifier for the dataset (static value for this scraper).
_fetchedAt	ISO timestamp indicating when the record was collected.
_from	Indicates whether the record came from a list page or a detail page.

Example Output

[
      {
            "url": "https://www.workopolis.com/job/abc123",
            "title": "Senior Software Engineer",
            "company": "Tech Company Inc.",
            "location": "Toronto, ON",
            "date_posted": "2 days ago",
            "description_html": "<div><h2>About the role</h2><p>Build and scale backend services...</p></div>",
            "description_text": "About the role\nBuild and scale backend services...",
            "_source": "workopolis.com",
            "_fetchedAt": "2025-10-22T10:30:00.000Z",
            "_from": "detail"
      }
]

Directory Structure Tree

Workopolis Job Scraper (IMPORTANT :!! always keep this name as the name of the apify actor !!! Workopolis Job Scraper )/
├── src/
│   ├── main.js
│   ├── config/
│   │   ├── input.schema.json
│   │   └── defaults.json
│   ├── crawlers/
│   │   ├── searchCrawler.js
│   │   └── detailCrawler.js
│   ├── extractors/
│   │   ├── extractSearchCards.js
│   │   ├── extractJobDetail.js
│   │   └── normalizeFields.js
│   ├── utils/
│   │   ├── buildSearchUrl.js
│   │   ├── dateFilters.js
│   │   ├── dedupe.js
│   │   ├── logger.js
│   │   └── validation.js
│   └── storage/
│       ├── pushDataset.js
│       └── stateStore.js
├── data/
│   ├── input.examples.json
│   └── output.sample.json
├── .env.example
├── .gitignore
├── package.json
├── package-lock.json
└── README.md

Use Cases

Recruiters use it to monitor new Workopolis postings for target roles, so they can respond faster to hiring demand and candidate sourcing opportunities.
Market analysts use it to track job volume and location trends across Canada, so they can generate labor market insights and forecasts.
Sales teams use it to build prospect lists from companies actively hiring, so they can prioritize outreach to high-intent organizations.
Career platforms use it to populate job feeds with normalized fields, so they can deliver better search and alert experiences to users.
Content creators use it to extract hiring signals by category and region, so they can publish data-backed career and salary insights.

FAQs

How do I choose between keyword/location search and custom URLs? Use keyword + location when you want a simple, repeatable configuration and quick iteration. Use custom URLs when you need exact filters already supported by the site (specific query parameters, sorting, or niche combinations). Custom URLs are also helpful if you’ve already validated a search in the browser and want the scraper to reproduce it precisely.

What happens if I disable detail collection? With detail collection off, the scraper focuses on speed and collects what’s available on result cards (typically URL, title, company, location, and posting recency). This is ideal for large-scale monitoring and alerting where full descriptions are not required. If you need description_text/description_html, enable detail collection.

How do I avoid duplicates across pages or repeated runs? The scraper deduplicates by job URL during a run to prevent repeated items when pagination overlaps. For recurring runs, store the last-seen URLs (or a URL hash) and filter them in your pipeline to keep only new postings.

Why am I getting blocked or seeing empty results sometimes? Blocks can happen due to rate limits or aggressive concurrency. Reduce the number of pages, lower concurrency, and enable proxy settings for higher success rates. If you rely on session-based content, provide cookies via the cookies or cookiesJson option.

Performance Benchmarks and Results

Primary Metric: With detail collection enabled, typical throughput is ~50–100 job records per minute depending on query complexity and network conditions.

Reliability Metric: With proxy settings enabled and conservative concurrency, runs commonly achieve 95%+ successful fetch rate across paginated searches, with most failures attributable to transient network timeouts.

Efficiency Metric: Detail scraping generally uses moderate resources (about 1–2 GB memory) and scales predictably with results_wanted and max_pages, making it suitable for scheduled monitoring.

Quality Metric: When detail collection is enabled, description completeness is high (full HTML + normalized text), and field consistency remains stable due to normalization and validation on each record.

"Bitbash is a top-tier automation partner, innovative, reliable, and dedicated to delivering real results every time."

Nathan Pennington
Marketer
★★★★★

"Bitbash delivers outstanding quality, speed, and professionalism, truly a team you can rely on."

Eliza
SEO Affiliate Expert
★★★★★

"Exceptional results, clear communication, and flawless delivery.
Bitbash nailed it."

Syed
Digital Strategist
★★★★★

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Workopolis Job Scraper

Introduction

Job Market Monitoring Workflow

Features

What Data This Scraper Extracts

Example Output

Directory Structure Tree

Use Cases

FAQs

Performance Benchmarks and Results

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Workopolis Job Scraper

Introduction

Job Market Monitoring Workflow

Features

What Data This Scraper Extracts

Example Output

Directory Structure Tree

Use Cases

FAQs

Performance Benchmarks and Results

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages