Skip to content

triposat/walmart-price-monitor

Repository files navigation

Walmart Price Monitor

A scheduled price monitor for Walmart product pages that runs check_once.py every 30 minutes on free GitHub Actions. Each run appends new readings to price_history.json, and the workflow commits this file back to the repository so the history is kept between runs. Alerts land in Slack (or any other Apprise-supported channel) on real price drops.

The accompanying blog post walks through the full reasoning behind the stack choices (curl_cffi vs Selenium, the __NEXT_DATA__ parser, the CHALLENGE_MARKERS false-positive war story, how the alert thresholds were tuned, and the deployment story). This README focuses on getting the project running in your own account.

Why this stack

Walmart product pages embed the full pricing payload in a single Next.js JSON blob inside <script id="__NEXT_DATA__">. The scraper parses this blob and reads the full product object in one extraction: current price, prior price, list price, per-unit price, markdown flags (rollback / clearance / reducedPrice), seller name and type (Walmart-direct vs marketplace), event-pricing flags (priceFlip / specialBuy), brand, rating, review count, and availability.

Walmart has no JSON-LD or Open Graph price meta tags on product pages, and no public API for unauthenticated reads; the JSON blob is the only canonical source. CSS selectors are kept as a defensive fallback only.

The anti-bot layer is Akamai Bot Manager combined with HUMAN Security. Both serve their soft challenge page with HTTP 200 OK, which means status code alone cannot identify a block; the scraper has to inspect the response body on every successful response. The two markers checked are the literal challenge-UI text "activate and hold the button" and the title-tag "<title>robot or human". Generic terms like "perimeterx" or "px-captcha" are deliberately not used. Walmart's Content Security Policy lists those domains on every product page, which would produce false-positive challenges on every real fetch. A separate check covers Akamai's hard-block path (HTTP 307 redirect to /blocked?url=<base64>). On any match, the request is retried through the next proxy in the pool.

Alert thresholds

The workflow sends an alert when all of the following are true:

  1. The current price is lower than the lowest price recorded in the last 24 hours. This is a new 24-hour low, not a small change from the previous reading.
  2. The drop is at least 2% and at least $1. This filters out small changes of a few cents on low-cost items.
  3. No alert has been sent for the same product in the last 6 hours. This stops repeated notifications when prices change often.

If any condition fails, the script still saves the reading to history but does not send a notification.

The four limits, plus the history retention setting, are defined as constants near the top of storage.py:

MIN_DROP_PCT = 2.0
MIN_DROP_DOLLARS = 1.00
COOLDOWN_HOURS = 6
BASELINE_WINDOW_HOURS = 24
HISTORY_RETENTION_DAYS = 30  # readings older than this are deleted automatically

Adjust these values to change the alert behavior. For example:

  • To alert on any drop of 0.1% or larger, set MIN_DROP_PCT = 0.1.
  • To allow at most one alert per product per day, set COOLDOWN_HOURS = 24.

Slack alert format

When an alert fires, Slack receives a message like:

Price Drop: Apple AirPods 4

Apple AirPods 4

Previous: $103.00
Current:  $99.00
Drop:     $4.00 (-3.88%)
vs Was:   $129.99 (-$30.99 / -23.84%)
Per unit: $99.00/count
Tags:     ROLLBACK

https://www.walmart.com/ip/11381374703

The optional lines only appear when relevant:

  • vs Was / vs MSRP appears only when Walmart shows a higher prior price or list price.
  • Per unit appears for groceries and other unit-priced items.
  • Tags combines the offer label (ROLLBACK / CLEARANCE / REDUCED), the event-pricing flag (priceFlip or specialBuy set on the page), and a SOLD BY <name> tag for marketplace items where the seller is not Walmart-direct.
  • Stock appears only when the item is not currently in stock.

Walmart-direct prices are stable; marketplace seller prices change more often, which is why the SOLD BY tag is surfaced separately so the reader knows what kind of price they are looking at.

File structure

.
├── .github/workflows/monitor.yml   # cron schedule and run logic
├── config.py                       # loads proxies from environment, validates products
├── scraper.py                      # curl_cffi scraper + __NEXT_DATA__ parser
├── storage.py                      # TinyDB persistence + decision layer + thresholds
├── alerts.py                       # Apprise multi-channel alerts (Slack-friendly format)
├── check_once.py                   # entry point: runs one price check per execution
├── run_locally.py                  # local continuous runner (no proxies, for testing)
├── force_alert.py                  # one-shot alert sender to verify your Slack setup
├── products.json                   # list of Walmart item IDs to monitor
├── requirements.txt
├── .gitignore
└── price_history.json              # created automatically on the first run

Setup (one-time, about 10 minutes)

Requirements: a GitHub account (free tier is enough). For optional local testing: Python 3.10 or newer.

1. Fork this repository

Click the Fork button in the top-right of this repo's GitHub page. GitHub creates a copy under your account. That copy is where you will configure secrets and where the scheduled workflow will run.

Make your fork private. The workflow auto-commits price_history.json, which records the products you monitor and their prices over time. A private fork keeps this data out of search-engine indexes:

  • Open your fork → Settings → scroll to the bottom of the General page → Change repository visibilityMake private → confirm.

If you would rather start from a downloaded ZIP, clone the repo locally, remove the existing .git directory, then git init and push to your own new repository instead of forking.

2. Create a Slack incoming webhook

  1. Open https://api.slack.com/apps and click Create New AppFrom scratch.
  2. Give it a name (e.g. "Walmart Price Monitor") and pick the workspace.
  3. In the left sidebar choose Incoming Webhooks and switch it on.
  4. Click Add New Webhook to Workspace, pick the channel that should receive alerts, and click Allow.
  5. Copy the webhook URL. It looks like https://hooks.slack.com/services/<TEAM_ID>/<BOT_ID>/<SECRET>.

3. Add GitHub Secrets

In your fork, go to Settings → Secrets and variables → Actions → New repository secret.

APPRISE_URLS (required)

For Slack via the webhook from step 2, convert the URL to Apprise's slack:// format. The simplest pattern is:

slack://TOKEN_A/TOKEN_B/TOKEN_C

where TOKEN_A, TOKEN_B, and TOKEN_C are the three path segments of your webhook URL. For example, https://hooks.slack.com/services/T0000/B0000/XYZ becomes:

slack://T0000/B0000/XYZ

You can also use multiple channels by putting one URL per line in the secret:

slack://T0000/B0000/XYZ
discord://webhook_id/webhook_token
mailto://you:app_password@gmail.com?to=you@gmail.com

For the full list of supported notification channels and their URL formats, see the Apprise wiki: https://github.com/caronc/apprise/wiki

You can verify the Slack URL works before deploying:

pip install apprise
apprise -vv -t "test" -b "Price monitor test" "slack://T0000/B0000/XYZ"

A message should appear in the chosen Slack channel within a few seconds.

To verify the full alert formatting (not just delivery), run force_alert.py, which calls send_alert() with test data matching the example shown above:

APPRISE_URLS="slack://T0000/B0000/XYZ" python force_alert.py

The resulting Slack message renders exactly what a real price-drop alert would look like, with all optional rows (vs Was, Per unit, Tags) populated.

PROXIES (optional)

For short-term testing or roughly under 100 product fetches per day, leave this secret unset. The scraper runs in direct mode via curl_cffi's Chrome impersonation, which passes Walmart's first-pass Akamai check from most IPs.

For sustained scale (anything beyond a few hours of continuous runs, or more than ~50 products at hourly cadence), add ISP proxies. One proxy per line, in the format host:port:user:password:

proxy1.example.com:8000:your_user:your_pass
proxy2.example.com:8001:your_user:your_pass
proxy3.example.com:8002:your_user:your_pass
...

Datacenter ASNs (AWS, OVH, Hetzner, DigitalOcean) hit the Akamai challenge on most Walmart requests; consumer-ISP IPs pass through far more often.

4. Edit products.json

Replace the example item IDs with the products you want to monitor. Each entry needs two fields: item_id and a recognizable name. The item_id is the numeric ID at the end of a Walmart product URL. For example, the URL https://www.walmart.com/ip/Apple-AirPods-4/11381374703 has the item ID 11381374703.

{"item_id": "11381374703", "name": "Apple AirPods 4"}

There is no target price field. An alert is sent when a price drop crosses the thresholds defined in storage.py. Commit and push your changes when you are done.

Forking from this template? The repo ships with an inherited price_history.json containing demo readings from the original deployment. You will not get alerts on those products, only on the ones in your edited products.json. If you want to start with a clean baseline file, delete it and let the next workflow run recreate it:

git rm price_history.json
git commit -m "Clean inherited price history before first run"
git push

If you skip this, the inherited readings age out automatically after 30 days (per HISTORY_RETENTION_DAYS in storage.py).

5. Trigger the first run manually

Open the Actions tab in your repository, select Walmart Price Monitor, and click Run workflow. This first run confirms that your secrets are configured correctly. After this, runs happen automatically every 30 minutes.

Test locally first (optional)

You can run one full scrape-and-alert cycle on your laptop before deploying to GitHub Actions. Useful for sanity-checking the scraper, the product list, and the Slack format end-to-end.

pip install -r requirements.txt
APPRISE_URLS="slack://T0000/B0000/XYZ" python run_locally.py --once

The --once flag runs a single cycle and exits. Drop it (and optionally pass --interval 30) to keep the loop running every 30 minutes on your machine.

No proxies needed for local runs: run_locally.py stubs PROXIES and the scraper falls back to direct mode via curl_cffi's Chrome impersonation.

Important limitations

  • Scheduled runs are not exact. GitHub may delay cron triggers by 10 to 30 minutes during periods of high load. For routine price monitoring this is acceptable. For time-sensitive cases such as flash sales, use a dedicated server instead.
  • GitHub pauses workflows after 60 days of repository inactivity. This workflow's automatic commits count as activity, so the schedule does not pause in normal use.
  • The GitHub free tier gives 2,000 Actions minutes per month for private repositories. Each run takes about 1 minute. With 48 runs per day across 30 days, monthly usage is about 1,440 minutes, which is below the free tier limit. Public repositories have unlimited minutes.
  • The workflow includes a concurrency lock. This stops two runs from writing to price_history.json at the same time.

Changing the monitored products

Edit products.json and push the change to the repository. The next scheduled run will use the updated list. No additional deployment step is needed.

Changing the run frequency

Edit the cron value in .github/workflows/monitor.yml:

- cron: "*/15 * * * *"   # every 15 minutes
- cron: "0 * * * *"      # every hour
- cron: "0 */6 * * *"    # every 6 hours

Troubleshooting

  • The workflow run failed. Open the failed run in the Actions tab, expand the failed step, and read the log output. Most failures are caused by a missing or incorrectly formatted PROXIES or APPRISE_URLS secret.
  • All requests return the "Robot or human?" challenge. Walmart's Akamai and HUMAN layers are more aggressive than Amazon's anti-bot system, so proxy quality matters more here. Make sure the proxies are consumer-ISP IPs, not datacenter IPs. Datacenter ASNs (AWS, OVH, Hetzner, DigitalOcean) hit the challenge on most Walmart requests.
  • No alerts arrive in Slack even when prices drop. First verify the Slack URL with the apprise -vv command from step 4. Then check that the price drop crosses the thresholds in storage.py. A $0.10 drop on a $30 item is below the default 2% threshold and will not trigger an alert.
  • The workflow is stuck in the "queued" state. GitHub's free runners can be delayed during periods of high demand. The queue usually clears within 5 to 15 minutes.

About

Scheduled Walmart price-drop monitor. curl_cffi + ISP proxies + TinyDB. Slack alerts via Apprise. Runs free on GitHub Actions.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages