A scheduled price monitor for Walmart product pages that runs check_once.py
every 30 minutes on free GitHub Actions. Each run appends new readings to
price_history.json, and the workflow commits this file back to the
repository so the history is kept between runs. Alerts land in Slack (or any
other Apprise-supported channel) on real price drops.
The accompanying blog post walks through the full reasoning behind the stack choices (curl_cffi vs Selenium, the
__NEXT_DATA__parser, theCHALLENGE_MARKERSfalse-positive war story, how the alert thresholds were tuned, and the deployment story). This README focuses on getting the project running in your own account.
Walmart product pages embed the full pricing payload in a single Next.js JSON
blob inside <script id="__NEXT_DATA__">. The scraper parses this blob and
reads the full product object in one extraction: current price, prior price,
list price, per-unit price, markdown flags (rollback / clearance / reducedPrice),
seller name and type (Walmart-direct vs marketplace), event-pricing flags
(priceFlip / specialBuy), brand, rating, review count, and availability.
Walmart has no JSON-LD or Open Graph price meta tags on product pages, and no public API for unauthenticated reads; the JSON blob is the only canonical source. CSS selectors are kept as a defensive fallback only.
The anti-bot layer is Akamai Bot Manager combined with HUMAN Security. Both
serve their soft challenge page with HTTP 200 OK, which means status code
alone cannot identify a block; the scraper has to inspect the response body
on every successful response. The two markers checked are the literal
challenge-UI text "activate and hold the button" and the title-tag
"<title>robot or human". Generic terms like "perimeterx" or "px-captcha"
are deliberately not used. Walmart's Content Security Policy lists those
domains on every product page, which would produce false-positive challenges
on every real fetch.
A separate check covers Akamai's hard-block path (HTTP 307 redirect to
/blocked?url=<base64>). On any match, the request is retried through the
next proxy in the pool.
The workflow sends an alert when all of the following are true:
- The current price is lower than the lowest price recorded in the last 24 hours. This is a new 24-hour low, not a small change from the previous reading.
- The drop is at least 2% and at least $1. This filters out small changes of a few cents on low-cost items.
- No alert has been sent for the same product in the last 6 hours. This stops repeated notifications when prices change often.
If any condition fails, the script still saves the reading to history but does not send a notification.
The four limits, plus the history retention setting, are defined as constants
near the top of storage.py:
MIN_DROP_PCT = 2.0
MIN_DROP_DOLLARS = 1.00
COOLDOWN_HOURS = 6
BASELINE_WINDOW_HOURS = 24
HISTORY_RETENTION_DAYS = 30 # readings older than this are deleted automaticallyAdjust these values to change the alert behavior. For example:
- To alert on any drop of 0.1% or larger, set
MIN_DROP_PCT = 0.1. - To allow at most one alert per product per day, set
COOLDOWN_HOURS = 24.
When an alert fires, Slack receives a message like:
Price Drop: Apple AirPods 4
Apple AirPods 4
Previous: $103.00
Current: $99.00
Drop: $4.00 (-3.88%)
vs Was: $129.99 (-$30.99 / -23.84%)
Per unit: $99.00/count
Tags: ROLLBACK
https://www.walmart.com/ip/11381374703
The optional lines only appear when relevant:
vs Was/vs MSRPappears only when Walmart shows a higher prior price or list price.Per unitappears for groceries and other unit-priced items.Tagscombines the offer label (ROLLBACK / CLEARANCE / REDUCED), the event-pricing flag (priceFliporspecialBuyset on the page), and aSOLD BY <name>tag for marketplace items where the seller is not Walmart-direct.Stockappears only when the item is not currently in stock.
Walmart-direct prices are stable; marketplace seller prices change more
often, which is why the SOLD BY tag is surfaced separately so the reader
knows what kind of price they are looking at.
.
├── .github/workflows/monitor.yml # cron schedule and run logic
├── config.py # loads proxies from environment, validates products
├── scraper.py # curl_cffi scraper + __NEXT_DATA__ parser
├── storage.py # TinyDB persistence + decision layer + thresholds
├── alerts.py # Apprise multi-channel alerts (Slack-friendly format)
├── check_once.py # entry point: runs one price check per execution
├── run_locally.py # local continuous runner (no proxies, for testing)
├── force_alert.py # one-shot alert sender to verify your Slack setup
├── products.json # list of Walmart item IDs to monitor
├── requirements.txt
├── .gitignore
└── price_history.json # created automatically on the first run
Requirements: a GitHub account (free tier is enough). For optional local testing: Python 3.10 or newer.
Click the Fork button in the top-right of this repo's GitHub page. GitHub creates a copy under your account. That copy is where you will configure secrets and where the scheduled workflow will run.
Make your fork private. The workflow auto-commits price_history.json,
which records the products you monitor and their prices over time. A private
fork keeps this data out of search-engine indexes:
- Open your fork → Settings → scroll to the bottom of the General page → Change repository visibility → Make private → confirm.
If you would rather start from a downloaded ZIP, clone the repo locally,
remove the existing .git directory, then git init and push to your own
new repository instead of forking.
- Open https://api.slack.com/apps and click Create New App → From scratch.
- Give it a name (e.g. "Walmart Price Monitor") and pick the workspace.
- In the left sidebar choose Incoming Webhooks and switch it on.
- Click Add New Webhook to Workspace, pick the channel that should receive alerts, and click Allow.
- Copy the webhook URL. It looks like
https://hooks.slack.com/services/<TEAM_ID>/<BOT_ID>/<SECRET>.
In your fork, go to Settings → Secrets and variables → Actions → New repository secret.
APPRISE_URLS (required)
For Slack via the webhook from step 2, convert the URL to Apprise's slack://
format. The simplest pattern is:
slack://TOKEN_A/TOKEN_B/TOKEN_C
where TOKEN_A, TOKEN_B, and TOKEN_C are the three path segments of your
webhook URL. For example, https://hooks.slack.com/services/T0000/B0000/XYZ
becomes:
slack://T0000/B0000/XYZ
You can also use multiple channels by putting one URL per line in the secret:
slack://T0000/B0000/XYZ
discord://webhook_id/webhook_token
mailto://you:app_password@gmail.com?to=you@gmail.com
For the full list of supported notification channels and their URL formats, see the Apprise wiki: https://github.com/caronc/apprise/wiki
You can verify the Slack URL works before deploying:
pip install apprise
apprise -vv -t "test" -b "Price monitor test" "slack://T0000/B0000/XYZ"A message should appear in the chosen Slack channel within a few seconds.
To verify the full alert formatting (not just delivery), run force_alert.py,
which calls send_alert() with test data matching the example shown above:
APPRISE_URLS="slack://T0000/B0000/XYZ" python force_alert.pyThe resulting Slack message renders exactly what a real price-drop alert
would look like, with all optional rows (vs Was, Per unit, Tags) populated.
PROXIES (optional)
For short-term testing or roughly under 100 product fetches per day, leave
this secret unset. The scraper runs in direct mode via curl_cffi's Chrome
impersonation, which passes Walmart's first-pass Akamai check from most IPs.
For sustained scale (anything beyond a few hours of continuous runs, or more
than ~50 products at hourly cadence), add ISP proxies. One proxy per line,
in the format host:port:user:password:
proxy1.example.com:8000:your_user:your_pass
proxy2.example.com:8001:your_user:your_pass
proxy3.example.com:8002:your_user:your_pass
...
Datacenter ASNs (AWS, OVH, Hetzner, DigitalOcean) hit the Akamai challenge on most Walmart requests; consumer-ISP IPs pass through far more often.
Replace the example item IDs with the products you want to monitor. Each
entry needs two fields: item_id and a recognizable name. The item_id is
the numeric ID at the end of a Walmart product URL. For example, the URL
https://www.walmart.com/ip/Apple-AirPods-4/11381374703 has the item ID
11381374703.
{"item_id": "11381374703", "name": "Apple AirPods 4"}There is no target price field. An alert is sent when a price drop crosses
the thresholds defined in storage.py. Commit and push your changes when
you are done.
Forking from this template? The repo ships with an inherited
price_history.json containing demo readings from the original deployment.
You will not get alerts on those products, only on the ones in your edited
products.json. If you want to start with a clean baseline file, delete it
and let the next workflow run recreate it:
git rm price_history.json
git commit -m "Clean inherited price history before first run"
git pushIf you skip this, the inherited readings age out automatically after 30 days
(per HISTORY_RETENTION_DAYS in storage.py).
Open the Actions tab in your repository, select Walmart Price Monitor, and click Run workflow. This first run confirms that your secrets are configured correctly. After this, runs happen automatically every 30 minutes.
You can run one full scrape-and-alert cycle on your laptop before deploying to GitHub Actions. Useful for sanity-checking the scraper, the product list, and the Slack format end-to-end.
pip install -r requirements.txt
APPRISE_URLS="slack://T0000/B0000/XYZ" python run_locally.py --onceThe --once flag runs a single cycle and exits. Drop it (and optionally pass
--interval 30) to keep the loop running every 30 minutes on your machine.
No proxies needed for local runs: run_locally.py stubs PROXIES and the
scraper falls back to direct mode via curl_cffi's Chrome impersonation.
- Scheduled runs are not exact. GitHub may delay cron triggers by 10 to 30 minutes during periods of high load. For routine price monitoring this is acceptable. For time-sensitive cases such as flash sales, use a dedicated server instead.
- GitHub pauses workflows after 60 days of repository inactivity. This workflow's automatic commits count as activity, so the schedule does not pause in normal use.
- The GitHub free tier gives 2,000 Actions minutes per month for private repositories. Each run takes about 1 minute. With 48 runs per day across 30 days, monthly usage is about 1,440 minutes, which is below the free tier limit. Public repositories have unlimited minutes.
- The workflow includes a concurrency lock. This stops two runs from
writing to
price_history.jsonat the same time.
Edit products.json and push the change to the repository. The next
scheduled run will use the updated list. No additional deployment step is
needed.
Edit the cron value in .github/workflows/monitor.yml:
- cron: "*/15 * * * *" # every 15 minutes
- cron: "0 * * * *" # every hour
- cron: "0 */6 * * *" # every 6 hours- The workflow run failed. Open the failed run in the Actions tab, expand
the failed step, and read the log output. Most failures are caused by a
missing or incorrectly formatted
PROXIESorAPPRISE_URLSsecret. - All requests return the "Robot or human?" challenge. Walmart's Akamai and HUMAN layers are more aggressive than Amazon's anti-bot system, so proxy quality matters more here. Make sure the proxies are consumer-ISP IPs, not datacenter IPs. Datacenter ASNs (AWS, OVH, Hetzner, DigitalOcean) hit the challenge on most Walmart requests.
- No alerts arrive in Slack even when prices drop. First verify the
Slack URL with the
apprise -vvcommand from step 4. Then check that the price drop crosses the thresholds instorage.py. A $0.10 drop on a $30 item is below the default 2% threshold and will not trigger an alert. - The workflow is stuck in the "queued" state. GitHub's free runners can be delayed during periods of high demand. The queue usually clears within 5 to 15 minutes.