You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This commit was created on GitHub.com and signed with GitHub’s verified signature.
Fixed
Both Engines
Keyboard controls now work reliably — P, R, S/Q, W keys previously had
no effect unless pressed during a narrow window between page-loop iterations.
Root cause: the listener thread stored keys in _key but the main loop
only called get_key() once per page (every 10–30 s), so most keypresses
were missed. Fix: P and S/Q now set threading.Event flags (_pause_event, _stop_event) directly inside the listener thread, so they take effect
immediately — even when the main loop is blocked in time.sleep() or a tqdm iteration. R clears _pause_event immediately. W remains stored
in _key and is consumed by the main loop to print stats. InputController gains is_paused() and is_stopped() helper methods;
both scrapers updated to poll these instead of a local paused variable.
Dead-website retry: 0 → 1 — profile/crawl fetches now retry once on ReadTimeout and general connection errors. ConnectTimeout (TCP-level
failure — host is unreachable) still returns ("", 0) immediately with
zero retries, so truly dead sites never stall the run.
Added
Documentation
docs/ folder added to the repository root for visual assets.
Contains docs/README.md with step-by-step instructions for recording terminal-demo.gif (asciinema) and capturing excel-preview.png. docs/.gitkeep ensures Git tracks the empty folder.
README Output Preview — placeholder text replaced with an HTML comment
block that is ready to uncomment once assets are placed in docs/.
README Project Structure — updated to show the docs/ folder.