Fixes uchicago-dsi/core-facility#103 (out-of-date scrapers)#24
Fixes uchicago-dsi/core-facility#103 (out-of-date scrapers)#24jpivarski wants to merge 6 commits into
Conversation
AFDBdocker run --rm -v "$PWD/services/extract/src:/app/src" --env-file "$PWD/services/extract/.env.dev" debit-scrapers bash -lc "cd /app/src/pipeline && uv run pytest ./extract/tests/integration/banks.py::TestWorkflows -k afdb -vv"Before fixCommand outputAfter fix |
IDBdocker run --rm -v "$PWD/services/extract/src:/app/src" --env-file "$PWD/services/extract/.env.dev" debit-scrapers bash -lc "cd /app/src/pipeline && uv run pytest ./extract/tests/integration/banks.py::TestWorkflows -k idb -vv"No changesAlthough the issue says that there was an error fetching an authentication token, I didn't see that. Without making any changes, all tests pass. Command output |
DFCdocker run --rm -v "$PWD/services/extract/src:/app/src" --env-file "$PWD/services/extract/.env.dev" debit-scrapers bash -lc "cd /app/src/pipeline && uv run pytest ./extract/tests/integration/banks.py::TestWorkflows::test_download -k dfc -vv"Before fixAlthough the issue says that this one was already fixed, I had to allow for a space in the Command outputAfter fixCommand output |
ProparcoTwo tests: docker run --rm -v "$PWD/services/extract/src:/app/src" --env-file "$PWD/services/extract/.env.dev" debit-scrapers bash -lc "cd /app/src/pipeline && uv run pytest ./extract/tests/integration/banks.py::TestWorkflows::test_result_page_scrape -k pro -vv"and docker run --rm -v "$PWD/services/extract/src:/app/src" --env-file "$PWD/services/extract/.env.dev" debit-scrapers bash -lc "cd /app/src/pipeline && uv run pytest ./extract/tests/integration/banks.py::TestWorkflows::test_project_page_scrape -k pro -vv"The issue says that this one had a datetime format error that was already fixed. I must have been looking for the baseline code in the wrong place because I had to fix that format error myself. Before fixCommand outputand After fixCommand outputand |
Testing everythingdocker run --rm -v "$PWD/services/extract/src:/app/src" --env-file "$PWD/services/extract/.env.dev" debit-scrapers bash -lc "cd /app/src/pipeline && uv run pytest ./extract/tests/integration/banks.py::TestWorkflows -vv"Before fixCommand output |
Just test the failing onesdocker run --rm -v "$PWD/services/extract/src:/app/src" --env-file "$PWD/services/extract/.env.dev" debit-scrapers bash -lc "cd /app/src/pipeline && uv run pytest './extract/tests/integration/banks.py::TestWorkflows::test_partial_download[ebrd]' './extract/tests/integration/banks.py::TestWorkflows::test_project_partial_page_scrape[ebrd-https://www.ebrd.com/home/work-with-us/projects/psd/36582.html]' './extract/tests/integration/banks.py::TestWorkflows::test_project_partial_page_scrape[ebrd-https://www.ebrd.com/home/work-with-us/projects/psd/56092.html]' './extract/tests/integration/banks.py::TestWorkflows::test_project_partial_page_scrape[ebrd-https://www.ebrd.com/work-with-us/projects/psd/52642.html]' './extract/tests/integration/banks.py::TestWorkflows::test_project_partial_page_scrape[ebrd-https://www.ebrd.com/work-with-us/projects/psd/54846.html]' './extract/tests/integration/banks.py::TestWorkflows::test_project_partial_page_scrape[ebrd-https://www.ebrd.com/work-with-us/projects/psd/technonicol-regional-expansion--resource-efficiency.html]' -vv"After fixThe first of these tests was a Unicode decoding error. The others require a Command output |
Ran all tests againAnd everything passed. (This is with a Gemini API key set up, though it also works without one.) Command output |
Fixes uchicago-dsi/core-facility#103
I've been collecting logs of test commands and their output to show
but they were too large to put in one comment box, so I made a series below.
Worth noting: IDB has self-healed since it was last tested.