Update BibTeX files and integrate extracted talks from tenure packet#73
Open
jpfairbanks wants to merge 5 commits into
Open
Update BibTeX files and integrate extracted talks from tenure packet#73jpfairbanks wants to merge 5 commits into
jpfairbanks wants to merge 5 commits into
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
Adds an extracted-talks bibliography source and wiring in Quarto, plus updates multiple CSL-JSON bibliography files to incorporate revised metadata generated from UF Faculty Analytics exports.
Changes:
- Add a Pandoc Lua filter (
lua/extract_csl.lua) to extract APA-style talk citations from DOCX into CSL-JSON. - Add a new talks bibliography file (
assets/bib/extracted_talks.json) and include it inbibtable.qmd/bibliography.qmd. - Update existing bibliography JSON files with revised fields (venues, dates, identifiers), and remove the legacy
bibliography.bib.
Reviewed changes
Copilot reviewed 11 out of 13 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| lua/extract_csl.lua | New Pandoc Lua filter to extract CSL-JSON entries from DOCX talk lists. |
| csl-output-testing.json | Added a sample extracted CSL-JSON output for testing/inspection. |
| bibtable.qmd | Adds assets/bib/extracted_talks.json to the table bibliography inputs/resources. |
| bibliography.qmd | Adds a new talks bibliography topic and a new rendered section for it. |
| bibliography.bib | Removes the legacy BibTeX file previously referenced by site config. |
| assets/bib/extracted_talks.json | Adds extracted talks dataset used by Quarto citeproc/multibib. |
| assets/bib/cv_talks.json | Removes a miscategorized non-talk entry from talks JSON. |
| assets/bib/cv_proceedings.json | Updates proceedings metadata (status, dates, identifiers, URLs/DOIs, etc.). |
| assets/bib/cv_preprints.json | Removes a preprint entry from the preprints JSON. |
| assets/bib/cv_posters.json | Adjusts poster metadata (but introduces typos/name-field swap). |
| assets/bib/cv_journals.json | Adds/updates journal metadata (but introduces a DOI formatting issue). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+13
to
+15
| nocite: | ||
| - "@*" | ||
| - "@talks:*" |
Comment on lines
+18
to
+25
| @@ -19,6 +22,7 @@ resources: | |||
| - assets/bib/cv_posters.json | |||
| - assets/bib/cv_preprints.json | |||
| - assets/bib/apa-cv.csl | |||
| - csl-output.json | |||
| talk: assets/bib/cv_talks.json | ||
| poster: assets/bib/cv_posters.json | ||
| preprint: assets/bib/cv_preprints.json | ||
| talks: assets/bib/extracted_talks.json |
Comment on lines
57
to
59
| "family": "Evan", | ||
| "given": "Patterson" | ||
| } |
Comment on lines
+56
to
+62
| else | ||
| local parts = {} | ||
| for k, v in pairs(val) do | ||
| table.insert(parts, '"' .. json_escape(k) .. '":' .. encode(v)) | ||
| end | ||
| return "{" .. table.concat(parts, ",") .. "}" | ||
| end |
Comment on lines
+74
to
+91
| -- -------------------------------------------------------------------- | ||
| -- APA presentation patterns (Lua patterns, not full PCRE) | ||
| -- ------------------------------------------------------- | ||
| -- The patterns are deliberately permissive – they just need to capture | ||
| -- the fields we care about. They are applied to the plain‑text content | ||
| -- of a paragraph (i.e. after Pandoc has stripped formatting). | ||
| -- -------------------------------------------------------------------- | ||
| local patterns = { | ||
| -- General pattern for numbered APA entries produced by Pandoc from DOCX. | ||
| -- Captures: | ||
| -- 1) author string (up to the period before the date parentheses) | ||
| -- 2) year | ||
| -- 3) month/day string (or just month) | ||
| -- 4) title (plain text, may contain commas, ends with a period) | ||
| -- 5) event (conference/journal etc.) | ||
| -- 6) location (city/state/country) | ||
| "^%s*%d+%.%s*(.-)%s*%((%d%d%d%d),%s*([^%)]+)%)%.%s*(.-)%.%s*([^,]+),%s+(.+)%.$", | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I updated all my talks in the UF faculty analytics system, which can produce APA formatted word files. Talks from this can be extracted CSL JSON and then used on the website.