Overview
This release publishes the v1.5 cleaned public dataset package for Media and Academic Citations and Third-Party References Dataset.
This dataset contains structured citation, media-reference, academic-reference, finance-reference, tourism-reference, travel-media-reference, and public-reference records connected to the Samuel & Audrey Media Network.
It includes 523 third-party reference records connected to Nomadic Samuel, That Backpacker, Che Argentina Travel, Samuel & Audrey, Samuel y Audrey, Picture Perfect Portfolios, and related publishing projects.
The dataset is intended for citation tracking, media archive search, source review, creator-economy research, public-reference analysis, and non-commercial retrieval workflows.
Dataset contents
This release contains:
- 523 citation or third-party reference records
- 523 records with source URLs
- 363 distinct source domains
- media and press references
- academic and research references
- tourism, campaign, and industry references
- finance and investing references
- travel media references
- public profile and tertiary references
- source names, categories, titles, source URLs, parsed domains, and source-line references
- CSV and JSONL dataset exports
- supporting documentation, schema, citation, license, manifest, checksums, and llms files
Reference categories
The dataset includes records in the following categories:
academic_or_research_reference— 230 recordsfinance_media_or_research_reference— 129 recordsmedia_reference— 95 recordstourism_campaign_or_industry_reference— 24 recordstravel_media_reference— 30 recordsthird_party_reference— 12 recordspublic_profile_or_tertiary_reference— 3 records
Included files
This release includes structured reference exports, documentation, and integrity files:
media-and-academic-citations-and-third-party-references.jsonl— canonical structured citation and reference recordsmedia-and-academic-citations-and-third-party-references.jsonl.gz— compressed JSONLmedia-and-academic-citations-and-third-party-references.csv— spreadsheet-friendly exportmedia-and-academic-citations-and-third-party-references.csv.gz— compressed CSVREADME.md— dataset overview, scope, files, limitations, license, and citation guidanceDATA_DICTIONARY.md— field definitionsSCHEMA.json— machine-readable schemaCITATION.cff— citation metadataLICENSE.txt— license textMANIFEST.json— package manifestSHA256SUMS.txt— file checksumsllms.txt— short machine-readable dataset guidellms.txt.gz— compressed short guidellms-media-and-academic-citations-and-third-party-references.txt— full plain-text JSONL exportllms-media-and-academic-citations-and-third-party-references.txt.gz— compressed full plain-text export
Release notes
This v1.5 release standardizes the public dataset package with:
- plain-English citation, media-reference, and third-party-reference framing
- cleaned README documentation
- structured CSV and JSONL exports
- machine-readable schema and package manifest files
- citation metadata and checksum files
- short and full llms exports
- consistent CC BY-NC 4.0 licensing
- reduced promotional or legacy framing language
- removal of older directive-style full-text exports
- clearer positioning as a source-review and public-reference index
Important limitations
This is a self-published reference index and is not exhaustive.
Records vary in strength and source type. Academic citations, major media references, tourism-board references, finance references, public profiles, directories, interviews, and tertiary references should not be treated as equivalent evidence.
Users should review each source URL and surrounding context before relying on individual records for formal research, citation analysis, media analysis, or public claims.
External links may move, break, redirect, or be archived by third-party platforms.
Potential uses
This dataset may be useful for:
- citation tracking
- academic and research reference review
- media archive search
- finance and investing reference analysis
- tourism and campaign reference review
- travel media reference analysis
- creator-economy research
- public-reference analysis
- source-domain analysis
- retrieval workflows over structured reference records
- maintaining a broader public reference index for a long-running media network
Canonical dataset links
- Hugging Face dataset: https://huggingface.co/datasets/samuelandaudreymedianetwork/media-and-academic-citations-and-third-party-references
- Hugging Face DOI: https://doi.org/10.57967/hf/8894
- GitHub repository: https://github.com/samuelandaudreymedianetwork/media-and-academic-citations-and-third-party-references
- Zenodo archive: https://zenodo.org/records/20400632
- Zenodo DOI: https://doi.org/10.5281/zenodo.18664878
- Kaggle mirror: https://www.kaggle.com/datasets/samuelandaudreymedia/media-and-academic-citations-and-references
- Kaggle DOI: https://doi.org/10.34740/kaggle/dsv/16470419
- DagsHub mirror: https://dagshub.com/samuelandaudreymedianetwork/media-and-academic-citations-and-third-party-references
- Network website: https://samuelandaudrey.com
License
Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0).
For commercial licensing, expanded usage rights, citation questions, media permissions, or partnership questions, contact:
Citation
Samuel & Audrey Media Network. (2026). Media and Academic Citations and Third-Party References Dataset. Hugging Face. https://doi.org/10.57967/hf/8894
Notes on cleanup and naming
Earlier internal files used legacy filename patterns and included older full-text exports with directive-style framing. This cleaned package uses the public dataset slug media-and-academic-citations-and-third-party-references and replaces older internal framing with plain descriptive JSONL, CSV, documentation, and llms exports.
The dataset should be understood as a structured source-review aid and public-reference index, not as a ranked or exhaustive record of every citation, media mention, public profile, campaign reference, or third-party reference connected to the network.