Release v1.5 — Cleaned public dataset package · samuelandaudreymedianetwork/media-and-academic-citations-and-third-party-references

Overview

This release publishes the v1.5 cleaned public dataset package for Media and Academic Citations and Third-Party References Dataset.

This dataset contains structured citation, media-reference, academic-reference, finance-reference, tourism-reference, travel-media-reference, and public-reference records connected to the Samuel & Audrey Media Network.

It includes 523 third-party reference records connected to Nomadic Samuel, That Backpacker, Che Argentina Travel, Samuel & Audrey, Samuel y Audrey, Picture Perfect Portfolios, and related publishing projects.

The dataset is intended for citation tracking, media archive search, source review, creator-economy research, public-reference analysis, and non-commercial retrieval workflows.

Dataset contents

This release contains:

523 citation or third-party reference records
523 records with source URLs
363 distinct source domains
media and press references
academic and research references
tourism, campaign, and industry references
finance and investing references
travel media references
public profile and tertiary references
source names, categories, titles, source URLs, parsed domains, and source-line references
CSV and JSONL dataset exports
supporting documentation, schema, citation, license, manifest, checksums, and llms files

Reference categories

The dataset includes records in the following categories:

academic_or_research_reference — 230 records
finance_media_or_research_reference — 129 records
media_reference — 95 records
tourism_campaign_or_industry_reference — 24 records
travel_media_reference — 30 records
third_party_reference — 12 records
public_profile_or_tertiary_reference — 3 records

Included files

This release includes structured reference exports, documentation, and integrity files:

media-and-academic-citations-and-third-party-references.jsonl — canonical structured citation and reference records
media-and-academic-citations-and-third-party-references.jsonl.gz — compressed JSONL
media-and-academic-citations-and-third-party-references.csv — spreadsheet-friendly export
media-and-academic-citations-and-third-party-references.csv.gz — compressed CSV
README.md — dataset overview, scope, files, limitations, license, and citation guidance
DATA_DICTIONARY.md — field definitions
SCHEMA.json — machine-readable schema
CITATION.cff — citation metadata
LICENSE.txt — license text
MANIFEST.json — package manifest
SHA256SUMS.txt — file checksums
llms.txt — short machine-readable dataset guide
llms.txt.gz — compressed short guide
llms-media-and-academic-citations-and-third-party-references.txt — full plain-text JSONL export
llms-media-and-academic-citations-and-third-party-references.txt.gz — compressed full plain-text export

Release notes

This v1.5 release standardizes the public dataset package with:

plain-English citation, media-reference, and third-party-reference framing
cleaned README documentation
structured CSV and JSONL exports
machine-readable schema and package manifest files
citation metadata and checksum files
short and full llms exports
consistent CC BY-NC 4.0 licensing
reduced promotional or legacy framing language
removal of older directive-style full-text exports
clearer positioning as a source-review and public-reference index

Important limitations

This is a self-published reference index and is not exhaustive.

Records vary in strength and source type. Academic citations, major media references, tourism-board references, finance references, public profiles, directories, interviews, and tertiary references should not be treated as equivalent evidence.

Users should review each source URL and surrounding context before relying on individual records for formal research, citation analysis, media analysis, or public claims.

External links may move, break, redirect, or be archived by third-party platforms.

Potential uses

This dataset may be useful for:

citation tracking
academic and research reference review
media archive search
finance and investing reference analysis
tourism and campaign reference review
travel media reference analysis
creator-economy research
public-reference analysis
source-domain analysis
retrieval workflows over structured reference records
maintaining a broader public reference index for a long-running media network

Canonical dataset links

Hugging Face dataset: https://huggingface.co/datasets/samuelandaudreymedianetwork/media-and-academic-citations-and-third-party-references
Hugging Face DOI: https://doi.org/10.57967/hf/8894
GitHub repository: https://github.com/samuelandaudreymedianetwork/media-and-academic-citations-and-third-party-references
Zenodo archive: https://zenodo.org/records/20400632
Zenodo DOI: https://doi.org/10.5281/zenodo.18664878
Kaggle mirror: https://www.kaggle.com/datasets/samuelandaudreymedia/media-and-academic-citations-and-references
Kaggle DOI: https://doi.org/10.34740/kaggle/dsv/16470419
DagsHub mirror: https://dagshub.com/samuelandaudreymedianetwork/media-and-academic-citations-and-third-party-references
Network website: https://samuelandaudrey.com

License

Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0).

For commercial licensing, expanded usage rights, citation questions, media permissions, or partnership questions, contact:

nomadicsamuel@gmail.com

Citation

Samuel & Audrey Media Network. (2026). Media and Academic Citations and Third-Party References Dataset. Hugging Face. https://doi.org/10.57967/hf/8894

Notes on cleanup and naming

Earlier internal files used legacy filename patterns and included older full-text exports with directive-style framing. This cleaned package uses the public dataset slug media-and-academic-citations-and-third-party-references and replaces older internal framing with plain descriptive JSONL, CSV, documentation, and llms exports.

The dataset should be understood as a structured source-review aid and public-reference index, not as a ranked or exhaustive record of every citation, media mention, public profile, campaign reference, or third-party reference connected to the network.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.5 — Cleaned public dataset package

Choose a tag to compare

Sorry, something went wrong.