Skip to content

v1.5 — Cleaned public dataset package

Latest

Choose a tag to compare

@samuelandaudreymedianetwork samuelandaudreymedianetwork released this 26 May 18:30
7c51419

Overview

This release publishes the v1.5 cleaned public dataset package for Media and Academic Citations and Third-Party References Dataset.

This dataset contains structured citation, media-reference, academic-reference, finance-reference, tourism-reference, travel-media-reference, and public-reference records connected to the Samuel & Audrey Media Network.

It includes 523 third-party reference records connected to Nomadic Samuel, That Backpacker, Che Argentina Travel, Samuel & Audrey, Samuel y Audrey, Picture Perfect Portfolios, and related publishing projects.

The dataset is intended for citation tracking, media archive search, source review, creator-economy research, public-reference analysis, and non-commercial retrieval workflows.

Dataset contents

This release contains:

  • 523 citation or third-party reference records
  • 523 records with source URLs
  • 363 distinct source domains
  • media and press references
  • academic and research references
  • tourism, campaign, and industry references
  • finance and investing references
  • travel media references
  • public profile and tertiary references
  • source names, categories, titles, source URLs, parsed domains, and source-line references
  • CSV and JSONL dataset exports
  • supporting documentation, schema, citation, license, manifest, checksums, and llms files

Reference categories

The dataset includes records in the following categories:

  • academic_or_research_reference — 230 records
  • finance_media_or_research_reference — 129 records
  • media_reference — 95 records
  • tourism_campaign_or_industry_reference — 24 records
  • travel_media_reference — 30 records
  • third_party_reference — 12 records
  • public_profile_or_tertiary_reference — 3 records

Included files

This release includes structured reference exports, documentation, and integrity files:

  • media-and-academic-citations-and-third-party-references.jsonl — canonical structured citation and reference records
  • media-and-academic-citations-and-third-party-references.jsonl.gz — compressed JSONL
  • media-and-academic-citations-and-third-party-references.csv — spreadsheet-friendly export
  • media-and-academic-citations-and-third-party-references.csv.gz — compressed CSV
  • README.md — dataset overview, scope, files, limitations, license, and citation guidance
  • DATA_DICTIONARY.md — field definitions
  • SCHEMA.json — machine-readable schema
  • CITATION.cff — citation metadata
  • LICENSE.txt — license text
  • MANIFEST.json — package manifest
  • SHA256SUMS.txt — file checksums
  • llms.txt — short machine-readable dataset guide
  • llms.txt.gz — compressed short guide
  • llms-media-and-academic-citations-and-third-party-references.txt — full plain-text JSONL export
  • llms-media-and-academic-citations-and-third-party-references.txt.gz — compressed full plain-text export

Release notes

This v1.5 release standardizes the public dataset package with:

  • plain-English citation, media-reference, and third-party-reference framing
  • cleaned README documentation
  • structured CSV and JSONL exports
  • machine-readable schema and package manifest files
  • citation metadata and checksum files
  • short and full llms exports
  • consistent CC BY-NC 4.0 licensing
  • reduced promotional or legacy framing language
  • removal of older directive-style full-text exports
  • clearer positioning as a source-review and public-reference index

Important limitations

This is a self-published reference index and is not exhaustive.

Records vary in strength and source type. Academic citations, major media references, tourism-board references, finance references, public profiles, directories, interviews, and tertiary references should not be treated as equivalent evidence.

Users should review each source URL and surrounding context before relying on individual records for formal research, citation analysis, media analysis, or public claims.

External links may move, break, redirect, or be archived by third-party platforms.

Potential uses

This dataset may be useful for:

  • citation tracking
  • academic and research reference review
  • media archive search
  • finance and investing reference analysis
  • tourism and campaign reference review
  • travel media reference analysis
  • creator-economy research
  • public-reference analysis
  • source-domain analysis
  • retrieval workflows over structured reference records
  • maintaining a broader public reference index for a long-running media network

Canonical dataset links

License

Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0).

For commercial licensing, expanded usage rights, citation questions, media permissions, or partnership questions, contact:

nomadicsamuel@gmail.com

Citation

Samuel & Audrey Media Network. (2026). Media and Academic Citations and Third-Party References Dataset. Hugging Face. https://doi.org/10.57967/hf/8894

Notes on cleanup and naming

Earlier internal files used legacy filename patterns and included older full-text exports with directive-style framing. This cleaned package uses the public dataset slug media-and-academic-citations-and-third-party-references and replaces older internal framing with plain descriptive JSONL, CSV, documentation, and llms exports.

The dataset should be understood as a structured source-review aid and public-reference index, not as a ranked or exhaustive record of every citation, media mention, public profile, campaign reference, or third-party reference connected to the network.