Skip to content

EphraimAsad/VAE-Of-Mortality

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Decomposing Sex Differences in Mortality Across Age and Cause in England and Wales, 1915–2015

Overview

This repository contains the full codebase, processed data, and analytical outputs for the study:

Decomposing Sex Differences in Mortality Across Age and Cause in England and Wales, 1915–2015

The project investigates the long-run structure of male and female mortality using interpretable latent variable modelling. A variational autoencoder (VAE) is applied to age-specific, cause-of-death mortality profiles across a century to uncover low-dimensional demographic mechanisms driving sex differences in mortality.

Rather than modelling causes independently, the approach captures the joint structure of mortality by age, cause, sex, and historical period, allowing male–female divergence to be decomposed into interpretable latent dimensions aligned with known demographic transitions.

All analyses are fully reproducible and rely exclusively on publicly available data.

Data Source

Mortality data are derived from the UK Office for National Statistics (ONS):

Office for National Statistics. Causes of death over 100 years. https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/deaths/articles/causesofdeathover100years/2017-09-18

Raw cause-of-death counts were harmonised across decades, age groups, and cause categories to ensure comparability across historical changes in registration and classification systems.

Analytical Framework

Latent Mortality Modelling

A variational autoencoder (VAE) is trained separately for males and females

Each observation corresponds to an age group × decade mortality profile

Inputs consist of cause-specific mortality counts

Sex is not included as a model input

Latent space dimensionality: 3

Learning sex-specific latent structures independently ensures that observed differences arise from empirical patterns rather than imposed modelling assumptions.

Interpretation of Latent Dimensions

Latent dimensions are interpreted by correlating latent coordinates with observed cause-specific mortality counts across ages and periods.

Three stable and interpretable dimensions emerge:

Z1 — Epidemiological transition axis Captures the transition from infectious and acute causes of death to chronic disease mortality.

Z2 — Cancer / modernization axis Reflects long-run structural shifts in cancer mortality and accident exposure associated with social and economic change.

Z3 — External and behavioural risk axis Captures accidents, suicide, violence, and war-related mortality, exhibiting strong age specificity and pronounced sex differences.

Male–Female Divergence Analysis

Male–female divergence is quantified as the mean absolute distance in aligned latent space at each age–decade cell.

To identify mechanisms driving divergence:

Absolute differences are decomposed by latent axis

The dominant latent dimension is identified for each age and decade

Divergence is interpreted mechanistically rather than descriptively

This allows separation of divergence driven by baseline epidemiological conditions from divergence driven by behavioural and structural exposures.

Validation and Robustness Analyses

The repository includes multiple validation procedures designed to support interpretability and methodological rigor:

  1. Dimensionality justification

VAEs trained with 2–5 latent dimensions

Reconstruction and total loss evaluated

Three dimensions shown to capture dominant structure without instability

  1. Temporal stability

Separate models trained on 1915–1965 and 1965–2015

Consistent latent axes recovered across periods

  1. Quantitative decoding of latent axes

Synthetic mortality profiles generated at fixed latent positions

Decoded into cause-of-death space to clarify substantive meaning

  1. Temporal trajectories

Mean latent values tracked across decades

Reveals epidemiological transition and modernization trends

  1. Age-specific trajectories

Latent paths traced for selected age groups

Demonstrates age-specific and sex-specific divergence mechanisms

Key Outputs

CSV tables enabling replication and secondary analysis

Publication-quality figures illustrating latent structure and divergence

Fully documented notebooks covering the entire analytical pipeline

Reproducibility

All analyses can be reproduced by:

Cloning the repository

Installing dependencies: pip install -r environment/requirements.txt Running notebooks sequentially

The analysis was conducted using:

Python ≥ 3.10

PyTorch

NumPy, pandas, scikit-learn

matplotlib and seaborn

Random seeds are fixed throughout to ensure reproducibility.

Intended Use

This repository is intended for:

Demographic and population health research

Methodological work on mortality structure

Interpretable machine learning in social science

Peer review, replication, and visa assessment

The code prioritises clarity and interpretability over production optimisation.

Citation

If you use this work, please cite:

Decomposing Sex Differences in Mortality Across Age and Cause in England and Wales, 1915–2015. Preprint forthcoming.

License

This project is released under the MIT License. ONS data remain subject to their original usage terms.

Contact

For questions, replication issues, or collaboration inquiries, please open an issue or contact the author via GitHub.

About

Publication Research supporting Decomposing Sex Differences in Mortality Across Age and Cause in England and Wales, 1915–2015

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages