FT-Transformer EHR Retrieval

A controlled method-study comparing an FT-Transformer-style tabular encoder with a baseline MLP encoder for multimodal retrieval.

The project is inspired by multimodal medical representation learning, but the public version uses controlled synthetic data instead of restricted clinical data. The goal is to study method behavior under known data-generating conditions, not to make clinical claims.

Research Report

A paper-style summary of this project is available here:

Read the paper-style report

Research Question

Can an FT-Transformer-style tabular encoder improve retrieval alignment compared with a simpler MLP encoder when the data contains feature interactions, noisy pairings, or larger sample sizes?

What This Repository Shows

This repository demonstrates:

how to build a two-tower contrastive retrieval pipeline
how to compare MLP and FT-Transformer tabular encoders
how Recall@K and lift-over-random can be used for retrieval evaluation
how model behavior changes under linear, interaction-based, and noisy-pair synthetic setups
why higher embedding similarity does not always guarantee better top-k retrieval

Methods Compared

This benchmark compares a baseline MLP EHR encoder with an FT-Transformer-style EHR encoder inside the same contrastive retrieval pipeline.

For the full method description, including encoder design and comparison setup, see the paper-style report.

Related Work

This project is inspired by FT-Transformer-style tabular encoding, TabTransformer, and contrastive retrieval for multimodal representation learning.

For the full related-work discussion and references, see the paper-style report.

Demo-Space Visualization

The plots below show the synthetic input spaces used in this repository. These are pre-training diagnostics, not learned FT-Transformer embeddings.

EHR profile space	EHR train/eval split	Synthetic CXR image space

The first two panels use the same EHR PCA coordinates but highlight different information: the first shows clinical feature structure, while the second shows whether Pool A and Pool B are spread across the same EHR space.

Panel	What to notice
EHR profile space	Nearby points represent structurally similar synthetic patient profiles. Color indicates hypertension status, and marker shape indicates sex.
EHR train/eval split	Pool A and Pool B should be mixed across the same broad EHR space, rather than forming isolated train/eval regions. This helps check that the split is not visually biased.
Synthetic CXR image space	The image-side inputs show visible structure before training, giving the retrieval model meaningful variation to learn from.

These plots are qualitative diagnostics. The main conclusions should be based on retrieval results and controlled experiments rather than visual inspection alone.

Dataset Setup

The public repository uses controlled synthetic multimodal data.

Three dataset modes are included:

Setup	Purpose
Linear	Tests whether a simple MLP is already sufficient when the signal is mostly linear
Interaction	Tests whether FT-Transformer helps when feature interactions matter
Noisy	Tests robustness when a fraction of image-EHR pairings are intentionally corrupted

The generated images are controlled synthetic visual patterns rather than clinical medical images. This makes the benchmark reproducible, privacy-safe, and suitable for method-behavior analysis. For full setup details, see the paper-style report.

Experiments

The final benchmark compares:

encoder type: MLP vs FT-Transformer
dataset mode: linear, interaction, noisy
sample size: 500 pilot, 1000, 2000
training duration: 10 epochs for the main benchmark

Metrics reported:

Recall@1
Recall@5
Recall@10
Recall@50
lift over random baseline
positive-pair cosine similarity
training loss

Key Findings

FT-Transformer did not universally outperform the MLP. Its behavior depended on the data setup.

Main observations:

In linear settings, the MLP was already highly competitive.
In the 1000-sample interaction setup, FT-Transformer improved retrieval over MLP.
In noisy-pair settings, both models degraded, showing that better architecture cannot fully compensate for weak pair quality.
FT-Transformer often improved positive-pair similarity, but higher similarity did not always translate into better top-k retrieval.

For the full findings and interpretation, see the paper-style report.

Repository Structure

ft-transformer-ehr-retrieval/
│
├── src/
│   ├── train.py
│   ├── make_demo_data.py
│   ├── model.py
│   ├── cxr_transforms.py
│   └── ehr_transformer.py
│
├── data_demo/
│   ├── demo_cxr_samples.csv
│   ├── demo_ehr_profiles.csv
│   ├── train_feature_schema.json
│   └── train_transform_stats.json
│
├── docs/
│   ├── method_overview.md
│   ├── experiment_design.md
│   ├── metric_explanation.md
│   └── reproducibility_notes.md
│
├── experiments/
│   └── results_summary.md
│
├── requirements.txt
└── README.md

Quick Start

Install dependencies:

pip install -r requirements.txt

Generate synthetic data:

python src/make_demo_data.py --mode interaction --n-samples 1000

Train the MLP baseline:

python src/train.py --ehr-encoder mlp --epochs 10 --batch-size 32 --num-workers 0 --output-dir outputs/example_mlp --checkpoint-dir checkpoints/example_mlp

Train the FT-Transformer model:

python src/train.py --ehr-encoder ftt --epochs 10 --batch-size 32 --num-workers 0 --output-dir outputs/example_ftt --checkpoint-dir checkpoints/example_ftt

Documentation

References

References are included in the paper-style report.

Limitations

This repository is a controlled method-analysis project rather than a clinical deployment study. The benchmark uses synthetic multimodal data so that data structure, pairing quality, feature interactions, and sample-size effects can be studied under known conditions.

For the full limitations and future work discussion, see the paper-style report.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FT-Transformer EHR Retrieval

Research Report

Research Question

What This Repository Shows

Methods Compared

Related Work

Demo-Space Visualization

Dataset Setup

Experiments

Key Findings

Repository Structure

Quick Start

Documentation

References

Limitations

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
data_demo		data_demo
docs		docs
experiments		experiments
figures		figures
src		src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

FT-Transformer EHR Retrieval

Research Report

Research Question

What This Repository Shows

Methods Compared

Related Work

Demo-Space Visualization

Dataset Setup

Experiments

Key Findings

Repository Structure

Quick Start

Documentation

References

Limitations

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages