K=4 EM-warmup checkpoint (Pfam top-1000)
K=4 EM-warmup checkpoint (Pfam top-1000)
The trained TKF-DP checkpoint (best-val-LL snapshot) and full training log
from a single SVI run on the top-1000 Pfam families, K_c=4, with the
multi-seed soft-EM warm-start enabled and no per-class-pair side potentials.
Result
- Best val_LL = -298.41 at outer iter 55 (validation family PF00076)
- Early-stopped at iter 85 on patience-6
- Total wall time: 16 751 s ≈ 4 h 39 min
- Per-outer mean: 197 s
Configuration
| Flag | Value |
|---|---|
| --processed-dir | data/pfam_processed_top1000 |
| --n-families | 1000 |
| --K (K_c) | 4 |
| --em-warmup-iters | 500 |
| --em-warmup-seeds | 50 |
| --K-H-max | 10 (= K_c (K_c+1) / 2) |
| --alpha-z | 100 |
| --val-families | PF00076 |
| --use-side-potentials | off |
| Substitution model | LG08 |
Files in the tarball
| Path | Size |
|---|---|
| _best_chkpt/state.npz | 2.6 MB |
| _best_chkpt/meta.json | 20 KB |
| _best_chkpt/trace.json | 10 KB |
| logs/exp2_v2_K4_top1000_tsb_emwarm.log | 115 KB |
Reproduction
gh release download results/K4-emwarm-top1000-2026-05-09 -R evoldoers/tkfdp \
-p 'K4-emwarm-top1000-2026-05-09.tar.gz'
tar xzf K4-emwarm-top1000-2026-05-09.tar.gz
# Then plug into the BAliBASE eval (recipe step h in REPRODUCTION_MANIFEST.md)
Backs Figure 3 (Potts atom log-odds), Figure 4 (LAMA1 holmes-tile),
Figure 5 (PF00053 MSA triangle), and the inf-PHMM-K=4 row of Table 1.