Complete Research Journey
A detailed technical walkthrough showing WHAT we did, WHICH features we used, WHY we made each decision, and HOW we achieved 0.848 AUC. Every number is verified and real.
The Complete Journey (7 Phases)
Follow our research from raw data to breakthrough findings. All numbers are verified against actual results.
Started with OASIS-1: 436 subjects → 205 usable after cleaning. Single-site (Washington Univ), homogeneous data. Task: CDR=0 vs CDR≥0.5 (138 healthy / 67 dementia).
Scaled to ADNI-1: 629 subjects (de-duplicated from 1,825 scans). Multi-site (57 sites), heterogeneous scanners. BUT: Used only Age + Sex (2D) for clinical features.
Experiment A (OASIS→ADNI): MRI-Only most robust (0.607 AUC, -20.7% drop). Fusion worse (-28.9% drop). Experiment B (ADNI→OASIS): Late Fusion best (0.624 AUC). Different winners!
Question: Is MODEL broken or FEATURES weak? Answer: Intentionally added MMSE + CDR-SB (circular cognitive test scores). Result: 0.988 AUC (almost perfect).
Used REAL biological features (honest but powerful). Extracted from ADNIMERGE.csv. N=629 subjects. 35% CSF missing (median imputation), 18% volumes missing.
Hypothesis: Track ResNet features over time to predict MCI→Dementia conversion. Data: 639 subjects, 2,262 scans (avg 3.6/subject). Model: LSTM on ResNet512 sequences.
Switched to EXPLICIT volumetric measurements from ADNIMERGE. Cohort: 341 MCI-only subjects (115 converters, 226 stable). Model: Random Forest (100 trees, max_depth=10, 5-fold CV). Why RF not LSTM? Only 341 subjects (too few for deep learning), tabular data, interpretable.
Key Discoveries
• Level-MAX biomarkers: 0.808 AUC
• Hippocampus atrophy rate: 34.2% importance
• Longitudinal tracking: +11.2% boost
• Random Forest: Best for N=341
• Feature content: 7× more important than architecture
• Age/Sex only: 0.598 AUC (near-random)
• ResNet for progression: 0.441 AUC
• LSTM sequences: Couldn't learn
• Cross-dataset transfer: 15-30% drop
• Attention fusion: Higher variance, worse robustness
• APOE4 carriers: 44% vs 23% conversion (2× risk)
• Hippocampus alone: 0.725 AUC
• Simple RF >Complex LSTM (0.848 vs 0.441)
• Feature upgrade: +21% AUC
• Architecture upgrade: <3% AUC
MCI → Dementia Progression Prediction
Using 21 volumetric features (baseline + follow-up + delta), Random Forest achieved 0.848 AUC (p<0.001, d=2.14). Hippocampal atrophy rate is the strongest single predictor. Statistical validation: 95% power (N=341 exceeds required N=278).