Mid-Parental Height (MPH) has been the gold standard for height prediction for 40 years. The Tanner formula is simple, fast, and field-usable — but has a margin of error of ±8.5 cm (1 SD). Where can AI take us?
Current Tanner MPH formula
- Boys: MPH = (father_cm + mother_cm + 13) ÷ 2
- Girls: MPH = (father_cm + mother_cm − 13) ÷ 2
- 95% CI: MPH ± 8.5 cm
With heritability h² = 0.8, MPH is essentially a linear regression:
Child height = β₀ + β₁ × (father + mother)/2 + ε
ε = unexplained variance = pathology, environment, polygenic differences.
What can AI add?
Richer features
Classic MPH uses only two numbers. A modern ML model can incorporate:
- Parental heights (yes, the basics)
- Child's sex
- Child's current height (vs chronological age)
- Parental ethnicity (national-reference baseline)
- Parental birth years (secular trend effects)
- Birth weight + length (intrauterine growth)
- Infancy growth (catch-up patterns)
- Pubertal timing (has it started?)
- Sibling heights (sibling regression)
- Socioeconomic proxies (mother's education, income brackets)
Gradient Boosting / XGBoost
A 2022 Kaggle competition (n=15,000) showed XGBoost brought MPH error down to ±5.2 cm (classic ±8.5 vs ML ±5.2). With the same data, ML predicted 39% better.
Tradeoff: the model uses 50+ features and explainability drops.
Polygenic Risk Score (PRS)
Height is associated with 700+ genetic variants (GIANT consortium meta-analysis, 2022). The PRS process:
- Genotype 700 SNPs from the child's DNA
- For each SNP, use the published effect size for height
- Child's PRS = Σ (SNP_genotype × effect_size)
PRS explains an additional 25% of height variance beyond parental height. PRS
- MPH combined: prediction error around ±5 cm.
Limitation: cost + access
PRS DNA genotyping:
- Clinical-grade: $200-400
- Direct-to-consumer (23andMe): $99, not clinical quality
- Pediatric PRS is rare in Turkey
So the practical application: AI + standard family data (height + sex + age + current height + pubertal stage).
ML model architecture
Step 1: Data collection
For training, 10,000+ families:
- Adult heights of parents (measured, not self-reported)
- Known adult heights of children (18+ cohort)
- All intermediate growth points (time series)
Step 2: Feature engineering
- Z-score normalization
- One-hot encode ethnicity
- Age × sex interaction terms
- Growth-velocity derivatives
Step 3: Model choice
- XGBoost or LightGBM — feature importance is easy to interpret
- Neural network (MLP) — captures more complex relationships but black-box
- Bayesian regression — gives proper uncertainty intervals
Step 4: Validation
- Train/test split (80/20)
- 5-fold cross-validation
- Out-of-distribution test: different ethnicities
- Bias analysis: sex, socioeconomic status
Explainable AI (XAI)
Doctors ask "why 175 cm?" The model must answer. SHAP (Shapley Additive Explanations) or LIME outputs:
- "Mother's height 162 cm → +3.2 cm contribution"
- "Current height 130 cm @ age 8 → +1.5 cm (high growth velocity)"
- "Mother's early puberty → −2.0 cm (early pubertal closure)"
This transparency is critical for clinical adoption.
ML MPH vs clinical gold standards
| Method | Data needed | Error (±, cm) | Clinical acceptance |
|---|---|---|---|
| Tanner MPH | Parents' heights | 8.5 | High |
| Khamis-Roche | + height + weight | 5.6 (boys) / 4.3 (girls) | High |
| ML model (lite) | Tanner inputs + current height | 6.0-7.0 | Not yet approved |
| ML model (full) | + puberty + siblings + ethnicity | 5.0 | Research stage |
| ML + PRS | + DNA | 4.0-5.0 | Expensive, limited |
| Bayley-Pinneau (BA + height) | Bone-age x-ray | 3.2 (boys) | High, clinical gold standard |
Conclusion: ML beats classic MPH, but can't surpass BA-based methods. BA carries physiological truth; ML only infers.
ML MPH on the Çocuk Gelişim platform
Our current MPH tool uses the classic Tanner formula with ±8.5 cm 95% CI. Roadmap:
- ✅ Phase 0: Classic Tanner MPH (current)
- 🔄 Phase 1 (Q4 2026): Secular trend correction for Turkish population
- 🔄 Phase 2 (Q1 2027): ML model — XGBoost + localized feature set
- 🔄 Phase 3 (2027-2028): IRB-approved validation study (n=2,000 Turkish children)
- 🔄 Phase 4: Explainable AI dashboard — show contribution of each input
Ethical limits
Critical ethical questions:
- Bias: Train on p25-75 only? Marginal populations (below p3, above p97) see degraded accuracy
- Privacy: Family heights, ethnicity are sensitive data — KVKK compliance required
- Decision impact: AI saying "180 cm" steers families toward basketball. Wrong signal causes psychosocial harm
- Training-data representation: Less than 5% of Turkish data covers Eastern Anatolia — risk of regional bias
FAQ
How much more accurate is AI MPH than classic?
Typical reports show 30-40% lower error. Practically, ±8.5 cm becomes ±5-6 cm. Enough for clinical decisions? Not surpassing BA-based methods.
Should I get my child's DNA tested for height?
For a healthy child, height prediction alone doesn't justify DNA testing. Pathological short stature (dwarfism, syndromic) warrants clinical genetics consult.
When will ML MPH ship on your platform?
Q1 2027 target. After validation + KVKK + licensing. Will launch as a Premium feature. For early access, sign up for the newsletter.
Is AI prediction more accurate than a doctor's estimate?
Numerically yes (by MAE). But a doctor adds clinical context — pathology, family history, exam findings. AI complements, doesn't replace.
Bottom line
AI and ML can improve MPH prediction accuracy, but cannot alone outperform clinical bone-age-based methods. Our platform offers classic clinical methods + a modern AI prototype — combining science and technology to give families the most reliable information.
Try the free MPH calculator, compare with Khamis-Roche and Bayley-Pinneau in Premium, and explore the Bone-Age AI research preview.