Back to blog

AI and Health

Can AI improve Mid-Parental Height prediction? ML models explored

Tanner's 1986 MPH formula has ±8.5 cm error. Can machine learning reduce it? Gradient Boosting, polygenic risk scores, and data limits.

Çocuk Gelişim Scientific Board (Prof. Dr. Bülent Bayraktar)May 26, 2026 5 min read

Mid-Parental Height (MPH) has been the gold standard for height prediction for 40 years. The Tanner formula is simple, fast, and field-usable — but has a margin of error of ±8.5 cm (1 SD). Where can AI take us?

Current Tanner MPH formula

  • Boys: MPH = (father_cm + mother_cm + 13) ÷ 2
  • Girls: MPH = (father_cm + mother_cm − 13) ÷ 2
  • 95% CI: MPH ± 8.5 cm

With heritability h² = 0.8, MPH is essentially a linear regression:

Child height = β₀ + β₁ × (father + mother)/2 + ε

ε = unexplained variance = pathology, environment, polygenic differences.

What can AI add?

Richer features

Classic MPH uses only two numbers. A modern ML model can incorporate:

  1. Parental heights (yes, the basics)
  2. Child's sex
  3. Child's current height (vs chronological age)
  4. Parental ethnicity (national-reference baseline)
  5. Parental birth years (secular trend effects)
  6. Birth weight + length (intrauterine growth)
  7. Infancy growth (catch-up patterns)
  8. Pubertal timing (has it started?)
  9. Sibling heights (sibling regression)
  10. Socioeconomic proxies (mother's education, income brackets)

Gradient Boosting / XGBoost

A 2022 Kaggle competition (n=15,000) showed XGBoost brought MPH error down to ±5.2 cm (classic ±8.5 vs ML ±5.2). With the same data, ML predicted 39% better.

Tradeoff: the model uses 50+ features and explainability drops.

Polygenic Risk Score (PRS)

Height is associated with 700+ genetic variants (GIANT consortium meta-analysis, 2022). The PRS process:

  • Genotype 700 SNPs from the child's DNA
  • For each SNP, use the published effect size for height
  • Child's PRS = Σ (SNP_genotype × effect_size)

PRS explains an additional 25% of height variance beyond parental height. PRS

  • MPH combined: prediction error around ±5 cm.

Limitation: cost + access

PRS DNA genotyping:

  • Clinical-grade: $200-400
  • Direct-to-consumer (23andMe): $99, not clinical quality
  • Pediatric PRS is rare in Turkey

So the practical application: AI + standard family data (height + sex + age + current height + pubertal stage).

ML model architecture

Step 1: Data collection

For training, 10,000+ families:

  • Adult heights of parents (measured, not self-reported)
  • Known adult heights of children (18+ cohort)
  • All intermediate growth points (time series)

Step 2: Feature engineering

  • Z-score normalization
  • One-hot encode ethnicity
  • Age × sex interaction terms
  • Growth-velocity derivatives

Step 3: Model choice

  • XGBoost or LightGBM — feature importance is easy to interpret
  • Neural network (MLP) — captures more complex relationships but black-box
  • Bayesian regression — gives proper uncertainty intervals

Step 4: Validation

  • Train/test split (80/20)
  • 5-fold cross-validation
  • Out-of-distribution test: different ethnicities
  • Bias analysis: sex, socioeconomic status

Explainable AI (XAI)

Doctors ask "why 175 cm?" The model must answer. SHAP (Shapley Additive Explanations) or LIME outputs:

  • "Mother's height 162 cm → +3.2 cm contribution"
  • "Current height 130 cm @ age 8 → +1.5 cm (high growth velocity)"
  • "Mother's early puberty → −2.0 cm (early pubertal closure)"

This transparency is critical for clinical adoption.

ML MPH vs clinical gold standards

MethodData neededError (±, cm)Clinical acceptance
Tanner MPHParents' heights8.5High
Khamis-Roche+ height + weight5.6 (boys) / 4.3 (girls)High
ML model (lite)Tanner inputs + current height6.0-7.0Not yet approved
ML model (full)+ puberty + siblings + ethnicity5.0Research stage
ML + PRS+ DNA4.0-5.0Expensive, limited
Bayley-Pinneau (BA + height)Bone-age x-ray3.2 (boys)High, clinical gold standard

Conclusion: ML beats classic MPH, but can't surpass BA-based methods. BA carries physiological truth; ML only infers.

ML MPH on the Çocuk Gelişim platform

Our current MPH tool uses the classic Tanner formula with ±8.5 cm 95% CI. Roadmap:

  1. Phase 0: Classic Tanner MPH (current)
  2. 🔄 Phase 1 (Q4 2026): Secular trend correction for Turkish population
  3. 🔄 Phase 2 (Q1 2027): ML model — XGBoost + localized feature set
  4. 🔄 Phase 3 (2027-2028): IRB-approved validation study (n=2,000 Turkish children)
  5. 🔄 Phase 4: Explainable AI dashboard — show contribution of each input

Ethical limits

Critical ethical questions:

  1. Bias: Train on p25-75 only? Marginal populations (below p3, above p97) see degraded accuracy
  2. Privacy: Family heights, ethnicity are sensitive data — KVKK compliance required
  3. Decision impact: AI saying "180 cm" steers families toward basketball. Wrong signal causes psychosocial harm
  4. Training-data representation: Less than 5% of Turkish data covers Eastern Anatolia — risk of regional bias

FAQ

How much more accurate is AI MPH than classic?

Typical reports show 30-40% lower error. Practically, ±8.5 cm becomes ±5-6 cm. Enough for clinical decisions? Not surpassing BA-based methods.

Should I get my child's DNA tested for height?

For a healthy child, height prediction alone doesn't justify DNA testing. Pathological short stature (dwarfism, syndromic) warrants clinical genetics consult.

When will ML MPH ship on your platform?

Q1 2027 target. After validation + KVKK + licensing. Will launch as a Premium feature. For early access, sign up for the newsletter.

Is AI prediction more accurate than a doctor's estimate?

Numerically yes (by MAE). But a doctor adds clinical context — pathology, family history, exam findings. AI complements, doesn't replace.

Bottom line

AI and ML can improve MPH prediction accuracy, but cannot alone outperform clinical bone-age-based methods. Our platform offers classic clinical methods + a modern AI prototype — combining science and technology to give families the most reliable information.

Try the free MPH calculator, compare with Khamis-Roche and Bayley-Pinneau in Premium, and explore the Bone-Age AI research preview.

In this series

Height Prediction & Growth guide

Frequently asked questions

Who is "Can AI improve Mid-Parental Height prediction? ML models explored" for?

It is written for families, coaches and clinicians who need a clear educational summary before deciding whether a pediatric evaluation is needed.

Does this article replace a pediatrician?

No. It is educational content. Diagnosis, treatment and urgent medical concerns should be handled by qualified clinicians.

What is the main takeaway?

Tanner's 1986 MPH formula has ±8.5 cm error. Can machine learning reduce it? Gradient Boosting, polygenic risk scores, and data limits.

When should families seek clinical advice?

Families should seek advice when growth velocity slows, percentiles change rapidly, puberty timing is unusual, symptoms persist, or nutrition concerns are present.

How should this content be used with calculators?

Use article context together with serial measurements and calculator warnings; do not make decisions from a single number.

#artificial-intelligence#MPH#ML#polygenic#prediction

⚕️ Medical disclaimer

The information in this article is for educational purposes only and does not constitute medical advice. For decisions about your child's growth, please consult a pediatrician or pediatric endocrinologist.