Who is "Can AI improve Mid-Parental Height prediction? ML models explored" for?

It is written for families, coaches and clinicians who need a clear educational summary before deciding whether a pediatric evaluation is needed.

Does this article replace a pediatrician?

No. It is educational content. Diagnosis, treatment and urgent medical concerns should be handled by qualified clinicians.

What is the main takeaway?

Tanner's 1986 MPH formula has ±8.5 cm error. Can machine learning reduce it? Gradient Boosting, polygenic risk scores, and data limits.

When should families seek clinical advice?

Families should seek advice when growth velocity slows, percentiles change rapidly, puberty timing is unusual, symptoms persist, or nutrition concerns are present.

How should this content be used with calculators?

Use article context together with serial measurements and calculator warnings; do not make decisions from a single number.

Can AI improve Mid-Parental Height prediction?…

Mid-Parental Height (MPH) has been the gold standard for height prediction for 40 years. The Tanner formula is simple, fast, and field-usable — but has a margin of error of ±8.5 cm (1 SD). Where can AI take us?

Current Tanner MPH formula

Boys: MPH = (father_cm + mother_cm + 13) ÷ 2
Girls: MPH = (father_cm + mother_cm − 13) ÷ 2
95% CI: MPH ± 8.5 cm

With heritability h² = 0.8, MPH is essentially a linear regression:

Child height = β₀ + β₁ × (father + mother)/2 + ε

ε = unexplained variance = pathology, environment, polygenic differences.

What can AI add?

Richer features

Classic MPH uses only two numbers. A modern ML model can incorporate:

Parental heights (yes, the basics)
Child's sex
Child's current height (vs chronological age)
Parental ethnicity (national-reference baseline)
Parental birth years (secular trend effects)
Birth weight + length (intrauterine growth)
Infancy growth (catch-up patterns)
Pubertal timing (has it started?)
Sibling heights (sibling regression)
Socioeconomic proxies (mother's education, income brackets)

Gradient Boosting / XGBoost

A 2022 Kaggle competition (n=15,000) showed XGBoost brought MPH error down to ±5.2 cm (classic ±8.5 vs ML ±5.2). With the same data, ML predicted 39% better.

Tradeoff: the model uses 50+ features and explainability drops.

Polygenic Risk Score (PRS)

Height is associated with 700+ genetic variants (GIANT consortium meta-analysis, 2022). The PRS process:

Genotype 700 SNPs from the child's DNA
For each SNP, use the published effect size for height
Child's PRS = Σ (SNP_genotype × effect_size)

PRS explains an additional 25% of height variance beyond parental height. PRS

MPH combined: prediction error around ±5 cm.

Limitation: cost + access

PRS DNA genotyping:

Clinical-grade: $200-400
Direct-to-consumer (23andMe): $99, not clinical quality
Pediatric PRS is rare in Turkey

So the practical application: AI + standard family data (height + sex + age + current height + pubertal stage).

ML model architecture

Step 1: Data collection

For training, 10,000+ families:

Adult heights of parents (measured, not self-reported)
Known adult heights of children (18+ cohort)
All intermediate growth points (time series)

Step 2: Feature engineering

Z-score normalization
One-hot encode ethnicity
Age × sex interaction terms
Growth-velocity derivatives

Step 3: Model choice

XGBoost or LightGBM — feature importance is easy to interpret
Neural network (MLP) — captures more complex relationships but black-box
Bayesian regression — gives proper uncertainty intervals

Step 4: Validation

Train/test split (80/20)
5-fold cross-validation
Out-of-distribution test: different ethnicities
Bias analysis: sex, socioeconomic status

Explainable AI (XAI)

Doctors ask "why 175 cm?" The model must answer. SHAP (Shapley Additive Explanations) or LIME outputs:

"Mother's height 162 cm → +3.2 cm contribution"
"Current height 130 cm @ age 8 → +1.5 cm (high growth velocity)"
"Mother's early puberty → −2.0 cm (early pubertal closure)"

This transparency is critical for clinical adoption.

ML MPH vs clinical gold standards

Method	Data needed	Error (±, cm)	Clinical acceptance
Tanner MPH	Parents' heights	8.5	High
Khamis-Roche	+ height + weight	5.6 (boys) / 4.3 (girls)	High
ML model (lite)	Tanner inputs + current height	6.0-7.0	Not yet approved
ML model (full)	+ puberty + siblings + ethnicity	5.0	Research stage
ML + PRS	+ DNA	4.0-5.0	Expensive, limited
Bayley-Pinneau (BA + height)	Bone-age x-ray	3.2 (boys)	High, clinical gold standard

Conclusion: ML beats classic MPH, but can't surpass BA-based methods. BA carries physiological truth; ML only infers.

ML MPH on the Çocuk Gelişim platform

Our current MPH tool uses the classic Tanner formula with ±8.5 cm 95% CI. Roadmap:

✅ Phase 0: Classic Tanner MPH (current)
🔄 Phase 1 (Q4 2026): Secular trend correction for Turkish population
🔄 Phase 2 (Q1 2027): ML model — XGBoost + localized feature set
🔄 Phase 3 (2027-2028): IRB-approved validation study (n=2,000 Turkish children)
🔄 Phase 4: Explainable AI dashboard — show contribution of each input

Ethical limits

Critical ethical questions:

Bias: Train on p25-75 only? Marginal populations (below p3, above p97) see degraded accuracy
Privacy: Family heights, ethnicity are sensitive data — KVKK compliance required
Decision impact: AI saying "180 cm" steers families toward basketball. Wrong signal causes psychosocial harm
Training-data representation: Less than 5% of Turkish data covers Eastern Anatolia — risk of regional bias

FAQ

How much more accurate is AI MPH than classic?

Typical reports show 30-40% lower error. Practically, ±8.5 cm becomes ±5-6 cm. Enough for clinical decisions? Not surpassing BA-based methods.

Should I get my child's DNA tested for height?

For a healthy child, height prediction alone doesn't justify DNA testing. Pathological short stature (dwarfism, syndromic) warrants clinical genetics consult.

When will ML MPH ship on your platform?

Q1 2027 target. After validation + KVKK + licensing. Will launch as a Premium feature. For early access, sign up for the newsletter.

Is AI prediction more accurate than a doctor's estimate?

Numerically yes (by MAE). But a doctor adds clinical context — pathology, family history, exam findings. AI complements, doesn't replace.

Bottom line

AI and ML can improve MPH prediction accuracy, but cannot alone outperform clinical bone-age-based methods. Our platform offers classic clinical methods + a modern AI prototype — combining science and technology to give families the most reliable information.

Try the free MPH calculator, compare with Khamis-Roche and Bayley-Pinneau in Premium, and explore the Bone-Age AI research preview.

Can AI improve Mid-Parental Height prediction? ML models explored