14Systematic Reviews and Meta-Analyses of Ossiculoplasty
What pooled evidence across thousands of implants says about extrusion rates, hearing gain, and material performance.
FWhy pool the evidence at all?
Ask ten otologists which prosthesis hears best and you may get ten confident answers, each anchored to a personal run of good cases. That is the problem the systematic reviewwas invented to solve. Any single surgeon’s series is small, retrospective and shaped by who walked through the door; the play of chance and local habit can make almost any material look excellent or poor. A systematic review answers a pre-specified question by searching the literature exhaustively, appraising every eligible study against fixed criteria, and — when the studies are similar enough — statistically pooling their results in a meta-analysis. The pay-off is precision: combining thousands of operated ears narrows the confidence interval around an estimate that no one centre could ever produce alone.
This is why the systematic review and meta-analysis sit at the apex of the conventional hierarchy of evidence, above the randomised trial, the cohort study, the case series and, at the base, expert opinion and bench reasoning. But the apex is not magic. A meta-analysis is only ever as trustworthy as the studies it gathers: pool flawed, mismatched series and you get a precise answer to the wrong question. Nowhere is that caveat sharper than in ossiculoplasty, where high-quality randomised data are scarce and the literature is dominated by small retrospective series with inconsistent reporting [2025]. The pyramid for our field is, frankly, top-heavy: pooled reviews sit at the top, but the rungs beneath them are thinly populated.
FWhat the headline numbers say
Strip the field down to the figures every otologist should carry into a consent conversation, and the pooled data are reassuringly consistent. The largest contemporary meta-analysis of titanium ossicular reconstruction gathered 40 studies and found a mean air-bone gap improvement of about 12 dB for partial prostheses (PORP) and about 17 dB for total prostheses (TORP), with a postoperative gap closed to within 20 dB in roughly 70% of PORP ears and 57% of TORP ears [2023]. Expressed differently, a typical patient can expect a meaningful hearing gain and a better-than-even chance of a near-normal result, but not a guarantee — a message far more honest than “this usually works.”
The other headline number is extrusion— the prosthesis slowly working its way out through the drum. A focused review of eighty titanium series put the pooled extrusion-or-dislocation rate at about 5%, but with a startling reported range from 0 to 35% [2023]. That range is itself a lesson: extrusion is uncommon on average yet highly dependent on technique, the ear and whether a cartilage shield protects the interface. A clean way to summarise the pooled position for a patient is therefore: a good chance of useful hearing improvement, a small but real chance of the prosthesis extruding, and a result that depends as much on their ear as on the device.
TPORP versus TORP: a real gap, the wrong lesson
The most reproducible single finding in the pooled literature is that PORP outperforms TORP for gap closure. The titanium meta-analysis showed 70% versus 57% within 20 dB [2023]; an independent pediatric meta-analysis of 11 studies and 449 children found 62.5% versus 48.3% [2023]. The temptation is to conclude that the PORP is simply the better device. That is the wrong lesson, and recognising why is the heart of reading this literature well.
The PORP–TORP difference is largely confounding by indication. A surgeon reaches for a TORP precisely becausethe stapes superstructure is gone — and an absent superstructure marks a more extensively diseased ear, a longer drum-to-footplate span to reconstruct, and a worse starting point. The pooled data make this explicit: PORP ears began with a smaller preoperative gap, and the pediatric analysis found no difference in extrusion between PORP and TORP (odds ratio 1.08) [2023]. If the prostheses were genuinely different in quality you would expect a difference in extrusion; you do not see one. The hearing gap reflects the severity of the defect each device is chosen to treat, not the merit of the hardware. The clinical corollary is liberating: when anatomy demands a TORP, you place a TORP — the pooled figures describe your patient’s prognosis, they do not condemn your choice.
TDoes the material matter? The pooled verdict
Surgeons argue passionately about titanium versus hydroxyapatite versus the patient’s own incus. The pooled evidence is almost deflating: for hearing, the materials are hard to separate. A Bayesian network meta-analysis of 17 studies and 1,273 patients found hydroxyapatite and titanium gave comparable air-bone-gap closure (a mean effect around 16 dB), with autologous tissue and Teflon ranking marginally highest for air- and bone-conduction outcomes; no prosthesis emerged as definitively superior across all endpoints [2025]. Comparative series tell the same story: hearing results overlap so heavily that material is, at most, a second-order determinant of audiometric success [2018].
Where the materials do separate is extrusion and handling, and this is what earns titanium its “benchmark” reputation despite equivalent hearing. In a head-to-head comparison of Plastipore, hydroxyapatite and titanium, hearing did not differ significantly within each prosthesis type, but titanium showed the lowest extrusion — most clearly for total prostheses (titanium 4% vs hydroxyapatite 8% vs Plastipore 15%) [2018]. Titanium is also light, rigid, non-ferromagnetic (MRI-compatible) and easily trimmed at the microscope. Hydroxyapatite, by contrast, is exceptionally biocompatible and tolerates direct contact with the drum, but is brittle and awkward to shape — limitations that drove the hybrid hydroxyapatite-head designs in the first place [1992]. So the pooled verdict is nuanced: choose your material for its safety and usability profile, because the hearing it delivers is broadly interchangeable.
| Question to the pooled data | What the reviews actually show |
|---|---|
| Which prosthesis hears best? | No consistent winner; titanium ≈ hydroxyapatite ≈ autograft for air-bone-gap closure |
| PORP or TORP gives better gap closure? | PORP (about 70% vs 57%), but driven by defect severity, not device quality |
| How often does a prosthesis extrude? | Pooled titanium about 5% (range 0–35%); titanium lowest, Plastipore highest |
| What most predicts a good result? | The middle-ear environment and defect, more than the material chosen |
TReading a review critically: heterogeneity and bias
A meta-analysis presents a single tidy number, but the trainee must learn to look behind it. The dominant threat in ossiculoplasty pooling is heterogeneity— the studies being combined are not really alike. They differ in disease severity, in surgical technique, in whether a cartilage shield was used, in length of follow-up, and, critically, in how they report hearing. One paper averages 0.5/1/2/3 kHz against the postoperative bone line; another quietly drops the worst high frequency or references an old bone line that flatters the result. Pool those together and the headline figure blurs. Reviews of the field repeatedly conclude that the available evidence is insufficient to crown an ideal prosthesis and call for larger, standardised studies [2025].
Three specific cautions deserve naming:
- Confounding by indication. As with PORP versus TORP, the device or technique is chosen because of the ear, so apparent device effects partly encode disease severity [2023].
- The environment dominates the material.Statistical prognostic staging (the OOPS index) shows mucosal health, aeration, drainage, the ossicular remnant and prior surgery drive outcome more than the prosthesis — which is exactly why materials look so similar once pooled [2001].
- Wide ranges hide behind tidy means.An extrusion rate of “about 5%” spanning 0–35% tells you the average is unremarkable but the variance is everything; your technique sits somewhere on that spread [2023].
The skill, then, is to read the pooled estimate and its spread, to ask what was pooled, and to treat a confident-looking forest plot built from small retrospective series with appropriate humility.
CUsing pooled evidence at the bedside
How should the clinician actually use this body of evidence? Not as a league table of prostheses, but as a source of honest, quantified counselling and risk stratification. A defensible synthesis runs as follows:
- Counsel with real numbers. Tell a PORP candidate they have roughly a two-in-three chance of closing the gap to near-normal and a small (about 5%) chance of extrusion; tell a TORP candidate the odds are a little lower because their ear is more diseased, not because the device is worse [2023, 2023].
- Choose material for safety and handling, not for hearing.Since pooled hearing is comparable, let extrusion profile, MRI compatibility, trimmability and cost decide — which in many hands means titanium with a cartilage shield, or a well-sculpted autograft when the tissue is available [2025, 2018].
- Optimise the ear before you blame the device. Because the environment outweighs the material, aeration, mucosal health and staging will move your results further than any change of prosthesis brand [2001].
- Audit yourself against the pooled benchmark. If your gap-closure rate sits well below the pooled 70% for PORP, or your extrusion well above 5%, the review gives you a reference line against which to interrogate your own technique and reporting [2023].
The mature position is neither to worship the meta-analysis nor to dismiss it. Pooled across thousands of implants, the evidence delivers a stable, usable picture: good but imperfect hearing gains, low but variable extrusion, and materials that differ more in safety than in sound. Its limitations — heterogeneity, confounding and a thin randomised base — are reasons to read it carefully, not to ignore it. Used well, the systematic review turns a field of competing anecdotes into a set of numbers you can quote to a patient and measure yourself against [2025, 2025].
What is the most accurate reading of why PORP outperforms TORP across these pooled analyses?
A systematic review and meta-analysis sits near the top of the evidence hierarchy. What does it actually do that an individual case series does not?
Pooling thousands of implants across reviews, roughly what proportion of PORP ossiculoplasties close the air-bone gap to within 20 dB, and what is the approximate pooled extrusion rate for titanium prostheses?
Meta-analyses comparing titanium and hydroxyapatite consistently report no significant difference in hearing outcome, yet titanium is often described as the benchmark. What best reconciles these two statements?
When applying pooled ossiculoplasty evidence to your own practice, which limitation most threatens the validity of the headline numbers?