2026 · June

MegaCALR: AI-Driven, Single-Cell Quantification of CALRmut in Megakaryocytes

Thomas Fletcher

Machine Learning Scientist

At EHA 2026 in Stockholm, we presented MegaCALR IHC, a deep-learning pipeline that detects, segments, and classifies every megakaryocyte on a CALRmut immunohistochemistry (IHC) slide and reports per-cell counts and class proportions in place of a slide-level grade.

Its agreement with board-certified hematopathologists falls within the range of agreement among the pathologists themselves, and remains stable across cell size, stain intensity, diagnosis, and scanners.

Although validated here on CALRmut IHC, the underlying method generalises to quantitative IHC and targeted therapies more broadly.

Targeted therapies need quantitative tissue readouts

Developing a targeted therapy requires measuring its target: which patients express it, at what level, and whether expression changes under treatment. IHC makes the target visible in tissue, staining the protein cell by cell on a slide that a pathologist then reads.

The limitation lies in the readout itself. The pathologist typically returns a single grade, often no more than positive or negative for the whole slide. Such a grade is sufficient for diagnosis but cannot quantify a target across thousands of cells, track how expression changes over months of treatment, or serve as a trial endpoint. The relevant detail is present in the tissue, yet the grade discards most of it, and recovering it manually, by annotating every relevant cell on every slide, does not scale to a clinical trial.

We develop AI that recovers this detail: identifying every relevant cell, measuring how strongly each expresses the marker, and returning quantities that are comparable across slides. MegaCALR applies this approach to CALRmut.

Why we started with CALRmut

Calreticulin (CALR) mutations are the second most common driver of myeloproliferative neoplasms. Mutant CALR activates the thrombopoietin receptor and drives proliferation of megakaryocytes, the large platelet-producing cells of the bone marrow. With CALR-targeted therapies now in development, there is a near-term need to measure CALRmut expression in these cells accurately and at scale.

This is currently difficult. CALRmut IHC is reported as a binary, slide-level grade, and the only route to cell-level detail, the manual annotation of megakaryocytes, does not scale to a cohort, let alone a trial. MegaCALR removes this constraint by detecting, segmenting, and classifying every megakaryocyte on a CALRmut whole-slide image.

How MegaCALR works

MegaCALR processes a whole-slide image in three stages. Quality-control models first restrict the analysis to assessable tissue, excluding regions such as bone and haemorrhage. A segmentation model then delineates every megakaryocyte, and a classification model assigns each to a CALRmut staining class: Negative, Low Positive, or High Positive.

The MegaCALR pipeline: input tiles pass through a megakaryocyte segmentation model, then a classification model, producing a whole-slide map of classified megakaryocytes and per-class metrics. — The MegaCALR pipeline: quality control restricts analysis to assessable tissue, megakaryocytes are segmented, then each is classified by CALRmut staining. Figure from the EHA 2026 poster.

The output is a megakaryocyte count together with the per-class proportions, namely the fraction of cells that are CALRmut Negative, Low Positive, and High Positive. These quantities are directly comparable across patients, time-points, and trial arms.

Validated against five pathologists

We evaluated MegaCALR against expert consensus. Five board-certified hematopathologists independently annotated the validation data. For detection, each pathologist outlined every megakaryocyte in each field of view, and the reference standard was defined only where readers agreed. For classification, each assigned every megakaryocyte to one of the three CALRmut classes (or to “not a megakaryocyte”), and the reference was taken as the majority vote, with ties excluded.

The pipeline was trained and validated on CALRmut whole-slide images from our own dataset of routine diagnostic samples, with validation performed on a held-out set that reflects routine practice rather than a curated subset.

This frames the evaluation appropriately. The pathologists do not fully agree with one another; inter-observer variability is substantial, both in identifying megakaryocytes and in classifying their CALRmut expression. The relevant question is therefore not whether the model is perfect, but whether its agreement with consensus falls within the range observed among the experts.

Two panels. Left: the same megakaryocytes segmented by each of five pathologists and by MegaCALR, showing the readers disagree on which cells to call megakaryocytes while MegaCALR matches their consensus. Right: a grid of how each pathologist and MegaCALR classified individual megakaryocytes as CALRmut Negative, Low Positive, or High Positive, with MegaCALR matching the consensus class. — Pathologists vary in which cells they call megakaryocytes (left) and in the CALRmut class they assign to a given cell (right); on both, MegaCALR (MC) matches their consensus. Figure from the EHA 2026 poster.

Expert-level agreement

It does, on every axis we measured: a segmentation F1 of 0.76, a three-class quadratic weighted kappa (QWK) of 0.90, and a binary QWK of 0.95. MegaCALR agrees with the expert consensus about as closely as the experts agree with one another.

Two charts comparing MegaCALR to each of the five pathologists. Left: per-field-of-view segmentation F1. Right: three-class quadratic weighted kappa for classification. MegaCALR sits within the spread of the pathologists on both. — Per-field-of-view segmentation F1 (left) and three-class classification QWK (right): MegaCALR (MC) falls within the range of the five pathologists on both. Figure from the EHA 2026 poster.

Robustness across strata

Average performance is uninformative if it collapses on difficult cases. MegaCALR remained within the pathologist range across megakaryocyte diameters, quartiles of stain (DAB) intensity, and diagnoses. Its weakest performance, on the smallest megakaryocytes (under 20 µm), coincided with the cases on which the pathologists themselves agreed least with consensus. Performance therefore degrades where the underlying task is intrinsically hard, not where the model is brittle.

Segmentation F1 broken down by diagnosis, DAB stain-intensity quartile, and megakaryocyte diameter, alongside an inter-scanner agreement scatter plot. MegaCALR stays within the pathologist range across strata, and counts agree closely between scanners. — MegaCALR holds within the pathologist range across diagnosis, stain (DAB) intensity, and megakaryocyte diameter, and agrees closely with itself across scanners. Figure from the EHA 2026 poster.

Reproducibility across scanners

Reproducibility is most critical in the trial setting, and is something a human reader cannot guarantee. We acquired the same slides on different scanners. Megakaryocyte counts were almost perfectly reproducible (intraclass correlation 0.99), and the CALRmut class proportions nearly so (0.94–0.99), with only a small, non-significant bias between scanners.

A general method for quantitative IHC

CALR illustrates a general method: converting an IHC stain into single-cell, per-class quantification. Restrict the analysis to assessable tissue, identify the cells that carry the signal, classify how strongly each expresses the marker, and report reproducible counts and proportions in place of a slide-level grade.

The method is not specific to calreticulin or to megakaryocytes; it extends to other markers, cell types, tumour types, and targeted therapies. For a therapy in development that requires a scalable, reproducible readout of its target in tissue, whether for patient selection, a pharmacodynamic measure, or a response endpoint, this is the class of biomarker we build.

We are extending the approach to further stains and indications. If a quantitative tissue readout is on the critical path for a therapy you are developing, please get in touch.

Presented at the European Hematology Association (EHA) 2026 Congress, Stockholm, Sweden, June 11–14, 2026. MegaCALR IHC was developed at Ground Truth Labs on our own dataset, with independent megakaryocyte annotations contributed by five external hematopathologists.