Analysis

How many cells do you need to reliably detect a population of interest?

Summary
The answer to this common question is more complex than you'd think. We use real-world examples to break down how reliability depends on the confidence you need, the variability you can accept, and of course the frequency of the population you're interested in measuring.
Article

The answer depends on the confidence you need, and the variability you can accept. Variability is expressed by the coefficient of variation (CV) is simply the standard deviation divided by the mean. The higher this number, the more “variable” the measurement. The lower the number, the less “variable” the measurement. Intuitively, a population that appears a lot, say 10% of the time, needs fewer cells than a population that occurs 0.001% of the time.

For purposes of this answer, let’s consider two populations, intermediate Monocytes (inMono) and T cells. These figures are derived from our mass cytometry validation report, however we believe these reference ranges are applicable regardless of the instrument type (spectral cytometry or mass cytometry).

Table 1: Reference Range Values for PBMCs from Healthy Subjects, % of non-granulocytes
Cell Population Median Inter-run CV% Intra-run CV%
inMono 0.47% 16.09% 4.63%
T cells 37.28% 0.39% 1.27%

Luckily, this question has been addressed in the work of Keeney et al. To wit, “for cell-based assays such as flow cytometry, a simple calculation can be used to determine the size of the database/sample that will provide a given precision: r = (100/CV)2; where r is the number of events meeting the required criterion, and CV is the coefficient of variation of a known positive control.”

We’ve adapted Keeney’s table below:

Table 2: “Determination of database/sample size that will provide a given precision in rare event analysis”, Keeney, et. al
Desired Coefficient of Variation (%) 1 5 10 20
r = number of events of interest 10,000 400 100 25
When occurring at a frequency of:
Fraction 1:n Total number of events which must be collected
0.1 10 100,000 4,000 1,000 250
0.01 100 1,000,000 40,000 10,000 2,500
0.001 1,000 10,000,000 400,000 100,000 25,000
0.0001 10,000 100,000,000 4,000,000 1,000,000 250,000
0.00001 100,000 1,000,000,000 40,000,000 10,000,000 2,500,000
0.000001 1,000,000 10,000,000,000 400,000,000 100,000,000 25,000,000

Now, let’s couple that with the instruments we have at our disposal: mass cytometry and spectral flow cytometry. Based on our experience, mass cytometry and spectral flow have recovery rates of 50% and 90%, respectively.

Table 3: Estimated events for Peripheral Blood Mononuclear Cell collection
Drawn from Patient (mL) Cells per mL (M) Number of Cells in a Vial (M) Instrument Recovery Rate Resulting Events (M)
3 1.8 5 Mass Cytometry 50% 2.63
Spectral Flow 90% 4.73
Table 4: Estimated events for Whole Blood collection
Drawn from
Patient (mL)
Cells per
mL (M)
Number of
Cells in Whole Blood (M)
%
Granulocytes
Approx. Number
of Granulocytes (M)
Non-
Granulocytes (M)
Instrument Recovery
Rate
Resulting
Events (M)
Resulting Non-
Granulocyte Events (M)
2 5 10 50% 5 5 Mass Cytometry 50% 5.00 2.50
5 5 Spectral Flow 90% 9.00 4.50

For PBMCs, you can get 0.5M - 10.8M events, depending on the volume of blood collected. In the table, we just show the average ranges, i.e. 2.63M - 4.73M. And for whole blood, you can get 4M - 10.8M total events. Since at least 50% of the cells will be granulocytes, you’ll get to ~2M - ~5.4M non-granulocytes. For readability, we’re showing only a smaller range, from 2.5M - 4.5M.

Now, let’s come back to the two populations from Table 1, inMono and T-cells.

Table 5: Desired CV for inMono
Desired CV 1% 5%
Teiko observed inMono Median % of non-Granulocytes 0.47%
Total Number of events that must be collected, based on Keeney Table 1,000,000 40,000
Estimated inMono Population needed to achieve CV 4,700 188
Actual Teiko collected inMono events 302
Above Threshold? No Yes
Actual Teiko Intra-Run CV 4.63%
Actual Teiko Inter-Run CV 16.09%
Industry Standard CV Acceptance Criteria 25–30%

For a 5% CV for a 0.47% population, you would need about 188 events to achieve this CV. And We collected 302, so we were above the threshold. Turns out, we were right in line with inter-run CVs, at 4.63%, and Inter-run CVs at 16.09%.

Table 6: Desired CV for T-cells
Desired CV 1% 5%
Teiko observed T-cell Median % of non-Granulocytes 37.28%
Total Number of events that must be collected, based on Keeney Table 100,000 4,000
Estimated T-cell Population needed to achieve CV 37,280 1,491
Actual Teiko collected T-cell events 39,210
Above Threshold? Yes Yes
Actual Teiko Intra-Run CV 1.27%
Actual Teiko Inter-Run CV 0.39%
Industry Standard CV Acceptance Criteria 25–30%

Now, T-cells are much more populous. In our precision study, we collected 39,210 events, beyond the 37,280 and 1,491 cells we need to achieve a good coefficient of variation. And, as it turns out, we clocked 1.27% CVs for intra-run and 0.39% CVs for inter-run.

So, hopefully this gives you an intuition for “How many cells do you need to reliably detect a population of interest?” Interested in capturing a specific population?

Contact us to discuss further!

Newsletter
Get great insight from our expert team.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
By signing up you agree to our Terms & Conditions