I have an informatics team. Why do I need you?

Teiko saves your scientists time and effort by streamlining data processing and analysis, ensuring organized and reliable results. Our advantage is we knit all the pieces together: specimen collection, tracking, panels, report validation, and data delivery. In data delivery, we provide an analytical pipeline that transforms complex cytometry data into actionable insights. 

Our analytical pipeline provides automated statistical analysis with access via a web app built upon a relational database. This enables you to spend more time finding answers to clinical trials. Reproducing our analytical pipeline (unit testing, relational database, and continuous integration/delivery), could take the better part of a year. Instead, insights are unlocked within weeks. With our focus on continuous integration and continuous deployment, you will also benefit from our ongoing enhancements in visualization, statistical analysis, and clustering.

If you work with us, you gain access to a team dedicated to optimizing and managing your data pipeline, freeing your informatics team to focus on other critical tasks. With automated unit-testing, peer-reviewed changes, and modern software engineering practices, we ensure that our process is not only reliable but also transparent and fully trackable.

Why not just use flat files?

To receive and digest data from complex, high-dimensional datasets, we implemented an analytical pipeline to extract, transform, and load this information into a relational database. The design of the relational database is crucial for high performance and rapid data extraction. Our approach ensures efficient data handling, scalability, and ease of querying. In Figure 1 we show the design of just a single portion of our relational database. This diagram shows the tables used to manage clustering results and gating statistics.

Figure 1: A portion of our relational database.

One of the major false economies of relying on flat files instead of a relational database is the initial perception of simplicity and lower cost. While flat files may seem straightforward and cost-effective for small-scale projects, they quickly become a liability as data complexity and volume increase. 

Flat files often suffer from slow access times, limited ability to handle concurrent queries, and increased risk of data corruption. This makes flat files inefficient for large-scale data operations. The lack of scalability and difficulty in managing data integrity can lead to significant delays and errors in data processing. In contrast, a well-designed relational database supports rapid data retrieval while maintaining data integrity. The upfront investment Teiko has made in a relational database system is offset by the long-term gains in efficiency, reliability, and scalability. This provides you with a sustainable solution for high-dimensional data analysis. 

For example, consider a research team using a flat file to store data from a large-scale clinical trial. As the data grows, they notice accessing specific subsets of the data becomes increasingly slow. Also running complex queries requires substantial manual effort. 

With our design complex queries such as:

  • “At day 42, is there a statistically significant decrease in state marker HLA-DR median channel value within B cells in complete responders but not in Partial or Non-Responders?”, or
  • “Among cohort 3 samples, do we see a statistically significant increase in B-cells between day 2 and day 3?”

…are answered easily.

Here’s a comparison between a flat file methodology and a relational database:

FeatureFlat FileRelational Database
ComplexitySimpleHighly Complex
Time costLowInitially high
SuitabilitySuitable for small scale data analysisSuitable for high-dimensional data analysis
Access timeSlowRapid
Data corruption riskHigh riskMaintains consistency and integrity
ScalabilityPoorHighly scalable
Data integrity managementDifficultEfficient and reliable
Long term efficiencyInefficient for large-scale operationsGains in efficiency overtime
ReliabilityProne to errors and delaysHighly reliable
SustainabilityIncrease in liability as data complexity growsRobust and sustainable solution
CollaborationDifficultDesigned for concurrent access
Table 1: A comparison of features between flat files and relational databases for data management.

How complex is the data, and can I trust your statistical approach?

We can free up weeks or months of time for your clinical and translational teams. One of the biggest complaints we hear from drug developers using conventional flow cytometry vendors is, “How am I supposed to analyze this data?” 

For an average clinical project where 162 samples are run through our pipeline, the gated data alone returns approximately 160,000 data points. These 160,000 data points result in approximately one million statistical comparisons for three doses across nine timepoints. This can easily double or triple as the number of endpoints increase. Imagine comparing adverse events, responder vs non responder, and complex dosing regimens. For even experienced informatics teams going from raw data to analysis can take weeks, with queues as long as months. With Core and Premier, analysis is delivered alongside the raw data, and we execute the statistical analysis automatically.

To showcase the complexity of the high dimensional data, our Immune Differences (marker expression) data table returns approximately 550 plots. Each plot shows the median channel expression of markers on all cell types. We perform statistical analysis and apply multiple test corrections to the results. This is crucial because as the number of tests increases, the probability of false positives also increases. Multiple test correction prevents false positives from being reported as significant results. 

Accessing these results is easy with our user interface, allowing rapid drill-down on significant findings. In Figure 2 we show the experience using the web-app to rapidly identify significant results. On our web-app it is simple to toggle between a paired and an unpaired analysis. Initially there are no significant differences from the unpaired analysis, but once the user returns a paired analysis multiple results are returned as significant.

Figure 2: An example on how to use the web app to drill down to significant results.

Why are you confident in the analytical software? 

Every step of our analytical pipeline is tested and organized into software packages. We employ automated unit-tests. Each function has a unit-test, which acts as a piece of code that tests the hypothesis of the original code. Unit-tests prove that the code performs as expected, which is why testing is crucial. Just like in science, proving the hypothesis is core to ensuring reliability.

For instance, consider our function convert_dataset_to_statistical_input, which converts data for statistical analysis. One unit test checks whether the function correctly handles cases where some values are missing in the input data. The test test_convert_gating_dataset_to_statistical_test_input_negative intentionally provides a dataset with missing values. If the function tries to process this data, it should raise a ValueError indicating which samples are missing. This unit test would catch issues during development, ensuring that our pipeline correctly identifies and handles missing data before it causes problems in a real-world analysis.

Another example is preparing tallies for statistical analysis. Tallies are composition of subjects and the definition of whether the statistic should be paired or unpaired. Our function generate_tallies has a unit-test called generate_tallies_for_paired_with_no_subject_pairs. The function should return a warning that no tally could be generated. If the function produces a tally, then the software is malfunctioning and the function that intakes this output would fail. This test protects other parts of the software from breaking.

Example TestQuick ScriptsSoftware Infrastructure
Statistical tests require at least 100 eventsDifficult to know accuracy of resultsBuilt-in
Tallies for paired statistics, but have subjects in both timepointsDifficult to detect downstream process that will breakBuilt-in
Table 2: Showing the difference between quick scripts and our in house infrastructure.

Automated unit-testing provides assurance that changes made do not break other parts of the codebase. If a piece of code requires an update, but this update causes ten tests in an unrelated portion of the codebase to fail, we are alerted at the time of implementation. This prevents breaking other parts of the codebase. This ensures that the web-app is always functioning when needed.

All code at Teiko has changes tracked. Using modern software engineering practices, all pieces of code have a history of changes that can be observed. This provides information about who made the changes and an easy way to roll back changes if needed. Compare this described experience to the experience in a point-and-click platform like Excel.

What about just doing the analysis in spreadsheets?

In Excel, an analyst makes a spreadsheet with complex operations that produce fast visualizations. However:

  • How are changes reviewed? Changes are often not reviewed, but occasionally it is required for the delivering party to show the underlying formulas. Describing and understanding the formulas are difficult in spreadsheets as one often needs to double-click the formulas to understand what data is being used.
  • How are changes tracked? In modern cloud architectures, all files are tracked every time a change is made. However, the shared Excel workbook used by the entire organization can change without review, or a copy of the book can be used elsewhere, leading to discrepancies. An example would be an excel workbook with a sheet of raw data. If a user accidentally adds a value to one cell, this change is hard to find and can completely change the result of the analysis.

Why not use Flowjo, GraphPad or off the shelf?

These are amazing power tools, but they are not integrated with a lab, and these solutions are not totally automatable. Using our offering, you’ll automatically get the most advanced cytometry analyses for clinical trials. 

Batch normalization: Imagine samples that arrive across multiple batches. You then need to figure out which batch normalization technique to apply. We’ve performed the research to select the most appropriate normalization algorithms. We also provide a Quality Control (QC) assessment of the normalization, and we post these reports through our integrated web-app. 

Clustering algorithms: The magnitude of the data can cause most standard clustering algorithms to choke, or run up a large computation bill. Teiko tracks the state of the art and implements fast, reliable, and cost effective algorithms. 

Trackable or reviewable experience: When running another analysis using tools like GraphPad or Flowjo, users must remember all the clicks and configurations applied. In contrast, with us you can just login and drill-down on answers to your clinical trial.

You can try to implement a do-it-yourself, but it comes with a real time and maintenance cost. Does this sound interesting to you? Contact us and let us know if we can support your clinical trial.

You Might Also Like

Understanding the immune setpoint – a new approach to cancer treatment

Everyone has a distinct “immune setpoint” that determines the way they respond to diseases and treatments. Matthew Spitzer, Associate Professor at UCSF and scientific cofounder of Teiko Bio, explains how understanding each patient’s individual immune setpoint could be the key to maximizing cancer treatment efficacy.

Page Anchor: #what-is-immune-setpoint – Not Visible On Front

What is an “immune setpoint” and how does it affect our responses to life-threatening diseases like cancer?

Each individual’s immune system is characterized by a unique homeostatic baseline, called the immune setpoint. The immune setpoint defines how our immune system behaves naturally, in the absence of a disturbance. This is based on our underlying genetics, historical exposures to different infectious diseases, and other types of events that may have impacted the state of our immune system. In the case of cancer, the immune setpoint determines how someone’s immune system will respond to the cancer cells and whether they can control the tumor.

“…the immune setpoint determines how someone’s immune system will respond to the cancer cells and whether they can control the tumor.”

Page Anchor: #characterizing-immune-setpoint – Not Visible On Front

How do we characterize an individual’s immune setpoint?

The most common approach is to measure the types of immune cells that are present in someone’s body and look at how those cells are behaving. Cytometry-based approaches have historically been the most common way to do that. We use mass cytometry to measure the expression of roughly 45 different proteins across hundreds of thousands of cells at the single-cell level. Some of those proteins help us identify different types of immune cells, while others give us insight into how those cells are behaving or functioning. It’s been a really powerful tool over the last decade to understand more about the human immune system and how it responds to different perturbations.

We also have technologies that similarly can measure gene expression at the single-cell level. Other types of analyses provide us with complementary information on the chromatin landscape of immune cells, the abundance of circulating proteins such as cytokines and chemokines, as well as T cell and B cell receptors expressed by someone’s adaptive immune cells. These are layers of information that we now have access to, which can give us additional insight into the state of someone’s immune system.

Page Anchor: #immune-setpoint-approach-today – Not Visible On Front

Are these technologies used by doctors today?

Not typically. There are other tools that doctors use to help decide whether a patient is likely to benefit from a drug. However, these approaches still leave a significant amount of uncertainty. For example, doctors currently look for specific mutations that correspond to the signaling pathways that therapies can target. The challenge is that cancers often find ways to escape by evolving and developing resistance. So, while a patient’s cancer might initially benefit from these drugs targeting a specific mutation, the effect may not be long-lasting.

In contrast to these targeted therapies, immunotherapies have the potential to activate immune cells that recognize cancer cells as foreign, and this can result in the formation of long-lived memory cells that protect against cancer recurrence for long periods of time. We have some clues as to which patients are more likely to benefit from immunotherapy, but often these responding patients are difficult to predict using the tools currently available to oncologists.

“…immunotherapies have the potential to activate immune cells that recognize cancer cells as foreign, and this can result in the formation of long-lived memory cells that protect against cancer recurrence for long periods of time.”

Page Anchor: #immune-setpoint-better-approach – Not Visible On Front

If that’s the case, what could be a better approach?

Most drugs are initially developed because the pathways that they target were thought to be important in cancer cells or in other types of cells within the local tumor microenvironment. As we learn more about what those drugs are doing in people and in mouse models, it has become apparent that many of them are acting outside of the tumor as well. So, we should really be thinking about interactions outside of the tumor that could be used to activate cells that have the potential to drive new immune responses. Understanding how immune responses outside the tumor associate with clinical response is a real opportunity to better identify those patients likely to benefit.

For example, one of the main types of immunotherapy to treat patients with solid tumors is called a checkpoint inhibitor. Many of these inhibitors block a protein called PD-1 or its ligand, PD-L1. Originally, it was thought that, how many T cells express PD-1, or how many cancer cells or myeloid cells express PD-L1, would be interpretable information that would predict efficacy. This turned out not to be quite as straightforward as we would have thought. For instance, scientists found that many patients who have high levels of PD-L1 don’t necessarily respond well to checkpoint inhibitors. Meanwhile, some patients who don’t have a high expression of PD-L1 respond well.

We’re still working to understand exactly why that is the case. But one of the things we’ve learned is that these checkpoint inhibitors are, at least in part, not only acting on cells that are in the tumor tissue, but also on cells outside of the tumor tissue, including cells in the lymph nodes. These peripheral cells can be activated outside of the tumor, circulate through the blood, infiltrate into the tumor, and kill cancer cells.

“…these checkpoint inhibitors are, at least in part, not only acting on cells that are in the tumor tissue, but also on cells outside of the tumor tissue, including cells in the lymph nodes.”

This suggests that looking only into the tumor tissue is limited because it doesn’t provide you with full insight into one’s immune responses. There is a real opportunity for immune monitoring or measuring the immune setpoint outside of the tumor to understand somebody’s baseline immune state.

Page Anchor: #immune-setpoint-future – Not Visible On Front

Do you think immune profiling will change the way we treat cancer patients in the future?

The immune system provides us a unique opportunity to fight cancer because of its ability to adapt, to keep up with changes that take place in cancer. It clearly has the potential to be very powerful if we can figure out how to effectively harness it across all patients.

I think there are opportunities that we might be missing when we don’t think about important pathways or interactions outside of the tumor. We need to be thinking critically about the variability that exists across patients in their immune states and to what extent they’re likely to be able to mount an immune response to different types of drugs. I think it is clear that there will be great utility for these immune profiling technologies in tailoring treatment strategies for particular patients and in different disease contexts.

The field has made tremendous progress in this area over the last 20 years, and many patients are now experiencing the long-term, durable benefit of immunotherapies. There is a very real opportunity for us as a scientific and medical community to continue making advancements in this area to improve patient outcomes with the latest technologies available.

You Might Also Like