Blood RNA-Sequencing data is processed using a variety of bioinformatics tools to map reads to human genes, cells, bacteria/viruses, immune repertoire, and many others as shown in the figure below.

COVID-19 Blood Transcriptome

This paper mapped multiple human gene signatures of hospitalized COVID-19 but determined there were actually two different groups of patients, those with immune activation and those suppressed.

Prokop JW, Bupp CP, Chesla D, Faber W, Love CP, Karam R, Abualkheir N, Feldmann B, Teng L, McBride T, Leimanis ML, English BK, Frisch A, Bauss J, Kalpage N, Derbedrossian A, Pinti RM, Hall N, Mills J, Eby A, VanSickle EA, Pageau SC, Shankar R, Chen B, Carcillo J, Olivero R, Hartog NL, Rajasekaran S. High-Density Blood Transcriptomics Reveals Precision Immune Signatures of SARS-CoV-2 Infection in Hospitalized Individuals. Frontiers in Immunology. 12:694243 (2021). PMID: 34335605.

Blood transcriptome gene signatures of SARS-CoV-2 patients. (A) Three dimensions of principal components (PC1-PC3) of first collection time point RNAseq gene annotations for samples of COVID-19 (red) or control (blue) patients. (B) Volcano plot of gene expression in COVID-19 or control patients with significant genes higher (red) or lower (blue) marked. The x-axis shows the log2 fold change, and the y-axis shows the -log10 of the adjusted p-value. Shadowed genes are involved in interferon signaling. (C) Box and whisker plot of the top three genes labeled in panel (B, D) Top gene ontology (GO) enrichment terms for up (red) or down (blue) genes. The term for each enriched description is shown first, followed by the name, and then in parentheses, the number of genes is significant relative to the number of genes within the genome annotated for the term. The x-axis shows the -log10 of the false discovery rate (FDR).

Schematic of two different groups of COVID patients relative to controls. Red represents the immune overactive COVID-19 patients (group A), cyan the immune suppressed patients (group B), and controls cluster in black (group C).

Infant RSV

Gene signatures of these patients showed shared response but similar to the COVID-19 study, each patient had unique signatures of infections.

Gupta R, Leimanis ML, Adams M, Bachmann AS, Uhl K, Bupp CP, Hartog NL, Kort EJ, Olivero R, Comstock SS, Sanfilippo DJ, Lunt SY, Prokop JW*, Rajasekaran S*. Balancing Precision vs. Cohort Transcriptomic Analysis of Acute and Recovery Phase of Viral Bronchiolitis. American Journal of Physiology-Lung Cellular and Molecular Physiology. 320(6):L1147-L1157 (2021). *Co-corresponding. PMID: 33851876

Figure. 2.
Transcriptome comparisons of viral bronchiolitis (VB) day 0 and 3 versus controls. A: sample segregation for all control (red), VB day 0 severe (blue), VB day 0 moderate (magenta), VB day 3 severe (yellow), and VB day 3 moderate (gray) transcriptomes for gene expression on three-dimensional reduction using t-SNE. B: volcano plot of genes with differential expression between nested control versus VB day 0 relative to VB day 0 versus VB day 3 for genes higher (blue) or lower (red). Cutoffs used were log2 fold change 2 and Adj-P < 0.01, with 319 identified genes. C: top four significant genes up or down shown as box and whisker plots with Adj-P value shown. D: STRING-based enriched pathways for genes higher (blue) or lower (red) in VB. Values are shown as the −log10 of P value and the term identification number is in parentheses. E: top segregating transcripts between controls and VB day 0 (blue) annotated at biotype level. The gray box shows the percent of mapped transcripts for each biotype and the blue box shows the percent of genes significantly different with log2 fold change 2 and Adj-P < 0.01. F: transcripts with multiple isoforms significant between controls and VB. The gray box shows the total number of mapped transcripts and the blue box shows how many were significantly changed. The percent of transcripts change for each gene is labeled in parentheses. G: each transcript is labeled with ENST, protein name in parenthesis, and biotype annotation with values shown as box and whisker plots. F: volcano plot of male control versus male day 0 relative to male day 0 versus female day 0. The top genes based on combined B metric are labeled. G and H: top two significant transcripts between controls and VB samples (G) or moderate versus severe VB (H) shown as a box and whisker plot. I: volcano plot for nested comparison of control versus VB relative to male versus female. J: top four genes for nested comparisons. Colors represent samples: cyan-F control, dark blue-F severe VB, light blue-F moderate VB, gray-M control, red-M severe VB, orange-M moderate VB. K: female versus male values for the highly significant CD177 using same colors as panel J. TPM, transcripts per million.
Figure. 4.
Gene outlier mapping for samples. A: expression levels of the top five severity genes for PERSEVERE scores for controls (gray), VB day 0 (red) and VB day 3 (cyan) with outliers labeled. B: top cytokines expressed in samples shown as a heat map (red is highest value of each cytokine and blue the lowest). Clustering of samples was performed using one minus Pearson’s correlation. To the right are the top significant cytokines segregating controls versus VB day 0 samples that are higher (red) or lower (black) in VB day 0C: additive cytokines from panel B for each sample that are down (x-axis) or up (y-axis) in VB day 0 samples. The highest expression of elevated factors are labeled. D: annotated of the normalized mapping value for cell-type-specific genes in patient 3 time point 0 relative to the max value of all other samples. The top three non patient 3 mapping values are labeled in red. E: outlier mapping for the top three in panel D, labeling the patient with highest mapping levels. F: genes significantly elevated (4 standard deviations above others) in each of the day 0 VB (red) or control (black) samples. Labeled above, each are GO enrichment terms identified from the sample elevated genes. VB, viral bronchiolitis.

Precision Medicine of Multiple Organ Dysfunction Syndrome

Patients showed broad insights with mapping data well correlated to clinical records. One of the patients we discovered the process termed viral induced genetics, where a virus dysregulates nonsense mediated decay pathways to activate dominant negative variants.

Shankar R, Leimanis M, Newbury PA, Liu K, Xing J, Nedveck D, Kort EJ, Prokop JW, Zhou G, Bachmann AS, Chen B, Rajasekaran S. Gene expression signatures identify pediatric patients with multiple organ dysfunction who require advanced life support in the intensive care unit. EBiomEdicine. 62:103122 (2020). PMID: 33248372.

Prokop JW, Shankar R, Gupta R, Leimanis ML, Nedveck D, Uhl K, Chen B, Hartog NL, Van Veen J, Sisco JS, Sirpilla O, Lydic T, Boville B, Hernandez A, Braunreiter C, Kuk CC, Singh V, Mills J, Wegener M, Adams M, Rhodes M, Bachmann AS, Pan W, Byrne-Steele ML, Smith DC, Depinet M, Brown BE, Eisenhower M, Han J, Haw M, Madura C, Sanfilippo DJ, Seaver LH, Bupp C, Rajasekaran S. Virus-induced genetics revealed by multidimensional precision medicine transcriptional workflow applicable to COVID-19. Physiological Genomics (2020) 52(6):255-268.  PMID: 32437232.

Fig. 2.
61 transcriptomes analyzed from 27 patients. Principal component analysis of transcriptomes annotated at the gene level for all 3 days (day 0 green, day 3 blue, day 8 magenta) (A) or on day 0 for the various groups (Sedation control blue, MODS green, MODS with ECMO red) (B). C: using the five PERSERVERE genes, we quantify the total transcripts per million (TPM, x-axis) and the percent of the five genes expressed (y-axis). Two outliers are labeled (sample:group:day:sex). Expression of various neutrophil genes (D) or cytokines (E) shown on the left as the sum of their TPM (x-axis) and the % of genes in the group expressed (y-axis). On the right is a heat map of each sample and the various genes. Clustering is based on one minus Pearson’s correlation. ECMO, extracorporeal membrane oxygenation; F, female; M, male; MODS, multiple organ dysfunction syndrome.
Fig. 7.
Patient unique transcripts for day 0–8 and patient to all genes. Left: genes up (gray) or down (red) in patients at day 0 RNA-Seq relative to day 8Right: genes up (gray) or down (red) in all 3 days RNA-Seq of patient relative to all other patients. Shown below is the plot of both day 0/8 (x-axis) and patient to all (y-axis). Labeled for each patient are the primary diagnoses and the GO enriched terms that overlap to phenotype (red and blue). The most surprising observation was the HHV-7 detected infection in patient 20, who had sarcoma, that aligns to genes decreased at day 0 that correlate to herpes-associated sarcoma. ECMO, extracorporeal membrane oxygenation; EBV, Epstein-Barr virus; FDR, false discovery rate; HLH, hemophagocytic lymphohistiocytosis; PPI, protein-protein interaction; SRP, signal-recognition particle. On the bar graph, A, B, and C are shown below for each patient.
Fig. 8.
Patient with active Epstein-Barr virus (EBV) and RNASEH2B splice variant. A: patient timeline: * marks multi-omic data collection points, red marks interventions, black marks major events, and blue marks phenotypes observed. B: clustering of control (red) and patient samples (green), including a 22 mo follow-up. C: second viral read caller to confirm EBV infection with EBER1 (gray) and EBER2 (red) reads from EBV detected at day 0 shown below. D: known structure of RNASEH2A (gray), RNASEH2C (magenta), and RNASEH2B (cyan). E: zoom-in view of V20 not falling at a critical site suggesting L would be functional. F: percent of reads from the RNASEH2B alleles at multiple RNA-Seq measurements suggesting allelic bias to the allele inherited from the father. G: reads aligned near the potential splice site variant inherited from mom that confirms a frameshift and early truncation. H: total RNASEH2B levels over the multiple days suggesting reduced levels relative to controls (red). I: pathways activated in the patient’s 2 yr follow-up, suggesting immune system dysfunction. J: immune system repertoire analysis performed on the patient follow-up for seven chains (TRB, T cell receptor beta; TRA, T cell receptor alpha; IGH, immunoglobulin heavy chain; TRD, T cell receptor delta; TRG, T cell receptor gamma; IGK, immunoglobulin light chain kappa; IGL, immunoglobulin light chain lambda). Shown is the percent of CDR3 identified in each group with a callout for further dissection of the IGH subgroups. K: diversity index (DI) for each of the sequenced chains. DI corresponds to elevated recombination sites, with lower levels suggesting an enrichment of certain sequences within the chain. The IGL is shown for diversity, suggesting the elevation of epitope recognition. This epitope is seen in 27 of the top 35 CDR3 sequences for IGL with a significant enrichment (8.9e-82) and composing 21.7% of all the CDR3 IGL sequences. CRP, C-reactive protein; ECMO, extracorporeal membrane oxygenation; ER, emergency room; HDVCH, Helen DeVos Children’s Hospital; HLH, hemophagocytic lymphohistiocytosis; PICU, pediatric intensive care unit; WES, whole exome sequencing.

CCR5, HIV resistance and Critical Care

We used multiple datasets to study the highly popular CCR5 in our critical care samples, elucidating that many of the worse patients carried the delta 32 allele.

Bauss J, Morris M, Shankar R, Olivero R, Buck LN, Stenger CL, Hinds D, Mills J, Eby A, Zagorski JW, Smith C, Cline S, Hartog N,  Chen B,  Huss J,  Carcillo JA,  Rajasekaran S,  Bupp C, Prokop JW. CCR5 and biological complexity: the need for data integration and educational materials to normalize genetic/biological reductionism at the interface of Ethical, Legal, and Social Implications. Frontiers in Immunology. 12:790041 (2021). PMID: 34925370

CCR5 expression in blood PAXgene tube RNAseq. (A) Box and whisker plots for the expression (TPM) of CCR5 from various NCBI BioProjects that were generated by Illumina paired-end RNA-Seq from human blood collected PAXgene tube samples. Listed next to each BioProject code is the number of samples, average CCR5 expression, and standard deviation of CCR5 expression. (B) The TPM expression for each sample (x-axis) of panel (A) relative to the BioProject normalized Z-score (y-axis). The top seven samples based on Z-score are labeled. (C) The Z-scores for our three pilot precision transcriptome datasets, with outlier samples labeled. (D) The COVID-19 study analysis of CCR5 normalized expression relative to CIBERSORTx absolute CD8 T-cell values. (E) The percent of transcripts containing the delta 32 variant relative to wild type (x-axis) for samples of panel (C) relative to the BioProject normalized Z-Score. All homozygous samples for delta 32 are labeled as is the heterozygous sample with the highest study Z-score.