Dimensionality Visualization

knitr::opts_chunk$set(eval = FALSE)

Introduction

Spectral Flow Cytometry data consist of many acquired cellular events, with increasing number of markers. Many unsupervised analysis approaches rely on clustering the markers on the basis of median fluorescent intensity. A common workflow step is to then project these cells onto a dimensionality visualized space using one of the various available algorithms. While this can provide useful information, we agree with Pachter labs points in their Specious Art Spatial Genomics paper highlighting ease of overinterpreting what may be sheer artefact noise.

Similarly, dimensionality visualization works on both unmixed files, as well as novel applications addressing unstained samples still in the raw form. To enable characterization of what actually ends up in these islands (whether brightness, unique signature, random noise) we have implemented dimensionality visualization for the most frequently used algorithms to facilitate their exploration in context of our paper.

While these are useful, we would like to reiterate, these wrappers are for exploratory/convenience purposes only, please refer to the original packages for any specialized argument implementation. We may modify them, so don’t build your own R package around our implementation. Instead, take advantage of the copyleft license nature of free and open-source software, fork it, modify it and include it in your own package.

Getting Started

Load Libraries

library(Luciernaga)
library(flowCore)
library(flowWorkspace)
library(openCyto)
library(ggcyto)  
library(data.table)
library(dplyr)
library(purrr) 
library(stringr)
library(ggplot2)
library(gt)
library(plotly)
library(htmltools)

Create a GatingSet

File_Location <- system.file("extdata", package = "Luciernaga")
FCS_Pattern <- ".fcs$"
FCS_Files <- list.files(path = File_Location, pattern = FCS_Pattern,
                        full.names = TRUE, recursive = FALSE)
head(FCS_Files[10:30], 5)

UnstainedFCSFiles <- FCS_Files[grep("Unstained", FCS_Files)]
UnstainedCells <- UnstainedFCSFiles[-grep(
  "Beads", UnstainedFCSFiles)]

MyCytoSet <- load_cytoset_from_fcs(UnstainedCells,
                                   truncate_max_range = FALSE,
                                   transform = FALSE)
MyCytoSet

MyGatingSet <- GatingSet(MyCytoSet)
MyGatingSet

FileLocation <- system.file("extdata", package = "Luciernaga")
MyGates <- fread(file.path(path = FileLocation,
                           pattern = 'Gates.csv'))
gt(MyGates)

MyGatingTemplate <- gatingTemplate(MyGates)
gt_gating(MyGatingTemplate, MyGatingSet)
MyGatingSet[[1]]

Creating a Gating Set Unmixed

Unmixed_FullStained <- FCS_Files[grep("Unmixed", FCS_Files)]

UnmixedCytoSet <- load_cytoset_from_fcs(Unmixed_FullStained,
                                        truncate_max_range = FALSE,
                                        transform = FALSE)
UnmixedGatingSet <- GatingSet(UnmixedCytoSet)
UnmixedGatingSet

Markers <- colnames(UnmixedCytoSet)
KeptMarkers <- Markers[-grep(
  "Time|FS|SC|SS|Original|-W$|-H$|AF", Markers)]

MyBiexponentialTransform <- flowjo_biexp_trans(channelRange = 256,
                                               maxValue = 1000000,
                                               pos = 4.5, neg = 0,
                                               widthBasis = -1000)
TransformList <- transformerList(KeptMarkers, MyBiexponentialTransform)
UnmixedGatingSet <- transform(UnmixedGatingSet, TransformList)

FileLocation <- system.file("extdata", package = "Luciernaga")
UnmixedGates <- fread(file.path(path = FileLocation,
                                pattern = 'GatesUnmixed.csv'))
UnmixedGating <- gatingTemplate(UnmixedGates)
gt_gating(UnmixedGating, UnmixedGatingSet)

Helper Functions

Utility_ColAppend

Originally an internal function supporting our dimensionality visualization functions, Utility_ColAppend() appends new data.frame columns to an .fcs file that contains the same number of rows as a new marker/parameter. To do this, you feed the function a cytoset object, a data.frame of the flowframe exprs data, and a data.frame of the new data columns. Utility_ColAppend() then adds the appropriate parameters and returns a flowframe object with these new parameters added.

Since it has been highly useful to us, we are mentioning it here in case you need to append additional information/alternate dimensionality visualizations to your own .fcs files. We have tried to keep the return .fcs file compliant for ease of use in other software applications, but expect bugs to arise outside internal function purpose, so if you encounter a bug with this function, please report it via GitHub.

ff <- gs_pop_get_data(UnmixedGatingSet[1], subsets="live",
                      inverse.transform = FALSE)
BeforeParameters <- ff[[1, returnType = "flowFrame"]]
BeforeParameters

# Main Expression Data From Original
MainDataFrame <- as.data.frame(exprs(ff[[1]]), check.names = FALSE)

# Creating Artificial Data To Mimic Metadata to Append
NewData <- MainDataFrame %>% mutate(
  ExposureStatus = sample(1:3, n(), replace = TRUE))

NewData <- NewData %>% select(ExposureStatus)

AfterParameters <- Utility_ColAppend(ff=ff, DF=MainDataFrame,
                                     columnframe = NewData)
AfterParameters

For anyone wanting to continue on and create an .fcs file from the flowframe, example code is provided below:

outpath <- file.path("C:", "Users", "JohnDoe", "Desktop")
Name <- flowWorkspace::keyword(AfterParameters, "GROUPNAME")
TheFileName <- paste0(Name, "_Appended.fcs")
fileSpot <- file.path(outpath, TheFileName)
fileSpot
# flowCore::write.FCS(AfterParameters, filename = fileSpot, delimiter="#")

Utility_Downsample

For many dimensionality visualization protocols, it’s recommended that you downsample so that each specimen has roughly similar representation in the final plot. While this can be helpful when you have some specimens with 10,000 cells, and others with a million cells, it encounters issues when some specimens have millions but other samples have just 200 cells. Consequently, deciding whether to down-sample is a decision that you need to make and justify. We have implemented Utility_Downsample() to facilitate the process.

# plot(UnmixedGatingSet)
CountData <- gs_pop_get_count_fast(UnmixedGatingSet)
CountData %>% filter(Population %in% "/singletsFSC/singletsSSC/singletsSSCB/nonDebris/lymphocytes/live") %>%
  select(name, Count)

#Single Sample Returned as a data.frame 
removestrings <- c("DTR_2023_ILT_15_Tetramers-",
                   "-Ctrl_Tetramer_Unmixed", ".fcs")
SingleSample <- Utility_Downsample(UnmixedGatingSet[1],
                                   sample.name = "GROUPNAME",
                                   removestrings=removestrings,
                                   subsets = "live", subsample = 2500,
                                   internal = FALSE, export = FALSE,
                                   inverse.transform=TRUE)
SingleSample

# Multiple Samples

MultipleSamples <- map(.x=UnmixedGatingSet, .f=Utility_Downsample,
                       sample.name =  "GROUPNAME", removestrings=removestrings,
                       subsets = "live", subsample = 2500, internal = FALSE,
                       export = FALSE, inverse.transform=TRUE)
MultipleSamples

Utility_Concatinate

A common feature of dimensionality visualization approaches for cytometry data is concatenating different samples into a single file. This sometimes includes first downsampling to an equivalent number of cells in a particular cell subset, so as to not overly influence the result on the basis of a single cell subset. Similarly, this can be done before or after cells are normalized using one of the various algorithms.

To facilitate this process, we have implemented the Utility_Concatinate() function. Unlike other approaches, we implemented it at the GatingSet level, allowing to merge only cells at the gating node of interest into active memory. The final file can then be saved as it’s own .fcs file. We have attempted to retain FCS format, so that it can be used across software without issues. If you encounter issues, please reach out!

removestrings <- c("DTR_", ".fcs")
StorageLocation <- file.path("C:", "Users", "JohnDoe", "Desktop")

#Return Types: "data.frame", "flow.frame", "fcs"
ConcatinatedReturn <- Utility_Concatinate(gs=UnmixedGatingSet,
                                          sample.name = "GROUPNAME",
                                          removestrings=removestrings,
                                          subsets="live", subsample = 2000,
                                          ReturnType = "flow.frame",
                                          newName = "MyConcatinatedFile",
                                          outpath = StorageLocation,
                                          export = FALSE, inverse.transform=TRUE)

ConcatinatedReturn

Utility_tSNE

Now that we have a GatingSet containing our raw .fcs files, let’s figure out what markers/parameters are present, and save the detectors as “KeptMarkers”. From there, let’s identify the unstained file within the GatingSet. After this we will run a tSNE on the provided data at the lymphocyte gating level, using the Utility_tSNE() function. This function has the ability to return a .fcs file (which is what we typically use). For the purpose of this vignette, it returns as a FlowCore flowframe. For use with our visualization functions that work on GatingSet objects, we transform to a cytoframe, add to a new cytoset and then convert that into a gating set. Finally we visualize using Utility_ThirdColorPlots()

Markers <- colnames(MyCytoSet)
KeptMarkers <- Markers[-grep("Time|FS|SC|SS|Original|-W$|-H$|AF", Markers)]
KeptMarkers

pData(MyGatingSet[[3]]) %>% pull(name)
nrow(MyGatingSet[[3]])
plot(MyGatingSet)

tSNE_Output <- Utility_tSNE(x=MyGatingSet[[3]], sample.name = "GUID",
                            removestrings=c("_Cells", ".fcs"),
                            subset = "nonDebris", columns=KeptMarkers,
                            export=FALSE)

cf <- flowFrame_to_cytoframe(tSNE_Output)
TheNewCS <- cytoset()
cs_add_cytoframe(cs=TheNewCS, sn="Test", cf=cf)
NewGatingSet <- GatingSet(TheNewCS)

TSNEPlot <-  Utility_ThirdColorPlots(x=NewGatingSet[[1]], subset = "root",
                                     xaxis="tSNE_1", yaxis = "tSNE_2",
                                     zaxis ="B3-A", splitpoint = "continuous",
                                     sample.name = "TUBENAME",
                                     removestrings = c("Dimensionality", ".fcs"),
                                     thecolor = "orange", tilesize = 0.6)

TSNEPlot

We can now repeat the previous step, but using our Unmixed file GatingSet. The process is similar, identifying markers present in the .fcs file (fluorophores in this case), selecting our markers of interest, and from there passing to the Utility_tSNE() function.

Markers <- colnames(UnmixedCytoSet)
KeptMarkers <- Markers[-grep("Time|FS|SC|SS|Original|-W$|-H$|AF", Markers)]
SubsetMarkers <- c("BUV496-A", "BUV805-A", "Pacific Blue-A", "BV711-A",
                       "BV786-A", "Spark Blue 550-A", "PE-A", "APC-Fire 750-A")

pData(UnmixedGatingSet[[3]]) %>% pull(name)
nrow(UnmixedGatingSet[[3]])
plot(UnmixedGatingSet)

removestrings <- c(".fcs")

tSNE_Output <- Utility_tSNE(x=UnmixedGatingSet[[3]], sample.name = "TUBENAME", removestrings=removestrings, subset = "live", columns=SubsetMarkers, export=FALSE)

#BUGGED: flowCore_$P42Rmax not contained in Text section!

cf <- flowFrame_to_cytoframe(tSNE_Output)
TheNewCS <- cytoset()
cs_add_cytoframe(cs=TheNewCS, sn="Test", cf=cf)
NewGatingSet <- GatingSet(TheNewCS)

Sample_TSNEPlot <-  Utility_ThirdColorPlots(x=NewGatingSet[[1]], subset = "root",
                                            xaxis="tSNE_1", yaxis = "tSNE_2",
                                            zaxis ="Spark Blue 550-A",
                                            splitpoint = "continuous",
                                            sample.name = "GROUPNAME",
                                            removestrings = removestrings, 
                                            thecolor = "orange", tilesize = 0.6)

Sample_TSNEPlot

Utility_UMAP

Now that we have a GatingSet containing our raw .fcs files, let’s figure out what markers/parameters are present, and save the detectors as “KeptMarkers”. From there, let’s identify the unstained file within the GatingSet. After this we will run a tSNE on the provided data at the lymphocyte gating level, using the Utility_UMAP() function. This function has the ability to return a .fcs file (which is what we typically use). For the purpose of this vignette, it returns as a FlowCore flowframe. For use with our visualization functions that work on GatingSet objects, we transform to a cytoframe, add to a new cytoset and then convert that into a gating set. Finally we visualize using Utility_ThirdColorPlots()

Markers <- colnames(MyCytoSet)
KeptMarkers <- Markers[-grep("Time|FS|SC|SS|Original|-W$|-H$|AF", Markers)]
KeptMarkers

pData(MyGatingSet[[3]]) %>% pull(name)
nrow(MyGatingSet[[3]])
plot(MyGatingSet)

UMAP_Output <- Utility_UMAP(x=MyGatingSet[[3]], sample.name="GUID",
                            removestrings=c("_Cells", ".fcs"), 
                            subset="nonDebris", columns=KeptMarkers,
                            export=FALSE)

cf <- flowFrame_to_cytoframe(UMAP_Output)
TheNewCS <- cytoset()
cs_add_cytoframe(cs=TheNewCS, sn="Test", cf=cf)
NewGatingSet <- GatingSet(TheNewCS)

UMAPPlot <-  Utility_ThirdColorPlots(x=NewGatingSet[[1]], subset = "root",
                                     xaxis="UMAP_1", yaxis = "UMAP_2", 
                                     zaxis ="B3-A", splitpoint = "continuous",
                                     sample.name = "TUBENAME",
                                     removestrings = c("Dimensionality", ".fcs"),
                                     thecolor = "orange", tilesize = 0.3)

UMAPPlot

Markers <- colnames(UnmixedCytoSet)
KeptMarkers <- Markers[-grep("Time|FS|SC|SS|Original|-W$|-H$|AF", Markers)]
SubsetMarkers <- c("BUV496-A", "BUV805-A", "Pacific Blue-A", "BV711-A",
                       "BV786-A", "Spark Blue 550-A", "PE-A", "APC-Fire 750-A")

pData(UnmixedGatingSet[[3]]) %>% pull(name)
nrow(UnmixedGatingSet[[3]])
plot(UnmixedGatingSet)

removestrings <- c(".fcs")

UMAP_Output <- Utility_UMAP(x=UnmixedGatingSet[[3]], sample.name="GUID", removestrings=removestrings, subset="live", columns=SubsetMarkers, export=FALSE)

cf <- flowFrame_to_cytoframe(UMAP_Output)
TheNewCS <- cytoset()
cs_add_cytoframe(cs=TheNewCS, sn="Test", cf=cf)
NewGatingSet <- GatingSet(TheNewCS)

Sample_UMAPPlot <-  Utility_ThirdColorPlots(x=NewGatingSet[[1]], subset = "root",
                                            xaxis="UMAP_1", yaxis = "UMAP_2",
                                            zaxis ="Spark Blue 550-A",
                                            splitpoint = "continuous",
                                            sample.name = "GROUPNAME",
                                            removestrings = c("Dimensionality", ".fcs"),
                                            thecolor = "orange", tilesize = 0.3)

Sample_UMAPPlot

Utility_PaCMAP

Unlike the other dimensionality visualization algorithms that are implemented in R, both PaCMAP and PHATE are primarily implemented in Python. Utilizing basilisk package, we implement method to facilitate the plot generation isolated within the Luciernaga environment. To do this, you would need to install either the devel or WithPython branches when downloading from GitHub.

Repeating similar steps as with the tSNE and UMAP examples above for both raw and unmixed data:

Markers <- colnames(MyCytoSet)
KeptMarkers <- Markers[-grep("Time|FS|SC|SS|Original|-W$|-H$|AF", Markers)]
KeptMarkers

pData(MyGatingSet[[3]]) %>% pull(name)
nrow(MyGatingSet[[3]])
plot(MyGatingSet)

PaCMAP_Output <- Utility_PaCMAP(x=MyGatingSet[[3]], sample.name="GUID",
                            removestrings=c("_Cells", ".fcs"), 
                            subset="nonDebris", columns=KeptMarkers,
                            export=FALSE)

cf <- flowFrame_to_cytoframe(PaCMAP_Output)
TheNewCS <- cytoset()
cs_add_cytoframe(cs=TheNewCS, sn="Test", cf=cf)
NewGatingSet <- GatingSet(TheNewCS)

PaCMAPPlot <-  Utility_ThirdColorPlots(x=NewGatingSet[[1]], subset = "root",
                                     xaxis="PaCMAP_1", yaxis = "PaCMAP_2", 
                                     zaxis ="B3-A", splitpoint = "continuous",
                                     sample.name = "TUBENAME",
                                     removestrings = c("Dimensionality", ".fcs"),
                                     thecolor = "orange", tilesize = 0.3)

PaCMAPPlot

Markers <- colnames(UnmixedCytoSet)
KeptMarkers <- Markers[-grep("Time|FS|SC|SS|Original|-W$|-H$|AF", Markers)]
SubsetMarkers <- c("BUV496-A", "BUV805-A", "Pacific Blue-A", "BV711-A",
                       "BV786-A", "Spark Blue 550-A", "PE-A", "APC-Fire 750-A")

pData(UnmixedGatingSet[[3]]) %>% pull(name)
nrow(UnmixedGatingSet[[3]])
plot(UnmixedGatingSet)

removestrings <- c(".fcs")

PaCMAP_Output <- Utility_PaCMAP(x=UnmixedGatingSet[[3]], sample.name="GUID", 
                              removestrings=removestrings, subset="live",
                              columns=SubsetMarkers, export=FALSE)

cf <- flowFrame_to_cytoframe(PaCMAP_Output)
TheNewCS <- cytoset()
cs_add_cytoframe(cs=TheNewCS, sn="Test", cf=cf)
NewGatingSet <- GatingSet(TheNewCS)

Sample_PaCMAPPlot <-  Utility_ThirdColorPlots(x=NewGatingSet[[1]], subset = "root",
                                            xaxis="PaCMAP_1", yaxis = "PaCMAP_2",
                                            zaxis ="Spark Blue 550-A",
                                            splitpoint = "continuous",
                                            sample.name = "GROUPNAME",
                                            removestrings = c("Dimensionality", ".fcs"),
                                            thecolor = "orange", tilesize = 0.3)

Sample_PaCMAPPlot

Utility_PHATE

And similarly for PHATE, with downsampling for brevity:

Markers <- colnames(MyCytoSet)
KeptMarkers <- Markers[-grep("Time|FS|SC|SS|Original|-W$|-H$|AF", Markers)]
KeptMarkers

pData(MyGatingSet[[3]]) %>% pull(name)
nrow(MyGatingSet[[3]])
plot(MyGatingSet)

Phate_Output <- Utility_Phate(x=MyGatingSet[[3]], sample.name="GUID",
                            removestrings=c("_Cells", ".fcs"), 
                            subset="nonDebris", subsample=15000,
                            columns=KeptMarkers,
                            export=FALSE)

cf <- flowFrame_to_cytoframe(Phate_Output)
TheNewCS <- cytoset()
cs_add_cytoframe(cs=TheNewCS, sn="Test", cf=cf)
NewGatingSet <- GatingSet(TheNewCS)

PhatePlot <-  Utility_ThirdColorPlots(x=NewGatingSet[[1]], subset = "root",
                                     xaxis="Phate_1", yaxis = "Phate_2", 
                                     zaxis ="B3-A", splitpoint = "continuous",
                                     sample.name = "TUBENAME",
                                     removestrings = c("Dimensionality", ".fcs"),
                                     thecolor = "orange", tilesize = 0.0001)

PhatePlot

Markers <- colnames(UnmixedCytoSet)
KeptMarkers <- Markers[-grep("Time|FS|SC|SS|Original|-W$|-H$|AF", Markers)]
SubsetMarkers <- c("BUV496-A", "BUV805-A", "Pacific Blue-A", "BV711-A",
                       "BV786-A", "Spark Blue 550-A", "PE-A", "APC-Fire 750-A")

pData(UnmixedGatingSet[[3]]) %>% pull(name)
nrow(UnmixedGatingSet[[3]])
plot(UnmixedGatingSet)

removestrings <- c(".fcs")

Phate_Output <- Utility_Phate(x=UnmixedGatingSet[[3]], sample.name="GUID", 
                              removestrings=removestrings, subset="live",
                              subsample=5000, 
                              columns=SubsetMarkers, export=FALSE)

cf <- flowFrame_to_cytoframe(Phate_Output)
TheNewCS <- cytoset()
cs_add_cytoframe(cs=TheNewCS, sn="Test", cf=cf)
NewGatingSet <- GatingSet(TheNewCS)

Sample_PhatePlot <-  Utility_ThirdColorPlots(x=NewGatingSet[[1]], subset = "root",
                                            xaxis="Phate_1", yaxis = "Phate_2",
                                            zaxis ="Spark Blue 550-A",
                                            splitpoint = "continuous",
                                            sample.name = "GROUPNAME",
                                            removestrings = c("Dimensionality", ".fcs"),
                                            thecolor = "orange", tilesize = 0.0001)

Sample_PhatePlot

David Rach

1 June 2025