Unmixing with Luciernaga
David Rach
University of Maryland, Baltimoredrach@som.umaryland.edu
1 June 2025
Source:vignettes/Unmixing.rmd
Unmixing.rmd
Introduction
Unmixing is a black box for many spectral flow cytometry users, you adjust gates on your single color controls, you provide full-stained samples, you unmix and then you evaluate the outputs with NxN plots. The golden rules of reference controls (1. Single Color Controls as Bright or Brighter than Full-Stain Sample; 2. Unmixing Single Color should be the same fluorophore (even better same manufacturer and lot); 3. Single Color Controls should have autofluorescence subtracted from a matching/equivalent unstained sample; 4. enough events) are useful guidepost that obviously work but few have mechanistic explanations behind why.
Building on examples from Jakob Theorell’s flowSpecs
and Christopher Hall’s flowUnmix package, we
implemented a way to take Luciernaga_QC()
outputs of
purified fluorophore signatures and unmix them using ordinal least
squares (OLS) working from GatingSet objects, and returning FCS 3.0
standard files. In combination with functional programming principles,
we have been leveraging this to understand how variations of fluorophore
signature and brightness impact the unmixing of full-stained samples. We
hope our addditive contribution enables users to push the limits of SFC
and uncover new insights, write ways to handle issues arising from
relative heterogeneity of individual immune cells unmixed with
combination outputs, and spare future graduate students having to go
write their own R package to answer space-wormhole questions.
Getting Started
This section uses the generated purified flourophore signatures
generated by Luciernaga_QC()
in the previous vignettes.
Let’s first load the required packages by calling them with library.
library(Luciernaga)
library(flowCore)
library(flowWorkspace)
library(openCyto)
library(ggcyto)
library(data.table)
library(dplyr)
library(purrr)
library(stringr)
library(ggplot2)
library(gt)
library(plotly)
library(htmltools)
Then we can find the .fcs files stored within the
Luciernaga
packages extdata folder and sort them by their
respective type
File_Location <- system.file("extdata", package = "Luciernaga")
FCS_Pattern <- ".fcs$"
FCS_Files <- list.files(path = File_Location, pattern = FCS_Pattern,
full.names = TRUE, recursive = FALSE)
head(FCS_Files[10:30], 20)
UnstainedFCSFiles <- FCS_Files[grep("Unstained", FCS_Files)]
UnstainedBeads <- UnstainedFCSFiles[grep("Beads", UnstainedFCSFiles)]
UnstainedCells <- UnstainedFCSFiles[-grep("Beads", UnstainedFCSFiles)]
BeadFCSFiles <- FCS_Files[grep("Beads", FCS_Files)]
BeadSingleColors <- BeadFCSFiles[-grep("Unstained", BeadFCSFiles)]
CellSingleColorFiles <- FCS_Files[grep("Cells", FCS_Files)]
CellSingleColors <- CellSingleColorFiles[!str_detect("Unstained", CellSingleColorFiles)]
Now lets create a GatingSet for our single-color cell unmixing controls
MyCytoSet <- load_cytoset_from_fcs(CellSingleColors,
truncate_max_range = FALSE,
transform = FALSE)
MyCytoSet
MyGatingSet <- GatingSet(MyCytoSet)
MyGatingSet
FileLocation <- system.file("extdata", package = "Luciernaga")
MyGates <- fread(file.path(path = FileLocation, pattern = 'Gates.csv'))
gt(MyGates)
MyGatingTemplate <- gatingTemplate(MyGates)
gt_gating(MyGatingTemplate, MyGatingSet)
MyGatingSet[[1]]
Now lets create a GatingSet for our unstained cell unmixing controls
MyUnstainedCytoSet <- load_cytoset_from_fcs(UnstainedCells,
truncate_max_range = FALSE,
transform = FALSE)
MyUnstainedCytoSet
MyUnstainedGatingSet <- GatingSet(MyUnstainedCytoSet)
MyUnstainedGatingSet
FileLocation <- system.file("extdata", package = "Luciernaga")
MyGates <- fread(file.path(path = FileLocation, pattern = 'Gates.csv'))
gt(MyGates)
MyGatingTemplate <- gatingTemplate(MyGates)
gt_gating(MyGatingTemplate, MyUnstainedGatingSet)
MyUnstainedGatingSet[[1]]
Generate Luciernaga_QC Outputs
Now that the GatingSets are re-established, let’s continue where the
last vignette left off by processing all the fcs files with
Luciernaga_QC
to characterize the fluorescent signatures
within.
Let’s first provision the AFOverlap csv to handle conflicts.
FileLocation <- system.file("extdata", package = "Luciernaga")
pattern = "AutofluorescentOverlaps.csv"
AFOverlap <- list.files(path=FileLocation, pattern=pattern,
full.names = TRUE)
AFOverlap_CSV <- read.csv(AFOverlap, check.names = FALSE)
AFOverlap_CSV
And next generate a CellAF unstained signature that can be used when these fluorophore-autofluorescence overlap files are encountered:
# pData(MyUnstainedGatingSet[1])
removestrings <- c(".fcs")
TheCellAF <- map(.x=MyUnstainedGatingSet[1], .f=Luciernaga_QC, subsets="lymphocytes",
removestrings=removestrings, sample.name="GUID",
unmixingcontroltype = "cells", Unstained = TRUE,
ratiopopcutoff = 0.001, Verbose = FALSE,
AFOverlap = AFOverlap, stats = "median",
ExportType = "data", SignatureReturnNow = TRUE,
outpath = TemporaryFolder, Increments=0.1,
SecondaryPeaks=2, experiment = "FirstExperiment",
condition = "ILTPanel", SCData="subtracted",
NegativeType="default")
TheCellAF <- TheCellAF[[1]] #Removes list caused by map
gt(TheCellAF)
Now let’s use Luciernaga_QC()
with ExportType = “fcs” to
export the data as individual .fcs files, and set Brightness = TRUE to
save .csv files that can be used for Luciernaga_Tree()
. For
this vignette, we will be saving the .fcs files to a temporary folder.
On your own workstation, save the outputs to a folder where you can
retrieve them later by providing a file.path to the outpath
argument.
Let’s start by processing the single-color unmixing controls.
#pData(MyGatingSet)
StorageLocation <- file.path(tempdir(), "LuciernagaOutputs")
if (!dir.exists(StorageLocation)) {
dir.create(StorageLocation)
}
SingleColor_Data <- map(.x=MyGatingSet, .f=Luciernaga_QC, subsets="nonDebris",
removestrings=removestrings, sample.name="GUID",
unmixingcontroltype = "cells", Unstained = FALSE,
ratiopopcutoff = 0.001, Verbose = FALSE,
AFOverlap = AFOverlap, stats = "median",
ExportType = "fcs", Brightness=TRUE, SignatureReturnNow = FALSE,
outpath = StorageLocation, Increments=0.1,
SecondaryPeaks=2, experiment = "FirstExperiment",
condition = "ILTPanel", Subtraction = "Internal",
CellAF=TheCellAF, SCData="subtracted",
NegativeType="default", minimalfcscutoff=0.01)
TheLuciernagaOutputs_FCS <- list.files(StorageLocation, pattern="fcs", full.names = TRUE)
head(TheLuciernagaOutputs_FCS, 4)
And let’s also process an unstained unmixing control specimen also to characterize autofluorescence that is present.
#pData(MyUnstainedGatingSet)
Unstained_Data <- map(.x=MyUnstainedGatingSet[1], .f=Luciernaga_QC, subsets="nonDebris",
removestrings=removestrings, sample.name="GUID",
unmixingcontroltype = "cells", Unstained = TRUE,
ratiopopcutoff = 0.001, Verbose = FALSE,
AFOverlap = AFOverlap, stats = "median",
ExportType = "fcs", Brightness=TRUE, SignatureReturnNow = FALSE,
outpath = StorageLocation, Increments=0.1,
SecondaryPeaks=2, experiment = "FirstExperiment",
condition = "ILTPanel", Subtraction = "Internal",
CellAF=TheCellAF, SCData="subtracted",
NegativeType="default", minimalfcscutoff=0.01)
Luciernaga_Tree
We generate a lot of clusters with Luciernaga_QC()
. We
can visualize them using the report and plotting functions described in
the previous vignette, but selecting individual candidate output .fcs
for use in unmixing can be tiresome and confusing.
Luciernaga_Tree()
is our initial attempt to reduce the
burden, by instituting a decision tree to help filter the many outputs
and return likely candidates that will work for unmixing. It relies on
the Luciernaga_QC()
Brightness=TRUE .csv outputs in making
this decision.
We want to be upfront and say this is developmental. We have recently created the tools to allow us to query how fluorophore brightness, signature, and relative abundance impact the unmixing of full-stained samples. We have not had the time to delve into the outcomes at the depth we would like to come up with a grand unified theory of perfect unmixing. That is on my to-do-list as a postdoc/industry/whatever (hire me if this interest you or you want to avoid me working for your competitors). For the time, it works well enough, but with occasional bugs in the final unmixing. We highly encourage your feedback and tinkering with the decision trees step methodology in order to achieve more consistent results. For now, the process works as follows (detailed explanation logic).
With that background out of the way, let’s continue.
ReferencePath <- system.file("extdata", package = "Luciernaga")
PanelPath <- file.path(ReferencePath, "UnmixingPanel.csv")
UnmixingPanel <- read.csv(PanelPath, check.names=FALSE)
MoveThese <- Luciernaga_Tree(BrightnessFilePath = StorageLocation, PanelPath = PanelPath)
gt(head(MoveThese, 5))
By scrolling through the returned selections, we can screen based on
our knowledge of the panel and decide if the outcomes seem reasonable.
We will also verify that these decisions were correct before we reach
the unmixing process by visualizing vs. reference signatures in
Luciernaga_SingleColors()
. In the case that the wrong file
output was selected, it is much easier to remove and replace one .fcs
file than 30.
Luciernaga_Move
Once Luciernaga_Tree()
has identified the
Luciernaga_QC()
output .fcs files that will likely produce
the best unmixing outcome, it would be absolutely brutal having to track
down within a folder of hundreds of .fcs files with some repetition of
CCR4BUV615_UV6_10-V7_08-B3_04 nomenclature (believe me, I did so
initially). Luciernaga_Move()
takes the
Luciernaga_Tree()
list of ideal candidates, and copies
these .fcs files to a designated folder, saving you the hassle, and
allowing you to simply point at that folder for use in the functions
mentioned below.
Continuing from where we left off with Luciernaga_Tree()
above, for this example, we will create a different temporary folder
were we will store the selected .fcs files for later use in the
unmixing.
SortedStorageLocation <- file.path(tempdir(), "LuciernagaSelected")
if (!dir.exists(SortedStorageLocation)) {
dir.create(SortedStorageLocation)
}
UnmixingPanel <- read.csv(PanelPath, check.names=FALSE)
TheseFluorophores <- UnmixingPanel %>% pull(Fluorophore)
walk(.x=TheseFluorophores, .f=Luciernaga_Move, data=MoveThese, input=StorageLocation, output=SortedStorageLocation)
MovedFiles <- list.files(SortedStorageLocation, pattern="fcs", full.names=TRUE)
length(MovedFiles)
With these steps completed, we are now ready to proceed to the steps to validate our choice in unmixing controls.
Luciernaga_LinearSlices
We had previously showcased Luciernaga_LinearSlices()
ability to take an .fcs file, and visualize variation in the signature
based on the quantile splits for the MFI brightness. What we saw for APC
CD16 is replicated below:
#pData(MyGatingSet)
APC_Example <- subset(MyGatingSet, str_detect(name, "CD16_"))
RawSlices <- Luciernaga_LinearSlices(x=APC_Example[1], subset="lymphocytes",
sample.name="GUID", removestrings=removestrings,
stats="median", returntype="raw",
probsratio=0.1, output="plot", desiredAF="R1-A")
plotly::ggplotly(RawSlices)
#pData(MyGatingSet[6])
NormalizedSlices <- Luciernaga_LinearSlices(x=APC_Example[1], subset="lymphocytes",
sample.name="GUID", removestrings=removestrings,
stats="median", returntype="normalized",
probsratio=0.1, output="plot", desiredAF="R1-A")
plotly::ggplotly(NormalizedSlices)
For today, we highlight Luciernaga_LinearSlices()
ability to do this also with the Luciernaga_QC()
outputs.
MovedFiles <- list.files(SortedStorageLocation, pattern="fcs", full.names=TRUE)
Selected_CS <- load_cytoset_from_fcs(MovedFiles, truncate_max_range = FALSE, transform = FALSE)
Selected_GS <- GatingSet(Selected_CS)
pData(Selected_GS)
ThePlots <- map(.x=Selected_GS, .f=Luciernaga_LinearSlices, subset="root", removestrings=".fcs",
sample.name="GUID", stats="median", returntype="normalized", output="plot")
plotly::ggplotly(ThePlots[[1]])
As we can see, with the exception of cells below 30 percentile in
brightness, most of the variation in signature we saw in the original
file, and that the sorting within Luciernaga_QC()
appears
to have sorted cells of similar signature regardless of brightness.
By passing the generated plots to Utility_Patchwork()
,
we can set returntype=“pdf” or “patchwork” to generate a report for all
fluorophores.
PatchworkObjects <- Utility_Patchwork(ThePlots, filename = "LinearSlices", outfolder=ReferencePath, returntype = "patchwork")
PatchworkObjects[1]
Luciernaga_SingleColors
The main purpose of Luciernaga_SingleColors()
is to
generate a reference matrix for use in unmixing full-stained samples. It
additionally provides a mechanism by which we can visualize the
signatures we plan on using and compare them to the reference signatures
stored within Luciernaga
. This allows us to screen out
potential issues in the single-color reference matrix before we proceed
to unmix the full-stained specimens.
To begin, we need to generate either a .csv file or a data.frame for
each of the fluorophores present, and specify a cutoff point for each,
dictated by our observations from what we saw with
Luciernaga_LinearSlices()
. For this example, we will do
this in R and set the intervals from 0.4 to 1.0 across the board. If you
want to fine-tune things further, it would be easier to save the output
as a .csv file, modify it, and then return it to R.
UnmixingPanel <- read.csv(PanelPath, check.names=FALSE)
ThePanelCuts <- UnmixingPanel %>% select(-Detector) %>% mutate(From=0.3) %>% mutate(To=1)
head(ThePanelCuts, 5)
#write.csv(ThePanelCuts, path="SaveHere.csv", row.names=FALSE)
An important thing to ensure is that we correctly identify each fluorophore and ligand at this step, to ensure names are correct in the unmixed full-stained .fcs file. In this case, we are using the keyword “TUBENAME” to identify between the .fcs files. This is what the original name looks like:
keyword(Selected_GS[1], "TUBENAME")
Luciernaga_SingleColors()
will automatically clean up
any portion of the name that has “(Cells)” or “(Beads)” present in it.
However, that would leave the final name as “DR_CCR4 BUV615” which would
be converted into Fluorophore = BUV615, Ligand = DR_CCR4. To clean this
up, we will remove the authors initials (“DR_”) by providing them as
part of a list to the remove strings argument.
#pData(Selected_GS)
removestrings=c("DR_", ".fcs")
SCs <- map(.x=Selected_GS[1], .f=Luciernaga_SingleColors, sample.name="TUBENAME",
removestrings=removestrings, subset="root", PanelCuts=ThePanelCuts,
stats="median", Verbose=TRUE, SignatureView=TRUE, returntype = "plots")
In the case above, we set the returntype = “plot” to visualize the
outcome compared to the stored Reference Signatures within
Luciernaga
. Let’s see what these look like:
plotly::ggplotly(SCs[[1]])
As you can tell, the provided signature originating from our output
.fcs file closely resembles that of the reference signature. We can
repeat this for all the fluorophores and using
Utility_Patchwork()
, set returntype = “pdf” and examine all
fluorophores to spot any issues. For this example, I will set it the
argument to “patchwork” to visualize.
SCs <- map(.x=Selected_GS, .f=Luciernaga_SingleColors, sample.name="TUBENAME",
removestrings=removestrings, subset="root", PanelCuts=ThePanelCuts,
stats="median", Verbose=TRUE, SignatureView=TRUE, returntype = "plots")
TheView <- Utility_Patchwork(x=SCs, filename="ReferenceMatches",
outfolder = ReferencePath, returntype = "patchwork")
TheView[3]
In this particular case, we can see that while most generated signatures aligned with their references, the .fcs files we are using for BUV737 and PacificBlue deviate substantially and are likely to impact the final unmixing. Within our workflow, we would follow up by removing the current output .fcs from the Selected Folder, and replace it with another variant.
While this highlights the visualizing portion of
Luciernaga_SingleColors()
lets go ahead and generate the
reference matrix by changing returntype = “data”
SC_Reference <- map(.x=Selected_GS, .f=Luciernaga_SingleColors, sample.name="TUBENAME",
removestrings=removestrings, subset="root", PanelCuts=ThePanelCuts,
stats="median", Verbose=FALSE, SignatureView=FALSE, returntype = "data") %>%
bind_rows()
head(SC_Reference, 4)
While we are at it, we might as well identify how far the cosine difference for the trouble Pacific Blue Fluorophore is before we go and correct it:
PacificBlue <- SC_Reference %>% select(-Ligand) %>% rename(Sample=Fluorophore)
Results <- QC_WhatsThis(x="Pacific Blue", data=PacificBlue, NumberHits=10,
returnPlots = TRUE)
Results[[1]]
plotly::ggplotly(Results[[2]])
As you can tell, the fluorophore more closely resembles BV510 with no Pacific Blue appearing in the list of hits. This suggest that the returned fluorescent signature in the .fcs file is mainly autofluorescence, and should not be used.
Back on topic, once we have corrected the Selected Folder, rerun
Luciernaga_SingleColors()
and are satisfied with out
results, we should go ahead and save the results data.frame elsewhere
for further reference, or that they can be so edited to correct for any
typos or format issues that may have been missed before creating unmixed
.fcs files.
Luciernaga_Unmix
Luciernaga_Unmix()
is the unmixing function implemented
within the Luciernaga
package. What mainly distinguishes it
from other R package implementations of ordinary least squares unmixing
is it works at the GatingSet level in terms of infrastructure (reducing
active memory use) and is set up in such a way to allow us to rapidly
iterate/modify/change the inputs to subsequently evaluate the unmixed
full-stained .fcs files for the impacts that those decisions had on the
unmixing. We have put some effort into ensuring that the subsequent
unmixed files are compatible with various software typically used by
those who prefer to use GUI for their flow data. This involved changes
done within the newly produced .fcs files exprs, parameters and
description folder, it’s possible we may have missed something, so if
you encounter a bug, please reach out.
As previously stated, this remains experimental, and at the moment is just intended as a tool to allow me to querry how brightness/signature/abundance of individual single colors impacts the full-unmixing. As I improve on my existing knowledge, the quality of inputs/outputs is likely to also improve as I figure out the things that I don’t yet know and correct for them. So consider this a work in progress for now, feel free to reach out if you know something I don’t, and want to collaborate on getting it implemented here.
For now, let’s identify the raw full-stained files and load them into a GatingSet:
File_Location <- system.file("extdata", package = "Luciernaga")
FCS_Pattern <- ".fcs$"
FCS_Files <- list.files(path = File_Location, pattern = FCS_Pattern,
full.names = TRUE, recursive = FALSE)
RawFullStainedFCSFiles <- FCS_Files[grep("Tetramer", FCS_Files)]
RawFullStainedFCSFiles <- RawFullStainedFCSFiles[-grep("Unmixed", RawFullStainedFCSFiles)]
UnmixCytoSet <- load_cytoset_from_fcs(RawFullStainedFCSFiles, truncate_max_range = FALSE, transform = FALSE)
UnmixGatingSet <- GatingSet(UnmixCytoSet)
Let’s identify the Single Color Reference Data output from
Luciernaga_SingleColors()
that we have validated
(correcting from any issues)
ReferencePath <- system.file("extdata", package = "Luciernaga")
ValidatedSCReferenceData <- file.path(ReferencePath, "ValidatedSCReferenceData.csv")
SingleColorReference <- read.csv(ValidatedSCReferenceData, check.names = FALSE)
And finally, lets provide a file.path to the panel (to establish correct ordering of fluorophores in the final file)
PanelPath <- file.path(ReferencePath, "UnmixingPanel.csv")
PanelNames <- read.csv(PanelPath, check.names=FALSE)
With these pre-requisites prepared we can go ahead. For this example, we will merge the “GROUPNAME” and “TUBENAME” to form the final name. For the final file, we will use the addon argument to append “_Unmixed” at the end. As Ordinary Least Squares (OLS) returns values close to 0, the multiplier increases all values across the board, which allows the data to resemble that of other softwares when the bi-exponential transform is applied.
Luciernaga_IterativeUnmix
This function is an extension of Luciernaga_Unmix()
using the same inputs, with the added provision that you provide it a
folder of variant of Luciernaga_QC()
.fcs files for a
single fluorophore of interest. Luciernaga_IterativeUnmix()
will then proceed one by one through the files in that folder, process
them individually and swap them in to the Reference Matrix, unmix the
full-stain samples, and return the variant unmixed full-stain files to
the outfolder. It will repeat this until everything is complete. What we
will do subsequently, is use Utility_UnityPlots()
and
Utility_NxNPlots()
and the workflow described in Vignette 1
to consolidate all the variant unmixed files and evaluate how the
variation in that iterated single-color impacted the final unmixing.
IterativePath <- file.path(ReferencePath, "DifferentialPerCP")
removestrings <- c("DR_", ".fcs")
iterate_removestrings <- c("DR_", "(Cells)", ".fcs", " ", "PerCP-Cy5.5", "CD26", "_")
TheSampleName <- c("GROUPNAME", "TUBENAME")
Luciernaga_IterativeUnmixing(IterativePath=IterativePath, iterate_removestrings=iterate_removestrings,
removestrings=removestrings, sample.name=TheSampleName, subset="root",
PanelCuts=ThePanelCuts, stats="median", Verbose=FALSE, SignatureView=FALSE,
FullStainedGS=UnmixGatingSet, controlData=SingleColorReference, multiplier=50000,
outpath=UnmixedOutpath, PanelPath=PanelPath)
Conclusion
And with that, we conclude our tour of the current state of the
unmixing functions within the Luciernaga
package. They
remain a work in progress, and we welcome any
contributions/insights/bug-reports to continue improving on them. This
entire project arose when curious of how placing a positive gate on a
single-color unmixing control would alter the unmixing of that file, and
the sum of Luciernaga
’s functions have been geared to
allowing me to answer these questions so that no other graduate student
will have to go through the horror of “it unmixed weird, no idea why” 20
years from now.