PBMC isolation and cryopreservation
Peripheral venous blood samples were collected by venipuncture from two voluntary blood donors, one male and one female. Blood samples were collected in ACD-tubes and stored at room temperature (RT) or 4 °C. In the former condition, peripheral blood mononuclear cells (PBMC) were isolated at 2, 8, 24, and 48 h. For samples stored at 4 °C, PBMC were isolated at 24 and 48 h. PBMC separation was performed using Ficoll density gradient centrifugation. For each condition, 12 ml of blood was diluted with an equal volume of pre-warmed RPMI 1640 culture medium (Lonza). The diluted blood was then carefully layered onto a Leucosep tube (Greiner Bio-One) prefilled with 15 ml of Ficoll-Plus (GE Healthcare Biosciences AB) and centrifuged for 15 min at 800×g and RT (without acceleration and brake). After centrifugation, PBMC were collected with a sterile Pasteur pipette into a 50-ml tube, diluted up to 10 ml with pre-warmed RPMI medium and centrifuged for 10 min, at 400×g and RT. Following a second washing step with 5 ml of RPMI medium and a 5-min centrifugation, PBMC were resuspended in 8 aliquots of freezing media. Freezing media consisted of RPMI 1640 with 20% heat-inactivated fetal bovine serum (Sigma-Aldrich), 10% DMSO (Sigma-Aldrich), and penicillin-streptomycin 1:1000 (Lonza). One-milliliter aliquots, with approximately 1 × 10e6 cells/ml, were gradually frozen using a commercial freezing box (Mr. Frosty, Nalgene) at − 80 °C for 24 h and then stored in a vapor-phase liquid nitrogen tank at − 150 °C.
Cryopreserved (− 80 °C) PBMC samples were rapidly thawed in a 37 °C water bath. Each sample was transferred into a 15-ml Falcon using a 1000-μl cut tip without mixing by pipetting. Next, 1 ml of 37 °C pre-warmed media (Hibernate-A supplemented with 10% FCS; ThermoFisher) was added dropwise with gentle swirling of the sample. After 1-min incubation, 2 ml of pre-warmed media was added and incubated for 1 min. Next, 5-ml pre-warmed media was gently added, inverted, and incubated (1 min). This step was repeated once. Finally, the samples were centrifuged at 700×g for 5 min (4 °C). The supernatant was removed, and the pellets re-suspended in 100 μl of Cell Staining Buffer (BioLegend). Of note, we did not observe an increase of damaged/dead cells (cell viability staining) towards the later time points with an average viability of 95% (range, 88–98%) across time points. An increased fraction of debris could be observed at 24 and 48 h, which, however, did not result in reduced data quality (e.g., proportion of excluded cells) during data analysis (Additional file 7: Table 6).
CLL patient samples were obtained from freshly extracted blood, stored either at RT or 4 °C. Mononuclear cells were isolated after Ficoll density gradient centrifugation, at 2, 4, 6, 8, and 24 h after patient blood extraction. The cells were directly cryopreserved with freezing media (RPMI 1640 with 20% FBS and 10% DMSO), in the concentration of 5–10 × 10e6 cells/ml, according to standardized protocol. The tumor cell content of all the samples was > 80%, as assessed by immunostaining of CD19, CD20, CD5, and CD45 followed by flow cytometry. All patients gave informed consent for their participation in the study according to International Cancer Genome Consortium (ICGC) guidelines.
Cell hashing was performed following the manufacturer’s instructions (Cell hashing and Single Cell Proteogenomics Protocol Using TotalSeq™ Antibodies; BioLegend). Therefore, samples were incubated 10 min at 4 °C with Human TruStain FcX™ Fc Blocking reagent (BioLegend). Next, sample-specific TotalSeq antibodies (anti-human Hashtag 1-8, Biolegend) were added with subsequent incubation on ice for 45 min. Cells were washed once with cold 1X PBS supplemented with 0.0005% BSA (ThermoFisher) and pelleted at 700×g for 5 min. A single-cell solution was obtained resuspending the pellet in 1X PBS (0.0005% BSA) and filtering it through a 40-μm cell strainer. The cells were counted in an automatic cell counter (Countess® v.2, ThermoFisher).
PBMC isolation and activation
Peripheral venous blood samples were collected by venipuncture from one voluntary donor (male). Blood samples were collected in 10-ml Vacutech Vacuum Blood Collection Tubes K2/K3 EDTA (Becton Dickinson) and stored at RT. Peripheral blood mononuclear cells (PBMC) were isolated at 0, 8, and 24 h after blood collection. PBMC separation was performed using Ficoll density gradient centrifugation. For each condition, 9 ml of blood was diluted with an equal volume of 1X PBS (Gibco). The diluted blood was then carefully layered onto 9 ml of Lymphoprep solution (STEMCELL Technology) and centrifuged for 15 min at 700×g and RT (without acceleration and brake). After centrifugation, PBMC were collected and washed twice with 10 ml of 1X PBS. The pellet was resuspended with 10 ml of 1X PBS, and cells were counted with a TC20™ Automated Cell Counter (Bio-Rad Laboratories). PBMC were again centrifuged for 5 min at 700×g and resuspended in an appropriate volume of freezing media (RPMI with 10% heat-inactivated fetal bovine serum and 10% DMSO). Aliquots of ~ 0.5 × 10e6 cells/ml were gradually frozen using a commercial freezing box (Mr. Frosty, Nalgene) at − 80 °C for 24 h and then stored in a vapor-phase liquid nitrogen tank at − 150 °C.
For T cell activation, cells were thawed in MACS buffer (1X PBS, 4% FBS, 2 mM EDTA), centrifuged during 5 min at 700×g and RT, and resuspended in pre-warmed culture media (RPMI, 1% Pyruvate, 20% FBS, Pen/Strep, DNase 100 U/ml). A TC20™ automated cell counter was used to assess cell number and viability. The number of only viable cells was used to calculate volumes for cell seeding. For each condition, 200,000 live cells were seeded into two wells of a 96-well round bottom plate (Sigma Aldrich) for a total of 400,000 cells per condition (time point). Dynabeads Human T-Activator CD3/CD28 (Thermo Fisher Scientific) were transferred to a 1.5-ml tube (5 μl/well), washed twice with 1 ml of cell culture media, and resuspended with 10 volumes of cell culture media. Fifty microliters of resuspended beads was added to each well for T cell activation and expansion. Cells were incubated during 24 h at 37 °C with 5% CO2 and 5% humidity. The remaining cells (~ 350,000 cells per condition) were used as a control (day 0) for T cell activation. Cells subjected to T cell activation protocol were collected in a 1.5-ml tube and stained with DAPI (Thermo Fisher Scientific) at 1 μM final concentration. DAPI-negative live individual cells were sorted with a BD FACSAria™ Fusion Flow cytometer (BD Biosciences) in 1X PBS supplemented with 0.05% BSA.
Samples subjected to T cell activation treatment, as well as corresponding control samples, were subjected to a Cell Hashing protocol before proceeding to scRNA-seq. Cell hashing was performed following the manufacturer’s instructions (Cell hashing and Single Cell Proteogenomics Protocol Using TotalSeq™ Antibodies; BioLegend). Cells were counted with a TC20™ Automated Cell Counter, and an equal number of cells was taken for each condition. Briefly, samples were resuspended in Cell Staining Buffer (BioLegend), incubated 10 min at 4 °C with Human TruStain FcX™ Fc Blocking reagent (Bio Legend). To each condition, a specific TotalSeq-A antibody-oligo conjugate (anti-human Hashtag 1-8, Biolegend) was added and incubated on ice for 1 h. Cells were then washed three times with cold PBS-0.05% BSA (ThermoFisher) and centrifuged for 5 min at 700×g at 4 °C. Finally, cells were resuspended in an appropriate volume of 1X PBS-0.05% BSA in order to obtain a final cell concentration > 500 cells/ul, suitable for 10x Genomics scRNA-seq. An equal volume of hashed cell suspension from each of the conditions (0 h, 8 h, and 24 h) was mixed and filtered with a 40-μm strainer. Cell concentration was verified by counting with a TC20™ Automated Cell Counter.
Single-cell RNA sequencing
Cells were partitioned into Gel Bead In Emulsions with a Target Cell Recovery of 10,000 total cells. Sequencing libraries were prepared using the single-cell 3′ mRNA kit (V2 for PBMC samples and V3 for CLL and cultured PBMC samples; 10x Genomics) with some adaptations for cell hashing, as indicated in TotalSeq™-A Antibodies and Cell Hashing with 10x Single Cell 3′ Reagent Kit v3 3.1 Protocol by BioLegend. Briefly, 1 μl of 0.2 μM HTO primer (Hashtag Oligonucleotides; GTGACTGGAGTTCAGACGTGTGC*T*C; *Phosphorothioate bond) was added to the cDNA amplification reaction in order to amplify the hashtag oligos together with the full-length cDNAs. A SPRI selection cleanup was done in order to separate mRNA-derived cDNA (> 300 bp) from antibody-oligo-derived cDNA (< 180 bp), as described in the abovementioned protocol. 10x cDNA sequencing libraries were prepared following 10x Genomics Single Cell 3′ mRNA kit protocol, while HTO cDNAs were indexed by PCR as follows. Briefly, 5 μl of purified hashtag oligo cDNA was mixed with 2.5 μl of 10 μM Illumina TruSeq D70X_s primer (IDT) carrying a different i7 index for each sample, 2.5 μl of SI primer from 10x Genomics Single Cell 3′ mRNA kit, 50 μl of 2X KAPA HiFi PCR Master Mix (KAPA Biosystem), and 40 μl of nuclease-free water. The reaction was carried out using the following thermal cycling conditions: 98 °C for 2 min (initial denaturation), 12 cycles of 98 °C for 20 s, 64 °C for 30 s, 72 °C for 20 s, and a final extension at 72 °C for 5 min. The HTO libraries were purified with 1.2 X SPRI bead selection. Size distribution and concentration of cDNA and HTO libraries were verified on an Agilent Bioanalyzer High Sensitivity chip (Agilent Technologies). Finally, sequencing of HTO and cDNA libraries was carried out on a HiSeq4000 or NovaSeq6000 system (Illumina).
Single-cell ATAC sequencing
For the single-cell ATAC-seq experiments, we analyzed one PBMC and one CLL sample isolated after 0 h, 8 h, and 24 h of blood storage at room temperature before cryopreservation. Frozen samples were rapidly thawed in a 37 °C water bath. Each sample was transferred into a 15-ml Falcon using a 1000-μl cut tip without mixing by pipetting. Next, 1 ml of 37 °C pre-warmed media (Hibernate-A supplemented with 10% FCS) was added dropwise with gentle swirling of the sample. After 1 min of RT incubation, additional 2 ml of pre-warmed media was added. The samples were again kept at RT for 1 min, before 5 ml of pre-warmed media was gently added. This step was repeated once. Then, samples were centrifuged at 700×g for 5 min. Supernatant was removed and the pellets resuspended in 500 μl of PBS supplemented with 0.05% BSA. Cell concentration and viability were determined with a TC20™ Automated Cell Counter.
Nuclei isolation was performed following the “Nuclei Isolation for Single Cell ATAC Sequencing demonstrated protocol” (10x Genomics). Briefly, 1,000,000 cells from the CLL sample and 300,000 cells from PBMC were transferred to a 1.5-ml microcentrifuge tube and centrifuged at 500×g for 5 min at 4 °C. The supernatant was removed without disrupting the cell pellet, and 100 μl of chilled Lysis Buffer (10 mM Tris-HCl (pH 7.4); 10 mM NaCl; 3 MgCl2; 0.1% Tween-20; 0.1% Nonidet P40 Substitute; 0.01% Digitonin and 1% BSA) was added and pipette-mixed 10 times. Samples were then incubated on ice during 3 min. Following lysis, 1 mL of chilled Wash Buffer (10 mM Tris-HCl (pH 7.4); 10 mM NaCl; 3 MgCl2; 0.1% Tween-20 and 1% BSA) was added and pipette-mixed. Nuclei were centrifuged at 500×g for 5 min at 4 °C, and the supernatant removed without disrupting the pellet. Based on the starting number of cells and assuming a 50% loss during the procedure, nuclei were resuspended into the appropriate volume of chilled Diluted Nuclei Buffer (10x Genomics) in order to achieve a nuclei concentration of 1540–3850 nuclei/μl, suitable for a Target Nuclei Recovery of 5000. The resulting nuclei concentration was determined using a with a TC20™ Automated Cell Counter.
scATAC-seq libraries were prepared according to the Chromium Single Cell ATAC Reagent Kits User Guide (10x Genomics; CG000168 Rev. B). Briefly, the transposition reaction was prepared by mixing the desired number of nuclei with ATAC Buffer (10X Genomics) and ATAC Enzyme (10X Genomics), before incubation for 60 min at 37 °C. Nuclei were partitioned into Gel Bead-In-Emulsions (GEMs) by loading the following into a Chip E: the master mix (previously added to the same tube of the transposed nuclei), the Chromium Single Cell ATAC Gel Beads (10X Genomics), and the Partitioning Oil (10X Genomics). After the run into the Chromium Controller, the DNA linear amplification was performed by incubating the GEMs at the following thermal cycling conditions: 72 °C for 5 min, 98 °C for 30 s, 12 cycles of 98 °C for 10 s, 59 °C for 30 s, and 72 °C for 1 min. GEMs were broken using the Recovery Agent (10X Genomics), and the resulting DNA was purified by Dynabeads and SPRIselect reagent (Beckman Coulter; B23318) bead cleanups. Indexed sequencing libraries were obtained by mixing the amplification product with the Sample Index PCR Mix (10X Genomics) and the Chromium i7 Sample Index (10x Genomics) and incubating at the following thermal cycling conditions: 98 °C for 45 s, 12 cycles of 98 °C for 20 s, 67 °C for 30 s, 72 °C for 20 s, and with a final extension of 72 °C for 1 min. Sequencing libraries were subjected to a final bead cleanup SPRIselect reagent and quantified on an Agilent Bioanalyzer High Sensitivity chip (Agilent Technologies). Finally, libraries were loaded on an Illumina HiSeq 2500 system in Rapid Run mode using the following read length: 50 bp Read 1N, 8 bp i7 Index, 16 bp i5 Index, and 50 bp Read 2N.
Primary processing and demultiplexing
We processed sequencing reads with CellRanger v3.0.0 for the PBMC data and v3.0.2 for the CLL and T cell activation data. We used the human GRCh38 assembly as reference genome. To specify the hashtag oligonucleotide (HTO) libraries, the cDNA libraries, and the HTO sequences, we followed the “Feature Barcoding Analysis” pipeline, available at https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/using/feature-bc-analysis. We set the --chemistry and --expect-cells flags of cellranger count to “SC3Pv3” and “5000”, respectively. We demultiplexed cell hashtags as described in Stoeckius et al.  for each batch and donor separately. Briefly, we normalized hashtag oligonucleotide (HTO) counts using a centered log ratio (CLR), in which each count is divided by the geometric mean of a HTO across cells and log-transformed. We then clustered barcodes using k-medoids with k equal to the number of conditions (k = 4 for batch 03, and k = 8 for batch 04), which allowed us to identify the background distribution of each HTO. We re-clustered Male 04 with k = 3, as no clear signal for the HTO “24 h 4 °C” was detected. Subsequently, we considered the top 0.5% normalized HTO counts of the background distribution as outliers and excluded them. We classified barcodes to a given condition if the normalized HTO counts of that condition exceeded the 0.99 quantile. We discarded both barcodes that were assigned to more than one condition (multiplets) and barcodes that were not assigned to any condition (negatives). In subsequent datasets (CLL and T-cell activation), we demultiplexed the HTO with Seurat’s built-in functions . Of note, all sampling time point showed comparable dataset qualities (e.g., library size and mitochondrial read content; Additional file 7: Table S6).
Quality control and normalization
We performed quality control (QC) and normalization separately for each dataset (PBMC, CLL, T-cell activation). Following the guidelines from Luecken et al. , we inspected the distributions of three main QC metrics: library size (total UMI), library complexity (number of detected genes), and mitochondrial expression. Importantly, we analyzed these metrics jointly to ensure that cells with high mitochondrial expression were not metabolically active. Finally, we analyzed one of the CLL donors independently as it showed markedly different distributions in QC metrics. We classified as damaged cells those barcodes with an aberrantly low number of UMI and genes, or with an abnormally high mitochondrial expression. Likewise, we classified as doublets those barcodes that possessed and aberrantly large library size and complexity. We also leveraged DoubletFinder  to detect doublets that shared the same HTO and were not outliers in any qc metric. We ruled out genes that were detected in less than 10 (CLL) or 15 cells (T cell activation). Finally, we used the Scran 1.10.2 package  to normalize UMI counts with cell-based size factors. We provide a supplementary table with the number of cells before and after QC, the average library size, library complexity, and mitochondrial fraction stratified by time, donor, and temperature (Additional file 7: Table 6).
Cell type annotation
Cell type annotation was performed within the Seurat framework  (Additional file 1: Fig. S13). To cluster cells, we (i) identified overdispersed genes with the FindVariableFeatures function (using default parameters), (ii) scaled UMI counts and regressed out the batch effect, (iii) performed principal component analysis (PCA), (iv) used these PCs to create a k-nearest neighbors graph with the FindNeighbors function, and (v) clustered cells with the FindClusters function. We set the resolution parameter to 0.05, 0.01, and 0.1 for the PBMC, CLL, and T cell activation data, respectively. Finally, we used well-known marker genes to annotate each cluster to its specific cell type. We provide a supplementary table with the number of cells per cell type stratified by time, donor, and temperature (Additional file 7: Table 6).
To quantify and compare sampling time with other sources of variance, we followed two complementary approaches. First, we defined ρp as a universal measure of cell similarity, as Skinnider et al.  described this proportionality metric as the most accurate to measure cell-cell association. We then downsampled T cells and B cells to 50 cells per sampling time to ensure that different cell types and time points were equally represented. Likewise, we downsampled each cluster of leukemic cells (one per donor) in a similar manner. We obtained a distance matrix by computing all pairwise ρp as described in Skinnider et al. Finally, we sorted the distance matrix by cell type and time to allow for interpretation.
Second, we downsampled monocytes and NK cells as described before and merged it with the T and B cell dataset. We then regressed the scran-normalized gene expression values of 5282 genes (union of highly variable features of the merged and cell type-specific datasets) on one of four explanatory variables (cell type, time, donor, and batch), and extracted a distribution of r2 values for each variable. To conduct a similar analysis in the Smart-seq2 dataset (T cells), we used the plotExplanatoryVariables function from Scater .
Differential expression analysis (DEA)
To find the scRNA-seq signature, we divided cells in the PBMC and CLL datasets in time-biased (t > 2 h) and time-unbiased (t < =2 h). Subsequently, we performed a Wilcoxon signed-rank test to test for differential expression for each gene. Soneson et al.  reported that this test is among the best performing for scRNA-seq DEA analysis. Vieth et al.  showed that with Scran normalization, there is no need for scRNA-seq-tailored statistical tests. To define the sampling time signatures, we performed the same approach but setting the logfc.threshold of Seurat’s FindMarkers function to logfc.threshold = 0.25 to increase the specificity.
Gene Ontology (GO) enrichment analysis
To elucidate biological processes affected by sampling time, we conducted a GO enrichment analysis with the GOstats 2.48.0 package . We used as target set the entrez identifiers of the upregulated (log fold-change > 0) or downregulated (log fold-change < 0) genes in the sampling time signature, and as universe set the entrez identifiers of all genes that we included in the analysis. Finally, we filtered out GO terms that were too general (size ≥ 600), or too specific (size < 3). In addition, we only retained GO terms with a p value lower than 0.05 and an odds ratio greater than 2.
Integration of the sampling signatures
To compare the sampling time signatures detected for PBMC and CLL with other condition-specific signatures, we downloaded the DEG tables for the studies by Baechler et al.  and van den Brink et al. . We defined as signature those genes with an adjusted p value < 0.001. Furthermore, we subsetted the Baechler signature to the top 250 genes to harmonize the number of DEG across signatures. Finally, we calculated a signature-specific score using the AddModuleScore function from Seurat.
Prediction of sampling time-biased cells
To predict sampling time-biased cells, we used the AddModuleScore function of Seurat to compute a time score per cell using a signature calculated on the male donor (training set). We then fitted a logistic regression model using the time score as explanatory variable. Subsequently, we predicted the probability of being “biased” for every cell of the female donor (test set) and found the area under the curve (AUC) with the “AUC” function from the cvAUC v1.1.0 package. To test the significance of our signature, we repeated the process with a signature defined on random genes.
Computational correction sampling time signature
To correct the time-biased transcriptomes in the female PBMC dataset (test set), we regressed the expression of each gene on the time score and kept the scaled and centered residuals as the variance in gene expression not explained by time. We performed this process for each cell type independently to minimize Simpson’s paradox.
As all the analysis above used a similar proportion of “biased” and “unbiased” cells, we sought to test the effect of varying percentages of biased cells on the time score computation and regression. In this setting, we performed bootstrapping as follows: first, we sampled 300 cells with replacement from the Smart-seq2 dataset, enforcing an approximate percentage of time-affected cells. Second, we computed the average Silhouette width between affected and unaffected cells. Finally, we computationally corrected the transcriptome profiles and recalculated the average Silhouette width. We repeated this process 25 times for each set of percentages ranging from 10 to 90% of affected cells.
k-nearest-neighbor batch-effect test (kBET)
To assess the mixability between cells of different time points in the presence or absence of our corrections, we used the kBET metric . Briefly, kBET compares the label distribution of the local k-nearest neighborhood of a given cell with the global distribution with a Pearson’s χ2 test, with the null hypothesis that, if samples are well-mixed, both distributions are equal. We ran kBET with the cells embedded in UMAP space and with default parameters. We defined the acceptance rate as the percentage of tested cells with a p value > 0.05, and as rejection rate 100-acceptance rate.
To confirm that the results obtained from 10x Genomics-derived data were technology-independent, we profiled the transcriptome of 376 CD3+ T cells with Smart-seq2 . The cells originated from the same donors as in the 10x Genomics experiments and were distributed across four 96-well plates (all time points per plate). We discarded 60 cells that either had < 75,000 or > 1,000,000 total counts, < 435 detected genes or a mitochondrial expression > 20%. Similarly, we filtered out 6542 genes that had an average expression across cells < 1. We normalized gene counts with the Scran package, which removed the batch effect between plates. Finally, we clustered cells with the SC3 1.10.1 package , as it outperforms other tools for small datasets .
ATAC-seq data analysis
ATAC-seq data from 10x Genomics was processed with CellRanger-atac 1.1.0. Differential accessibility to detect changes in open chromatin sites was performed with Wilcoxon-Mann-Whitney rank sum test for high-throughput expression profiling data (R BioQC v1.0.0). Motif enrichment analysis was performed using the package motifcounter v.1.10.0  with default parameters, and the motifs were downloaded from JASPAR database (579 motifs from JASPAR CORE VERTEBRATES, http,//jaspar.genereg.net/downloads/). The background distribution was computed over the total peaks called in the datasets (56,627 in the PBMC and 80,861 in the CLL). We provide a supplementary table with multiple QC metrics (number of high-quality cells and detected peaks, among others) stratified by experiment, donor, and time (Additional file 7: Table 6).