Generation and maintenance of cell types
Human pluripotent stem cell maintenance
H9 human pluripotent stem cells (WiCell) were maintained in E8 media and passaged every 4 days onto matrigel-coated plates (Roche). ESCs, cardiac myocytes, epicardium, and endothelium were H9 ESC-derived. The remainder of the cells were H7 ESC-derived. H7 pluripotent stem cells (WiCell) were maintained in feeder-free conditions using mTeSR1 media (StemCell Technologies) + 1% penicillin/streptomycin (Thermo Fisher), and fresh media was added daily. Cells were cultured on tissue culture plastics coated with Geltrex basement matrix (Thermo Fisher; which was diluted 1:100 in DMEM/F12 media [Thermo Fisher] before being used to coat culture plastics). Prior to reaching confluence, H7 ESCs were dissociated using either Accutase (Thermo Fisher) or Versene (Thermo Fisher), and then were passaged onto new plates. Both H7 (WA07) and H9 (WA09) are included in the NIH Human Embryonic Stem Cell Registry and their identity has been authenticated. The genomic identity and mycoplasma status of H7 ESCs were not assessed. H9 ESCs were tested quarterly for mycoplasma contamination.
For all H7-derived cells, differentiation was conducted in serum-free media, either Chemically Defined Medium 2 (CDM2) or Chemically Defined Medium 3 (CDM3). The composition of CDM2 basal medium [25, 26] is 50% IMDM + GlutaMAX (Thermo Fisher, 31980-097) + 50% F12 + GlutaMAX (Thermo Fisher, 31765-092) + 1 mg/mL polyvinyl alcohol (Sigma, P8136-250G) + 1% v/v chemically defined lipid concentrate (Thermo Fisher, 11905-031) + 450 μM 1-thioglycerol (Sigma, M6145-100ML) + 0.7 μg/mL recombinant human insulin (Sigma, 11376497001) + 15 μg/mL human transferrin (Sigma, 10652202001) + 1% v/v penicillin/streptomycin (Thermo Fisher, 15070-063). Polyvinyl alcohol was brought into suspension by gentle warming and magnetic stirring, and the media was sterilely filtered (through a 0.22-μm filter) prior to use.
The composition of CDM3 basal medium [23] is 45% IMDM + GlutaMAX (Thermo Fisher, 31980-097) + 45% F12 + GlutaMAX (Thermo Fisher, 31765-092) + 10% KnockOut serum replacement (Thermo Fisher, 10828028) + 1 mg/mL polyvinyl alcohol (Sigma, P8136-250G) + 1% v/v chemically defined lipid concentrate (Thermo Fisher, 11905-031) + 1% v/v penicillin/streptomycin (Thermo Fisher, 15070-063). Polyvinyl alcohol was brought into suspension by gentle warming and magnetic stirring, and the media was sterilely filtered (through a 0.22-μm filter) prior to use.
Cardiac myocyte differentiation
On day 0 (start of differentiation) H9 human pluripotent stem cells were treated with 1mg/ml Collagenase B (Roche) for 1 h, or until cells dissociated from plates, to generate embryoid bodies. Cells were collected and centrifuged at 300 rcf for 3 min and resuspended as small clusters of 50–100 cells by gentle pipetting in differentiation media containing RPMI (Gibco), 2 mM/L L-glutamine (Invitrogen), 4×104 monothioglycerol (MTG, Sigma-Aldrich), 50 μg/ml ascorbic acid (Sigma-Aldrich). Differentiation media was supplemented with 2ng/ml BMP4 and 3 μmol Thiazovivin (Millipore). Embryoid bodies were cultured in 6-cm dishes (USA Scientific) at 37°C in 5% CO2, 5% O2, and 90% N2. On day 1, the media was changed to differentiation media supplemented with 30 ng/ml BMP4 (R&D Systems) and 30 ng/ml Activin A (R&D Systems), 5ng/ml bFGF (R&D Systems), and 1 μM Thiazovivin (Milipore). On day 3, embryoid bodies were harvested and washed once with DMEM (Gibco). Media was changed to differentiation media supplemented with 5 ng/ml VEGF (R&D Systems) and 5 μmol/L XAV (Stemgent). On day 5, media was changed to differentiation media supplemented with 5 ng/ml VEGF (R&D Systems). After day 8, media was changed every 3–4 days to differentiation media without supplements until approximately day 30.
Cardiac myocyte dissociation
Embryoid bodies were incubated overnight with 0.6mg/ml Collagenase Type II (Worthington) at 37°C. Dissociated cells were harvested and washed with Wash media (DMEM, 0.1% BSA) + 1 mg/ml DNase (VWR) twice and centrifuged at 300 rcf for 3 min. Cells were resuspended in differentiation media supplemented with 1 μM Thiazovivin (Millipore) and filtered.
Epicardium differentiation
The H9 human pluripotent stem cell cardiac myocyte protocol was followed up to day 3. On day 3, embryoid bodies were dissociated with TrypLE Express (Gibco). Dissociated cells were washed with Wash media (DMEM, 0.1% BSA) + 1mg/ml DNase (VWR) twice and centrifuged at 300 rcf for 3 min. Cells were resuspended in differentiation media supplemented with 1ng/ml BMP4 (R&D Systems) and filtered and counted using a hemocytometer. Cells were plated onto a matrigel-coated 96-well plate at 80,000 cells per well. On day 5, the media was changed to differentiation media supplemented with 5ng/ml VEGF (R&D Systems) and 1nM all-trans retinoic acid (Sigma-Aldrich). After day 5, media was changed every 2 days with the same day 5 differentiation media composition. On day 11, the media was changed to differentiation media supplemented with 5ng/ml VEGF (R&D Systems) and cells were fed with the same differentiation media every 2 days until day 15. On day 15, cells were dissociated with 1mg/ml Collagenase B (Roche) for 1 h, washed with Wash media (DMEM, 0.1% BSA) + 1mg/ml DNase (VWR), and centrifuged at 300 rcf for 3 min. Cells were further dissociated with 3ml TrypLE Express, washed with Wash media (DMEM, 0.1% BSA) + 1mg/ml DNase (VWR), and centrifuged at 300 rcf for 3 min. Cells were resuspended in differentiation media supplemented with 1 μM Thiazovivin (Millipore) and filtered and counted using a hemocytometer. Cells were plated in a matrigel-coated 6-well plate at 100,000 cells per well. On day 16, media was changed to differentiation media supplemented with 5ng/ml VEGF (R&D Systems) and cells were fed every 2 days until they reached confluence (approximately day 22). WT1 (WILMS-TUMOR 1) expression is indicative of successful epicardial differentiation [70].
Endothelial cell differentiation
The H9 human pluripotent stem cell cardiac myocyte protocol was followed up to day 5. At day 5, embryonic bodies were dissociated with TrypLE Express (Gibco). Dissociated cells were washed with Wash media (DMEM, 0.1% BSA) + 1mg/ml DNase (VWR) twice and centrifuged at 300 rcf for 3 min. Cells were resuspended in differentiation media supplemented with 100ng/ml VEGF (R&D Systems) and 50ng/ml bFGF (R&D Systems) and filtered and counted using a hemocytometer. Cells were plated onto a matrigel-coated 96-well plate at 80,000 cells per well. Media was changed every 2 days using differentiation media supplemented with 100ng/ml VEGF (R&D Systems) and 50ng/ml bFGF (R&D Systems). Cells were harvested and sorted on days 14–15.
Fluorescence-activated cell sorting (FACS)
Dissociated H9 human pluripotent stem cell-derived cells were resuspended in differentiation media containing diluted antibodies (dilutions listed below) for 30 min on ice. Cells were washed with differentiation media and resuspended in differentiation media + DAPI (1.35μg/ml, Biolegend) for FACS (BD FACSAria). Human pluripotent stem cell-derived cardiac myocytes used for IF-FISH were sorted by gating for SIRPA+ (PE-Cy7 anti-human CD172a/b, Biolegend, 1:200) and CD90- (APC anti-human CD90 (Thy1) Antibody, Biolegend, 1:200) cells. PSC-derived endothelial cells were sorted by gating for CD31+ (PE anti-human CD31, Biolegend, 1:200) cells.
Ectodermal differentiation
The day prior to beginning differentiation, H7 ESCs were dissociated with Accutase (Thermo Fisher) for 10 min at 37°C. Accutase was neutralized through the addition of excess DMEM/F12 media, and then ESCs were pelleted via centrifugation and the supernatant was aspirated. Pelleted ESCs were resuspended in mTeSR1 + 1% penicillin/streptomycin + 1 μM of the ROCK inhibitor Thiazovivin (Tocris) (henceforth referred to “cell-plating media”) and plated onto Geltrex-coated tissue culture plastics at a density of 4 × 105 cells/cm2 (i.e., 2.1 × 106 cells per 10-cm dish). Twenty-four hours after seeding, the cell-plating media was aspirated, and cells were briefly washed with DMEM/F12 to remove all traces of cell-plating media.
For definitive ectoderm induction, H7 ESCs were differentiated through the modification of a previously described method [71], in CDM2 basal media, for 24 h.
For border ectoderm induction, H7 ESCs were differentiated into OTX2+ definitive ectoderm in 24 h (as described above), and then definitive ectoderm was briefly washed (with DMEM/F12) and then further differentiated into PAX3+ border ectoderm progenitors through the modification of a previously described method [42], in CDM2 basal media, for 24 h. Differentiation media was aspirated and added fresh every 24 h.
For midbrain induction, H7 ESCs were differentiated into definitive ectoderm in 24 h (as described above), and then definitive ectoderm was briefly washed (with DMEM/F12) and was further differentiated into neural progenitors through the modification of a previously described method [71], in CDM2 basal media, for 24 h. Neural progenitors were briefly washed (with DMEM/F12) and were then further differentiated into midbrain progenitors expressing PAX2, PAX5, EN1, and EN2 through a modification of a previously described method [72], in CDM2 media, for 48 h. Differentiation media was aspirated and added fresh every 24 h.
Endodermal differentiation
The day prior to beginning differentiation, H7 ESCs were dissociated with Accutase (Thermo Fisher) at 37°C. Accutase was neutralized through the addition of excess DMEM/F12 media, and then ESCs were pelleted via centrifugation and the supernatant was aspirated. Pelleted ESCs were resuspended in cell-plating media and plated onto Geltrex-coated tissue culture plastics at a 1:8–1:16 cell seeding ratio. Twenty-four hours after seeding, the cell-plating media was aspirated, and cells were briefly washed with DMEM/F12 to remove all traces of cell-plating media.
ESCs were then differentiated into anteriormost primitive streak (not profiled) through the addition of CDM2 basal medium supplemented with Activin A (100 ng/mL; R&D Systems), CHIR99021 (3 μM; Tocris), FGF2 (20 ng/mL; R&D Systems), and PI-103 (50 nM; Tocris), which was added for 24 h. Day 1 anteriormost primitive streak cells were briefly washed (with DMEM/F12) and then differentiated into day 2 definitive endoderm through the addition of CDM2 basal medium supplemented with Activin A (100 ng/mL; R&D Systems), LDN-193189 (250 nM; Tocris), and PI-103 (50 nM; Tocris), which was added for 24 h. Methods for anteriormost primitive streak and definitive endoderm formation have been described previously [23, 25, 27].
For liver differentiation, day 2 definitive endoderm cells were briefly washed (with DMEM/F12) and further differentiated into day 3 posterior foregut through the addition of CDM3 base media supplemented with FGF2 (20 ng/mL; R&D Systems), BMP4 (30 ng/mL; R&D Systems), TTNPB (75 nM; Tocris), and A8301 (1 μM; Tocris). Day 3 posterior foregut cells were briefly washed (with DMEM/F12), and then further differentiated on days 4–6 with CDM3 base media supplemented with Activin A (10 ng/mL; R&D Systems), BMP4 (30 ng/mL; R&D Systems), and Forskolin (1 μM; Tocris) to generate liver bud progenitors expressing HNF4A and TBX3. Methods for liver bud progenitor formation have been described previously [23, 27, 73].
For mid-hindgut differentiation, day 2 definitive endoderm cells were briefly washed (with DMEM/F12) and further differentiated into day 6 mid-hindgut progenitors expressing FOXA2, CDX2, and HOXA9 through the addition of CDM2 base media supplemented with FGF2 (100 ng/mL), BMP4 (10 ng/mL), and CHIR99021 (3 μM) for 4 days. Methods for mid-hindgut progenitor formation have been described previously [23, 25].
Mesodermal differentiation
The day prior to beginning differentiation, H7 ESCs were dissociated with Accutase (Thermo Fisher) at 37°C. Accutase was neutralized through the addition of excess DMEM/F12 media, and then ESCs were pelleted via centrifugation and the supernatant was aspirated. Pelleted ESCs were resuspended in cell-plating media and plated onto Geltrex-coated tissue culture plastics at a 1:8–1:16 cell seeding ratio. Twenty-four hours after seeding, the cell-plating media was aspirated, and cells were briefly washed with DMEM/F12 to remove all traces of cell-plating media.
ESCs were then sequentially differentiated into anterior primitive streak, paraxial mesoderm progenitors (“mesoderm progenitors” hereafter and in main text; enriched in TBX6, CDX2, and MSGN1 expression), and early somites (enriched in MEOX1 expression) as described previously [26]. Briefly, ESCs were differentiated into anterior primitive streak through the addition of CDM2 basal medium supplemented with Activin A (30 ng/mL; R&D Systems), CHIR99021 (4 μM; Tocris), FGF2 (20 ng/mL; R&D Systems), and PIK90 (100 nM; Calbiochem), which was added for 24 h, thus generating day 1 anterior primitive streak [26].
For mesoderm induction, ESCs were differentiated into anterior primitive streak in 24 h (as described above), and then anterior primitive streak was briefly washed (with DMEM/F12) and then treated with CDM2 basal media supplemented with A8301 (1 μM; Tocris), LDN193189 (250 nM; Tocris), CHIR99021 (3 μM, Tocris), and FGF2 (20 ng/mL; R&D Systems), which was added for 24 h, thus generating mesoderm progenitors [26].
For early somite induction, ESCs were differentiated into anterior primitive streak and then further differentiated into mesoderm (as described above). Mesoderm was briefly washed (with DMEM/F12) and then treated with CDM2 basal media supplemented with CDM2 base media supplemented with A8301 (1 μM; Tocris), LDN193189 (250 nM; Tocris), XAV939 (1 μM; Tocris), and PD0325901 (500 nM; Tocris) for 24 h, thus generating early somite progenitors [26].
IF-FISH, imaging, and quantification
IF and IF-FISH
ESCs and cardiac myocytes were grown and/or differentiated in culture, sorted (see above), and plated for FISH by direct growth on coverslips. Cells were fixed with 4% paraformaldehyde (PFA) for 10 min at room temperature (RT) and permeabilized with 0.5% Triton X-100 for 10 min at RT. Permeabilized cells were then blocked in 1% BSA in PBS-T (8mM Na2HPO4, 150mM NaCl, 2mM KH2PO4, 3mM KCl, 0.05% Tween 20, pH 7.4) and incubated with primary and secondary antibodies for 1 h each at RT with 3 PBS-T washes for 5 min each in between antibody incubations. Primary antibodies used were anti-Lamin B1 (1:1000, Abcam #ab16048) and anti-H3K9me2 (1:1000, Active Motif #39239). Secondary antibodies used were anti-rabbit AlexaFluor 488 (1:1000, Invitrogen #21206) and anti-rabbit AlexaFluor 568 (1:1000, Invitrogen #10042).
Following IF, cells were post-fixed with 2% PFA for 10 min at RT and permeabilized with 0.7% Triton X-100 for 10 min at RT. Cells were incubated in 2× SSC-T (3.0M NaCl, 0.3M Sodium Citrate, 0.1% Tween 20) for 5 min at RT, followed by washes in 2× SSC-T with 50% formamide for 5 min at RT, 2.5 min at 92°C, and 20 min at 60°C. Cells were hybridized with a Cy2, Cy3, or Cy5 directly labeled DNA probe diluted in a hybridization mix containing 50% formamide, 1× dextran sulfate sodium salt (Fisher Scientific #BP1585) with PVSA (poly(vinylsulfonic acid, sodium salt) solution) (Sigma #278424), 10μg RNAseA, 10mM dNTPs, and 2-5pmol probe for 30 min at 80°C, then overnight (minimum 16 h) at 37°C. Probes were designed in target chromosomal regions (see Additional file 8: Table 5) using a tiled oligo-based approach with 80-mer probes spaced at 4 probes/kb in designated chromosomal regions. Cells were washed with 2× SSC-T at 60°C for 15 min followed by washes in 2× SSC-T and 2× SSC for 10 min each at RT. Cells were counterstained with DAPI solution (Sigma #D9542) diluted in 2× SSC for 5 min at RT. Cells were mounted on coverglass with SlowFade Gold antifade mounting reagent (Invitrogen #S36936) prior to image acquisition. The nonLAD probe was not dye-conjugated; for this probe, IF-FISH was performed as above with the following modifications. Primary hybridization mix contained 5μg RNAseA, 5mM dNTPs, and a probe concentration of 50pmol. Following primary probe hybridization, cells were washed with 2× SSC=T at 60°C for 15 min, followed by 2× SSC-T and 2× SSC for 10 min. A second hybridization mix was added containing 10% formamide, 1× dextran sulfate salt with PVSA, and 10pmol of a secondary probe conjugated to AlexaFluor 647. Cells were hybridized for 2 h at RT, then washed with 2× SSC-T at 60°C for 10 min, then 2× SSC-T, and 2× SSC for 5 min each. Samples were then counterstained with DAPI and mounted, as above.
Imaging
Confocal 3D images were taken using 120nm step Z-stacks, with an approximate range of 10–70 Z-planes per cell. Obtained images were deconvoluted using Leica Lightning Deconvolution Software. Representative images in Fig. 4 represent a single focal plane with brightness and contrast adjusted equivalently across samples in ImageJ. 3D reconstructions were performed using IMARIS v.8.1.2 software (Bitplane AG, Switzerland). Nuclear lamina and H3K9me2 surfaces were created using Surfaces tool with automatic settings based on the fluorescent signals from the anti-LB1 and anti-H3K9me2 antibodies. DNA FISH foci were generated using the Spots tool with a 500–1000-nm diameter, created at the intensity mass center of the fluorescent probe signal. Distance from the center of the FISH focus to the inner and outer edge of the nuclear lamina surface was quantified using the Measurement Points tool. Thickness of the H3K9me2-marked peripheral heterochromatin layer was calculated as the distance between the inner and outer edge of the H3K9me2 surface quantified using the Measurement Points tool with 5 random measurements per cell, from 10 independent cells per cell type (total of 50 measurements per cell type). Depth of H3K9me2 is indicated as the maximum value of those 50 measurements. In cases when the signal from a FISH focus was embedded into the nuclear lamina layer, the measurement returned negative distances. Statistical significance of the differences in localization between T1-, T2-, and nonLAD foci (Fig. 4C) was calculated by a Kruskal-Wallis test with post hoc Dunn test for multiple comparisons. Statistical significance of differences in distribution between T1-, T2-, and nonLADs was calculated by a Kolmorgov-Smirnov test (Fig. 4D). Statistical significance of localization between additional T1- and T2-LAD foci for each cell type was calculated by a Mann-Whitney test.
ChIP, ChIP-qPCR, and ChIP-seq library preparation
ChIP
Undifferentiated ESCs and all differentiated cell types were crosslinked in culture by addition of methanol-free formaldehyde (Thermo Fisher, final 1% v/v) and incubated at room temperature for 10 min with gentle rotation. Crosslinking was quenched by addition of glycine (final 125mM) and incubated at room temperature for 5 min with gentle rotation. Media was discarded and replaced with PBS; cells were scraped and transferred to conical tubes and pelleted by centrifugation (250×g, 5 min at room temperature). Resulting pellets were flash frozen on dry ice and stored at −80°C. For ChIP, 30μL protein G magnetic beads (per ChIP sample; Dynal) were washed 3 times in blocking buffer (0.5% BSA in PBS); beads were resuspended in 250μL blocking buffer and 2μg antibody (Lamin B1: ab16048 [Abcam]; H3K9me2: ab1220 [Abcam]) and rotated at 4°C for at least 6 h. Crude nuclei were isolated from frozen crosslinked cells as follows: cell pellet (from 10cm plate) was resuspended in 10mL cold Lysis Buffer 1 (50mM HEPES-KOH pH7.5, 140mM NaCl, 1mM EDTA, 10% Glycerol, 0.5% NP-40, 0.25% Triton X-100, and protease inhibitors) and rotated at 4°C for 10 min, followed by centrifugation (250×g, 5 min at room temperature). Supernatant was discarded and the pellet was resuspended in 10mL cold Lysis Buffer 2 (10mM Tris-HCl pH 8.0, 200mM NaCl, 1mM EDTA, 0.5mM EGTA, and protease inhibitors) and rotated at room temperature for 10 min, followed by centrifugation (250×g, 5 min at room temperature). Supernatant was discarded and nuclei were resuspended/lysed in 1mL cold Lysis Buffer 3 (10mM Tris-HCl, pH 8.0, 100mM NaCl, 1mM EDTA, 0.5mM EGTA, 0.1% Na-Deoxycholate, and protease inhibitors) and transferred to pre-chilled 1-mL Covaris AFA tubes (Covaris). Samples were sonicated using a Covaris S220 sonicator (high cell chromatin shearing for 15 min; Covaris). Lysates were transferred to tubes and Triton X-100 was added (final 1%) followed by centrifugation (top speed, 10 min at 4°C in microcentrifuge). Supernatant was transferred to a new tube; protein concentration was measured by Bradford assay. Antibody-conjugated beads were washed 3 times in blocking buffer, resuspended in 50μL blocking buffer and added to 500μg input protein for overnight incubation with rotation at 4°C. Fifty micrograms lysate was aliquoted and stored at −20°C for input. On day 2, beads were washed 5 times in 1mL RIPA buffer (50mM HEPES-KOH pH 7.5, 500mM LiCl, 1mM EDTA, 1% NP-40, 0.7% Na-Deoxycholate) with 2-min incubation at room temperature with rotation for each wash. Beads were washed in 1mL final wash buffer (1×TE, 50mM NaCl) for 2 min with rotation at room temperature before final resuspension in 210μL elution buffer (50mM Tris-HCl pH 8.0, 10mM EDTA, 1% SDS). To elute, beads were incubated with agitation at 65°C for 30 min. 200μL eluate was removed to a fresh tube, and all samples (ChIP and reserved inputs) were reverse-crosslinked overnight at 65°C with agitation for a minimum of 12 h, but not more than 18 h. Two hundred microliters 1×TE was added to reverse-crosslinked DNA to dilute SDS, and samples were treated with RNaseA (final 0.2mg/mL RNase; 37°C for 2 h) and Proteinase K (final 0.2mg/mL Proteinase K; 55°C for 2 h) before phenol:chloroform extraction and resuspension in 10mM Tris-HCl pH 8.0. ChIP and input DNA was quantified by Qubit (Thermo Fisher).
ChIP-qPCR
Post quantification, ChIP DNA from ESCs was diluted 1:5 and used for qPCR assessment across 20 independent T1-LAD, T2-LAD, nonLAD and T1-KDD, T2-KDD, and nonKDD regions (primer sequences in Additional file 9: Table 6). qPCR was performed in 10μL reactions in 384-well format with 2μL 1:5 diluted template, 2× Power SyBr mastermix (Thermo Fisher), and 0.1μM each forward and reverse primer. qPCR reactions were run for 40 cycles using standard conditions [3 min at 95°C; 40× (15 s at 95°C; 1 min at 60°C)] on a QuantStudio 5 or QuantStudio 7 qPCR machine (Applied Biosystems). For qPCR assessments, average enrichment (average Ct ChIP/average Ct input) were quantified per primer set.
Library preparation
ChIP-seq libraries were prepared using the NEBNext Ultra II DNA library prep kit (NEB). Samples were indexed for multiplex sequencing. Library quality was analyzed by BioAnalyzer (Agilent Genomics) and quantified using qPCR (Kapa Biosystems or NEB). Libraries were pooled for multiplex sequencing, re-quantified, and sequenced on the Illumina NextSeq500 platform (vII; 75 bp single-end sequencing; Illumina).
RNA isolation and RNA-seq library preparation
Cells were scraped from tissue culture plates with 1×PBS and centrifuged at 1500g for 5 min at room temperature. After discarding supernatant, cell pellets were flash frozen in dry ice and stored at −80°C until processing. RNA was isolated using QIAGEN RNeasy total RNA extraction kit (QIAGEN). RNA quality was analyzed by BioAnalyzer; samples with RIN scores >8 were chosen for further processing. RNA libraries were prepared using the NEBNext Ultra II DNA Library Prep kit (NEB) with the NEBNext Poly(A) mRNA Magnetic Isolation Module (NEB) to enrich for poly-A-tailed RNA molecules. RNA-seq library quality was analyzed by BioAnalyzer (Agilent Genomics) and quantified using qPCR (Kapa Biosystems). Libraries were pooled for multiplex sequencing, re-quantified, and sequenced on the Illumina NextSeq500 platform (vII; 75 bp single-end sequencing; Illumina).
ChIP-seq/RNA-seq processing and computational analyses
ChIP-sequencing data processing for LAMIN B1 and H3K9me2
Adapters were trimmed using Trimmomatic [v0.39] [74]. Sequencing reads were aligned to human reference hg38 using BWA-MEM [v0.7.17] [75]. Aligned reads were converted to BAM and sorted using Samtools [v0.1.19] [76], with quality filter (“-F”) set to 1804. Duplicates were removed using Picard [v2.18.7] MarkDuplicates. Sequencing reads from the ENCODE blacklist were removed using BEDTools [v2.29.0] [76, 77]. Two biological replicates were analyzed for each cell type. Track views represent one biological replicate dataset. The data for cell types based on combined replicates adhere to ENCODE3 standards (Additional file 1: Table 1) [78].
Identification of LADs and KDDs
LB1 and H3K9me2 ChIP-seq signals were calculated and converted into BedGraph files using deepTools bamCompare [v3.3.2] [79] with 20-kb bins, using the signal extraction scaling method [80] for sample scaling, followed by quantile normalization between cell types to decrease the impact of batch effects. The bin size of 20kb was chosen based on assessment of the literature and motivation to describe LADs in as fine of resolution as possible [19, 55, 57]. HMMs were implemented for each cell type using pomegranate [v0.11.1] [81, 82]. Each HMM was initialized using a normal distribution and k-means with a uniform transition matrix and trained using the Baum-Welch algorithm. Each cell type-specific model was then applied to predict LAD or KDD state genome-wide per 20-kb bin, using the median value from both replicates for each bin, for each cell type individually, filtering regions in the ENCODE blacklist from consideration. For the LAD predictions, states were labeled as T1-LAD, T2-LAD, or nonLAD based on median LB1 signal for the bins with that state label, with the highest median LB1 signal being assigned T1-LAD, second highest T2-LAD, and lowest nonLAD. The same strategy was employed to assign T1-, T2-, and nonKDDs. For validation, the HMM was repeated using single biological replicates and replicability was measured using BEDTools “intersect” command.
ChIP-seq analyses
LB1 occupancy at LAD boundaries was computed using the computeMatrix --referencePoint and plotProfile tools from deepTools [v3.3.2] [79]. Binned read counts for Additional file 2: Fig. S3B and Fig. S8E were generated using the deepTools “bamCoverage” tool with a minimum mapping score of 10. To determine which bins were “enriched” for LB1, the number of LB1 reads in each bin was scaled so that the total LB1 reads matched the total input reads; any bin for which the LB1 count was higher than the input count was classified as enriched. To label each bin with its LAD classification, the genome was first broken down into 10-kilobase regions that were labeled with their respective LAD classifications. Then, each bin’s starting coordinate was rounded down to the nearest 10,000; this number was used to look up the classification of that region and also assigned to the bin. In cases where neighboring bins were automatically merged by bamCompare because they had the same score, these bins were split so each was an equal size. Bins in blacklisted regions were not classified by the HMM and were excluded from these analyses. For LAD validation by overlap of narrow regions of LB1 enrichment, LB1 peaks in ESCs and CMs were called using epic2 (version 0.0.16) with paired replicate ChIP and Input bam files, and parameters “-fs 200 -bin 600 -g 4 -fdr 0.05.”
RNA-sequencing analysis
Transcriptome data were quantified using Kallisto [v0.44.0] quant with fragment length determined by BioAnalyzer, standard deviation of 10, and 30 bootstraps, assigning reads using the Ensembl [v96] genome annotation [83]. TPM values were quantile-normalized between cell types. Differentially expressed transcripts (q≤0.01) between cell types were identified using Sleuth [0.30.0] [84].
ATAC-seq analysis
ATAC peaks from H9-derived cells [39] were downloaded as BED files from GEO and lifted over from hg19 to hg38, taking the intersection of two replicates for each cell type.
CTCF analysis
Bigwig files were downloaded from GEO (see Data Access section). CrossMap was used to lift over bigwigs from hg19 to hg38.
Hi-C analysis
Hi-C data for CMs and ESCs were downloaded as Cooler files from the 4D Nucleome Data Portal [40]. A and B compartments were called using cooltools [v0.3.0] [85].
Enrichment analyses
Odds ratios were calculated based on two by two tables of counts of 20-kb genomic bins for category (T1-LAD, T2-LAD, or KDD) overlap and domain of interest (replicating timing domain, gene, transposable element, etc.) overlap. P-values were calculated by Fisher’s exact test.
Comparison with single cell DamID
Single-cell DamID data from 172 KBM-7 cells from clone 5-5 was downloaded from the Gene Expression Omnibus (GSE68260) in hg19 and lifted over using pybedtools to hg38, removing regions that did not lift over. These data were intersected with T1- and T2-LADs using pybedtools [86].
Comparison with DamID-seq
Human DamID-seq data from hESC cells was downloaded from the 4D Nucleome Data Portal (Data set Identifier: 4DNESNFNTUAO) [87, 88]. To evaluate LB1 enrichment, the Dam-only data (Data set Identifier: 4DNFIUYAKRND) and LB1 sequencing (Data set Identifier: 4DNFIJDN1FW4) from the same biological sample were used to create a bedgraph file using the deepTools “bamCompare” function with 10-kb bins. Each bin was then annotated with its LAD classification using a custom Python script available with the analysis code for this manuscript (see below). Overlaps in LAD calls between ChIP-seq and DamID-seq datasets were calculated using the BEDTools “intersect” command.
Gene ontology enrichment analyses
Enriched gene ontology terms for genes located in invariants T1- or T2-LADs was done using the HumanBase Modules tool against a background set of genes that fall in T1- or T2-LADs in at least one cell type.
AT content analysis
AT genomic content was calculated for either invariant T1-LADs or the whole genome using BEDTools nuc. For the latter, coordinates for invariant T1-LADs across all 12 cell types (20-kb bins) were used in a merged bed file. AT content was calculated for each invariant T1-LAD. For the whole genome, a bed file with chromosomal coordinates was used, and the median AT content was calculated across chromosomes.
Supporting analyses
Gene annotations used throughout are from Ensembl v96. The reference genome used was human hg38, downloaded from the UCSC Genome Browser. Constitutive late, constitutive early, and switch domains were obtained from [44]. They defined replication timing domains by their consistency across multiple cell types; thus, the same domains are used in each cell type in our analysis. Transposable elements from RepeatMasker were downloaded from the UCSC Genome Browser. Plotting, statistical analyses, and supporting analyses were conducted in Python [v3.6] with packages Jupyter, matplotlib [89], seaborn [90], upsetplot [90], scikit-learn [91], numpy [92], pybedtools [77, 86], Circos [93], and deepTools [v3.3.2] [79] and in R [v4.1.0] [94] with packages ggplot2 [95], dplyr [96], tidyverse [97], VennDiagram [98], and ggalluvial [99].
Violin and box plots
Boxes represent standard median (center dot or line) and interquartile range (25th to 75th percentile). Whiskers denote 1.5× interquartile range.