### Brief introduction of the Stairway Plot method

The flexible multi-epoch model used in the skyline plot method [13, 14] is implemented for the Stairway Plot, which divides time into a series of blocks with each block starting and ending at the exact time of a particular coalescent event in the sampled sequences. The population size is assumed to remain constant within each block and to be able to change from one block to the next. A maximum of *n* − 1 time blocks can be defined given a sample of *n* DNA sequences, with block *k* corresponding to the *k*-coalescent time. Those *n* − 1 time blocks can be approximated to any demographic history. The Stairway Plot estimates a series of *θ*_{k}, *k* = 2, 3, …, *n*, maximizing the likelihood of the observed SFS. *θ*_{k} = 4*N*_{k}*μ*, where *N*_{k} is the effective size of the population during time block *k*, and *μ* is the mutation rate per bp per generation. In practice, adjacent blocks of time can be fused into one block to reduce the parameters to be estimated. More details of the algorithm can be found in [5].

One of the major improvements for Stairway Plot 2 is removing the requirement of polarizing SNPs by modeling folded SFS, and better model selection by using an unsupervised learning strategy. The major challenges are (1) whether the Stairway Plot framework will work with half the number of observations with folded SFS and (2) whether the loss of information can be compensated by better model selection strategy. The results showed that the Stairway Plot framework works with folded SFS and performs well with the new model selection strategy. This is partially contributed to the ensemble step: although each individual estimation can be coarse (fewer epochs), the ensemble estimation can be smooth and accurate.

### Brief introduction to Stairway Plot 2

Let *t*_{k} be the *k*-coalescent time of a random sample of *n* sequences, *N*_{k} is the effective size of the population during *t*_{k}, *θ*_{k} = 4*N*_{k}*μ*, and *μ* is the mutation rate per bp per generation. *θ*_{k} s are estimated for each of the *B* (default is 200) sub-samples of the SFS instead of bootstrap samples as in the Stairway Plot 1 [5]. In Stairway Plot 2, the effective population size trajectory is calculated for each SFS sub-sample, that is \( {N}_e^b(T)={\theta}_k^b/\left(4\mu \right) \) if \( {T}_i^b<T\le {T}_{i-1}^b \), where \( {\theta}_k^b \) is the *θ*_{k} estimation based on the sub-sample *b*, and \( {T}_i^b=\sum \limits_{k=i}^n\frac{\theta_k^b}{k\left(k-1\right)},i=2,3,\dots, n. \)

Then at each time point *T*, the median of a total of *B* estimates of the effective population size \( {N}_e^b(T) \) is used as the final estimate of *N*_{e} at *T* [9].

### Composite likelihood of folded SFS

Composite likelihood of the observed SFS was calculated as:

\( {L}_n={l}_n!\prod \limits_{i=0}^{n/2}\frac{p_i^{\eta_i}}{\eta_i!}, \)

where *n* is the sample size (number of haploids), *η*_{i} is the count of observed sites with a minor allele count of *i*, *p*_{i} is the frequency of *η*_{i} in the samples, and \( {l}_n=\sum \limits_{i=0}^{n-1}{\eta}_i \). This likelihood is calculated for both the training purpose (with training data) and testing/evaluating purpose (with testing data).

### SFS subsampling

Let *l*_{n} be the total number of sites observed, as defined above, where *n* is the sample size (number of haploids). A number \( {l}_n^{\prime } \) (by default \( {l}_n^{\prime }=2/3{l}_n \)) sites are randomly sampled from *l*_{n} sites and used as training data. The remaining \( {l}_n-{l}_n^{\prime } \) sites are used as testing data. SFSs, either folded or unfolded, can be obtained by summing the SNPs of a given ancestral allele count (unfolded) or minor allele count (folded).

### Constraint on “breakpoints”

For a sample of size *n*, a maximum of *n −* 1 different *θ*s that can be estimated. In an ordered serial of *θ*_{2}, *θ*_{3}, …, *θ*_{n}, “breakpoints” are inserted into the serial that separates the *θ*s into continuous groups. Any two consecutive *θ*s that are not separated by a breakpoint belong to the same group. The *θ* s within the same group have the same value, while those belonging to different groups may have different values. Therefore, there are *n* − 2 possible breakpoints that can be inserted. The actual number of breakpoints to be inserted into the serial is defined by the “blueprint” file. By default, four numbers approximately equal \( \frac{1}{4}\times \left(n-2\right) \), \( \frac{1}{2}\times \left(n-2\right) \), \( \frac{3}{4}\times \left(n-2\right) \), and *n* − 2 are used. Given a number *m*, for each training SFS, from the full set of breakpoints (i.e., 1, 2, …, *n* − 2) *m* of them are randomly picked. The best grouping of *θ*s fitting the training SFS follows the same procedure described in the Stairway Plot 1 paper [5] with the constraint that the actual breakpoints must be chosen from the *m* break points.

### Determine the best number of “breakpoints”

For *m* breakpoints defined above, the best estimations of *θ* s are obtained for each training SFS using the procedure described in the Stairway Plot 1 paper [5]. Then, the likelihood of this set of *θ*s using the corresponding testing SFS is calculated and used as the measurement of goodness-of-fit of those *θ*s. The average goodness-of-fit of a total of *B* testing SFS is used for the overall goodness-of-fit of using *m* break points, *G*_{m}. With a set of *m*, the best *m* is the one with the largest *G*_{m}. In practice, considering the variation of *G*_{m}, the best *m* is picked as the smallest number *m* that satisfies \( {G}_m>{G}_{m^{\prime }}+1.92 \) for all *m*^{′} < *m*, where *m* and *m*′ are both from the set of *m*.

### Simulation

SNP data were simulated using the ms [22] or MaCS [23] (Markovian Coalescent Simulator) programs. If not specified, all SNPs were simulated assuming a mutation rate (*μ*) of 1.2 × 10^{−8} per base pair per generation, a recombination rate of *ρ* = 1.2 × 10^{−8} per base pair per generation, and a generation time of 24 years. Simulation commands used for producing the data used in Additional file 1: Fig. S1 can be found in the Supplementary Note of the Stairway Plot 1 paper [5]. Other simulation commands are listed below.

zig-zag model: for /L %%i in (1, 1, 10) do (ms 200 200 -t 7156.0 -r 7156.0 10000000 -eN 0 5 -eG 0.000582262 1318.18 -eG 0.00232905 -329.546 -eG 0.00931619 82.3865 -eG 0.0372648 -20.5966 -eG 0.149059 5.14916 -eN 0.596236 0.5 >zig-zag-10M-%%i.out)

sharpCEU model: macs 200 30000000 -i 200 -t 0.0007156 -r 0.0007156 -eN 0.0 10.8300726663 -eN 0.00116452394261 1.08300726663 -eN 0.0174678591392 0.216601453326 -eN 0.0465809577045 1.08300726663 -eN 0.0873392956959 3.24902179989 -eN 0.232904788522 1.08300726663 2>/dev/null >sharpCEU.macs.out

sharpYRI model: macs 200 30000000 -i 200 -t 0.001 -r 0.001 -eN 0.0 8.25 -eN 0.0025 0.825 -eN 0.0416666666667 2.475 -eN 0.166666666667 0.825 2>/dev/null >sharpYRI.macs.out

### SFS of the Finnish individuals

The gnomAD project whole genomes sites and allele frequencies of the Finnish individuals were downloaded from http://gnomad.broadinstitute.org/downloads. A total of 650,351,035 likely neutral sites that are 50 kb away from any known coding genes yet within the 1000 Genomes Project phase 1 [24] strict mask were used for analyses [5].

### Parameters used in PSMC, SMC++, MSMC, MSMC2, Stairway Plot 1, and Stairway Plot 2

The PSMC estimations were conducted using the default parameters tuned for human populations: -N25 -t15 -r5 -p “4+25*2+4+6”. The composite likelihood with all individuals in the sample was used in PSMC. For SMC++, as suggested by its readme file, the composite likelihood using 10 distinguished individuals was used. The parameters “--regularization-penalty 5.0 --knots 16 --timepoints 35 100000” were used for SMC++ as suggested [6]. MSMC and MSMC2 used parameters “--skipAmbiguous”, and “-r 1” since we know the simulated recombination rate equals to the mutation rate. MSMC also used parameter “--fixedRecombination” as recommended by the authors [25]. For Stairway Plot 1 and 2, the default parameters were used.