Block bootstrap for genomic ranges — bootRanges • nullranges

Performs a block bootstrap, optionally with respect to a genome segmentation. Returns a bootRanges object, which is a GRanges object with all the ranges from the bootstrap iterations concatenated.

bootRanges(
  y,
  blockLength,
  R = 1,
  seg = NULL,
  exclude = NULL,
  excludeOption = c("drop", "trim"),
  proportionLength = TRUE,
  type = c("bootstrap", "permute"),
  withinChrom = FALSE,
  storeBlockLength = FALSE
)

Arguments

y: the GRanges to bootstrap sample
blockLength: the length of the blocks (for proportional blocks, this is the maximal length of a block)
R: the number of bootstrap samples to generate
seg: the segmentation GRanges, with a column ("state") indicating segmentation state (optional)
exclude: the GRanges of excluded regions (optional)
excludeOption: whether to "drop" or "trim" bootstrap ranges that overlap a excluded region
proportionLength: for the segmented block bootstrap, whether to use scaled block lengths (scaling by the proportion of the segmentation state out of the total genome length). That is, the resulting blocks will be of size less than blockLength
type: the type of null generation (un-segmented bootstrap only)
withinChrom: whether to re-sample (bootstrap) ranges across chromosomes (default) or only within chromosomes (un-segmented bootstrap only)
storeBlockLength: whether to save blockLength as a metadata column

Value

a BootRanges (GRanges object) with the bootstrapped ranges, where iteration and block length are recorded as metadata columns

Details

Note that this function requires input ranges have associated seqlengths, and that these must not be shorter than blockLength. See Seqinfo, seqlevels, and keepStandardChromosomes functions and their use in the Quick Start section of the vignette.

References

bootRanges manuscript:

Wancen Mu, Eric S. Davis, Stuart Lee, Mikhail G. Dozmorov, Douglas H. Phanstiel, Michael I. Love. 2023. "bootRanges: Flexible generation of null sets of genomic ranges for hypothesis testing." Bioinformatics. doi: 10.1093/bioinformatics/btad190

Original method describing the segmented block bootstrap for genomic features:

Bickel, Peter J., Nathan Boley, James B. Brown, Haiyan Huang, and Nancy R. Zhang. 2010. "Subsampling Methods for Genomic Inference." The Annals of Applied Statistics 4 (4): 1660–97. doi: 10.1214/10-AOAS363

Examples


set.seed(1)
library(GenomicRanges)
#> Loading required package: stats4
#> Loading required package: BiocGenerics
#> 
#> Attaching package: ‘BiocGenerics’
#> The following objects are masked from ‘package:stats’:
#> 
#>     IQR, mad, sd, var, xtabs
#> The following objects are masked from ‘package:base’:
#> 
#>     Filter, Find, Map, Position, Reduce, anyDuplicated, aperm, append,
#>     as.data.frame, basename, cbind, colnames, dirname, do.call,
#>     duplicated, eval, evalq, get, grep, grepl, intersect, is.unsorted,
#>     lapply, mapply, match, mget, order, paste, pmax, pmax.int, pmin,
#>     pmin.int, rank, rbind, rownames, sapply, saveRDS, setdiff, table,
#>     tapply, union, unique, unsplit, which.max, which.min
#> Loading required package: S4Vectors
#> 
#> Attaching package: ‘S4Vectors’
#> The following object is masked from ‘package:utils’:
#> 
#>     findMatches
#> The following objects are masked from ‘package:base’:
#> 
#>     I, expand.grid, unname
#> Loading required package: IRanges
#> Loading required package: GenomeInfoDb
gr <- GRanges("chr1", IRanges(0:4 * 10 + 1, width=5),
              seqlengths=c(chr1=50))
br <- bootRanges(gr, blockLength=10)