In this vignette, we demonstrate the unsegmented block bootstrap functionality implemented in nullranges. “Unsegmented” refers to the fact that this implementation does not consider segmentation of the genome for sampling of blocks, see the segmented block bootstrap vignette for the alternative implementation.

Timing on DHS peaks

First we use the DNase hypersensitivity peaks in A549 downloaded from AnnotationHub, and pre-processed as described in the nullrangesOldData package.

library(nullrangesData)
dhs <- DHSA549Hg38()

The following chunk of code evaluates various types of bootstrap/permutation schemes, first within chromosome, and then across chromosome (the default). The default type is bootstrap, and the default for withinChrom is FALSE (bootstrapping with blocks moving across chromosomes).

set.seed(5) # reproducibility
library(microbenchmark)
blockLength <- 5e5
microbenchmark(
  list=alist(
    p_within=bootRanges(dhs, blockLength=blockLength,
                        type="permute", withinChrom=TRUE),
    b_within=bootRanges(dhs, blockLength=blockLength,
                        type="bootstrap", withinChrom=TRUE),
    p_across=bootRanges(dhs, blockLength=blockLength,
                        type="permute", withinChrom=FALSE),
    b_across=bootRanges(dhs, blockLength=blockLength,
                        type="bootstrap", withinChrom=FALSE)
  ), times=10)
## Unit: milliseconds
##      expr      min       lq     mean   median       uq       max neval
##  p_within 801.7949 816.5089 838.6304 832.7102 860.4622  900.7043    10
##  b_within 721.9391 746.7398 850.5913 765.0534 843.2826 1404.8636    10
##  p_across 187.9344 199.2289 347.5174 206.4586 337.8437  879.2543    10
##  b_across 208.9729 218.1591 352.4104 220.2693 241.6383  876.0171    10

Visualize on synthetic data

We create some synthetic ranges in order to visualize the different options of the unsegmented bootstrap implemented in nullranges.

library(GenomicRanges)
seq_nms <- rep(c("chr1","chr2","chr3"),c(4,5,2))
gr <- GRanges(seqnames=seq_nms,
              IRanges(start=c(1,101,121,201,
                              101,201,216,231,401,
                              1,101),
                      width=c(20, 5, 5, 30,
                              20, 5, 5, 5, 30,
                              80, 40)),
              seqlengths=c(chr1=300,chr2=450,chr3=200),
              chr=factor(seq_nms))

The following function uses functionality from plotgardener to plot the ranges. Note in the plotting helper function that chr will be used to color ranges by chromosome of origin.

suppressPackageStartupMessages(library(plotgardener))
plotGRanges <- function(gr) {
  pageCreate(width = 5, height = 2, xgrid = 0,
                ygrid = 0, showGuides = FALSE)
  for (i in seq_along(seqlevels(gr))) {
    chrom <- seqlevels(gr)[i]
    chromend <- seqlengths(gr)[[chrom]]
    suppressMessages({
      p <- pgParams(chromstart = 0, chromend = chromend,
                    x = 0.5, width = 4*chromend/500, height = 0.5,
                    at = seq(0, chromend, 50),
                    fill = colorby("chr", palette=palette.colors))
      prngs <- plotRanges(data = gr, params = p,
                          chrom = chrom,
                          y = 0.25 + (i-1)*.7,
                          just = c("left", "bottom"))
      annoGenomeLabel(plot = prngs, params = p, y = 0.30 + (i-1)*.7)
    })
  }
}
plotGRanges(gr)

Within chromosome

Visualizing two permutations of blocks within chromosome:

for (i in 1:2) {
  gr_prime <- bootRanges(gr, blockLength=100, type="permute", withinChrom=TRUE)
  plotGRanges(gr_prime)
}

Visualizing two bootstraps within chromosome:

for (i in 1:2) {
  gr_prime <- bootRanges(gr, blockLength=100, withinChrom=TRUE)
  plotGRanges(gr_prime)
}

Across chromosome

Visualizing two permutations of blocks across chromosome. Here we use larger blocks than previously.

for (i in 1:2) {
  gr_prime <- bootRanges(gr, blockLength=200, type="permute", withinChrom=FALSE)
  plotGRanges(gr_prime)
}

Visualizing two bootstraps across chromosome:

for (i in 1:2) {
  gr_prime <- bootRanges(gr, blockLength=200, withinChrom=FALSE)
  plotGRanges(gr_prime)
}

Session information

## R version 4.2.0 (2022-04-22)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.4 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] plotgardener_1.3.9          microbenchmark_1.4.9       
##  [3] nullranges_1.3.5            nullrangesData_1.3.0       
##  [5] InteractionSet_1.25.0       SummarizedExperiment_1.27.1
##  [7] Biobase_2.57.1              MatrixGenerics_1.9.1       
##  [9] matrixStats_0.62.0          GenomicRanges_1.49.0       
## [11] GenomeInfoDb_1.33.3         IRanges_2.31.0             
## [13] S4Vectors_0.35.1            ExperimentHub_2.5.0        
## [15] AnnotationHub_3.5.0         BiocFileCache_2.5.0        
## [17] dbplyr_2.2.1                BiocGenerics_0.43.1        
## 
## loaded via a namespace (and not attached):
##   [1] colorspace_2.0-3              rjson_0.2.21                 
##   [3] ellipsis_0.3.2                ggridges_0.5.3               
##   [5] mclust_5.4.10                 rprojroot_2.0.3              
##   [7] XVector_0.37.0                fs_1.5.2                     
##   [9] bit64_4.0.5                   interactiveDisplayBase_1.35.0
##  [11] AnnotationDbi_1.59.1          fansi_1.0.3                  
##  [13] mvtnorm_1.1-3                 codetools_0.2-18             
##  [15] cachem_1.0.6                  knitr_1.39                   
##  [17] jsonlite_1.8.0                speedglm_0.3-4               
##  [19] Rsamtools_2.13.3              png_0.1-7                    
##  [21] shiny_1.7.1                   BiocManager_1.30.18          
##  [23] compiler_4.2.0                httr_1.4.3                   
##  [25] assertthat_0.2.1              Matrix_1.4-1                 
##  [27] fastmap_1.1.0                 cli_3.3.0                    
##  [29] later_1.3.0                   htmltools_0.5.2              
##  [31] tools_4.2.0                   gtable_0.3.0                 
##  [33] glue_1.6.2                    GenomeInfoDbData_1.2.8       
##  [35] dplyr_1.0.9                   rappdirs_0.3.3               
##  [37] Rcpp_1.0.8.3                  jquerylib_0.1.4              
##  [39] pkgdown_2.0.4                 vctrs_0.4.1                  
##  [41] Biostrings_2.65.1             strawr_0.0.9                 
##  [43] rtracklayer_1.57.0            xfun_0.31                    
##  [45] stringr_1.4.0                 plyranges_1.17.0             
##  [47] mime_0.12                     lifecycle_1.0.1              
##  [49] restfulr_0.0.15               XML_3.99-0.10                
##  [51] zlibbioc_1.43.0               MASS_7.3-57                  
##  [53] scales_1.2.0                  ragg_1.2.2                   
##  [55] promises_1.2.0.1              parallel_4.2.0               
##  [57] RColorBrewer_1.1-3            yaml_2.3.5                   
##  [59] curl_4.3.2                    memoise_2.0.1                
##  [61] ggplot2_3.3.6                 yulab.utils_0.0.5            
##  [63] sass_0.4.1                    stringi_1.7.6                
##  [65] RSQLite_2.2.14                highr_0.9                    
##  [67] BiocVersion_3.16.0            BiocIO_1.7.1                 
##  [69] desc_1.4.1                    filelock_1.0.2               
##  [71] BiocParallel_1.31.12          rlang_1.0.3                  
##  [73] pkgconfig_2.0.3               systemfonts_1.0.4            
##  [75] bitops_1.0-7                  pracma_2.3.8                 
##  [77] evaluate_0.15                 lattice_0.20-45              
##  [79] purrr_0.3.4                   GenomicAlignments_1.33.1     
##  [81] ks_1.13.5                     bit_4.0.4                    
##  [83] tidyselect_1.1.2              plyr_1.8.7                   
##  [85] magrittr_2.0.3                R6_2.5.1                     
##  [87] generics_0.1.2                DelayedArray_0.23.1          
##  [89] DBI_1.1.3                     pillar_1.7.0                 
##  [91] KEGGREST_1.37.3               RCurl_1.98-1.7               
##  [93] tibble_3.1.7                  crayon_1.5.1                 
##  [95] KernSmooth_2.23-20            utf8_1.2.2                   
##  [97] rmarkdown_2.14                grid_4.2.0                   
##  [99] data.table_1.14.2             blob_1.2.3                   
## [101] digest_0.6.29                 xtable_1.8-4                 
## [103] httpuv_1.6.5                  gridGraphics_0.5-1           
## [105] textshaping_0.3.6             munsell_0.5.0                
## [107] ggplotify_0.1.0               bslib_0.3.1