Background

  • Depth of coverage (DOC), off-target CNV approaches have been applied to exome and ChIP-seq data [PMC9039557][PMC4396974].
  • DOC-off-target methods require filtering out peak regions when using non-input immunoprecipitation (IP) samples to avoid off-target binding signals.
  • The peak calling outcomes are highly dependent on the specific algorithm used.
  • Most ChIP-seq and ATAC-seq peak callers are not designed for detecting complex, non-symmetric peak patterns.
    • For instance, H3K4me3 peaks are typically sharply localized, while H3K4me1/3 peaks span broader domains.
    • H3K27ac marks both large regions, such as super-enhancers, and smaller, discrete regions like promoters, exhibiting both broad and narrow peak characteristics [35788238][ref..].
    • Pol2 ..
  • There are some CNV tools for HiC
    • LOIC [PMC6127909]
    • HiNT [PMC7087379]
    • HiCnv and OneD [ref]
  • No CNV tools for HiChIP data

Example of CNV calls on MG63.3 H3K27ac at Chr6 Regions image

Problem

Different CNVs with Different Data

HiChIP(black), ChIP-seq(blue), and ChIP-seq Input(Gold) image

Different Peaks with Different Algorithms

  • CopywriteR : non-parameteric, FDR-base, expand peaks within segment boundaries
  • HOMER : Simple poisson model, 4-fold greater than in the surrounding 10 kb region, The maximum distance used to stitch peaks together
  • MACS2 : Dynamic poisson parameters (λlocal = max(λBG, λ1k, λ5k, λ10k), bad at expantion

CNV score is measured after filtering peaks.

  • Peaks by CopywriteR (top), HOMER(p-53)(mid), MACS2(p-4)(bottom) image

Methods

To make the peak caller smarter

  • Understand CopywriteR Model

  • Modify the CopywriteR algorithm (reduce peak expantion)
  • Peak Expantion Algorithm: fullcode
    345                 retest.peak.ranges <- apply(test, 1, function(x) {
    346                     left.lower.boundary <- max(0, (as.integer(x["start"]) - (resolution + 1)))
    347                     left.higher.boundary <- max(0, (as.integer(x["start"]) - 1))
    348                     right.lower.boundary <- min(chromosomes[selection],
    349                                                 (as.integer(x["end"]) + 1))
    350                     right.higher.boundary <- min(chromosomes[selection],
    351                                                  (as.integer(x["end"]) + (resolution + 1)))
    352                     left.peakCutoff <- ceiling(.peakCutoff(cov.chr[left.lower.boundary:left.higher.boundary],fdr.cutoff=FDRT))
    353                     right.peakCutoff <- ceiling(.peakCutoff(cov.chr[right.lower.boundary:right.higher.boundary],fdr.cutoff=FDRT))
    

    The original algorithm uses FDR=0.1 for adding peaks Applied stringent FDRT=0.0001

Results

  • Recovered CNV calling at 5’ body of SCIRT image

Discussion

  • Loops are good indicators CNV? image

  • New model \(log(C) = \beta_0( GCcontent ) + \beta_1( Mappability ) + \beta_2(3Dcontact) \epsilon\)

  • Other examples image image