Background

Focal amplification involving enhancers and target oncogenes has been observed in many cancers, such as EGFR in glioblastoma, MYC in group 3 medulloblastoma, and MYCN in both neuroblastoma and Wilms tumors cell2019scacheri.
- H3K27ac ChlP-seq, ATAC-seq, POLR2A ChlP-seq, and RNA-seq signals at two EGFR enhancers for four glioblastoma lines (GBM3565, GBM3094, GSC23, G459)
- used HiChIP: GSE73865 (O’Brien et al., 2016), GSE90683 (Boeva et al., 2017)
These oncogenes were co-amplified with super-enhancers, not only in contiguous regions but also in more complex, non-contiguous amplicons. They are linearly broken into cis and trans genomic loci associated with oncogenes role of ecDNAs.
These regulatory elements have been preserved and evolved within cells in a circular form, referred to as extra-circular DNA cell2013korbel.
Bioinformatics tools for analyzing whole genome sequencing (WGS) data can exhibit varying performance based on their underlying assumptions and the quality of the input data 38746056,39209966.

Methods

Convert contacts to network
Assortativity (https://networkx.org/nx-guides/content/algorithms/assortativity/correlation.html)

Hint : Gini-ranking 32293513
- github: https://github.com/parklab/HiNT
Developed App go

Public Datasets

Database 35388171
ecDNA HiChIP datasets 31748743.
MYC-amplified colorectal cancer cell line, ecDNA hubs are tethered by the BET protein BRD4 34819668.
HiChIP datasets from SNU16 cells (amplified for MYC and FGFR2) 31748743.

Previous Results

Results

Methods

Hint : Gini-ranking 32293513
- github: https://github.com/parklab/HiNT

Code Anlysis

The Hint source code (https://github.com/parklab/HiNT):

def gini(x):
    # (Warning: This is a concise implementation, but it is O(n**2)
    # in time and memory, where n = len(x).  *Don't* pass in huge
    # samples!)

    # Mean absolute difference
    mad = np.nanmean(np.abs(np.subtract.outer(x, x)))
    # Relative mean absolute difference
    rmad = mad/np.nanmean(x)
    # Gini coefficient
    g = 0.5 * rmad
    return g

def getGini(mat1,mat2):
    matrix1 = np.genfromtxt(mat1,delimiter="\t")
    matrix2 = np.genfromtxt(mat2,delimiter="\t")
    matrix1[np.isfinite(matrix1)==0] = 0
    matrix2[np.isfinite(matrix2)==0] = 0
    rowsum1 = np.sum(matrix1,axis=1)
    rowsum2 = np.sum(matrix2,axis=1)
    colsum1 = np.sum(matrix1,axis=0)
    colsum2 = np.sum(matrix2,axis=0)
    ridx1 = np.where(rowsum1==0)
    cidx1 = np.where(colsum1==0)
    ridx2 = np.where(rowsum2==0)
    cidx2 = np.where(colsum2==0)
    ridx = np.union1d(ridx1[0], ridx2[0])
    cidx = np.union1d(cidx1[0], cidx2[0])

    temp1 = np.delete(matrix1,ridx,0)
    temp2 = np.delete(matrix2,ridx,0)
    selectedData1 = np.delete(temp1,cidx,1)
    selectedData2 = np.delete(temp2,cidx,1)

    average1 = np.mean(selectedData1)
    average2 = np.mean(selectedData2)
    tm1 = np.divide(selectedData1,average1)
    tm2 = np.divide(selectedData2,average2)
    division = np.divide(tm1,tm2)
    giniIndex = gini(np.asarray(division).reshape(-1))
    maximum = np.nanmax(np.asarray(division).reshape(-1))

    return giniIndex,maximum

def getRankProduct(matrix1MbInfo,background1MbInfo,outdir,name):
    rpout = os.path.join(outdir,name + '_chrompairs_rankProduct.txt')
    outf = open(rpout,'w')
    ginis = []
    maximums = []
    chrompairs = []
    for chrompair in matrix1MbInfo:
        #print chrompair
        matrix1 = matrix1MbInfo[chrompair]
        matrix2 = background1MbInfo[chrompair]
        giniIndex,maximum = getGini(matrix1,matrix2)
        chrompairs.append(chrompair)
        ginis.append(giniIndex)
        maximums.append(maximum)
    rankgini = len(ginis) - rankdata(ginis)
    rankmaximum = len(maximums) - rankdata(maximums)
    #print rankgini,rankmaximum
    rps = (np.divide(rankgini,len(ginis)*1.0))*(np.divide(rankmaximum,len(maximums)*1.0))
    result = np.stack((chrompairs,ginis,maximums,rps),axis=-1)
    sortedResult = sorted(result, key=itemgetter(-1))
    outf.write('\t'.join(['ChromPair',"GiniIndex","Maximum","RankProduct"]) + '\n')
    for res in sortedResult:
        chrompair, gini, maximum, rp = res
        newres = [chrompair, str(gini), str(maximum), str(rp)]
        outf.write('\t'.join(newres) + '\n')
    outf.close()
   
    return rpout