CAT allows you to easily compare your peak-called ChIP peaks with those from most all D. melanogaster , mouse, and human modENCODE/ENCODE ChIP peaks.

Comparison is performed using the Genomic Association Tester that has been used to pre-compute the association of all proteins on chromatin used by ENCODE and modENCODE in their ChIP datasets.

Peaks that share significantly common genomic binding sites score highly, allowing easy identification of proteins that show similar or identical genomic distributions from those that are distinct.

How are ChIP association scores calculated?

Scores are calculated as:
Log2( observed overlap / expected overlap ) x -Log10( >p-value ) x sensitivity coefficient

The sensitivity coefficient is what distinguishes overlapping ChIP peaks from highly similar ChIP peaks and is determined from the percent of total bases overlapping between the two ChIP tracks being tested.
If 100% of all bases in ChIP A overlap with 100% of all bases in ChIP B, then sensitivity coefficient will be 4 : log10(% overlap total A bases) x log10(% overlap total B bases) )

For example, if we take the following four ChIP peak BED files, ChIP A, ChIP B, ChIP C, and ChIP D :
Track example to explain how ChIP Peak Association Score is calculated
Scoring for overlapping ChIP peaks will cluster ChIPs A,B,C since they all overlap highly between themselves. This is likely the case for histone modifying enzymes that deposit broad peak histone marks.

Scoring for highly similar ChIP peaks will cluster ChIPs A and B since they both colocalise well together (>90% total bases of each overlapping). This would be the case for e.g. colocalising transcription co-factors.
ChIP C will still have an association score higher with A+B than ChIP D will have with A+B, because the observed overlap of C with A/B is still high. However, since the percent total bases of C is low (e.g. <10%), the sensitivity coefficient will lower the association score relative to the observed between A+B when scoring for highly similar peaks.

You can use this tool to simply extract the pre-computed modENCODE or ENCODE data or input a BED-formatted file of your own ChIP data to compare association with some or all of the ENCODE ChIP datasets.

High genomic association or "co-localisation" between pairs of ChIP data can highlight potential :

  •      functional complexes or co-factors acting on similar genes.
  •      antagonistic activators/repressors acting on similar genes.
  •      preference for a given chromatin-acting protein with one or more histone marks.
  •      co-occurring histone marks

A note on user-supplied BED files...
  • Uploading your own BED file will allow comparison of association of modENCODE ChIP peaks with peaks/genomic regions of your supplied BED file.
    • This takes approximately 10 minutes for comparison of 1000 ChIP peaks against 200 other factors, depending on the current server load.
  • The supplied BED file is usually a ChIP peak BED file which will compare "all ChIP peaks with all ChIP peaks".
    However, you may wish to test association of modENCODE ChIP peaks with individual genomic regions (e.g. genes).
    • If this is the case, you should check the option "Individual Genomic Regions", which will compare all the selected modENCODE ChIP peaks with each individual genomic region in the provided BED file.
  • Due to the time involved in calculating peak association, uploading your own ChIP peak BED file requires the input of a valid email address to which the results will be sent.
  • User-supplied ChIP peak BED files are limited to 5 ChIP peak BED files (unlimited regions), although if you require individual region testing (e.g. a gene BED file), then only 1 BED file is permitted, with a maximum of regions.

The CAT source code is available for download to host your own ChIP Association Tester :


Phenotype Search
Provide a list of genes and see if any have an allele presenting a given phenotype keyword.

If any of your genes have alleles that have the provided keyword in the phenotype description, the gene will show in the output along with the alleles showing that phenotype.
Convert Gene Symbol to FlyBase IDs
Convert gene symbols to current Flybase IDs and vice-versa.

Paste in your list of gene symbols or FBgn IDs and convert!
Genetic Interactions Search
Find if a given gene has any known genetic interactions with a list of any number of genes.

Enter your gene of interest as the main gene, paste in your list of genes to compare against (e.g. genes from a transcriptome analysis) and search.
Protein Interactions Search
Find if a given protein has any known protein interactions with a list of any number of proteins.

Enter your protein of interest (as case sensitive Gene Symbol) as the main protein, paste in your list of (case sensitive) Gene Symbols to compare against and search.
 
Protein Interacting Pairs
Find if there are any known protein-protein interacting pairs from a given list of proteins.

Paste in your list of (case sensitive) Gene Symbols to compare against and if any of the proteins in the list are known to interact with any of the others, the interacting pair will be shown in a format compatible with Cytoscape.
FlyBase ID to BED file
Create a genomic BED file from a list of Flybase IDs.

Paste in your list of FBgn IDs and convert to BED-formatted genome positions.
Using the option "Convert FBgn's to gene symbol" will output the gene symbol in the BED file, rather than the FBgn ID.
Human disease Models
Find if any given genes have alleles that are used as a model for human diseases.

Paste in your list of genes and find if any of them have alleles used as models of human diseases.
 

Convert Mouse UCSC IDs
Convert a list of Mouse UCSC gene IDs into their corresponding Gene Symbol, ENSEMBL ID, or RefSeq ID.
Convert Mouse refSeq Accession
Convert a list of Mouse refSeq (refGene ) accession numbers into their corresponding Gene Symbol.
 

Convert Human UCSC IDs
Convert a list of Human UCSC gene IDs into their corresponding Gene Symbol, ENSEMBL ID, or RefSeq ID.
Convert Human refSeq Accession
Convert a list of Human refSeq (refGene ) accession numbers into their corresponding Gene Symbol.
 

Basic search
A lightweight interface to PubMed.
Couple a search query with a list of keywords (e.g your genes or proteins) and find if there is already published literature linking the search term with the keyword.
Multiple Keyword Search
An advanced search tool for PubMed.
Enter your search term and retrieve an easy to view list of PubMed results
 

Venn Diagrams
Find overlapping data through Venn diagrams.

Supports comparison of up to 5 different data sets, outputs PNG images, and provides the list of overlapping data
Cut Data Columns
This is a lightweight version of the Galaxy Text manipulation > Cut tool.

Simply paste in data, select how the columns are delimitted and extract the column of your choice.
Pad BED files
Pad a BED file with nucleotides up and/or down stream of the given genomic coordinates.

Without using the command line, this tool is similar to BEDtools slop command that will add coordinates upstream and/or downstream to a BED file without going shorter or longer than the chromosome size.
Colour Picker
Simple, but effective way to obtain the RGB, HEX, and HSB values for a given colour.

Add colour to your custom UCSC genome tracks by selecting the RGB values and adding them to the custom track with the form color=R,G,B