Motif Resources/Predictors | ||
ELM | [26] | |
To explore candidate functional sites in proteins and to learn about known motifs | ||
MiniMotif Miner | [88] | |
To analyse protein queries for the presence of short contiguous peptide motifs that have a known function in at least one other protein | ||
Scansite | [89] | |
To identify short protein sequence motifs that are recognized by modular signalling domains, phosphorylated by protein Ser/Thr- or Tyr-kinases or mediate specific interactions with proteins or phospholipids | ||
PePSite | [90] | |
To predict binding of a given peptide to a protein structure | ||
Motif Discovery | ||
DILIMOT | [39] | |
To find short, over-represented peptide patterns/linear motifs, in a set of proteins | ||
SLiMFinder | [91] | |
To find novel, significantly over-represented, short protein motifs | ||
Sequence Retrieval/Analysis | ||
BLAST | ||
To identify regions of local similarity between nuleotide or protein sequences, which can be used to infer functional and evolutionary relationships between sequences as well as help identify members of gene families | ||
BioMART | [93] | |
Provides free software and data services to foster scientific collaboration and facilitate the scientific discovery proces; the project adheres to the open source philosophy that promotes collaboration and code reuse | ||
Alignment | ||
Clustal | http://www.clustal.org/omega http://www.ebi.ac.uk/Tools/msa/clustalo | |
General purpose DNA or protein multiple sequence alignment program | ||
MAFFT | [95] | |
Multiple alignment program for amino acid or nucleotide sequences | ||
Jalview | [48] | |
Lightweight Java applet for use in web applications, and a powerful desktop application that employs web services for sequence alignment | ||
Phylogenetic Tree/Orthology | ||
TreeFam | [96] | |
Database composed of phylogenetic trees inferred from animal genomes, providing orthology/paralogy predictions as well the evolutionary history of genes | ||
EggNog | [97] | |
Database of orthologous groups of genes annotated with functional categories derived from COG/KOG categories | ||
COG | [98] | |
Database providing phylogenetic classification of proteins encoded in complete genomes | ||
Motif Conservation | ||
Conscore | [63] | |
Linear motif conservation filter | ||
Consurf | [99] | |
To identify functional regions in proteins | ||
SLiMPrints | http://bioware.ucd.ie/~compass/biowareweb/Server_pages/slimprints.php | [41] |
De novo motif discovery tool to identify relatively over-constrained proximal groupings of residues within intrinsically disordered regions, indicative of a putatively functional motif | ||
Protein Domains | ||
SMART | [52] | |
To identify and annotate genetically mobile domains and to analyse domain architectures | ||
PFAM | [51] | |
Database providing a large collection of protein families, each represented by multiple sequence alignments and hidden Markov models | ||
InterPro | [53] | |
To classify sequences into protein families and to predict the presence of important domains and sites | ||
Structure/Disorder | ||
PDB | [55] | |
Single worldwide repository of information about the 3D structures of large biological molecules, including proteins and nucleic acids | ||
PDBsum | [100] | |
Pictorial database providing an at-a-glance overview of the contents of each 3D structure deposited in PDB | ||
IUPred | [54] | |
To predict intrinsically unstructured regions in proteins | ||
D2P2 | [101] | |
Community resource, providing pre-computed disorder predictions on a large library of proteins from completely-sequenced genomes | ||
MobiDB | [102] | |
Centralized resource for annotations of intrinsic protein disorder | ||
DISPROT | [103] | |
Database providing information about proteins that lack fixed 3D structure in their putatively native states, either in their entirety or in part | ||
Protein-Protein Interactions | ||
BioGRID | [104] | |
Online interaction respository with data compiled through comprehensive curation efforts | ||
STRING | [57] | |
Provides known and predicted protein-protein interactions | ||
IntAct | [105] | |
Freely available, open source database system and analysis tools for molecular interaction data; all interactions are derived from literature curation or direct user submissions and are freely available | ||
PiSITE | [106] | |
Web-based database of protein interaction sites, providing information on interaction sites of a protein from multiple PDB entries | ||
DOMINO | [107] | |
Database of domain-peptide interactions | ||
ComPPI | [108] | |
Cellular compartment-specific database for protein-protein interaction network analysis | ||
iELM | [109] | |
Web server to explore short linear motif-mediated interactions | ||
KEGG | [110] | |
Database resource for understanding high-level functions and utilities of the biological system, such as the cell, the organism and the ecosystem, from molecular-level information, especially large-scale molecular datasets generated by genome sequencing and other high-throughput experimental technologies | ||
CORUM | [56] | |
Collection of experimentally verified mammalian protein complexes | ||
Subcellular Localization | ||
CELLO2GO | [59] | |
Web server for protein subcellular localization prediction with functional gene ontology annotation | ||
LocDB | [111] | |
Database that collects experimental annotations for the subcellular localization of proteins in Homo sapiens and Arabidopsis thaliana | ||
GeneOntology | [112] | |
Collaborative effort to address the need for consistent descriptions of gene products across databases | ||
Compartments | [113] | |
Database of protein subcellular localization data manually curated from the literature or obtained from high-throughput microscopy-based screens | ||
LOCATE | [114] | |
Curated database providing data that describe the membrane organization and subcellular localization of proteins from the RIKEN FANTOM4 mouse and human protein sequence set | ||
Tissue Expression | ||
Protein Atlas | [58] | |
Publicly available database with millions of high-resolution images showing the spatial distribution of proteins in 44 different normal human tissues and 20 different cancer types, as well as 46 different human cell lines | ||
TISSUES | [115] | |
Resource integrating evidence on tissue expression from manually curated literature, proteomics and transcriptomics screens, and automatic text mining | ||
Generic Resources | ||
UniProt | [116] | |
Manually annotated, non-redundant protein sequence and sequence isoform database; related information about the biological function of protein are curated from the scientific literature | ||
Antibodypedia | [117] | |
Open-access database of publicly available antibodies against human protein targets; contains data on the antibody efficacy in a range of biochemical and cell biological techniques | ||
IUPAC | [118] | |
Serves to advance the worldwide aspects of the chemical sciences and to contribute to the application of chemistry in science |