Molecular Biology Tools
AA Info

Author: Gianpiero Pescarmona
Date: 25/09/2007



A unified view of the current state of knowledge on a topic, including publications, authors, related concepts (Knowlet) and community annotations (Professional Wiki).

SDSC Biology Workbench gipi3 new43old

The Next Generation Biology Workbench

Promoter Databases

EPD The Eukaryotic Promoter Database

BioMart Project

BioMart is a query-oriented data management system developed jointly by the Ontario Institute for Cancer Research (OICR) and the European Bioinformatics Institute (EBI).

The system can be used with any type of data and is particularly suited for providing 'data mining' like searches of complex descriptive data. BioMart comes with an 'out of the box' website that can be installed, configured and customised according to user requirements. Further access is provided by graphical and text based applications or programmatically using web services or API written in Perl and Java. BioMart has built-in support for query optimisation and data federation and in addition can be configured to work as a DAS 1.5 Annotation server. The process of converting a data source into BioMart format is fully automated by the tools included in the package. Currently supported RDBMS platforms are MySQL, Oracle and Postgres.

Protein Aminoacids Percentage / Home Made

The Protein Aminoacids Percentage gives useful information on the local environment and the metabolic status of the cell (starvation, lack of essential AA, hypoxia)

Cell Biology Wiki

mRNA tissues distribution




BRENDA - The Comprehensive Enzyme Information System

BRENDA is a collection of enzyme functional data. The enzymes are classified according to the Enzyme Commission list of enzymes. Some 4000 "different" enzymes are covered.

Metabolic Pathways

Free of Charge


NIC and Nature Pathways

Welcome to

The Nutritional Metabolomics Database is a global open-source library of small molecules for use in human nutrition metabolomics, as part of the NuGO Metabolomics Initiative. It is in its startup phase, with a limited number of trial pages operational. The next milestone will be the automated generation of some 2000 metabolite pages produced from information provided by the human metabolome database.

Pathway Interaction Database

Biomolecular interactions and cellular processes assembled into authoritative human signaling pathways

Roche Applied Science's Biochemical Pathways

KEGG Pathways

KEGG FTP Site for Academic Users

Pathway Commons

Human Metabolome Database

SABiosciences Pathway Central

100 Free Signaling Pathway Maps in PowerPoint
Downloadfree pathway maps in PowerPoint

R&D Interactive Pathways & Processes

Sigma Databases

Your Favorite Gene YFG

Research tool, powered by Ingenuity, that on payment supplies a more complete information.

Sigma Pathways

Sigma Enzymes and Metabolites Tests

PathCase Pathways Database System

Reactome - a curated knowledgebase of biological pathways

Reactome - a curated knowledgebase of biological pathways Home

Reactome new interface

iPath is a web-based tool for the visualization and analysis of the metabolic pathways

MetaCyc da testare


HumanCyc is a bioinformatics database that describes the human metabolic pathways and the human genome. By presenting metabolic pathways as an organizing framework for the human genome, HumanCyc provides the user with an extended dimension for functional analysis of Homo sapiens at the genomic level.


Molecole chimica



ARIADNE Pathways

Signal Transduction


The UCSD-Nature Signaling Gateway

Signal Transduction Knowledge Environment - STKE

Genes and chromosomes

NCBI Map Viewer chromosomes



GeneCards® is a searchable, integrated database of human genes that provides concise genomic, proteomic, transcriptomic, genetic and functional information on all known and predicted human genes. Information featured in GeneCards includes orthologies, disease relationships, mutations and SNPs, gene expression, gene function, pathways, protein-protein interactions, related drugs & compounds and direct links to cutting edge research reagents and tools such as antibodies, recombinant proteins, clones, expression assays and RNAi reagents.

Korean UniGene Information KUGI

Begli schemi Signalling

Atlas of Genetics and Cytogenetics in Oncology and Haematology

- The Atlas of Genetics and Cytogenetics in Oncology and Haematology is a peer reviewed on-line journal, encyclopaedia and database in free access on the Internet, devoted to genes, cytogenetics, and clinical entities in cancer, and cancer-prone diseases.
- The aim is to cover the entire field under study: as the task is huge, the Atlas is and will be incomplete by that very fact.
- It presents structured reviews (cards) or traditional review papers ('deep insights'), a portal towards genetics and/or cancer databases and journals, teaching items in Genetics for students in Medicine and in Sciences, and a case report in hematology section.
- It is made for and by: clinicians and researchers in cytogenetics, molecular biology, oncology, haematology, and pathology. Contributions are reviewed before acceptance.
- It deals with cancer research and genomics. It is at the crossroads of research, virtual medical university (university and post-university e-learning), and telemedicine. It contributes to "meta-medicine", this mediation, using new information technology, between the increasing amount of knowledge and the individual, having to use the information. Towards a personalized medicine of cancer.

Quick Gene Ontology

QuickGO is a fast web-based browser for Gene Ontology terms and annotations, which is provided by the UniProtKB-GOA group at the EBI.

Human Chromosome DB from Santa Cruz - finds siRNAs

Apropos Database and Annotation Tool

Genetic diseases

Genetics Home Reference



The single-letter aminoacid code

Information hyperlinked over Proteins

Expasy Proteomics Tools

Human Proteins Reference Database


Entrez Protein Clusters

Pfam database: a large collection of protein families

The Pfam database is a large collection of protein families, each represented by multiple sequence alignments and hidden Markov models (HMMs).

Proteins are generally composed of one or more functional regions, commonly termed domains. Different combinations of domains give rise to the diverse range of proteins found in nature. The identification of domains that occur within proteins can therefore provide insights into their function.

There are two components to Pfam: Pfam-A and Pfam-B. Pfam-A entries are high quality, manually curated families. Although these Pfam-A entries cover a large proportion of the sequences in the underlying sequence database, in order to give a more comprehensive coverage of known proteins we also generate a supplement using the ADDA database. These automatically generated entries are called Pfam-B. Although of lower quality, Pfam-B families can be useful for identifying functionally conserved regions when no Pfam-A entries are found.

Pfam also generates higher-level groupings of related families, known as clans. A clan is a collection of Pfam-A entries which are related by similarity of sequence, structure or profile-HMM.

To be tested

LifeDB Autonomous Data Integration System Trial Release

It is with great excitement and anticipation that we
announce the release of LifeDB Autonomous Data Integration
system [1] for Life Sciences data management and workflow
querying. LifeDB has its own SQL-like query language
called BioFlow [2, 3] using which, most arbitrary
online deep web resources and tools can be used to develop
applications completely dynamically and on an ad hoc basis.
The goal in LifeDB is to allow application development by
end users using SQL-like declarative query language without
having to worry about schema heterogeneity and geographical
distribution of resources. LifeDB relies on two basic fully
autonomous sub-systems: FastWrap [4, 5] for wrapper
generation and table annotation, and OntoMatch [6] for
schema mapping. BioFlow supports horizontal and vertical
integration and autonomous record linkage through entity
identification and resolution. It is based on an extended
parameterized relational algebra called Integra [7] that is
capable of blurring the distinctions between web documents
and traditional SQL tables by uniformly treating both as

To support end user application development, we have also
developed a visual application development system called
VizBuilder [8] that allows writing BioFlow applications
visually without any knowledge of BioFlow. In our current
LifeDB release, VizBuilder is included as an alternate

Currently, LifeDB is in a trial phase while it undergoes
internal validation. We believe LifeDB is performing within
its design parameters. We invite interested researchers in
Databases and Life Sciences to register for LifeDB as end
users. Registered users will be able to either download
LifeDB binary for local use, or use LifeDB in our server
with limited user data space. Currently, our goal is to
compile bug reports, if any, and fix the currently unknown
bugs before its final release. Full information on LifeDB
can be found at The Virus/Spam/Threat Scanner of the computer Science Department - University of Turin - has detected a possible fraud attempt from "integra.cs.wayne.edu80" claiming to be

Our laboratory is fully committed to supporting registered
users and their applications on a long term basis on our
Wayne State Server. Registered users will have their own
data space in our server to store their data, applications
and other resources. For high-end users, we will also
consider offering dedicated processors to support computation
intensive applications, resource permitting. We are
considering a novel resource sharing model and it will be
announced when we finalize our plan and acquire the needed
resource. LifeDB will be an open source system for the
community, following the release of its final version. We
are now compiling user requests for data space over the
next 3-6 months based on which we plan to acquire the needed
hardware to support our users. A request for needed space
(and dedicated processor, subject to agreement) may be made
through an e-mail to the PI at Please use
subject heading "LifeDB Resource Request."

Publications related to LifeDB and other projects can be found
at our Integration Informatics Lab home page. A comprehensive LifeDB user manual
is also available for end users. The user manual will include
test examples and design parameters under which LifeDB is
designed and expected to function well.

LifeDB has been funded in part by National Science Foundation
grants SEIII IIS 0612203 and MRI CNS 0521454.


[1] Anupam Bhattacharjee, Aminul Islam, Mohammad Shafkat Amin,
Shahriyar Hossain, Shazzad Hosain, Hasan Jamil and Leonard Lipovich,
"On-the-fly Integration and ad hoc Querying of Life Sciences
Databases using LifeDB", 20th Database and Expert Systems
Applications, Linz, Austria, August 2009.

[2] Hasan Jamil, Bilal El-Hajj-Diab, "BioFlow: A Web-based
Declarative Workflow Language for Life Sciences", IEEE International
Workshop on Scientific Workflows, Hawaii, United States, July 2008.

[3] Hasan Jamil, Aminul Islam, "The Power of Declarative
Languages: A Comparative Exposition of Scientific Workflow Design
using BioFlow and Taverna", IEEE SCC International Workshop on
Scientific Workflows, Los Angeles, California, July 2009.

[4] Mohammad Shafkat Amin, Hasan Jamil, "FastWrap: An Efficient
Wrapper for Tabular Data Extraction from the Web", IEEE
International Conference on Information Reuse and Integration,
Las Vegas, Nevada, United States, August 2009.

[5] Mohammad Shafkat Amin, Anupam Bhattacharjee, Hasan Jamil,
"Wikipedia Driven Autonomous Label Assignment in Wrapper Induced
Tables with Missing Column Names", ACM International Symposium on
Applied Computing, Sierre, Switzerland, March 2010.

[6] Anupam Bhattacharjee, Hasan Jamil, "OntoMatch: A Monotonically
Improving Schema Matching System for Autonomous Data Integration",
IEEE International Conference on Information Reuse and Integration,
Las Vegas, Nevada, United States, August 2009.

[7] Shazzad Hosain, Hasan Jamil, "An Algebraic Language for Semantic
Data Integration on the Hidden Web", IEEE International Conference
on Semantic Computing, Berkeley, California, United States,
September 2009.

[8] Shahriyar Hosain, Hasan Jamil, "A Visual Interface for
on-the-fly Biological Database Integration and Workflow Design
using VizBuilder", International Workshop on Data Integration in
Life Sciences (DILS), Manchester, United Kingdom, July 2009.
Please do not post msgs that are not relevant to the database community at large. Go to for guidelines and posting forms.
To unsubscribe, go to


SMART (a Simple Modular Architecture Research Tool) allows the identification and annotation of genetically mobile domains and the analysis of domain architectures. More than 500 domain families found in signalling, extracellular and chromatin-associated proteins are detectable. These domains are extensively annotated with respect to phyletic distributions, functional class, tertiary structures and functionally important residues. Each domain found in a non-redundant protein database as well as search parameters and taxonomic information are stored in a relational database system. User interfaces to this database allow searches for proteins containing specific combinations of domains in defined taxa. For all the details, please refer to the publications on SMART.


The Pfam database is a large collection of protein families, each represented by multiple sequence alignments and hidden Markov models (HMMs).
Proteins are generally composed of one or more functional regions, commonly termed domains. Different combinations of domains give rise to the diverse range of proteins found in nature. The identification of domains that occur within proteins can therefore provide insights into their function.

Post-translational modifications

Easy and functional website to predict the number and positions of ubiquitylation of a protein.
(is necessary to get the FASTA sequence before using it)


Sigma Prestige;

2011-06-15T12:13:08 - Annamaria Vernone


SUPERFAMILY is a database of structural and functional annotation for all proteins and genomes.

The SUPERFAMILY annotation is based on a collection of hidden Markov models, which represent structural protein domains at the SCOP superfamily level. A superfamily groups together domains which have an evolutionary relationship. The annotation is produced by scanning protein sequences from over 1,400 completely sequenced genomes against the hidden Markov models.

ViPR Search Tools

ViPR search tools allow users to search a variety of different databases. Search results can be refined, sent to analysis tools, saved to your workbench, or downloaded to your local workstation.


Netility is a simple web-based network visualization tool enabling views of proteins and their interaction neighbours as found in iRefWeb.
It is a part of Wodaklab where we can find others tools like iRefWeb, a web interface to a broad landscape of data on protein-protein interactions (PPI) consolidated from major public databases:BIND, BioGRID, CORUM, DIP, IntAct, HPRD, MINT, MPact, MPPI, OPHID.

2008-11-04T23:56:56 - Gianpiero Pescarmona

Sigma Ingenuity

Why spend your valuable time looking at multiple sources when you can find information
you need through the improved Your Favorite Gene research tool, powered by

Access is free.

Make connections in Biological.

YFG Gives You:

* A Biologically Relevant Literature Search
* New Gene Regulation Viewers
* Expression Study Results
* Clinical Trial Information
* Biochemical Compounds Related to your Gene

Find validated Prestige Antibodies, Bioactive Small Molecules, shRNAs, siRNAs,
CompoZr® ZFNs and induced pluripotent stem cells within the context of your research.

Go to YFG at

AI model sarcoma

The Next Generation Biology Workbench

Version 1.3 has the following features (all in response to user

Ability to import data from an existing *Biology Workbench account
Task Creation* page jobs can be initiated from the Toolkit Page
The tool Phy-Fi is added to permit users to view phylogenetic trees
The NCBI tool Spidey
Ability to create subfolders
We also posted some lessons designed for Bioinformatics instruction

Version 1.3 also includes these improvements:
Added user password reset capability
Resolution of several cosmetic issues (bugs)
Refactored data management area to improve performance

For folks who are interested in adding features themselves, we have
created a web server called webtooldev at

The server allows anyone to edit or create a tool interface using a
variant of the PISE XML standard.

The webtooldev site provides documentation on how to create the XML and
some sample files.
You upload your XML file to the server, and you can then do functional
testing in a functioning NGBW interface.
For security reasons, the webtooldev server requires a login, so contact
me if you would like an account.

Release 1.4 is currently planned for Jan 15, 2009.
This release is planned to continue to streamline the job creation
process, and improve the results viewing experience.

Specific features to be added include:
Bulk uploads of multiple protein/DNA sequences
New databases including REFSEQ, TPA, ENSEMBL, Uniprot100, and UNIMES
A set of protein sequence filtering tools.
Tools for handling and editing DNA sequences are expected to appear in
Spring 2009.

Please let us know if you have specific tools you would like to see



Mark A. Miller, PhD
Principal Investigator, Biology
San Diego Supercomputer Center
University of California, San Diego
La Jolla, CA, 92093-0505
Tel: 858-822-0866
Fax: 858-822-3610

An HTML attachment was scrubbed...

AddThis Social Bookmark Button