A Database of Drosophila Genes & Genomes

FB2008_06, released July 3, 2008
 

Reference Manual F - Links to and from FlyBase

Last Updated: 13 November 2006

FlyBase provides stable links to FlyBase for use by other databases, and links to other databases from FlyBase. Links to FlyBase data items, and links between data items in FlyBase and other databases are described in the sections that follow. Drosophila Resources includes a linked list of additional databases likely to be of interest to users of FlyBase.

F.1. FlyBase Identifier Numbers

FlyBase assigns unique identifier numbers to several classes of object within the database. One reason for this is to allow unambiguous cross-references both within FlyBase and between FlyBase and other databases.

FlyBase identifier numbers have the general form FBxxnnnnnnn where xx is an alphabetical code for the identifier class and nnnnnnn is a 7 digit number, padded with leading zeros.

The following classes of object are now publicly available in FlyBase data:

  • FBab - aberration
  • FBal - allele
  • FBba - balancer/genotype variant
  • FBcl - clone
  • FBgn - gene
  • FBim - image
  • FBmc - molecular construct
  • FBms - molecular segment
  • FBpp - polypeptide
  • FBrf - reference
  • FBst - stock
  • FBti - transposable element insertion
  • FBtp - transgenic construct or natural transposon
  • FBtr - transcript

Each object has a single Primary identifier number that is used to uniquely identify it in the database.

An object may also have any number of Secondary identifier numbers. If an object has a secondary identifier number, it generally indicates that at some point an entry has been merged with or split from other entries in the database. This may have occured due to more data becoming available in the literature or due to correction of previous errors in the database.

The rules for when primary identifier numbers become secondary are complex. Some examples are included below:

A merge:

If two entries A and B are found to refer to the same object, then a new primary identifier number will be given to the merged entry, and the old identifier numbers of entries A and B will be listed under this merged entry as secondary identifier numbers.

A split (case 1):

If one entry is found to correspond to two (or more) objects, e.g., entry A does, in fact, refer to objects X and Y, then X and Y, as new objects, each get new primary identifier numbers and the old primary identifier number of the suppressed entry A is listed as a secondary identifier number under both X and Y.

A split (case 2):

If one entry is found to correspond to two (or more) objects, e.g. entry A refers to objects A and X, then the existing entry for A and the new entry for X each get a new primary identifier number and the old primary identifier number of A is listed as a secondary identifier number under both A and X.

If an object is simply renamed, i.e. its valid symbol in FlyBase is changed without there being a merge or a split, its primary identifier number does not change.

The following classes of identifier were previously used in FlyBase, but are no longer in current use as identifier numbers in the database.

  • FBan - annotation

F.2. Links to external databases

FlyBase includes "pointers" to data kept by other databases in two different ways.

FlyBase-curated links

These are accession numbers that are incorporated into the FlyBase database, for sequence and certain other molecular data, and for reference data.

Linkouts

These links derive from linking tables that are maintained and provided to FlyBase by the external database. Linkouts are combined with FlyBase data for reporting on FlyBase web pages.

F.2.1. FlyBase-curated links

Accession numbers from the following databases are currently incorporated into FlyBase records as FlyBase-curated links:

  • DDBJ/EMBL/GenBank - the nucleic acid sequence databases of Japan, the U.S., and Europe
  • EPD - Eukaryotic Promoter Database (Bucher)
  • GPCRDB - The G protein-coupled receptor database
  • InterPro - a database of protein families, domains and functional sites
  • MEROPS - Protease database
  • miRBase - microRNA data
  • MitoDrome - A database of annotated Dmel nuclear genes encoding mitochondrial proteins.
  • PDB - Protein Data Bank (Brookhaven)
  • PubMed - biomedical literature citations and abstracts
  • Rfam - RNA families database of alignments and CMs
  • TRANSFAC - A database of transcription factors and their binding sites
  • UniProtKB/Swiss-Prot - UniProt Knowledgebase, Swiss-Prot section
  • UniProtKB/TrEMBL - UniProt Knowledgebase, TrEMBL section

F.2.2. Linkouts

FlyBase currently supports linkouts from Gene and Insertion Reports to external databases. Databases suitable for this kind of linking to FlyBase are those with mature data structures whose data are expressed in terms of FlyBase genetic objects that carry stable identifiers or as sequences that can be mapped to the reference sequence of a Drosophila species. FlyBase currently accepts linkout data in a simple spreadsheet table (tab-delimited, 4 columns), plus a summary record for the external database with link information and name. We are happy to consider additional linkout databases, and will support linkouts for more data classes in the future. Please contact us if you would like to contribute links to your database. Information for linkout providers is available here.

The databases that currently provide linkouts from FlyBase are:

  • BDGP in situ Gene Expression Database - Patterns of gene expression in Drosophila embryogenesis
  • BioGRID - General Repository for Interaction Datasets
  • DEDB - Drosophila Exon Database
  • Drosophila PIMRider - The Drosophila Protein Interaction map
  • DRSC - Drosophila RNAi Screening Center
  • FLIGHT - Integrating Genomic and High-Throughput data
  • FlyAtlas - the Drosophila adult expression atlas
  • FlyExpress - A database and image-matching search engine for Drosophila embryonic gene expression
  • FlyMine - An integrated database for Drosophila and Anopheles genomics
  • FlyView - A Drosophila Image Database
  • GenomeRNAi, Heidelberg - a database of phenotypes from systematic RNA interference (RNAi) screens in cultured Drosophila cells
  • InParanoid - Eukaryotic Ortholog Groups
  • Interactive Fly - A cyberspace guide to Drosophila development and metazoan evolution
  • GEO - NCBI's Gene Expression Omnibus
  • PANTHER - Protein Classification System
  • REDfly - Regulatory Element Database for Drosophila

FlyBase-curated links and linkouts are displayed on the Report Pages in the most appropriate section of the Report. Linkouts are indicated by a LinkOut label in parentheses after the field label. In addition, on the Gene Report, all FlyBase-curated links and linkouts are also grouped together in a single EXTERNAL CROSSREFERENCES & LINKOUTS section.