SWISS-PROT RELEASE 6.0 RELEASE NOTES
Date: January 2, 1988
Author: A. Bairoch
1. INTRODUCTION
1.1 Evolution
Release 6.0 of SWISS-PROT contains 6102 sequence entries,
comprising 1'653'982 amino acids abstracted from 6422
references. This represents an increase of 25% over release
5.0. The recent growth of the data bank is summarised
below:
Release Date Number of entries Nb of amino acids
3.0 11/86 4160 969 641
4.0 04/87 4387 1 036 010
5.0 09/87 5205 1 327 683
6.0 01/88 6102 1 653 982
1.2 Source of data
Release 6.0 has been updated using protein sequence data
from release 14.0 of the PIR (Protein Identification
Resource) protein data bank, as well as translation of
nucleotide sequence data from release 13.0 of the EMBL
nucleotide sequence Data Library.
As an indication to the source of the sequence data in the
SWISS-PROT data bank we list here the statistics concerning
the DR (Databank Reference) pointer lines:
Entries with pointer(s) to only PIR entri(es): 3490
Entries with pointer(s) to only EMBL entri(es): 1173
Entries with pointer(s) to both EMBL and PIR entri(es): 1367
Entries with no pointers lines (entered in house): 72
2. DESCRIPTION OF THE CHANGES MADE TO SWISS-PROT SINCE
RELEASE 5
2.1 New sequence data
We have continued to scan the EMBL Data Library for
sequences not yet available in the protein data bank and we
have translated and annotated some 500 additional
sequences. In addition to those new sequences, some 100
existing entries have been updated with sequence data
translated from EMBL entries.
2.2 Modifications in the feature keys
- Four new feature keys have been introduced with this
release:
DNA_BIND: extent of a DNA-binding region.
INIT_MET: to indicate that there was an initiator
methionine in front of the sequence first
residue.
MYRISTYL: myristylated residue.
TRANSMEM: extent of a transmembrane region.
- Two feature keys have been renamed:
PROPEP replace PROP
SIMILAR replace HOMOLOGY
2.3 Format modifications
- Before release 6.0, if an entry had more than one DT
line, these lines were listed starting with the oldest
one and ending with the most recent one. So as to follow
the conventions used by the EMBL Data Library we have
inverted that order: the first DT line is now the most
recent one.
- A DR line pointing to a PIR entry used that entry name
as the primary identifier and the PIR release version as
the secondary identifier. PIR having (starting with
release 13) introduced accession number, we now use the
first accession number as the primary identifier and the
entry name as the secondary identifier. Example:
what was: DR PIR; R5EC7; RELEASE 11, DECEMBER 1986.
is now: DR PIR; A02768; R5EC7.
2.4 Organism identification code changes
- The organism code for guinea-pig which formerly was
GUPIG was changed to CAVPO (CAvia PORcellus) so as to
fit with the general naming rules.
- We have modified some viruses organism identification
codes to make them compatible with the abbreviations
commonly used.
2.5 Other changes
- Keywords have been updated in most entries.
- The annotations of a number of protein families has been
updated and completed using recent reviews.
3. THE NEXT RELEASE
SWISS-PROT release 7.0 will be probably ready in beginning
of april 1987, we hope that at that stage we will have
caught up with the backlog of sequences to translate from
the EMBL Data Library nucleotide sequences.
4. WE NEED YOUR HELP !
We welcome any feedback from our users. We especially would
appreciate that you notify us if you find that sequences
belonging to your field of expertise are missing from the
data bank. We also would like to be notified about
annotations to be updated, as for example if the function
of a protein has been clarified or if new posttranslational
information has become available.
APPENDIX A: DISKS FOR SWISS-PROT
IBM PC/AT 1.2 Mb disks
SWISS-PROT is stored on seven 1.2 Mb disks. Each of these
disk contains a single bulk file (PROT6_01.BLK to
PROT6_07.BLK):
Disk First sequence Last Sequence
1 16K$TRVPS CRAA$LOXAF
2 CRAA$MACMU GMCS$MOUSE
3 GONL$HUMAN KRAF$MOUSE
4 KRAF$MSV36 PGMU$RABIT
5 PGMU$YEAST TESB$RAT
6 TETX$CLOTE YX$NPVGM
7 YXL2$EPBAR ZEIG$MAIZE
IBM PC 360 Kb disks
SWISS-PROT is stored on twenty one 360 Kb disks. Each of
these disk contains a single bulk file (PROT6_01.BLK to
PROT6_21.BLK):
Disk First sequence Last Sequence
1 16K$TRVPS ARA5$AMBEL
2 ARAA$SALTY CAPB$HUMAN
3 CAPP$ECOLI COX1$BOVIN
4 COX1$CHLRE CYC$HELAN
5 CYC$HELAS EGF$HUMAN
6 EGF$MOUSE FUCO$HUMAN
7 FULC$MYXFU H41$TETPY
8 H42$TETPY HEMA$INASW
9 HEMA$INATA IL3$HYLLA
10 IL3$MOUSE KPCA$BOVIN
11 KPCB$BOVIN LV1$CHICKL
12 LV1A$HUMAN NAC2$RAT
13 NACH$ELEEL OSTC$MACFA
14 OSTC$MOUSE POLG$FMDVA
15 POLG$FMDVC RBS$ANASP
16 RBS$TOBAC SCX3$BUTOC
17 SCX3$CENSC TINT$MOUSE
18 TLP1$MOUSE V55$BPT7
19 V57$HSV11 VMSA$HPBVZ
20 VMSA$WHV YMC4$YEAST
21 YMCA$ASPNI ZEIG$MAIZE