Search for
You are here: ExPASy AU  > Databases  > Around UniProtKB

         UniProtKB/Swiss-Prot protein knowledgebase release 57.13 statistics


1.  INTRODUCTION

Release 57.13 of 19-Jan-10 of UniProtKB/Swiss-Prot contains 514212 sequence entries,
comprising 180900945 amino acids abstracted from 186149 references. 

393 sequences have been added since release 57.12, the sequence data of
105 existing entries has been updated and the annotations of
473244 entries have been revised.

Number of fragments: 8438
Number of additional sequences produced by alternative splicing, initiation or promoter usage, or ribosomal frameshifting: 28782


Protein existence (PE):           entries     %

1: Evidence at protein level        68107   13.2%
2: Evidence at transcript level     66344   12.9%
3: Inferred from homology          363910   70.8%
4: Predicted                        14325    2.8%
5: Uncertain                         1526    0.3%

The growth of the database is summarized below.

   


2.  TAXONOMIC ORIGIN

   Total number of species represented in this release of UniProtKB/Swiss-Prot: 12023

   The first twenty species represent 107121 sequences:  20.8 % of the total
   number of entries.


   2.1 Table of the frequency of occurrence of species

        Species represented 1x: 5228
                            2x: 1699
                            3x:  895
                            4x:  573
                            5x:  418
                            6x:  342
                            7x:  245
                            8x:  210
                            9x:  179
                           10x:  105
                       11- 20x:  575
                       21- 50x:  367
                       51-100x:  176
                         >100x: 1011


   2.2  Table of the most represented species

  ------  ---------  --------------------------------------------
  Number  Frequency  Species
  ------  ---------  --------------------------------------------
       1      20276  Homo sapiens (Human)
       2      16214  Mus musculus (Mouse)
       3       8823  Arabidopsis thaliana (Mouse-ear cress)
       4       7469  Rattus norvegicus (Rat)
       5       6552  Saccharomyces cerevisiae (Baker's yeast)
       6       5740  Bos taurus (Bovine)
       7       4974  Schizosaccharomyces pombe (Fission yeast)
       8       4367  Escherichia coli (strain K12)
       9       4248  Bacillus subtilis
      10       4089  Dictyostelium discoideum (Slime mold)
      11       3278  Caenorhabditis elegans
      12       3187  Xenopus laevis (African clawed frog)
      13       3052  Drosophila melanogaster (Fruit fly)
      14       2597  Danio rerio (Zebrafish) (Brachydanio rerio)
      15       2350  Oryza sativa subsp. japonica (Rice)
      16       2206  Pongo abelii (Sumatran orangutan)
      17       2151  Gallus gallus (Chicken)
      18       1993  Escherichia coli O157:H7
      19       1782  Methanocaldococcus jannaschii (Methanococcus jannaschii)
      20       1773  Haemophilus influenzae
      21       1752  Salmonella typhimurium
      22       1668  Escherichia coli O6
      23       1665  Shigella flexneri
      24       1550  Mycobacterium tuberculosis
      25       1503  Xenopus tropicalis (Western clawed frog) (Silurana tropicalis)
      26       1360  Sus scrofa (Pig)
      27       1341  Salmonella typhi
      28       1273  Pseudomonas aeruginosa
      29       1213  Mycobacterium bovis
      30       1159  Macaca fascicularis (Crab eating macaque) (Cynomolgus monkey)
      31       1015  Synechocystis sp. (strain PCC 6803)
      32        995  Yersinia pestis
      33        991  Archaeoglobus fulgidus
      34        940  Vibrio cholerae
      35        929  Salmonella paratyphi A
      36        922  Staphylococcus aureus (strain N315)
      37        922  Staphylococcus aureus (strain Mu50 / ATCC 700699)
      38        911  Rhizobium meliloti (Sinorhizobium meliloti)
      39        909  Acanthamoeba polyphaga mimivirus (APMV)
      40        896  Staphylococcus aureus (strain COL)
      41        894  Staphylococcus aureus (strain MW2)
      42        888  Staphylococcus aureus (strain MSSA476)
      43        885  Staphylococcus aureus (strain MRSA252)
      44        882  Oryctolagus cuniculus (Rabbit)
      45        879  Escherichia coli O6:K15:H31 (strain 536 / UPEC)
      46        879  Salmonella choleraesuis
      47        869  Shigella sonnei (strain Ss046)
      48        863  Yersinia pseudotuberculosis
      49        835  Escherichia coli O9:H4 (strain HS)
      50        829  Escherichia coli O139:H28 (strain E24377A / ETEC)
      51        823  Shigella boydii serotype 4 (strain Sb227)
      52        818  Escherichia coli (strain UTI89 / UPEC)
      53        817  Ashbya gossypii (Yeast) (Eremothecium gossypii)
      54        814  Escherichia coli (strain ATCC 8739 / DSM 1576 / Crooks)
      55        800  Shigella dysenteriae serotype 1 (strain Sd197)
      56        795  Candida albicans (Yeast)
      57        794  Vibrio parahaemolyticus
      58        789  Kluyveromyces lactis (Yeast) (Candida sphaerica)
      59        785  Escherichia coli (strain SMS-3-5 / SECEC)
      60        778  Erwinia carotovora subsp. atroseptica (Pectobacterium atrosepticum)
      61        776  Pasteurella multocida
      62        771  Aquifex aeolicus
      63        771  Neurospora crassa
      64        765  Escherichia coli (strain K12 / DH10B)
      65        764  Canis familiaris (Dog)
      66        759  Escherichia coli O127:H6 (strain E2348/69 / EPEC)
      67        759  Escherichia coli (strain K12 / BW2952)
      68        757  Escherichia coli (strain 55989 / EAEC)
      69        757  Staphylococcus epidermidis (strain ATCC 35984 / RP62A)
      70        756  Escherichia coli O17:K52:H18 (strain UMN026 / ExPEC)
      71        756  Escherichia coli O8 (strain IAI1)
      72        756  Staphylococcus epidermidis (strain ATCC 12228)
      73        750  Escherichia coli (strain SE11)
      74        750  Shigella flexneri serotype 5b (strain 8401)
      75        750  Escherichia coli O45:K1 (strain S88 / ExPEC)
      76        748  Escherichia coli O7:K1 (strain IAI39 / ExPEC)
      77        747  Candida glabrata (Yeast) (Torulopsis glabrata)
      78        742  Escherichia coli O157:H7 (strain EC4115 / EHEC)
      79        738  Streptomyces coelicolor
      80        738  Photorhabdus luminescens subsp. laumondii
      81        731  Vibrio vulnificus
      82        730  Bacillus halodurans
      83        726  Escherichia coli O81 (strain ED1a)
      84        722  Yersinia enterocolitica serotype O:8 / biotype 1B (strain 8081)
      85        721  Bacillus anthracis
      86        719  Salmonella enteritidis PT4 (strain P125109)
      87        715  Vibrio vulnificus (strain YJ016)
      88        715  Salmonella paratyphi B (strain ATCC BAA-1250 / SPB7)
      89        713  Salmonella paratyphi A (strain AKU_12601)
      90        712  Yersinia pestis bv. Antiqua (strain Nepal516)
      91        712  Yersinia pseudotuberculosis serotype O:1b (strain IP 31758)
      92        711  Staphylococcus aureus (strain NCTC 8325)
      93        710  Salmonella newport (strain SL254)
      94        709  Salmonella heidelberg (strain SL476)
      95        709  Salmonella agona (strain SL483)
      96        708  Yersinia pestis bv. Antiqua (strain Antiqua)
      97        708  Salmonella schwarzengrund (strain CVM19633)
      98        705  Escherichia coli O1:K1 / APEC
      99        699  Salmonella dublin (strain CT_02021853)
     100        697  Enterobacter sp. (strain 638)
     101        696  Klebsiella pneumoniae subsp. pneumoniae (strain ATCC 700721 / MGH 78578)
     102        696  Shigella boydii serotype 18 (strain CDC 3083-94 / BS512)
     103        687  Mycoplasma pneumoniae
     104        685  Escherichia fergusonii (strain ATCC 35469 / DSM 13698 / CDC 0568-73)
     105        684  Pseudomonas syringae pv. tomato
     106        683  Pan troglodytes (Chimpanzee)
     107        682  Salmonella gallinarum (strain 287/91 / NCTC 13346)
     108        682  Klebsiella pneumoniae (strain 342)
     109        676  Anabaena sp. (strain PCC 7120)
     110        670  Pseudomonas putida (strain KT2440)
     111        665  Staphylococcus aureus (strain USA300)
     112        665  Yersinia pestis (strain Pestoides F)
     113        664  Citrobacter koseri (strain ATCC BAA-895 / CDC 4225-83 / SGSC4696)
     114        661  Mycobacterium leprae
     115        658  Rhizobium sp. (strain NGR234)
     116        653  Serratia proteamaculans (strain 568)
     117        645  Escherichia coli
     118        645  Bradyrhizobium japonicum
     119        642  Zea mays (Maize)
     120        641  Staphylococcus aureus (strain bovine RF122 / ET3-1)
     121        638  Bacillus cereus (strain ATCC 14579 / DSM 31)
     122        637  Yersinia pseudotuberculosis serotype O:3 (strain YPIII)
     123        634  Salmonella arizonae (strain ATCC BAA-731 / CDC346-86 / RSK2980)
     124        633  Yersinia pseudotuberculosis serotype IB (strain PB1/+)
     125        620  Shewanella oneidensis
     126        617  Agrobacterium tumefaciens (strain C58 / ATCC 33970)
     127        615  Treponema pallidum
     128        612  Ralstonia solanacearum (Pseudomonas solanacearum)
     129        608  Staphylococcus haemolyticus (strain JCSC1435)
     130        608  Enterobacter sakazakii (strain ATCC BAA-894)
     131        602  Rhizobium loti (Mesorhizobium loti)
     132        602  Staphylococcus saprophyticus subsp. saprophyticus 
     133        600  Methanobacterium thermoautotrophicum
     134        598  Yersinia pestis bv. Antiqua (strain Angola)
     135        598  Salmonella paratyphi C (strain RKS4594)
     136        598  Emericella nidulans (Aspergillus nidulans)
     137        596  Listeria monocytogenes
     138        595  Photobacterium profundum (Photobacterium sp. (strain SS9))
     139        593  Debaryomyces hansenii (Yeast) (Torulaspora hansenii)
     140        592  Yarrowia lipolytica (Candida lipolytica)
     141        590  Bacillus cereus (strain ATCC 10987)
     142        589  Xanthomonas campestris pv. campestris
     143        588  Listeria innocua
     144        585  Rickettsia prowazekii
     145        584  Helicobacter pylori (Campylobacter pylori)
     146        582  Pectobacterium carotovorum subsp. carotovorum (strain PC1)
     147        581  Lactococcus lactis subsp. lactis (Streptococcus lactis)
     148        579  Neisseria meningitidis serogroup B
     149        576  Brucella suis
     150        572  Brucella melitensis
     151        572  Buchnera aphidicola subsp. Acyrthosiphon pisum 
     152        567  Bacillus thuringiensis subsp. konkukian
     153        565  Helicobacter pylori J99 (Campylobacter pylori J99)
     154        562  Buchnera aphidicola subsp. Schizaphis graminum
     155        560  Bacillus cereus (strain ZK / E33L)
     156        560  Pseudomonas syringae pv. syringae (strain B728a)
     157        557  Pseudomonas aeruginosa (strain UCBPP-PA14)
     158        556  Neisseria meningitidis serogroup A
     159        555  Bacillus licheniformis (strain DSM 13 / ATCC 14580)
     160        555  Xanthomonas axonopodis pv. citri (Citrus canker)
     161        553  Vibrio fischeri (strain ATCC 700601 / ES114)
     162        551  Pseudomonas fluorescens (strain Pf0-1)
     163        549  Oceanobacillus iheyensis
     164        545  Caulobacter crescentus (Caulobacter vibrioides)
     165        545  Clostridium acetobutylicum
     166        545  Pseudomonas fluorescens (strain Pf-5 / ATCC BAA-477)
     167        538  Pseudomonas syringae pv. phaseolicola (strain 1448A / Race 6)
     168        529  Listeria monocytogenes serotype 4b (strain F2365)
     169        523  Erwinia tasmaniensis (strain DSM 17950 / Et1/99)
     170        522  Sodalis glossinidius (strain morsitans)
     171        521  Bordetella bronchiseptica (Alcaligenes bronchisepticus)
     172        521  Xylella fastidiosa
     173        519  Streptococcus pneumoniae
     174        512  Xylella fastidiosa (strain Temecula1 / ATCC 700964)
     175        509  Chromobacterium violaceum
     176        509  Thermotoga maritima
     177        509  Vibrio cholerae serotype O1 (strain ATCC 39541 / Ogawa 395 / O395)
     178        507  Bordetella parapertussis
     179        507  Buchnera aphidicola subsp. Baizongia pistaciae
     180        507  Pseudomonas aeruginosa (strain PA7)
     181        505  Bordetella pertussis
     182        504  Haemophilus ducreyi
     183        503  Staphylococcus aureus (strain Newman)
     184        503  Geobacillus kaustophilus
     185        500  Pseudomonas entomophila (strain L48)
     186        498  Brucella abortus
     187        497  Rickettsia conorii
     188        496  Bacillus clausii (strain KSM-K16)
     189        492  Haemophilus influenzae (strain 86-028NP)
     190        491  Deinococcus radiodurans
     191        490  Xanthomonas campestris pv. campestris (strain 8004)
     192        490  Vibrio harveyi (strain ATCC BAA-1116 / BB120)
     193        490  Clostridium perfringens
     194        488  Bacillus amyloliquefaciens (strain FZB42)
     195        487  Burkholderia pseudomallei (Pseudomonas pseudomallei)
     196        487  Shewanella sp. (strain MR-7)
     197        485  Aspergillus fumigatus (Sartorya fumigata)
     198        484  Pseudomonas aeruginosa (strain LESB58)
     199        484  Shewanella sp. (strain MR-4)
     200        483  Mannheimia succiniciproducens (strain MBEL55E)
     201        483  Mycoplasma genitalium
     202        483  Staphylococcus aureus (strain Mu3 / ATCC 700698)
     203        482  Streptomyces avermitilis
     204        481  Corynebacterium glutamicum (Brevibacterium flavum)
     205        479  Proteus mirabilis (strain HI4320)
     206        476  Caenorhabditis briggsae
     207        475  Oryza sativa subsp. indica (Rice)
     208        475  Synechococcus elongatus (strain PCC 7942) (Anacystis nidulans R2)
     209        474  Methanosarcina acetivorans
     210        472  Burkholderia sp. (strain 383) (Burkholderia cepacia 
     211        472  Pseudomonas putida (strain F1 / ATCC 700007)
     212        472  Brucella abortus (strain 2308)
     213        472  Thermosynechococcus elongatus (strain BP-1)
     214        468  Enterococcus faecalis (Streptococcus faecalis)
     215        465  Acinetobacter sp. (strain ADP1)
     216        465  Pseudomonas putida (strain GB-1)
     217        464  Rhodopseudomonas palustris
     218        464  Xanthomonas campestris pv. vesicatoria (strain 85-10)
     219        464  Shewanella frigidimarina (strain NCIMB 400)
     220        462  Anabaena variabilis (strain ATCC 29413 / PCC 7937)
     221        462  Shewanella sp. (strain ANA-3)
     222        461  Burkholderia mallei (Pseudomonas mallei)
     223        461  Pyrococcus horikoshii
     224        460  Ralstonia eutropha  (Cupriavidus necator 
     225        458  Lactobacillus plantarum
     226        457  Streptococcus pneumoniae (strain ATCC BAA-255 / R6)
     227        457  Pyrococcus abyssi
     228        457  Ralstonia eutropha (strain JMP134) (Alcaligenes eutrophus)
     229        455  Methanosarcina mazei (Methanosarcina frisia)
     230        454  Staphylococcus aureus (strain JH1)
     231        454  Aeromonas hydrophila subsp. hydrophila (strain ATCC 7966 / NCIB 9240)
     232        453  Rickettsia felis (Rickettsia azadi)
     233        453  Xanthomonas oryzae pv. oryzae (strain MAFF 311018)
     234        452  Shewanella baltica (strain OS185)
     235        452  Pseudomonas putida (strain W619)
     236        452  Halobacterium salinarium (Halobacterium halobium)
     237        448  Staphylococcus aureus (strain JH9)
     238        448  Thermoanaerobacter tengcongensis
     239        448  Streptococcus mutans
     240        446  Methylococcus capsulatus
     241        446  Ovis aries (Sheep)
     242        446  Rhodobacter sphaeroides (strain ATCC 17023 / 2.4.1 / NCIB 8253 / DSM 158)
     243        446  Aeromonas salmonicida (strain A449)
     244        444  Vibrio fischeri (strain MJ11)
     245        443  Hahella chejuensis (strain KCTC 2396)
     246        443  Pseudomonas mendocina (strain ymp)
     247        441  Streptococcus pyogenes serotype M6
     248        441  Chlamydia trachomatis
     249        440  Dechloromonas aromatica (strain RCB)
     250        439  Rickettsia bellii (strain RML369-C)


   
   2.3  Taxonomic distribution of the sequences

   

   Kingdom        sequences (% of the database)
    Archaea           18172 (  4%)
    Bacteria         322942 ( 63%)
    Eukaryota        158269 ( 31%)
    Viruses           14829 (  3%)


   Within Eukaryota:

   

    Category            sequences (% of Eukaryota) (% of the complete database)
     Human                  20277 ( 13%)           (  4%)
     Other Mammalia         44526 ( 28%)           (  9%)
     Other Vertebrata       15906 ( 10%)           (  3%)
     Viridiplantae          28551 ( 18%)           (  6%)
     Fungi                  25076 ( 16%)           (  5%)
     Insecta                 7624 (  5%)           (  1%)
     Nematoda                4028 (  3%)           (  1%)
     Other                  12281 (  8%)           (  2%)



3.  SEQUENCE SIZE

   Repartition of the sequences by size (excluding fragments)

               From   To  Number             From   To   Number
                  1-  50    8371             1001-1100     3459
                 51- 100   39770             1101-1200     2389
                101- 150   55711             1201-1300     1902
                151- 200   55767             1301-1400     1771
                201- 250   54341             1401-1500     1394
                251- 300   47811             1501-1600      626
                301- 350   48188             1601-1700      492
                351- 400   41207             1701-1800      407
                401- 450   33658             1801-1900      388
                451- 500   26943             1901-2000      321
                501- 550   19059             2001-2100      192
                551- 600   13679             2101-2200      261
                601- 650   11441             2201-2300      268
                651- 700    8143             2301-2400      168
                701- 750    6782             2401-2500      128
                751- 800    4766             >2500         1000
                801- 850    4115
                851- 900    4735
                901- 950    3601
                951-1000    2520

   


   The average sequence length in UniProtKB/Swiss-Prot is 351 amino acids.

   The shortest sequence is   GWA_SEPOF (P83570):     2 amino acids.
   The longest sequence is  TITIN_MOUSE (A2ASS6): 35213 amino acids.


4.  JOURNAL CITATIONS

   Note: the following citation statistics reflect the number of distinct
         journal citations.

   Total number of journals cited in this release of UniProtKB/Swiss-Prot: 2037


   4.1 Table of the frequency of journal citations

        Journals cited 1x:  653
                       2x:  287
                       3x:  132
                       4x:  108
                       5x:   84
                       6x:   60
                       7x:   35
                       8x:   40
                       9x:   39
                      10x:   24
                  11- 20x:  161
                  21- 50x:  162
                  51-100x:   96
                    >100x:  156


   4.2  List of the most cited journals in UniProtKB/Swiss-Prot

   Nb    Citations   Journal name
   --    ---------   -------------------------------------------------------------
    1        17621   Journal of Biological Chemistry
    2         8170   Proceedings of the National Academy of Sciences of the U.S.A.
    3         4976   Journal of Bacteriology
    4         4487   Gene
    5         4453   Biochemical and Biophysical Research Communications
    6         4279   Nucleic Acids Research
    7         3915   FEBS Letters
    8         3754   Biochemistry
    9         3699   The EMBO Journal
   10         3355   Molecular and Cellular Biology
   11         3178   Nature
   12         3078   European Journal of Biochemistry
   13         2979   Journal of Molecular Biology
   14         2952   Biochimica et Biophysica Acta
   15         2628   Cell
   16         2471   Genomics
   17         2146   Biochemical Journal
   18         2080   Science
   19         2007   Journal of Virology
   20         1739   Molecular Microbiology
   21         1544   Journal of Cell Biology
   22         1486   Plant Molecular Biology
   23         1339   Virology
   24         1336   Genes and Development
   25         1302   Molecular and General Genetics
   26         1299   Nature Genetics
   27         1289   Human Molecular Genetics
   28         1272   Plant Physiology
   29         1195   The American Journal of Human Genetics
   30         1161   Oncogene
   31         1153   Journal of Biochemistry
   32         1124   Development
   33         1065   Human Mutation
   34          999   Molecular Biology of the Cell
   35          993   Journal of Immunology
   36          971   Genetics
   37          876   Structure
   38          861   Journal of General Virology
   39          857   Infection and Immunity
   40          834   The Plant Cell
   41          810   Archives of Biochemistry and Biophysics
   42          786   Molecular Cell
   43          782   Blood
   44          755   Yeast
   45          738   Microbiology
   46          711   The Plant Journal
   47          707   Journal of Cell Science
   48          707   Developmental Biology
   49          658   Cancer Research
   50          647   FEMS Microbiology Letters
   51          630   Current Biology
   52          590   Human Genetics
   53          582   Nature Structural Biology
   54          577   Mechanisms of Development
   55          526   Acta Crystallographica, Section D
   56          524   Protein Science
   57          523   Current Genetics
   58          522   Journal of Neuroscience
   59          517   Applied and Environmental Microbiology
   60          500   Toxicon
   61          497   Journal of Clinical Investigation
   62          490   Neuron
   63          469   Mammalian Genome
   64          449   American Journal of Physiology
   65          440   Immunogenetics
   66          438   The Journal of Experimental Medicine
   67          431   Molecular Endocrinology
   68          419   Molecular and Biochemical Parasitology
   69          405   Journal of Neurochemistry
   70          396   The Journal of Clinical Endocrinology and Metabolism
   71          380   Endocrinology
   72          375   Journal of Molecular Evolution
   73          362   DNA and Cell Biology
   74          354   DNA Sequence
   75          351   Molecular Biology and Evolution
   76          350   Bioscience, Biotechnology, and Biochemistry
   77          345   Journal of Medical Genetics
   78          344   Proteins
   79          313   Brain Research. Molecular Brain Research
   80          289   Biological Chemistry Hoppe-Seyler
   81          289   Plant and Cell Physiology
   82          285   Nature Cell Biology
   83          284   Comparative Biochemistry and Physiology
   84          283   Experimental Cell Research
   85          282   Peptides
   86          278   Antimicrobial Agents and Chemotherapy
   87          275   Journal of Investigative Dermatology
   88          274   Cytogenetics and Cell Genetics
   89          263   Molecular Pharmacology
   90          253   Biology of Reproduction
   91          248   Tissue Antigens
   92          246   Journal of General Microbiology
   93          245   Genome Research
   94          240   Neurology
   95          237   RNA
   96          235   Developmental Dynamics
   97          231   Virus Research
   98          227   Developmental Cell
   99          215   Hoppe-Seyler's Zeitschrift fur Physiologische Chemie
  100          205   DNA Research
  101          202   Planta
  102          202   European Journal of Immunology
  103          201   Molecular Plant-Microbe Interactions
  104          199   Biochimie
  105          195   Annals of Neurology
  106          192   European Journal of Human Genetics
  107          191   Genes to Cells
  108          186   Eukaryotic cell
  109          180   Immunity
  110          178   Journal of Human Genetics
  111          171   The New England Journal of Medicine
  112          170   Molecular and Cellular Endocrinology
  113          164   Archives of Microbiology
  114          164   Investigative Ophthalmology and Visual Science
  115          163   American Journal of Medical Genetics
  116          163   Molecular Phylogenetics and Evolution
  117          160   Nature Structural and Molecular Biology
  118          159   DNA
  119          155   Insect Biochemistry and Molecular Biology
  120          155   EMBO Reports
  121          153   Hemoglobin
  122          149   The FASEB Journal
  123          148   Bioorganicheskaia Khimiia
  124          148   Molecular Reproduction and Development
  125          148   Diabetes
  126          146   Molecular Immunology
  127          144   The FEBS Journal
  128          143   Archives of Virology
  129          142   Glycobiology
  130          140   Clinical Genetics
  131          136   General and Comparative Endocrinology
  132          135   Animal Genetics
  133          134   Molecular Genetics and Metabolism
  134          134   International Journal of Cancer
  135          131   Molecular and Cellular Neuroscience
  136          128   British Journal of Haematology
  137          128   Journal of Cellular Biochemistry
  138          122   Molecular Genetics and Genomics
  139          121   American Journal of Medical Genetics. Part A
  140          121   Biological Chemistry
  141          120   Agricultural and Biological Chemistry
  142          118   Nature Immunology
  143          118   Journal of Lipid Research
  144          116   BMC Genomics
  145          116   Journal of the American Chemical Society
  146          113   Thrombosis and Haemostasis
  147          113   Journal of Protein Chemistry
  148          112   Proteomics
  149          110   Circulation Research
  150          109   Journal of Neuroscience Research


5.  STATISTICS FOR SOME LINE TYPES

The following table summarizes the total number of some UniProtKB/Swiss-Prot lines,
as well as the number of entries with at least one such line, and the
frequency of the lines.

                                      Total    Number of  Average
   Line type / subtype                number   entries    per entry
------------------------------------  -------- ---------  ---------

References (RL)                       911549                 1.77                                         
   Journal                            719498     383525      1.40       1                                 
   Submitted to EMBL/GenBank/DDBJ     179482     166329      0.35       2                                 
   Submitted to other databases        10529       9158      0.02       3                                 
   Book citation                         632        618     <0.01       4                                 
   Plant Gene Register                   559        547     <0.01       5                                 
   Thesis                                394        392     <0.01       6                                 
   Unpublished observations              292        288     <0.01       7                                 
   Patent                                157        155     <0.01       8                                 
   Worm Breeder's Gazette                  6          6     <0.01       9                                 

Total number of distinct authors cited in UniProtKB/Swiss-Prot: 283983

                                      Total    Number of  Average
   Line type / subtype                number   entries    per entry  Rank
------------------------------------  -------- ---------  ---------  ----
Comments (CC)                        2156141                 4.19                                         
   ALLERGEN                              457        457     <0.01      26                                 
   ALTERNATIVE PRODUCTS                18620      18620      0.04      12                                 
   BIOPHYSICOCHEMICAL PROPERTIES        2876       2876      0.01      22                                 
   BIOTECHNOLOGY                         254        252     <0.01      28                                 
   CATALYTIC ACTIVITY                 214422     195652      0.42       5                                 
   CAUTION                              6752       6615      0.01      19                                 
   COFACTOR                            97463      89486      0.19       7                                 
   DEVELOPMENTAL STAGE                  8656       8656      0.02      16                                 
   DISEASE                              4492       3076      0.01      20                                 
   DISRUPTION PHENOTYPE                 2352       2352     <0.01      23                                 
   DOMAIN                              30691      27389      0.06      10                                 
   ENZYME REGULATION                    7662       7662      0.01      18                                 
   FUNCTION                           380132     364407      0.74       2                                 
   INDUCTION                           11375      11375      0.02      15                                 
   INTERACTION                         12027      12027      0.02      14                                 
   MASS SPECTROMETRY                    4194       3165      0.01      21                                 
   MISCELLANEOUS                       29608      27326      0.06      11                                 
   PATHWAY                            125209     114260      0.24       6                                 
   PHARMACEUTICAL                         83         83     <0.01      29                                 
   POLYMORPHISM                          765        735     <0.01      24                                 
   PTM                                 34974      28347      0.07       8                                 
   RNA EDITING                           589        589     <0.01      25                                 
   SEQUENCE CAUTION                    12637      12637      0.02      13                                 
   SIMILARITY                         596557     489527      1.16       1                                 
   SUBCELLULAR LOCATION               294783     289774      0.57       3                                 
   SUBUNIT                            217540     217540      0.42       4                                 
   TISSUE SPECIFICITY                  32275      32275      0.06       9                                 
   TOXIC DOSE                            409        398     <0.01      27                                 
   WEB RESOURCE                         8287       6577      0.02      17                                 

Total number of comment topics: 29


                                      Total    Number of  Average
   Line type / subtype                number   entries    per entry  Rank
------------------------------------  -------- ---------  ---------  ----
Features (FT)                        3168390                 6.16                                         
   ACT_SITE                           127270      75683      0.25       9                                 
   BINDING                            193500      55547      0.38       4                                 
   CA_BIND                              3645       1475      0.01      35                                 
   CARBOHYD                            95663      24520      0.19      13                                 
   CHAIN                              520761     509574      1.01       1                                 
   COILED                              18169      12210      0.04      26                                 
   COMPBIAS                            48463      25319      0.09      18                                 
   CONFLICT                           115165      40415      0.22      10                                 
   CROSSLNK                             4680       3053      0.01      34                                 
   DISULFID                            93802      24771      0.18      14                                 
   DNA_BIND                            10858       9993      0.02      29                                 
   DOMAIN                             141430      84104      0.28       6                                 
   HELIX                              130330      13633      0.25       8                                 
   INIT_MET                            14724      14724      0.03      27                                 
   LIPID                               10512       6694      0.02      30                                 
   METAL                              263239      64718      0.51       3                                 
   MOD_RES                            173214      58100      0.34       5                                 
   MOTIF                               31637      20442      0.06      22                                 
   MUTAGEN                             29521       7062      0.06      24                                 
   NON_CONS                             1540        631     <0.01      36                                 
   NON_STD                               347        272     <0.01      38                                 
   NON_TER                             11465       8699      0.02      28                                 
   NP_BIND                            102200      66954      0.20      12                                 
   PEPTIDE                              8407       5352      0.02      32                                 
   PROPEP                              10294       8668      0.02      31                                 
   REGION                              88772      49269      0.17      15                                 
   REPEAT                              87537      12968      0.17      16                                 
   SIGNAL                              33555      33545      0.07      21                                 
   SITE                                36315      21612      0.07      20                                 
   STRAND                             130873      12742      0.25       7                                 
   TOPO_DOM                           114529      23491      0.22      11                                 
   TRANSIT                              6434       6348      0.01      33                                 
   TRANSMEM                           332026      68306      0.65       2                                 
   TURN                                31103      10769      0.06      23                                 
   UNSURE                               1079        348     <0.01      37                                 
   VAR_SEQ                             38563      16551      0.07      19                                 
   VARIANT                             78797      16434      0.15      17                                 
   ZN_FING                             27971      12301      0.05      25                                 

Total number of feature keys: 38



                                      Total    Number of  Average
   Line type / subtype                number   entries    per entry  Rank      Category
------------------------------------  -------- ---------  ---------  ----      -------------------------------------------
Cross-references (DR)               12073112                23.48                                                           
   2DBase-Ecoli                           84         84     <0.01     113      2D gel databases                             
   Aarhus/Ghent-2DPAGE                   126         96     <0.01     110      2D gel databases                             
   AGD                                   823        817     <0.01      87      Organism-specific databases                  
   ANU-2DPAGE                             23         23     <0.01     120      2D gel databases                             
   ArachnoServer                         428        423     <0.01      96                                                   
   ArrayExpress                        58014      58014      0.11      35      Gene expression databases                    
   Bgee                                37622      37621      0.07      41      Gene expression databases                    
   BindingDB                             297        297     <0.01     104      Other                                        
   BioCyc                             160422     147567      0.31      18      Enzyme and pathway databases                 
   BRENDA                              65152      62356      0.13      30      Enzyme and pathway databases                 
   BuruList                              330        330     <0.01     103      Organism-specific databases                  
   CAZy                                 5645       5024      0.01      62      Protein family/group databases               
   CGD                                   554        550     <0.01      92      Organism-specific databases                  
   CleanEx                             30224      29576      0.06      43      Gene expression databases                    
   COMPLUYEAST-2DPAGE                     59         59     <0.01     115      2D gel databases                             
   Cornea-2DPAGE                          67         67     <0.01     114      2D gel databases                             
   CTD                                 61369      60823      0.12      34      Organism-specific databases                  
   CYGD                                 6628       6522      0.01      61      Organism-specific databases                  
   dictyBase                            4211       4089      0.01      70      Organism-specific databases                  
   DIP                                 10378      10272      0.02      54      Protein-protein interaction databases        
   DisProt                               397        394     <0.01      98      3D structure databases                       
   DOSAC-COBS-2DPAGE                     150        150     <0.01     109      2D gel databases                             
   DrugBank                             5317       1626      0.01      64      Other                                        
   EchoBASE                             4159       4124      0.01      72      Organism-specific databases                  
   ECO2DBASE                             351        299     <0.01     102      2D gel databases                             
   EcoGene                              4353       4350      0.01      68      Organism-specific databases                  
   eggNOG                             216331     216331      0.42      15      Phylogenomic databases                       
   EMBL                               844747     504537      1.64       3      Sequence databases                           
   Ensembl                             89983      69615      0.17      25      Genome annotation databases                  
   euHCVdb                                55         44     <0.01     116      Organism-specific databases                  
   FlyBase                              5390       5014      0.01      63      Organism-specific databases                  
   Gene3D                             235270     193172      0.46      14      Family and domain databases                  
   GeneCards                           21083      19821      0.04      47      Organism-specific databases                  
   GeneDB_Spombe                        4976       4931      0.01      66      Organism-specific databases                  
   GeneFarm                             2682       2667      0.01      79      Organism-specific databases                  
   GeneID                             466502     447505      0.91       6      Genome annotation databases                  
   Genevestigator                      64306      64306      0.13      33      Gene expression databases                    
   GenomeReviews                      369828     350327      0.72       9      Genome annotation databases                  
   GermOnline                          41931      41324      0.08      40      Gene expression databases                    
   GlycoSuiteDB                          280        280     <0.01     105      PTM databases                                
   GO                                2156980     480722      4.19       1      Ontologies                                   
   Gramene                              4269       4269      0.01      69      Organism-specific databases                  
   H-InvDB                             11249       9556      0.02      53      Organism-specific databases                  
   HAMAP                              306962     306819      0.60      13      Family and domain databases                  
   HGNC                                19531      19359      0.04      48      Organism-specific databases                  
   HOGENOM                            358862     358836      0.70      10      Phylogenomic databases                       
   HOVERGEN                            75023      75023      0.15      28      Phylogenomic databases                       
   HPA                                  8707       6564      0.02      56      Organism-specific databases                  
   HSC-2DPAGE                             85         85     <0.01     112      2D gel databases                             
   HSSP                                28846      28846      0.06      44      3D structure databases                       
   InParanoid                          65616      65616      0.13      29      Phylogenomic databases                       
   IntAct                              21322      21322      0.04      46      Protein-protein interaction databases        
   InterPro                          1593469     488549      3.10       2      Family and domain databases                  
   IPI                                 88171      63236      0.17      26      Sequence databases                           
   KEGG                               437439     415763      0.85       8      Genome annotation databases                  
   LegioList                             759        757     <0.01      88      Organism-specific databases                  
   Leproma                               664        661     <0.01      91      Organism-specific databases                  
   ListiList                            1185       1177     <0.01      84      Organism-specific databases                  
   MaizeGDB                              471        466     <0.01      94      Organism-specific databases                  
   MEROPS                               8465       8206      0.02      57      Protein family/group databases               
   MGI                                 16093      16042      0.03      50      Organism-specific databases                  
   MIM                                 15795      12437      0.03      52      Organism-specific databases                  
   MypuList                              203        203     <0.01     108      Organism-specific databases                  
   NextBio                             48682      48681      0.09      38      Other                                        
   NMPDR                              129910     129906      0.25      21      Genome annotation databases                  
   OGP                                   377        377     <0.01     100      2D gel databases                             
   OMA                                352715     352715      0.69      11      Phylogenomic databases                       
   Orphanet                             3675       2132      0.01      75      Organism-specific databases                  
   OrthoDB                             55287      55287      0.11      36      Phylogenomic databases                       
   PANTHER                            184715     169553      0.36      17      Family and domain databases                  
   Pathway_Interaction_DB               4569       1666      0.01      67      Enzyme and pathway databases                 
   PDB                                 64800      15302      0.13      32      3D structure databases                       
   PDBsum                              64800      15302      0.13      31      3D structure databases                       
   PeptideAtlas                         5168       5168      0.01      65      Proteomic databases                          
   PeroxiBase                            674        662     <0.01      90      Protein family/group databases               
   Pfam                               678247     477490      1.32       4      Family and domain databases                  
   PharmGKB                            15817      15806      0.03      51      Organism-specific databases                  
   PHCI-2DPAGE                           244        244     <0.01     107      2D gel databases                             
   PhosphoSite                         19295      19295      0.04      49      PTM databases                                
   PhosSite                              267        267     <0.01     106      PTM databases                                
   PhotoList                             738        738     <0.01      89      Organism-specific databases                  
   PhylomeDB                          120962     120962      0.24      22      Phylogenomic databases                       
   PIR                                114910     104966      0.22      23      Sequence databases                           
   PIRSF                               79953      79953      0.16      27      Family and domain databases                  
   PMAP-CutDB                           1395       1395     <0.01      82      Other                                        
   PMMA-2DPAGE                            52         52     <0.01     117      2D gel databases                             
   PptaseDB                               34         34     <0.01     118      Protein family/group databases               
   PRIDE                               53136      53136      0.10      37      Proteomic databases                          
   PRINTS                             136242     117832      0.26      20      Family and domain databases                  
   ProDom                              27768      27439      0.05      45      Family and domain databases                  
   ProMEX                                437        437     <0.01      95      Proteomic databases                          
   PROSITE                            454615     290111      0.88       7      Family and domain databases                  
   PseudoCAP                            1212       1203     <0.01      83      Organism-specific databases                  
   Rat-heart-2DPAGE                       28         28     <0.01     119      2D gel databases                             
   Reactome                             6997       4095      0.01      59      Enzyme and pathway databases                 
   REBASE                                376        355     <0.01     101      Protein family/group databases               
   RefSeq                             486655     447770      0.95       5      Sequence databases                           
   REPRODUCTION-2DPAGE                  1030        942     <0.01      86      2D gel databases                             
   RGD                                  7353       7349      0.01      58      Organism-specific databases                  
   SagaList                              389        388     <0.01      99      Organism-specific databases                  
   SGD                                  6640       6537      0.01      60      Organism-specific databases                  
   Siena-2DPAGE                          102        102     <0.01     111      2D gel databases                             
   SMART                              141568     109276      0.28      19      Family and domain databases                  
   SMR                                345533     345533      0.67      12      3D structure databases                       
   STRING                             203356     203353      0.40      16      Protein-protein interaction databases        
   SubtiList                            4191       4182      0.01      71      Organism-specific databases                  
   SWISS-2DPAGE                         1183       1183     <0.01      85      2D gel databases                             
   TAIR                                 8907       8794      0.02      55      Organism-specific databases                  
   TCDB                                 3282       3242      0.01      77      Protein family/group databases               
   TIGR                                33886      33120      0.07      42      Genome annotation databases                  
   TIGRFAMs                             2888       2865      0.01      78      Family and domain databases                  
   TubercuList                          1578       1542     <0.01      81      Organism-specific databases                  
   UCSC                                48481      39493      0.09      39      Genome annotation databases                  
   UniGene                             92395      81394      0.18      24      Sequence databases                           
   VectorBase                            403        389     <0.01      97      Genome annotation databases                  
   World-2DPAGE                          507        507     <0.01      93      2D gel databases                             
   WormBase                             3809       3724      0.01      74      Organism-specific databases                  
   WormPep                              4045       3269      0.01      73      Organism-specific databases                  
   Xenbase                              3615       3542      0.01      76      Organism-specific databases                  
   ZFIN                                 2506       2495     <0.01      80      Organism-specific databases                  

Total number of cross-referenced databases: 120

6.  AMINO ACID COMPOSITION

   6.1  Composition in percent for the complete database

   Ala (A) 8.28   Gln (Q) 3.94   Leu (L) 9.67   Ser (S) 6.50
   Arg (R) 5.54   Glu (E) 6.77   Lys (K) 5.86   Thr (T) 5.32
   Asn (N) 4.05   Gly (G) 7.09   Met (M) 2.42   Trp (W) 1.07
   Asp (D) 5.45   His (H) 2.27   Phe (F) 3.86   Tyr (Y) 2.91
   Cys (C) 1.35   Ile (I) 5.99   Pro (P) 4.68   Val (V) 6.88

   Asx (B) 0.000  Glx (Z) 0.000  Xaa (X) 0.00

   

   Legend: gray = aliphatic, red = acidic, green = small hydroxy,
           blue = basic, black = aromatic, white = amide, yellow = sulfur


   6.2  Classification of the amino acids by their frequency

   Leu, Ala, Gly, Val, Glu, Ser, Ile, Lys, Arg, Asp, Thr, Pro, Asn, Gln,
   Phe, Tyr, Met, His, Cys, Trp


7.  MISCELLANEOUS STATISTICS

4444 entries are encoded on a mitochondrion, and 3549 are encoded on a plasmid.

12104 entries are encoded on a plastid, 
of which 21 are encoded on apicoplasts, 
11546 on chloroplasts, 
44 on organellar chromatophores,
145 on cyanelles, 
149 on non-photosynthetic plastids and 
199 on unspecified types of plastid.

Number of entries with at least one sequence correction: 68193