Scoring
The scoring is tunable: Several parameters are proposed to be taken into account.
You can use them or not, and if used, you can define the weights of each parameter in the score.
There are two kinds of parameters, those at a peptide level, and those at a protein level.
S = S1 * S2
S = final score for this entry.
S1 = sum of each peptide score.
S2 = protein level score.
A score will be given to each identified peptide.
If no parameters are selected to be used in the score, the default peptide score will be 1.
The parameters are:
the intensity of the identified peak, missed cleavage or not, CTerm amino acid and modifications.
The parameters at the protein level are:
correctness for pI and Mw, if known, and coverage of the identified peptides on the sequence.
If none of these parameters are used in the score, the final score will be the number of identified peptides.
Details for S1 calculation :
S1 = intensity score * missed cleavage score * cTerm amino acid score * modification 1 score * modification 2 score...
- Intensity:
The spectrum intensities are normalized to the range [0; 1].
The intensity score is the normalized intensity of the peak power the user factor.
The user factor is the weight defined by the user of the intensity in the final score:
if factor = 0 : no weight, it is like not using the intensity in the score.
factor = 2 : intensity is used in a power of 2 in the score.
- Missed cleavage:
If the peptide identified includes a missed cleavage, missed cleavage score = factor,
otherwise missed cleavage score = 1.
The factor is the weight of the missed cleavage in the final score:
if factor = 1, no weight, it is like not using the missed cleavages in the score.
if factor = 0, infinite weight, the score of this peptide falls to zero.
- CTerm amino acid:
High intensity peaks should finish by an Arginine and not a Lysine.
Cterm R increases the peptide score if it ends by an Arginine and Cterm K decreases it if it ends by a Lysine.
This rule can be limited to a number of the highest intensity peaks:
if Cterm limit is 5, only the 5 highest peaks are concerned by theses rules.
- Modifications:
For each modification, the peptide score will be decreased by this factor for each unexpected locus:
For VARIABLE modifications, unexpected locus are the modified locus.
For FIXED modifications, unexpected locus are the unmodified locus.
Details for S2 calculation:
S2 = coverage score * pI score * Mw score
-
Coverage:
Ratio = The number of amino acids present in at least one peptide / the number of amino acids of the protein
coverage score = Ratio power the coverage factor.
factor is the weight of the coverage in the final score:
if factor = 0 : no weight
if factor = 2 : coverage is used in a power of 2 in the score.
- pI:
a = protein pI
b = sample pI (if known, otherwise pI score = 1)
s = 1 - |a-b|/(a+b)
s scores the difference between the two pI by a value in the range [0;1].
pI score = s power the pI factor.
factor is the weight of the pI in the final score :
if factor = 0 : no weight
if factor = 2 : pI is used in a power of 2 in the score.
-
Mw :
a = protein Mw
b = sample Mw(if known, otherwise Mw score = 1)
s = 1 - |a-b|/(a+b)
s scores the difference between the two Mw by a value in the range [0;1].
Mw score = s power the Mw factor.
factor is the weight of the Mw in the final score:
if factor = 0 : no weight
if factor = 2 : Mw is used in a power of 2 in the score.