- Info
mGene.web Performance
Here we report on the performance of mGene.web for different species and different data set sizes. We evaluate the prediction performance on the signal and content level. Here, we use the area under the ROC (auROC) and the precision recall curve (auPRC) as evaluation measure. The shown performance measures for signal and content predictors are out-of-sample estimates. We additionally evaluate mGene.web's performance for gene prediction. We show the performance on nucleotide, exon and transcript level on a validation set (except for the "nGASP small" set where the number of genes did not allow for the creation of a reliable validation set and we therefore give evaluation measurements on the training set).
Gene Signal |
TSS |
TIS |
ACC |
DON |
cdsStop |
Cleave |
|
auROC |
auPRC |
auROC |
auPRC |
auROC |
auPRC |
auROC |
auPRC |
auROC |
auPRC |
auROC |
auPRC |
Caenorhabditis elegans (40 genes) |
0.791 |
0.439 |
0.888 |
0.366 |
0.973 |
0.744 |
0.979 |
0.811 |
0.897 |
0.537 |
0.900 |
0.597 |
Caenorhabditis elegans (nGASP confirmed) |
0.948 |
0.886 |
0.883 |
0.563 |
0.991 |
0.937 |
0.993 |
0.946 |
0.932 |
0.731 |
0.898 |
0.575 |
Caenorhabditis elegans (nGASP all) |
0.961 |
0.933 |
0.886 |
0.566 |
0.991 |
0.941 |
0.988 |
0.932 |
0.919 |
0.694 |
0.889 |
0.817 |
Drosophila melanogaster |
0.934 |
0.714 |
0.951 |
0.795 |
0.986 |
0.934 |
0.992 |
0.959 |
0.965 |
0.858 |
0.954 |
0.780 |
Saccharomyces cerevisiae |
0.999 |
0.991 |
0.954 |
0.954 |
0.939 |
0.750 |
0.995 |
0.940 |
0.987 |
0.934 |
0.996 |
0.974 |
Arabidopsis thaliana |
0.965 |
0.797 |
0.959 |
0.817 |
0.986 |
0.929 |
0.991 |
0.953 |
0.960 |
0.816 |
0.938 |
0.653 |
Aspergillus nidulans |
0.999 |
0.988 |
0.946 |
0.760 |
0.965 |
0.827 |
0.987 |
0.927 |
0.960 |
0.806 |
0.998 |
0.973 |
Tetraodon nigroviridis |
0.945 |
0.795 |
0.897 |
0.606 |
0.974 |
0.877 |
0.985 |
0.922 |
0.930 |
0.739 |
0.937 |
0.728 |
Anopheles gambiae |
0.935 |
0.728 |
0.925 |
0.720 |
0.961 |
0.871 |
0.975 |
0.910 |
0.951 |
0.817 |
0.923 |
0.708 |
Ciona savignyi |
0.779 |
0.321 |
0.848 |
0.428 |
0.947 |
0.832 |
0.964 |
0.872 |
0.930 |
0.699 |
0.852 |
0.475 |
Gene Segment Type |
Intergenic |
utr5_exon |
cds_exon |
utr3_exon |
intron |
|
auROC |
auPRC |
auROC |
auPRC |
auROC |
auPRC |
auROC |
auPRC |
auROC |
auPRC |
Caenorhabditis elegans (40 genes) |
0.983 |
0.829 |
0.759 |
0.145 |
0.862 |
0.847 |
0.794 |
0.531 |
0.702 |
0.119 |
Caenorhabditis elegans (nGASP confirmed) |
0.864 |
0.772 |
0.838 |
0.205 |
0.980 |
0.969 |
0.760 |
0.126 |
0.851 |
0.588 |
Caenorhabditis elegans (nGASP all) |
0.818 |
0.521 |
0.885 |
0.495 |
0.965 |
0.942 |
0.976 |
0.788 |
0.959 |
0.924 |
Drosophila melanogaster |
0.436 |
0.258 |
0.818 |
0.297 |
0.960 |
0.945 |
0.822 |
0.509 |
0.716 |
0.180 |
Saccharomyces cerevisiae |
0.949 |
0.895 |
0.892 |
0.704 |
0.932 |
0.928 |
0.978 |
0.854 |
0.956 |
0.869 |
Arabidopsis thaliana |
0.778 |
0.736 |
0.864 |
0.422 |
0.955 |
0.908 |
0.813 |
0.263 |
0.940 |
0.856 |
Aspergillus nidulans |
0.890 |
0.633 |
0.889 |
0.504 |
0.847 |
0.470 |
0.977 |
0.788 |
0.945 |
0.842 |
Tetraodon nigroviridis |
0.704 |
0.387 |
0.861 |
0.394 |
0.957 |
0.925 |
0.949 |
0.645 |
0.948 |
0.911 |
Anopheles gambiae |
0.818 |
0.157 |
0.607 |
0.091 |
0.883 |
0.565 |
0.596 |
0.017 |
0.871 |
0.723 |
Ciona savignyi |
0.581 |
0.161 |
0.666 |
0.277 |
0.933 |
0.695 |
0.660 |
0.136 |
0.871 |
0.723 |
Evaluation level |
Exon |
Transcript |
|
SN |
SP |
SN |
SP |
Caenorhabditis elegans (40 genes) |
0.675 |
0.691 |
0.231 |
0.310 |
Caenorhabditis elegans (nGASP confirmed) |
0.860 |
0.841 |
0.429 |
0.478 |
Caenorhabditis elegans (nGASP all) |
0.727 |
0.736 |
0.346 |
0.410 |
Drosophila melanogaster |
0.812 |
0.825 |
0.536 |
0.612 |
Saccharomyces cerevisiae |
0.929 |
0.912 |
0.917 |
0.932 |
Arabidopsis thaliana |
0.887 |
0.894 |
0.634 |
0.680 |
Aspergillus nidulans |
0.754 |
0.674 |
0.507 |
0.546 |
These are preliminary results generated with a beta version of mGene.web. They may change in the near future. Moreover, this evaluation uses a different setup than for instance .