Peer ranking supplementary material
Here is a gzipped tar file with an implementation
of the peer ranking method, implementations of some methods for comparison,
some data, and some scripts for evaluating the methods.
About the yeast protein-protein data:
- Each record corresponds to a protein pair.
- The fields of a record are separated by commas.
- The last field is the gold-standard designation of whether or not
there is interaction.
- The first five fields are the sources of evidence:
- pulldown,
- yeast two-hybrid,
- co-expression,
- functional_annotation, and
- essentiality.
- Missing data is indicated by "?".
- This dataset was consolidated from data obtained from
the Gerstein Lab web site.
The following paper describes the method:
P. M. Long, V. Varadan,
S. Gilman, M. Treshock and R. A. Servedio.
Unsupervised evidence integration. ICML'05.
After publishing this paper we discovered a bug in the code
evaluating the EM algorithm on the protein-protein
data. The software above includes a patch.
Here is a revision of Table 1 from the paper:
Algorithm |
Protein-Protein |
Synth |
Peer |
0.947 |
0.977 |
Eigen |
0.862 |
0.899 |
k-means |
0.862 |
0.724 |
EM |
0.887 (buggy value was 0.848) |
0.911 |
Here are the ROC curves on the protein-protein data:
Please send email to tell us about your experiences with the software. We are also glad to answer questions about it.