Quantifying Accuracy


#1

Hi guys,

I know it’s a whole field in itself, but do you have any advice on quantifying, or putting numbers to, the accuracy of my classifier?

I’m currently looking at 5 phenotypes, and have built a few classifiers from ~200 images, using ~100 cells for each class. I have also kept images back to use as first seen data to test the ability of the classifier.
I usually plot my accuracy by scoring all, then looking at correct cells versus incorrect. I can also look at the enrichment scores for the image sets as well, which always looks better of course.

Now, I’m comparing the ability of a couple of classifiers against each other - i.e. one looks at all 5 phenotypes, another looks at a subset of 4, then I have two which look at 3 phenotypes.

I’d really like to be able to plot ROC curves for each of these classifiers to directly compare them using AUC. Is there a way to get data out to do this? I’ve tried using enrichment scores but they aren’t really suited to it

Any help much appreciated,
Cheers,
Paul


#2

Hi Paul,

This is a good question, and CPA’s “Check Progress” button is at least an attempt at answering this. It does a cross-validation of the training set (10 fold, I believe), though there are other metrics out there, as you say. With >2 classes though, our output plot is hard to assess what classes are providing the error. I forwarded this question to others in the group who may have code for this available, plus we have a couple students now that might have time to look into this, but we’ll see what they say.

Cheers,
David


#3

Follow-up reply from one of our group members:

[quote]XValidate in FastGentleBoosting.py will give the number of misclassified examples, but I don’t think there’s a way to get the information he wants with the current version. The only measure of accuracy now is the cross validation from Check Progress. Yes, we haven’t talked specifically about what metrics should be added, but we do want to improve in that area.

For future reference, if we want a python implementation of multiclass ROC curves see here scikit-learn.org/dev/auto_exampl … lot-roc-py [/quote]

Hope that helps, at least for now.
David