CPA workflow help


#1

I’m a newbie, so it isn’t clear to me from my results where I may be doing something incorrect. The rules resulting from running CPA Classifier reference object(s) location and number. I was expecting something more interesting, like texture, to be used for classification. I have two sets of U2OS cell images (1) DMSO (2) Cytochalasin D (blocks actin elongation), 128 images/set. I selected all measurements in CellProfiler/Export to Spreadsheet, and renamed two of these output files to “Per_Image” and “Per_Object”, amend column names to ‘Cells_Location_X’, etc. to match the properties file. Is this the proper workflow? I don’t think so b’c (1) the CellProfiler Export to Database references the Per_Object and Per_Image tables (but I don’t know where they are and how to edit them to coordinate w/the properties file- attached) and (2) I’m getting errors when fetching images by specific #. Is this something someone could kindly help me with? Thank you very much for your time.
SQLite - Copy.csv (6.25 KB)


#2

Hi,

Indeed, some “features” you do likely want to ignore. Set these columns in your properties file with the “classifier_ignore_columns” setting. Something like

classifier_ignore_columns = ImageNumber, ObjectNumber, .*_Location_Center_.*

As for the rest of your workflow, you seem to be overcomplicating it. So let’s simplify!

First

For CellProfiler Analyst to work, it expects a database, so you should use ExportToSpreadsheet. Howver, you apparently are, since it says so in your properties file? Perhaps you are using both ExportToSpreadsheet and ExportToDatabase, which is technically ok, but it will essentially duplicate your output. I would suggest just using only ExportToDatabase > SQLite as you are doing and then download a sqlite browser so you can look at the output tables in your database as listed in your properties file = Z:\Staff\Sheila\CellProfiler\20150817\SQLite.db

In the .db file, you will find the Per_Image and Per_Object tables already formed with all the requisite Cell object numbers and locations, etc. In fact, you shouldn’t NEED to look there for CPA to function, but it is good to inspect your data. Your main configuration for CPA is only your properties file.

Hope that helps!
David


#3

Progress has been made! I now see the utility of the TableViewer module in CPA :smile: . And the workflow is much simpler now- thanks! :smiley: When you have a moment, could you please take a look at (1) the attached properties file (2) the rules from CPA classifier and let me know how I’m still getting Number_Object_Number rules though I’ve tried to have those ignored? :confused:
RulesFromClassifier - Copy.csv (1.27 KB)
SQLite - Copy.csv (6.3 KB)


#4

Hi,

Sounds like good progress.

As for the ignore columns issue, you have one extra underscore at the end of “.Number_Object_Number.” that causes this string to not match your columns.

Not as critical, but a good idea is to also change “image_number_key_column” to “ImageNumber” and “object_number_key_column” to “ObjectNumber” in your classifier_ignore_columns line. These “…key_column…” phrases were meant to be place holders for the actual column names. (It doesn’t appear like you have TableNumber which is the most common situation, so you can remove that one altogether.)

So you could change:

to something like

classifier_ignore_columns = ImageNumber, ObjectNumber, .*_Location_Center_.*,.*_Metadata_.*, .*_Number_Object_Number.*, .*Parent.*, .*Children.*
though you may want to keep the Parent and Children columns, depending on your situation.

David


#5

In my continued quest for an interesting result…

Q1 :question: : When I include ImageNumber, ObjectNumber, .Location_Center.,.Metadata., ._Number_Object_Number., .Parent., .Children. in the ignore_columns, I get this error: No columns were found to use for classification after filtering… check Per_Object table. When I check the Per_Object table, sure enough, the only data shown is location/Parent/Children related. In CellProfiler, ExportToDatabase (screenshot attached), I’d like to select All in ‘Export measurements for all objects to the database’, however, I can’t get past the ‘Create one table per object, a single object table or a single object view’ w/o a warning. I am missing something conceptual here. Would you kindly elaborate why I can’t get measurements besides ImageNumber, ObjectNumber, .Location_Center.,.Metadata., ._Number_Object_Number., .Parent., .Children. into CPA? :open_mouth:

Q2 :question: : When I remove ignore_columns for analysis, after using ‘score all’ in the classifier module, how does one interpret the resulting Hit table? How are the images ranked? For example, in my attached result, Images 1-8 are treated, 9-16 non-treated. I can kinda see the images ordered loosely in that way, but I don’t see how the data to the right reflect this ranking.

Thank you for your expert guidance. :exclamation:
-sheila :confused:
ScoreAllTable.csv (1.07 KB)


#6

Hi Shelia,

Sorry for the late reply – have you made progress?

Quick answers to your questions:
(1) Have you added many Measure modules upstream of ExportToDatabase? You should see lots of “Measure_*” column headers in your Per_Object table. You can post your properties file here too for us to take a look at to verify it.
(2) You can read the CPA manual for the best description of “Enrichment score” cellprofiler.org/linked_files/Do … manual.pdf
Also in the CSV, the data are not sorted by anything I can determine? Note that CPA’s output table is sortable when you click the headers.

Hope that helps,
David