Simple batch file-list


#1

Hi -

I’m new to cellprofiler so I’m trying to learn batch processing with the simplest example I could think of. So I’m trying to port the very first HumanCells example [1] to run in batch mode.

I’m running version 2.2.0rc3 on a mac [2] and after getting java set up I can do the whole example via the GUI and generate valid output images. But then I export the pipeline and the Image_file_list and try to run from the command line and can’t seem to get things working even though I think I have the paths correct.

Here’s the command line I use:

./CellProfiler -r -c -p /tmp/humans/ExampleHuman.cppipe -o /tmp/humans/output --data-file=/tmp/humans/Image_file_list.csv -L 10

I get a very verbose output with lots of apparently unrelated java errors (they also appear in the GUI mode where this example works).

The key error message appears to be this one:

CP-JAVA 11:54:47.283 [Thread-1] WARN o.c.imageset.ChannelFilter - Empty image set list: no images passed the filtering criteria.

I’ve seen references to others seeing this problem but never an answer about the root cause.

I’ve uploaded the pipeline and other data here [3]. Would someone be able to try this and point me in the right direction?

Thanks in advance!
-Steve

[`1] http://cellprofiler.org/examples.html#HumanCells

[2] $ ./CellProfiler --version
Mar 25 12:01:03 CellProfiler[66710] : CellProfiler 2.2.0rc3
Mar 25 12:01:03 CellProfiler[66710] : Git b91da8e
Mar 25 12:01:03 CellProfiler[66710] : Version 20160308191757
Mar 25 12:01:03 CellProfiler[66710] : Built 2016-03-08T19:17:57

[3] https://dl.dropboxusercontent.com/u/1686930/humans.zip


#2

I tried this (changing my local paths appropriately) and also get the same error. @LeeKamentsky?

@Steve_Pieper Have you tried the --get-batch-commands switch to help? Though this looks ok to me.


#3

I ran this same pipeline using the ‘–get-batch-commands’ switch (after running it locally to get the h5 file) and it produced this error below. A grouping issue in headless?

./CellProfiler -c -r -p /Users/dlogan/Projects_local/Forum/Simple_batch_file-list/humans/output/DefaultOUT.h5 -g ImageNumber=1

ar 25 13:35:28 wme2f-11a CellProfiler[97641] : at javassist.CtClassType.getClassFile2(CtClassType.java:190)
Mar 25 13:35:28 wme2f-11a CellProfiler[97641] : … 37 more
Mar 25 13:35:30 wme2f-11a CellProfiler[97641] : Version: 2016-03-08T19:17:57 b91da8e / 20160308191757
Mar 25 13:35:30 wme2f-11a CellProfiler[97641] : Failed to stop Ilastik
Mar 25 13:35:31 wme2f-11a CellProfiler[97641] : vigra import: failed to import the vigra library. Please follow the instructions on
Mar 25 13:35:31 wme2f-11a CellProfiler[97641] : “http://hci.iwr.uni-heidelberg.de/vigra/” to install vigra
Mar 25 13:35:31 wme2f-11a CellProfiler[97641] : Traceback (most recent call last):
Mar 25 13:35:31 wme2f-11a CellProfiler[97641] : File “/Applications/CellProfiler_beta_2.2.0.app/Contents/Resources/lib/python2.7/cellprofiler/modules/classifypixels.py”, line 50, in
Mar 25 13:35:31 wme2f-11a CellProfiler[97641] : ImportError: No module named vigra
Mar 25 13:35:31 wme2f-11a CellProfiler[97641] : vigra import: failed to import the vigra library. Please follow the instructions on
Mar 25 13:35:31 wme2f-11a CellProfiler[97641] : “http://hci.iwr.uni-heidelberg.de/vigra/” to install vigra
Mar 25 13:35:31 wme2f-11a CellProfiler[97641] : Traceback (most recent call last):
Mar 25 13:35:31 wme2f-11a CellProfiler[97641] : File “/Applications/CellProfiler_beta_2.2.0.app/Contents/Resources/lib/python2.7/cellprofiler/modules/ilastik_pixel_classification.py”, line 47, in
Mar 25 13:35:31 wme2f-11a CellProfiler[97641] : ImportError: No module named vigra
Mar 25 13:35:33 wme2f-11a CellProfiler[97641] : CP-JAVA 13:35:33.024 [Thread-1] WARN o.c.imageset.ChannelFilter - Empty image set list: no images passed the filtering criteria.
Mar 25 13:35:33 wme2f-11a CellProfiler[97641] : Times reported are CPU times for each module, not wall-clock time
Mar 25 13:35:33 wme2f-11a CellProfiler[97641] : Uncaught exception in CellProfiler.py
Mar 25 13:35:33 wme2f-11a CellProfiler[97641] : Traceback (most recent call last):
Mar 25 13:35:33 wme2f-11a CellProfiler[97641] : File “/Applications/CellProfiler_beta_2.2.0.app/Contents/Resources/lib/python2.7/cellprofiler/main.py”, line 252, in main
Mar 25 13:35:33 wme2f-11a CellProfiler[97641] : File “/Applications/CellProfiler_beta_2.2.0.app/Contents/Resources/lib/python2.7/cellprofiler/main.py”, line 918, in run_pipeline_headless
Mar 25 13:35:33 wme2f-11a CellProfiler[97641] : File “/Applications/CellProfiler_beta_2.2.0.app/Contents/Resources/lib/python2.7/cellprofiler/pipeline.py”, line 1650, in run
Mar 25 13:35:33 wme2f-11a CellProfiler[97641] : File “/Applications/CellProfiler_beta_2.2.0.app/Contents/Resources/lib/python2.7/cellprofiler/pipeline.py”, line 1760, in run_with_yield
Mar 25 13:35:33 wme2f-11a CellProfiler[97641] : File “/Applications/CellProfiler_beta_2.2.0.app/Contents/Resources/lib/python2.7/cellprofiler/pipeline.py”, line 1678, in group
Mar 25 13:35:33 wme2f-11a CellProfiler[97641] : ValueError: The grouping keys specified on the command line (ImageNumber) must be the same as those defined by the modules in the pipeline ()
Mar 25 13:35:33 wme2f-11a CellProfiler[97641] : Failed to stop Ilastik
Mar 25 13:35:33 wme2f-11a CellProfiler[97641] : Traceback (most recent call last):
Mar 25 13:35:33 wme2f-11a CellProfiler[97641] : File “/Applications/CellProfiler_beta_2.2.0.app/Contents/Resources/boot.py”, line 377, in
Mar 25 13:35:33 wme2f-11a CellProfiler[97641] : _run()
Mar 25 13:35:33 wme2f-11a CellProfiler[97641] : File “/Applications/CellProfiler_beta_2.2.0.app/Contents/Resources/boot.py”, line 358, in _run
Mar 25 13:35:33 wme2f-11a CellProfiler[97641] : exec(compile(source, path, ‘exec’), globals(), globals())
Mar 25 13:35:33 wme2f-11a CellProfiler[97641] : File “/Applications/CellProfiler_beta_2.2.0.app/Contents/Resources/CellProfiler.py”, line 4, in
Mar 25 13:35:33 wme2f-11a CellProfiler[97641] : cellprofiler.main.main()
Mar 25 13:35:33 wme2f-11a CellProfiler[97641] : File “/Applications/CellProfiler_beta_2.2.0.app/Contents/Resources/lib/python2.7/cellprofiler/main.py”, line 252, in main
Mar 25 13:35:33 wme2f-11a CellProfiler[97641] : run_pipeline_headless(options, args)
Mar 25 13:35:33 wme2f-11a CellProfiler[97641] : File “/Applications/CellProfiler_beta_2.2.0.app/Contents/Resources/lib/python2.7/cellprofiler/main.py”, line 918, in run_pipeline_headless
Mar 25 13:35:33 wme2f-11a CellProfiler[97641] : initial_measurements = initial_measurements)
Mar 25 13:35:33 wme2f-11a CellProfiler[97641] : File “/Applications/CellProfiler_beta_2.2.0.app/Contents/Resources/lib/python2.7/cellprofiler/pipeline.py”, line 1650, in run
Mar 25 13:35:33 wme2f-11a CellProfiler[97641] : initial_measurements = measurements):
Mar 25 13:35:33 wme2f-11a CellProfiler[97641] : File “/Applications/CellProfiler_beta_2.2.0.app/Contents/Resources/lib/python2.7/cellprofiler/pipeline.py”, line 1760, in run_with_yield
Mar 25 13:35:33 wme2f-11a CellProfiler[97641] : in group(workspace):
Mar 25 13:35:33 wme2f-11a CellProfiler[97641] : File “/Applications/CellProfiler_beta_2.2.0.app/Contents/Resources/lib/python2.7/cellprofiler/pipeline.py”, line 1678, in group
Mar 25 13:35:33 wme2f-11a CellProfiler[97641] : ", ".join(grouping.keys()), ", ".join(keys)))
Mar 25 13:35:33 wme2f-11a CellProfiler[97641] : ValueError: The grouping keys specified on the command line (ImageNumber) must be the same as those defined by the modules in the pipeline ()
Mar 25 13:35:33 wme2f-11a CellProfiler[97641] : 2016-03-25 13:35:33.168 CellProfiler[97641:698058] CellProfiler Error

Then I tried without the ‘-g’ switch:

./CellProfiler -c -r -p /Users/dlogan/Projects_local/Forum/Simple_batch_file-list/humans/output/DefaultOUT.h5

and got output just fine!


Problem of running pipeline for BBBC022
#4

You should be using LoadData for loading a CSV. Then you can use the --data-file switch to tell LoadData to use this CSV. Just put the LoadData module at the top of the pipeline and ignore the “Legacy” warnings.

The alternative is to use the --file-list switch with the input modules. You should have one file per line, but you can’t input any metadata.


#5

Success! Thanks so much. Yes, adding the LoadData to the pipeline was the key.

In case anyone else is seeing the same issue here’s the command line and the pipeline + data that worked for me.

./CellProfiler -r -c -p /tmp/humans/ExampleHumanLoad.cppipe -o /tmp/humans/output --data-file=/tmp/humans/Input_file_list.csv -L 10

https://dl.dropboxusercontent.com/u/1686930/humans-with-LoadData.zip

Also, in case anyone is trying the same thing with the containerized cellprofiler, the following command worked for me with the same pipeline and data (just with the /tmp/human paths changed to /human everywhere).

docker run -v /home/pieper/humans:/humans cellprofiler/cellprofiler -r -c -p /humans/ExampleHumanLoad.cppipe -o /humans/output

Thanks again Lee and David!
Steve


#6

Hello,

We are also experimenting with the --get-batch-commands switch and ran into the same “grouping error”.

If we produce an Batch_data.h5 file without grouping the printed commands are like this:

CellProfiler -c -r -p /.../Batch_data.h5 -g ImageNumber=69

Running this command raises the error that the grouping keys on the command line do not match the grouping keys defined in the module.

If we replace the command by the one below, it works just fine:

CellProfiler -c -r -p /.../Batch_data.h5 -f 69 -l 69

Could the code behind --get-batch-commands get changed to do this automatically?


#7

This looks like a bug. I’ll file an issue for you.

Issue: https://github.com/CellProfiler/CellProfiler/issues/2725