Object name in pipeline saved as utf-16


#1

Bit of a convoluted problem…

We have a pipeline that was created and saved with Cellprofiler 3.0.0 on Windows, which is then run on a linux cluster running Cellprofiler v3.1.3 (forked roughly last week).

The pipeline runs fine until the final ExportSpeadsheet module which then produces the error:

Failed to complete post_run processing for module ExportToSpreadsheet.
Traceback (most recent call last):
  File "/exports/igmm/eddie/Drug-Discovery/scott/.conda_envs/cellprofiler/lib/python2.7/site-packages/cellprofiler/pipeline.py", line 2191, in post_run
    module.post_run(workspace)
  File "/exports/igmm/eddie/Drug-Discovery/scott/.conda_envs/cellprofiler/lib/python2.7/site-packages/cellprofiler/modules/exporttospreadsheet.py", line 643, in post_run
    self.run_objects(object_names, workspace, first_group)
  File "/exports/igmm/eddie/Drug-Discovery/scott/.conda_envs/cellprofiler/lib/python2.7/site-packages/cellprofiler/modules/exporttospreadsheet.py", line 708, in run_objects
    workspace, settings_group)
  File "/exports/igmm/eddie/Drug-Discovery/scott/.conda_envs/cellprofiler/lib/python2.7/site-packages/cellprofiler/modules/exporttospreadsheet.py", line 1098, in make_object_file
    ofeatures.sort()
UnicodeDecodeError: 'ascii' codec can't decode byte 0xff in position 7: ordinal not in range(128)

So it looks like there’s a feature or object name containing a strange character, the pipeline is here, though I can’t find anything strange in there.

I stuck a print ofeatures in our cellprofiler build in ExportToSpreadsheet, which prints out:

[..., (u'FilteredNuclei', 'Parent_\xff\xfeN\x00u\x00c\x00l\x00e\x00i\x00'), (u'FilteredNuclei', u'Children_Cells_Count'), ...]

According to python2 "\xff\xfeN\x00u\x00c\x00l\x00e\x00i\x00".decode("utf16") is u'Nuclei'.

I’ve tried:

  • converting the pipeline file to utf-8 with:
    iconv -f ascii -t utf-8 old_pipeline > new_pipeline
  • opening the pipeline in linux Cellprofiler 3.0.0, retyping the object names and saving it again

but I’m still getting the same error.

I’ve not had any issues with this 3.1.3 version of cellprofiler before until we started using pipelines from other users.

I’m pretty stumped, so any thoughts or work-arounds would be great.


#2

One of our SWE’s is looking at this now, but we think it might be related to a recent change we made to UTF encoding. If you want to roll back to 3.0.0 (or b4a10e2863611de8b406bbe5bed8f1787af9fef5, which is the last commit before that pull request), that might be a good workaround in the immediate future until we can get it fixed!


#3

Hello Swarchal!

Do you mind posting your LoadData file for this pipeline, and (possible) any images associated with it? I’d love to try and reproduce the issue on my machine so I can pinpoint how to fix it.

Thanks!


#4

Hi bcimini, AetherUnbound,

I’ve made a sample LoadData here, and the associated images are here, though you’ll need to fiddle with the LoadData paths.

Edit: looks like the forum doesn’t like ftp links. Images are located at:
ftp://ftp.ed.ac.uk/edupload/becka_test_images.tar.gz