CellProfiler 2.0 FAQ


This thread lists some of more commonly asked questions we come across in the forum. Feel free to add requests to this thread if you think of a general question that would be broadly applicable to many users!


"What happened to the module/data tool names?" or "I can’t find my module/data tool?"

Some of the modules have changed their names between CellProfiler 1.0 and 2.0. In some cases, the change was made to make the nomenclature more consistent; in others, to reflect new, broader functionality.

A pipeline created in CellProfiler 1.0 and loaded into CellProfiler 2.0 will have the appropriate modules converted to their new names automatically. However, if you are looking for a module in CellProfiler 2.0 and can’t find it using the familiar CellProfiler 1.0 name, consult this list of modules that have changed names (former name in parentheses):

  • ConserveMemory (from SpeedUpCellProfiler)

  • ConvertObjectsToImage (from ConvertToImage)

  • EnhanceEdges (from FindEdges)

  • EnhanceOrSuppressFeatures (from EnhanceOrSuppressSpeckles)

  • ExpandOrShrinkObjects (from ExpandOrShrink)

  • ExportToSpreadsheet (from ExportToExcel)

  • FlagImage (from FlagImageForQC)

  • FilterObjects (from FilterByObjectMeasurement)

  • IdentifyObjectsManually (from IdentifyPrimManual)

  • IdentifyPrimaryObjects (from IdentifyPrimAutomatic)

  • IdentifySecondaryObjects (from IdentifySecondary)

  • IdentifyTertiaryObjects (from IdentifyTertiarySubregion)

  • LoadData (from LoadText)

  • MaskObjects (from Exclude)

  • MeasureGranularity (from MeasureImageGranularity)

  • MeasureObjectSizeShape (from MeasureObjectAreaShape)

  • ReassignObjectNumbers (from RelabelObjects)

  • RelateObjects (from Relate)

  • Smooth (from SmoothOrEnhance)

  • UnmixColors (from DifferentiateStains)

The functionality of some modules has superseded by others. These original modules are now deprecated, and are no longer present. Below are the deprecated modules and the names of the modules that should be used in their place in parentheses:

  • LoadImageDirectory (use MakeProjection, along with LoadImages with metadata)

  • IdentifyPrimLoG (use IdentifyPrimaryObjects, with “Method to distinguish clumped objects” set to “Laplacian of Gaussian”)

  • KeepLargestObject (use FilterObjects)

  • Combine (use ImageMath)

  • PlaceAdjacent (use Tile)

  • GroupMovieFrames (use LoadImages with a movie file type selected and “Group the movie frames” checked)

  • FilenameMetadata (use LoadImages with metadata)

  • SubtractBackground (use ApplyThreshold with grayscale setting)

  • CalculateRatios (use CalculateMath with ‘divide’ as the operation)

A new module category Data Tools has been created and some modules have been moved from their original categories. These modules may also be found in the Data Tools menu option. The moved modules are listed below along with the former category in parentheses:

  • CalculateMath (from Measurement)

  • CalculateStatistics (from Measurement)

  • DisplayDataOnImage (from Other… was “ShowDataOnImage”)

  • ExportToDatabase(from File Processing)

  • ExportToSpreadsheet (from File Processing)

  • FlagImage (from Image Processing)

Some modules have yet to be implemented in CellProfiler 2.0. Loading these modules will currently produce an error message with the option of skipping the module and loading the rest of the pipeline.

  • DICTransform

  • DisplayGridInfo

  • DisplayMeasurement

  • Restart

Some modules will not be implemented in CellProfiler 2.0. Where possible, a workaround for the functionality is suggested.

  • SplitOrSpliceMovie: Use LoadImages with metadata

Some Data Tools have yet to be implemented in CellProfiler 2.0 but will be in the future. Where possible, a workaround for the functionality is suggested.

  • ViewData: Export your data using ExportToSpreadsheet and open the resulting spreadsheet in a spreadsheet program like Excel

  • AddData: Use the LoadData module when running your analysis

  • DataLayout, ShowPlateMapData: Use CellProfiler Analyst’s PlateMapViewer tool

  • PlotMeasurement: Some of the functionality is in the CP 2.0 data tool scatterplot and histogram, with the remaining types of plots to be added in the future.

  • SubmitBatch: Use the BatchProfiler tool

  • CalculateRatios and MeasurementCalculator: Use the CalculateMath data tool

Some Data Tools will not be implemented or supported in CellProfiler 2.0:

  • ClearData

  • ConvertBatchFiles: Any of the data tools will work on a batch file without needing to convert it first.

  • GenerateHistogramMovie

  • ExportLocations

Some Menu options have not been implemented in CellProfiler 2.0. Where possible, a workaround for the functionality is suggested.

  • Save pipeline as text: Unnecessary in CellProfiler 2.0; pipelines are now saved in a CellProfiler text format with the extension .cp.


*** What in the world does this output measurement name refer to:** “mean_Bacteria_Intensity_IntegratedIntensity_MaskGreen”, or “mean, Bacteria, Intensity_IntegratedIntensity_MaskGreen”?*

  • The first “mean_” is a bit complicated. If the measurement is from the Per_Image
    database table, or the Image spreadsheet, then it refers to the mean of that measurement across all objects in an image. However, if the measurement is from the Per_Object table, then the output is for a Parent object from a Relate module with the setting for “… generate per-parent means …” set to Yes, and it refers to the mean of the measurement for all the children across that particular parent object. Then to make it even more complicated, you can have the Mean across all parent objects in an image, of the Mean of the children within a parent (this measurement would be named something like “Mean, , Mean__…”. (We have debated whether to even include these “means of means” as their scientific significance may be of only confusing value, but we have left them in for now).

  • Intensity” is the measurement category

  • IntegratedIntensity” is the intensity of the Bacteria object, summed across its extent. But note that since this whole measurement name begins with “Mean_”, then the single number (in the Per_Object table) is the average of the individual objects’ summed intensities

  • “_MaskGreen” refers to the input image, which is the Green channel with a Mask applied at some point

This CP manual page may help as well.


Can CellProfiler handle my brightfield/phase contrast/DIC image?

CellProfiler was originally developed and optimized for fluorescent images. Automated cell segmentation from transmitted light images is notoriously tricky, not just for us but also for those in the biological computer vision community in general. However, there are some pre-processing steps that may help CellProfiler do the job:
[li]IdentifyPrimAutomatic will typically be unsuccessful at finding cells since there is not a clear foreground and background as there is in fluorescence images. However, identifying cells may be possible depending on whether the cells are darker or lighter as compared to the background.
[]If they are lighter, then you may be able to use IdentifyPrimAutomatic to find the cells, provided you have “Fill holes…” set to “Yes” to fill in the (often) darker interior. [/li]
[li]If the cells are darker, then you’ll want to use ImageMath to invert the image to make the cells lighter than the background, and then use IdentifyPrimAutomatic to find the cells. Sometimes, using top-hat filtering in the Smooth module can improve the contrast prior to object identification.[/li]
[li]Often, it is the cell boundary that is darker (due to contrast/phase differences) and not the cell itself. In this case, you will still want to invert the image using ImageMath to make the edges lighter and use FindEdges to find the cell boundaries. Provided the edges form a closed loop, you can then use IdentifyPrimAutomatic to fill in the cell interior.[/li][/ul][/
[li]The trunk build version of CellProfiler has a module, DICTransform, that provides several algorithms for transforming a DIC image to enhance the brightness of cells relative to the background.[/li]
[li]In transmitted images, obtaining a good illumination correction is important so you may want to try adjusting the microscope settings to get more uniform illumination, if possible, or use CorrectIllumination_Calculate and CorrectIllumination_Apply appropriately. This will make automatically identifying the cells with software much easier.[/li][/list:u]


CellProfiler 2.0 takes a really long time to start on my Windows computer
We’ve noticed this can occur in Windows when CellProfiler is trying to run on a computer with antivirus software installed. To work around this issue, you should add the CellProfiler installation folder (which defaults to C:\Program Files\CellProfiler) to the excluded list of folders that your software will omit from scanning; the location of this option will vary from package to package. For more technical details, see this post.


I’m running into memory problems upon loading an image in CellProfiler
We have two solutions that we have found to be helpful:

  • Use a 64-bit computer:
    64-bit have a higher memory overhead than 32-bit machines.

  • Downsample/crop your large images:
    For large images (e.g., >> 2000 x 2000 pixels), consider downsampling the image prior to loading in CellProfiler. Alternately, you might try splitting the image into non-overlapping tiles and processing each separately.

  • Use module windows sparingly:
    Each module window takes a certain amount of time and memory to produce. A single window which produces multiple panels (such as IdentifyPrimaryObjects), each of which containing a very large image, will probably increase the performance overhead until CellProfiler becomes unmanagable. Use Test mode to preview the results but once you are ready for the full analysis run, select Window > Close all open windows to remove the windows that already exist, and then Window > Hide all windows on run to prevent window creation.

  • Use the ConserveMemory module:
    ConserveMemory removes selected intermediate images from memory, so it can inserted into a pipeline to remove images that you know will no longer be used beyond that point.

  • Divide the workflow across several pipelines:
    Instead of making one pipeline that processes the images and then does the counting, measuring and so on, divide them into a few pipelines. For example, the first pipeline might pre-process the images and output the pre-processed results to a folder. A second pipeline might be contain the IdentifyPrimary modules for object identification. If this pipeline is successful, consider inserting the measurement modules until a memory error occurs. If so, each measurement module could be contained in a different pipeline if they cannot all be handled at once. A final step would be to merge the resultant spreadsheets. Modules that tend to be more memory-intensive include those that identify objects, and those that make the more complex measurements such as texture.
    Also keep in mind that the Run multiple pipelines feature can make serial executing pipelines that depend on each other easier.

  • Batch processing your images:
    While CellProfiler clears the image cache with each cycle, it holds the cumulative measurements in memory, Therefore, it may be preferable to processing your images in groups (e.g, if you have 30 images, process them in three separate groups of 10 images each). Running CellProfiler “headless” (i.e, without the graphic user interface) can assist in this task; the instriuctions shown are for running the source code, but the same applies if running the CellProfiler .exe or .app.

  • Reset CellProfiler and/or your computer:
    : Occasionally, restarting CellProfiler afresh will help with memory, or if restarting your computer, make sure a minimum of other applications are active.


I’m getting an error “Content is not allowed in prolog.” Is this bad?
To our knowledge, this is a Java warning. We’ve seen it ourselves, but have not noticed that it is actually “fatal.” You should be able to continue running without problems, but let please us know if something arises.


I don’t know what units CellProfiler uses for measurements/I’d like to convert the CellProfiler measurements to another scale (e.g., microns)
The units that CellProfiler uses are pixels. To convert CellProfiler measurements to another unit of scale, you would have to know the pixel-to-chosen unit conversion factor beforehand. You can do this using a stage micrometer in the following way:

  • Acquire a picture of the micrometer scale markings with the same magnification as for your cell images.

  • You can then measure the pixel distance between marks (which are a set number of microns apart) using the software of choice. For example, you can open the image in CellProfiler by double-clicking on the filename in the lower-left panel, and measure a distance using *Tools > Measure length *
    in the menu bar; the pixel distance is shown in the lower right of the image. If you are more familiar with using Photoshop or ImageJ for this purpose, feel free to use that.

  • You should have a number of pixels for a given number of microns, so you can determine your conversion factor as (micron distance)/(pixel distance)

This conversion factor can then be used in CalculateMath by selecting the measurement that you want to convert, “None” as the operation and enter the conversion factor in the “Multiply the above operand by” setting. Keep in mind that for area-based measurements, you will need to square the conversion factor before entering it into the setting box.and either:

  • Use the CalculateMath module. Select the measurement to convert with None as the operation and the conversion factor entered into the “Multiply the result by” setting. You would need to do this for each individual measurement you want converted and keep in mind that if you are using a measurement of area, you will need to square the conversion factor first.

  • Export the measurements using ExportToSpreadsheet and perform the conversion in another prgorgam (e.g., Excel).


**I’m getting a new workstation. Do you have any tips on what specifications it should have to run CellProfiler efficiently? **
Right now, we have the following (albeit general) recommendations:

  • 64-bit processor:
    Without a doubt, your best bet for CellProfiler is to use either the 64-bit version for a 64-bit Windows machine or maybe use the Macintosh version which is a little better than 32-bit Windows.

  • Single processor
    : CP is single-threaded for the most part, so multiple cores won’t get you much.

  • Memory:
    You may want to bulk up your memory, for three reasons:
    [list]*]CP gets bogged down if the images are much greater than 2000 x 2000 pixels or so. Our 64-bit virtual machine with 4GB RAM manages to run but slows down considerably with images that big.

  • CP keeps the cumulative measurements in memory and writes them out at the end, so if you have smaller images but a lot of them, low memory may also be a show-stopper.

  • If you use IIastik (bundled with CP v10997 and above) for pixel classification, that package takes a lot of memory in itself.



"On my Mac, I am seeing a ‘CellProfiler 2.0 error’ window on startup. Uninstalling and reinstalling doesn’t solve it."
This error (see image below) is produced when a folder that was used as the Default Input or Output folder no longer exists or been unmounted if it’s network drive.

We have noted and fixed this in the current release (v10997) so you should download a fresh copy from the website. Alternately, you can restoring/remounting the offending working directory prior to CP startup.

If this still doesn’t work, try the following (courtesy of ScottDaniel):

  • Uninstall MacPython per macpython.org’s uninstallation directions.

  • Delete CellProfiler (and CellProfiler Analyst if present)

  • Empty the trashcan

  • Restart the computer (even though its a Mac and shouldn’t matter)

  • Re-download Python 2.7.2 and CellProfiler

  1. Re-install Python and CellProfiler


"I’m analyzing movies in CP, but I don’t know how to process each movie/sequence of frames with a single pipeline."

Typically, we see this issue posed in the context of object tracking, in which cells are tracked within a single movie file or a sequence of image frames that are organized into folders, one per sequence. The problem is that ordinarily CP processes all images within a directory, one after the other, without regard to whether they form distinct groups or not.

This challenge can be solved using the metadata grouping capability of CP. If LoadImages is shown which text pattern in the filename (i.e., metadata) is required to uniquely identify each movie, CP can internally handle the files so that each movie is processed individually. In other words, each collection of movie frames is processed as a “group.”

We have posted a (hopefully) helpful example on the Examples webpage, under the ExampleTrackObjects item. One note is that if your files are single movie files rather than image sequences, you would set the file type in LoadImages as a movie file. In this case, you don’t need to check the “Group images by metdata?” setting, since the grouping is done automatically for movie files.

You can refer to the Image grouping entry in the online manual for more pointers. Also, if you have access to a computer cluster, you can refer to our FAQ question on batch processing so that your movies can be processed in parallel.


"I’d like to know how to set up my pipeline for parallel processing"

We typically use batch processing to handle the analysis (Note that this is an option only if you have multiple computer connected together or have access to a computing cluster). Once you have grouping set up and have inserted the CreateBatchFiles module to create a batch file, you can then run CP “headless” (i.e., without the graphical user interface) with a script of your design and with the batch file as the input pipeline.

Please note if your using image groups, you would need to make sure to specify the grouping with the “-g” flag and then submit each batch to a single computer/node.

Here are some pointers for more help:

  • The Batch submission
    entry in the online manual

  • Running CP headless: You can type “CellProfiler.py --help” to get more info on the available flags, including “-g”.

  • Assuming that you’ve already svn checked out our source code
    , take a look at the NewBatch.py script under the BatchProfiler directory. This is the script we use for our own batch submissions and can guide you in extracting the groups from the batch file and putting together the submission arguments.


"What image formats are allowed in CellProfiler? Which one should I use?"

CellProfiler uses Bio-Formats to read a variety of image formats, including most of those used in imaging; see here for the formats available. If the format is not shown in CellProfiler, then it may not enabled for reading, even though Bio-Formats permits it; please post in the forum if this is the case.

Some image formats are better than others for image analysis. Some are lossy (information is lost in the conversion to the format) like most JPG/JPEG files, others are lossless (no info is lost). Typically, we recommend either TIF or PNG.


"On the Mac, CellProfiler crashes on startup with a “Visit MacPython website” error message. How can I fix this?"

We think that the culprit is if the Default Input/Output folder either has weird characters in it or belongs to a network drive that was unmounted. The location of these folders is stored as a CellProfiler preference, and if it’s unreadable, CP dies horribly.

If your CP preferences are screwed up on the Mac, you can delete them and start fresh. Delete the file ~/Library/Preferences/CellProfilerLocal.cfg and then restart CellProfiler.


"On the PC, CellProfiler crashes (on startup or during a run) with a ‘UnicodeEncodeError: ascii codec cant’t encode characters in position 39-43 [or similar numbers]: ordinal not in range’. How can I fix this?"

If this behavior occurs on startup:
This issue is similar to the one above. In this case, you will need to modify some values in your registry to deal with the problem. PLEASE NOTE: Modify your registry settings at your own risk.

You can get to your registry by following the instructions here. Once there, if you go to Computer\HKEY_CURRENT_USER\Software\CellProfilerLocal.cfg (or do a Find (Ctrl-F) for ‘CellProfilerLocal.cfg’), modify the key “DefaultImageDirectory” by clearing the value data. Restart CellProfiler and the Default Input Folder should reset itself.

If this behavior occurs during a run:
This is an error that we have seen occur when the default input or output folders contains non-ASCII characters, e.g., non-English characters with accents over them such as listed here: extra.shu.ac.uk/emls/emlschar.html. Please try to insure that the paths to your folders do not have such characters.


"Can CellProfiler work with my 3-D/4-D/N-D image files?"

We are planning more extensive support for N-D, possibly starting work sometime in the next year. Currently, the multidimensional support in CellProfiler is somewhat limited, in that CellProfiler only works with the image planes for a single 2-D timepoint / z-stack location per cycle; in general, we try to stay away from looping through all images in CP during a given image cycle, aside from grouping, since looping would break many assumptions in the code and would make things complex for both the users and the code maintainers.

However, for N-D use, we often use grouping to create one group of all images for a site, typically a 3-D time series or z-stack, but possibly N-D. For assays that involve operations such as an N-D segmentation, it may be necessary to run one pipeline to produce the segmentation and a second to operate on the segmentation results.


"What version of MySQL should I install for use with CellProfiler/CellProfiler Analyst?"

We have used CellProfiler and CellProfiler Analyst only with 5.0 and 5.1. However, since the database requirements of both are fairly simple, it would probably work with any version. Version 5.1 seems to be slightly more stable than 5.5, but if you (or your sysadmin) has a reason to choose 5.5 instead, that should be fine.

Make sure to use the MyISAM storage engine, rather than InnoDB (which may be the default in newer versions).


"Why do the pixel intensities in CellProfiler scaled from 0 to 1, rather than the actual bit-depth "

The default behavior of CP is to scale all images between 0 and 1, by dividing all pixels in the image by the maximum possible intensity value for that bit-depth. This is in order to maintain consistency across image bit-depths and avoid having to re-evaluate various intensity-dependent settings for different image formats (e.g., the lower threshold bound in IdentifyPrimaryObjects).

Of note here is the “Rescale intensities?” box in LoadImages. Contrary to popular belief, it does not rescale the images to the native bit-depth (e.g, 255, 65535, etc). Rather, it deals with the cases in which an image of a particular bit-depth is stored in a format of different, higher bit-depth. For instance, a microscope might acquire images using a 12-bit A/D converter which outputs intensity values between zero and 4095, but stores the values in a field that can take values up to 65535. This setting is checked by default, resulting in the behavior described abovee. Unchecking this setting to ignore the image metadata and rescale the image to 0 - 1.0 by dividing by 255 or 65535, depending on the number of bits used to store the image.