Improving speed (sluggish at higher cycles)


#1

Hello,
Firstly many sincere thanks for the excellent software you have put together. As a biologist, I especially appreciate the user-friendliness, logical modular approach and varied applications of CellProfiler.

The short version of my question is this:
how can I get CellProfiler to run faster, or not peeter out when it gets to higher cycle numbers (eg~24, which equates to 3x96 images).

Thanks in advance for any help you can lend,
ilan

========
Detailed version here:
I am using CellProfiler on a PC with 16Gb of RAM to analyze morphological features (ie eccentricity and FormFactor) of GFP-expressing cells in up to 300 pictures acquired using the Incell 3000 automated confocal microscope.

I have succesfully modified a sample pipeline (human cells) to do the exact job I was hoping for and have confirmed the results mimic what was obtained on similar samples using eye counting.

My pipeline essentially looks like this:

LoadImages
IdentifyPrimAuto
MeasureObjectArea
ExportExcel

which enabled me to identify 4 different types of images in the input folder (and was running at 12 cycles). (See below for file naming. The 4 different types refers to 2,3,4,5 that follow the letters A,B,C,D. Letters are replicate rows on a 96well plate, numbers 2,3,4,5 are different samples in columns on the plate. 0, 1, 2 before “green” are image numbers in a single well)

My goal is to expand this to identify up to 12 different types of images in the input folder, and i THINK i’ve achieved this by repeating the sequence above, ie

LoadImages
LoadImages
LoadImages
IdentifyPrimAuto
IdentifyPrimAuto
IdentifyPrimAuto
IdentifyPrimAuto
IdentifyPrimAuto
IdentifyPrimAuto
IdentifyPrimAuto
IdentifyPrimAuto
IdentifyPrimAuto
IdentifyPrimAuto
IdentifyPrimAuto
IdentifyPrimAuto
MeasureObjectArea
MeasureObjectArea
ExportExcel
ExportExcel

However, when I run this program (which is now going up to 24 cycles), it runs MUCH slower, and gets more sluggish as it goes along. I noticed this happened when i started out trying to analyze 9 images (36 cycles) per sample instead of the 3 here.

(1) Do you have recommendations for changing the above pipeline to speed up analysis? Since I am working with TIFs of about 3Mb each, would it speed tings up to convert them all to jpegs first and if so, what program should I do this in?

(2) More long term for the programmers, is there a way of exporting the generated data along the way, and clearing the memory, so that it doesnt get slower as it gets to later cycles?

==================
my files look like this
_PLATE_2_0_A2_0Green
_PLATE_2_0_A2_1Green
_PLATE_2_0_A2_2Green
_PLATE_2_0_A3_0Green


_PLATE_2_0_A4_0Green


_PLATE_2_0_A5_0Green

(and then repeated for B-D which makes 12 cycles, but ideally would be able to do A-H which is 24 cycles, but seems to be too much.)


#2

It looks like the answer is in the help manual so I will try these changes first:
*

CellProler Help: MemoryAndSpeed
Help for memory and speed issues in CellProfiler:

There are several options in CellProfiler for dealing with out-of-memory
errors associated with analyzing images:

(1) Resize the input images
If the image is high-resolution, it may be helpful to determine
whether the features of interest can be processed (and accurate
data obtained) by using a lower-resolution image. If this is the
case, use the Resize module (under Image Processing) to scale down
the image to a more manageable size, and perform the desired
operations on the smaller image.

(2) Re-use the parameter names
Each image is associated with the unique name that you give it. If
you have many images, and many intermediate images created by the
modules you’ve added, the total space occupied by these images may cause
CellProfiler to run out of memory. In this case, a solution may be
to re-use names that you give to your parameters in later modules
in your pipeline.
For example, if you choose to resize your image and you know that you
don’t need the original image, you can give the resized image the same
name as the original. This will overwrite the original with the smaller,
resized image, thereby saving space.
Note: You must be certain that you have no use for the original image
later in the pipeline, since that data will be lost by this method.

(3) Running without display windows
When your images are being analyzed, the display windows created by
each module in your pipeline requires memory to create. If you are
not interested in seeing the intermediate output as it is produced,
you can deactivate the creation of display windows. Under File > Set
Preferences > Display Mode, you can specify which (if any) windows you
want displayed.
Note: The status and error windows will still be shown so you can see
the pipeline progress as your images are analyzed.

(4) Use the SpeedUpCellProfiler module.
The SpeedUpCellProfiler module permits the user to clear the images
stored in memory with the exception of those specified by the user.
Please see the help for the SpeedUpCellProfiler module for more details
and caveats.
In addition to these, there are other options within MATLAB and within
the operating system of your choice in order to maximize memory. See the
MATLAB product support page “Avoiding Out of Memory Errors”
(mathworks.com/support/tech-n … /1107.html) for details.
Also, there are several options for speeding up the analysis of your
pipeline:

(1) Running without display windows
By setting the display mode under File > Set Preferences > Display
Mode, you can turn off the module display windows which gives a bit of
a gain in speed. Once your pipeline is properly set up, we recommend
running the entire cycle without any windows displayed.

(2) Use care in object identification
If you have a large image which contains a large number of small
objects, a good deal of computer time will be used in processing each
individual object, many of which you might not need. In this case, make
sure that you adjust the diameter options in IdentifyPrimAutomatic to
exclude small objects you are not interested in, or use a FilterObjects
module to eliminate objects that are not of interest.*


#3

Hi Ilan,

Looks like you found the resource I was going to point you to first. Please let us know which (if any) of these options work for you; I’m sure others would like to know as well!

Regards,
-Mark


#4

Hi Mark,
Yes it worked perfectly! Such a great program…

I did:
1 - Resize Images
2 - Do not display windows
3 - SpeedUpCellProfiler Module

By far the largest effect on speed was due to resizing my ~3Mb tifs (I used the default scaler of 0.25, which required that I alter my min,max values by the same scaler). The next largest effect was due to the display windows not being shown I think. And I added the module SpeedUpCellProfiler as well, though didn’t fully compare how large an effect that was having.

ilan


#5

Good to hear that it worked!

One caveat is that any modules which have size-dependent parameters may need to also be resized. For example, if you’re using IdentifyPrimAutomatic for object detection, you may want to adjust the entry for “Typical diameter of objects…” accordingly since it uses units of pixels for the input image given, which would be your resized image and not your original.

Regards,
-Mark