ATTENTION USERS: Help us with usability improvements!


#1

Hi all,

Later in the year, the software engineers on the CP team would like to improve the usability of two modules in particular: LoadImages and IdentifyPrimaryObjects. These modules are among the most powerful to use, and for the same reason, can often be the most confusing to use, especially for beginners. So we’re hoping to streamline things a bit, and we’d like your thoughts in these two areas:

We’re asking users to post to this thread their thoughts on any (or all) the following issues.

How can we make LoadImages easier to use?
For example:

  • What current features/settings do you find easy to use? Difficult to use?
  • What software have you used that does a good job of showing the user how to load images? (e.g., ImageJ, Photoshop, etc)
  • Do you like the options presented in LoadImages as they are currently, or prefer a wizard
    (i.e., step-by-step guide), or something entirely? - What do you wish LoadImages would just “magically” know about your images, and be able to fill in without your help? (e.g., metadata, channel info, etc)
  • For those who extract metadata from the filename, how could this process be made easier?
  • Is there anything about specifying the input/output folders that you would change?

How can we make IdentifyPrimaryObjects easier to use?
For example:

  • What current features/settings do you find easy to use? Difficult to use?
  • What software have you used that does a good job of showing the user how to identify objects? (e.g., ImageJ, MetaMorph, etc)
  • Would more user interaction help with configuration of settings? For example, a user could be prompted to click on the smallest/largest isolated cells and CP would use that info to fill in the min/max diameter settings.
  • Do you think a “Simple” and “Advanced” mode would be useful? (The Advanced mode could be the interface as it is now, while the Simple mode would be a more basic version, asking minimal questions for more straightforward assays.) What questions should the Simple mode ask?

Again, feel free to answer the questions above or just post any of your thoughts on the subject in general.

Thanks!
-The CellProfiler team


#2

Hello,

It is true that loading images remains a difficult step and I cannot think of anything to improve it. With time you get better and now I have no problem, as long as the format is compatible.
For finding objects, I have worked with the Harmony software of Perkin Elmer that is installed on the Operetta and I find it extremely enabling. I have taught several people who have no notion of image analysis how to segment objects and build pipelines. The trick is that several basic methods are calculated on an image and the outline of the segmentation is showed. By mousing over the method tabs, the different outlines are shown dynamically on the image. The most promising segmentation method is then chosen for each parameter adjustment, the software calculates a range of settings and by mousing over the modification to the segmentation can be assessed dynamically. This very intuitive method allows to achieve very good results. The trick is to pack several “most common” segmentation pipelines in methods, have the software guess at some parameters, such as size of objects and have a good starting point.

Hope this helps,
Marc


#3

Probably the most confusing part of LoadImages is the metadata extraction and more complicated matching of file names. Perhaps instead of just giving a regular expression matcher, allow people to build the expression. I’m envisioning a list of items that can be added: “Exact Text” “Repeated Character” “Metadata Field”, which the program can use to build the appropriate regular expression.

For a first-time user, the pipeline concept may also be a source of confusion here, but I’m not sure what much can be done about that except maybe for the first time someone runs the program, presenting some links to introductory documents might be useful.

As for IdentifyPrimaryObjects, one thing is that some of its action is hidden. For instance, cropping a previous image based on another IdentifyPrimaryObjects module will allow you to use “Per Object” thresholding, and there’s no error indicator if you try to use Per Object on an image that hadn’t been previously cropped. This is a source of confusion because a novice user would assume that any object-finding based on a previous object would be considered a function of IdentifySecondaryObjects and not IdentifyPrimaryObjects. The biggest way this could be solved, perhaps, is to extend IdentifySecondaryObjects with this ability and strip per-object methods from IdentifyPrimaryObjects. Though I recognize this could break some previous pipelines, I don’t suppose it would be too hard to include code to check whether the loading pipeline has an IdentifyPrimaryObjects module using a Per Object method and then, with a notice to the user, automatically converting it to an IdentifySecondaryObjects module.

The other big problem here is that, as Marc mentioned, that testing different options can be a pain. With so many different methods and options to choose, it would be a big usability improvement if it was possible to see the results of an operation dynamically. Test mode can help to some extent, but it is hard to see how the objects change, especially since it is not translucently overlaid onto the input image. It would also help to be able to flip between the loaded images, not just the one . And then if you want to test it against multiple image sets… It’s all really oriented to people who think like programmers, and not necessarily people who just want to do something.

Since we’re talking about usability here, I’ll also throw one other pitch in: a Module that acts as a divider to visually separate different parts of the pipeline from each other (which could perhaps also call ConserveMemory). I don’t know about others, but I’d like to have a visual way to separate operations on one loaded image from the next, or background correction steps from object-finding steps. And/or put the name of the output item next to the Module name in the Pipeline list, so that users can easily see which module does what if they have multiple modules of the same time.


#4

Hi all,

By way of a follow-up, we have roadmap now on our developer’s wiki, with nascent plans and ideas. It’s not intended to replace our detailed bug-tracking/feature request system, but instead to stay big-picture and prioritize the next year’s plans: cellprofiler.org/wiki/index.php/ … r_Roadmap#

In particular, those interested in this thread may want to check out this section: cellprofiler.org/wiki/index.php/ … provements

Thanks!
Mark


#5

Hi all,

Since late last year, we have been working on an improved image loading interface for CellProfiler. It is still in development but can be previewed and we would appreciate feedback! You can obtain this version in tow ways:

  • Download a compiled installation package from here
    . On Windows, if you already have a CellProfiler installation, simply specify a different folder to install into. - Use git to check out the source code
    from the multiprocessing branch from our GitHub repository. If using Windows, we recommend using GitHub for Windows to do this.
    Please note that since this version is experimental, you should consider any pipelines that you construct to be unusable with the current and future releases of CellProfiler.

Lastly, you can check out our roadmap to see where we are headed in our development.

Let us know what you think!
-The CellProfiler team


#6

Hello,

I have received and filled in the survey that was sent around. Of course, after clicking the send button, something occurred to me. The Matlab legacy of having intensity values between 0 and 1 is irritating. Most users expect to see 0-4065 values and cannot relate to the smaller numbers. Also, often images are first looked at with ImageJ where intensities are given on the original scale. It would be great if future implementations could use the normal scale.
Thank you again for producing a great software, it’s been a game changer in our lab.
Marc


#7

mbray, I wonder if some users have trouble with the LoadImages module because it’s not terribly clear what files the module will open in the first cycle.

I could envision the following improvement: given the default ‘open from’ location and the parameters given to identify the file, the program would display the full names of the files the module would be open in the first cycle of the pipeline, it might clarify things.


#8

We’ve done quite a bit of work on the loading interface, and one thing we’ve added is a preview table which is displayed when the names are assigned. This will show how channels are matched and the order they’ll be processed in. So stay tuned!
-Mark


#9

Hi-
Thanks for making such a great program! When the program is running, on the IdentifyPrimaryObjects (where you see the raw image, the identified objects in green outline, and then the individual objects in different colors), it would be very useful if you added the name of the image file. Thus when the program is running and you see something odd by eye, you can write down which image it was.
Many thanks,
Richard Gomer
rgomer@tamu.edu


#10

Hi Richard,

Interesting idea. This seems simple on the face of it, but if you perform an operation on a loaded image, it becomes a different image altogether so it’s not clear what would be displayed in that case (perhaps the root image? But that would mean some bookkeeping if an image is transformed multiple times in different ways).

The workaround is that to keep track of the image cycle number, which is displayed in the module display window. If you output the per-image spreadsheet, you can see which image set it corresponds to by looking at the ImageNumber column. Also, you can infer it by using Test mode and selecting Test > Choose image set from the main menu, but the entries aren’t numbered in the dialog box that lets you choose the image set. We’ve made a revision that does number them though; you’ll see that in the next release.

Regards,
-Mark


#11

The testing mode in the current version of CP works well, but could benefit from a feature which would accelerate pipeline development. It would be a significant improvement if the user could select an image from the list on which to test the pipeline. In particular, when working out thresholding/masking/filtering in positive and negative control images, it would be useful to select which image you would like to test. The current setup allows the user only to test the pipeline on the first image listed in the loaded images dialog interface. If a different image needs to be tested you must remove the alphanumeric ordered entries preceding the alternate image of interest. One simple way might be to allow the user to prioritize images in the list with a button like <move image up/down>, and <move to top/bottom>. Perhaps an easier method would just allow the user to select the image and right click which would provide and option “Use this image for testing”, or allow the selection of the image itself to automatically prioritize it for test mode. This would allow users to toggle between two control images for testing and would accelerate pipeline testing and development. Warm Regards, Paul


#12

Hi Paul,

Thanks for this suggestion. In fact, right now in Test Mode you can select any particular image set by:
(1) Start Test Mode
(2) Test menu > Choose Image Set
This brings up a table with each image set on a row plus any metadata that you have extracted in the Metadata module is shown here also, so you can pick out Pos/Neg controls.

Does that work well enough for you? It is not the most “highly advertised” function – any other ideas on how to expose this to the user? I like your suggestion of a context menu (right-click) somewhere (in the pipeline window?) which has this function.

Best,
David


#13

Thank you David! Yes, this is a quick enough selection and works well … I did not spend enough time looking over the menus.

If you would like to improve visibility - I see that you have a button for in the “Test Mode” interface. This seems like a good fit for . Given you have a lot of excess real estate on the “Run” and “Step” buttons you might be able square them up and put the “Exit Test Mode” button on the same horizontal line. Then I might add the button to the second row next to the button. This may bring greater attention to the feature. NB, it seems the button may have limited use given image names are by default loaded in alphanumeric order and only by coincidence would this correlate to chronology or spatial position (z-slice) - the one exception where this would be useful is if you loaded only 2 image sets composing a positive and negative control. Thank you for the tip in selecting the image set, this speeds things up considerably. King Regards, PjB


#14

Your product is exceptionally well documented. I cannot thank you enough for taking the time to assemble good dynamic help to help navigate the various image/data handling tools. One place I found that could use greater clarity would be a help button describing the effects of choosing “Weighted Variance” as opposed to “Entropy” when thresholding in “Identify primary Objects” for example. Warm Regards, PjB


#15

A feature I found a bit confusing to use was the Measure Granularity tool. A more clear description would be helpful for the “range of granular spectrum” and how it relates to the “radius of the structuring element” and “subsampling factors”. I have been able to see this work using different settings but I would like to understand the variables a bit better. Thanks for the help, PjB


#16

Thanks for the good suggestions, PjB! We’ll add them to our list.
Cheers,
David


#17

CP Team: I have been observing that CP 2.1.1 is easily overwhelmed by large image sets, such as those input from a screen. It seems CP tries to register all images at once and run them as a single process. This seems not to work well particularly during Metadata extraction and Names and Types registering, furthermore the image analysis proceeds very slowly, if at all when large image collections are loaded. I have observed this on well equipped PC platforms running high end i7 multicore processors, so I do not think it is a hardware problem. It might improve the function of CP to segregate CP runs/sessions by folder, such that the completion of each folder as a member of a batch process could force memory dump (purge all file registries/indexes/open file handles etc.) or force shut down/restart. After restart/dump the batch processing would resume with the next folder in the list. I believe this would increase the capacity of CP to handle large image sets and the efficiency with which it processes the images from screening data sets. Other imaging packages implement a similar approach to avoid this problem of memory “saturation”. Warm Regards, Paul


#18

Hi Paul,
For screens, please see this page:

github.com/CellProfiler/CellPro … nvironment

We generally run about 10-30 image sets per job here (that’s one process per job and the process closes). You almost certainly have to use CreateBatchFiles if you have a lengthy image set or else you will pay the price of creating the image set (even for LoadData) for each process. If you are running with the GUI on a machine with many cores, you may have to limit the number of workers via the preferences. Finally, the trunk build has a --file-list switch so that if you are using the input modules, you can start a process headless and give it just the list of files that are in one folder to achieve what you describe.

–Lee