Headless workers always using at least 8GB of memory

headless

#1

Hi,

I keep running into MemoryErrors on cellprofiler runs on our cluster despite allocating really large amounts of RAM. After running a test job counting nuclei on a small image it seems to always use at least 8GB of memory, no matter how simple the job.

From using the desktop version I know I can usually get by on 1-2GB per worker, so I’m guessing it’s something to do with how we have java set up? I’m pretty clueless on anything java related so if anyone has any ideas it would be great.

Thanks,
Scott.


#2

I’m also pretty clueless about the java stuff, but we’ll do our best to get you started. :smile:

In the meantime, can you tell us a bit more about the cluster? OS, what version of CP you’re running, etc?


#3

The cluster is running Scientific Linux7.
The Cellprofiler install is cloned from github and cellprofiler --version returns 3.0.0rc1, which I’m running from a virtualenv.
I’m loading jdk 1.8.0_66 (don’t know if this is important).


#4

You can check if Java is the culprit by running:

java -XX:+PrintFlagsFinal -version | grep HeapSize

Reported sizes are in bytes.


#5

So to my untrained eye am I right in thinking it’s only allocating ~2GB on start up, and this all looks pretty normal?

$ java -XX:+PrintFlagsFinal -version | grep HeapSize
            uintx ErgoHeapSizeLimit           = 0           {product}
            uintx HeapSizePerGCThread         = 87241520    {product}
            uintx InitialHeapSize            := 2147483648  {product}
            uintx LargePageHeapSizeThreshold  = 134217728   {product}
            uintx MaxHeapSize                := 4294967296  {product}
    java version "1.8.0_66"
    Java(TM) SE Runtime Environment (build 1.8.0_66-b17)
    Java HotSpot(TM) 64-Bit Server VM (build 25.66-b17, mixed mode)

#6

Yeah. That’s about 2GB initially allocated, 4GB maximum. This looks like Java’s “out of the box” configuration; my laptop has the same specs.

Still doesn’t explain why 8GB are allocated.

We can try using another tool to inspect what’s happening while CellProfiler is running. You’ll need the process ID (PID) for CellProfiler when it’s running. You can use jmap (should be available with Java) to inspect the heap as CellProfiler runs. Assuming you have a 64-bit Java, the command to run is:

jmap -J-d64 -heap PID

Drop the -J-d64 flag if your Java is 32-bit.

Does anything seem off?


#7

We had a similar problem where CellProfiler was killed on the cluster due to excessive memory usage.
However in our case, it turned out that the cluster was configured to limit virtual memory, not real memory.
Therefore this was a pretty strange (and actually useless) cluster setting, because it was much too
restrictive.

If you do not know the difference: virtual memory is like real memory, but additionally includes memory
locations that can be shared between applications. Also, the operating system does not need to actually
allocate all virtual memory. So its easily possible to run 50 applications with 4GB virtual memory each
on a machine that just has 16BG of RAM. With real memory, this would not work.

So just to double-check: are you certain its real memory (sometimes reported as RSS)? If yes, then its
actually strange and a problem. If its virtual memory, then this is absolutely acceptable, but you may want
to change the cluster configuration.