[Simgrid-user] Excessive Memory Use

Martin Quinson martin.quinson at loria.fr
Thu Jul 21 14:57:35 CEST 2011


Hello,

I'm posting back to the mailing list [cutting what you wrote to avoid
hurting your sensibility] because I hate off list discussions of
general matter. I'm reactive today, but if you mail me off list you
may or may not get an answer. Always keep stuff on the lists, please.



no problem in your previous mail, I was only provocative to get more
information. There is no need to ask whether we think simgrid may have
an issue. We release when we think that there is none, and if we have
any doubt, they are written in the changelog.

If you want to chat to get our feelings, try #simgrid on freenode.


About your problem, it is veeeery interesting (and that's a pity you
decided to go offlist since I'd like to get the advice of other
people). The thing is that the leaks reported by your terminating
version are produced by this code:
MSG_error_t MSG_task_cancel(m_task_t task) {
  xbt_assert((task != NULL), "Invalid parameter");
  
  if (task->simdata->compute) {
    SIMIX_req_host_execution_cancel(task->simdata->compute);
    return MSG_OK;
  }
  if (task->simdata->comm) {
    SIMIX_req_comm_cancel(task->simdata->comm);
    return MSG_OK;
  }
  THROW_IMPOSSIBLE;
}

That's the THROW_IMPOSSIBLE which leaks. So, you are experiencing a
part of the code which is not tested, and you're hiding the abort
message. I guess that you have a try/catch in your code swallowing
exceptions, in finish_all_task_copies (master.c:573)

This bug report is precious to me, and I'd really appreciate if you
could give me what's needed to reproduce and investigate myself.


As for your problem, did I mention that we managed to launch 2M nodes
by reducing the size of the system stack that each of them gets? Check
the contexts/stack_size command line configuration option.

And to finally answer your question, the routing code were changed
recently (as in "this year", not sure which stable version anymore),
but it only leaded to much more memory efficient representations.

So there is nothing that I know which could explain what you seem to
experience. Maybe you want to chat with us on IRC so that we
investiguate together.

Bye, Mt.

-- 
The trouble with the French is that they don't have a word for
Entrepreneur.            -- G.W. Bush



More information about the Simgrid-user mailing list