[SimGrid-user] Are time-independent tracing ranks OK + recv with MPI_ANY_SOURCE ?
chaix at ics.forth.gr
Thu Nov 15 04:29:21 CET 2018
Indeed, nothing was wrong with rank-ids... but I think there is an issue
with recv with MPI_ANY_SOURCE.
First, we might just not want to support MPI_ANY_SOURCE for
time-independent traces, because this could introduce inaccuracies when
we change platform. But then, I think we should fail cleanly when we
Second, if we want to support it, there would be a discrepancy between
Pt2PtTIData::print and SendRecvParser::parse. The first skips the
"partner" when MPI_ANY_SOURCE (which sounds ok). The second assumes that
if the event misses an argument, it must be datatype. And as a
consequence, replay applies a recv waiting for the source given by tag
value.. I suppose we prefer to change SendRecvParser::parse to follow
the former, but I may be missing again something here?
Third, if I patch the first problem, in RecvAction::kernel, things break
because we try to find actor corresponding to MPI_ANY_SOURCE to generate
trace.. Is that OK not to trace recv from MPI_ANY_SOURCE, or should we
add another actor for this case?
On 11/14/2018 9:41 PM, Christian Heinrich wrote:
> Hi Fabien,
> the file names are in fact confusing: The process-ids (NOT the rank-ids!) start at 1 (0 is the
> maestro). If you open the file, you should see that the first column is the id-1.
> The main issue with using rank-ids is that they're not unique; each communicator defines a "rank 0".
> The whole thing should work as expected though.
> PS: I remember someone talking about SendRecv earlier already; maybe you should check with Augustin
> if he has something up his sleeves already.
> On Wed, 2018-11-14 at 18:58 +0200, Fabien Chaix wrote:
>> I am trying to use Time-independent traces on various MPI applications
>> with latest git version.
>> I am looking into implementing SendRecv that is needed for some apps,
>> but this is not my main concern.
>> I am a bit confused about the ranks being used in the trace. I would
>> believe that MPI ranks go from 0 to $size-1, but trace files start from
>> 1. I also get sendrecv event to $size that are causing the simulation to
>> fail. Perhaps the f2c structure in the middle is messing things up (my
>> code is written in c)?
>> I think I can get to the bottom of this, but if someone knows the right
>> way, will be faster and cleaner ;-)
>> Simgrid-user mailing list
>> Simgrid-user at lists.gforge.inria.fr
More information about the Simgrid-user