[SimGrid-user] Are time-independent tracing ranks OK + recv with MPI_ANY_SOURCE ?

Fabien Chaix chaix at ics.forth.gr
Thu Nov 15 04:29:21 CET 2018


Hi Christian,

Indeed, nothing was wrong with rank-ids... but I think there is an issue 
with recv with MPI_ANY_SOURCE.

First, we might just not want to support MPI_ANY_SOURCE for 
time-independent traces, because this could introduce inaccuracies when 
we change platform.  But then, I think we should fail cleanly when we 
detect that?

Second, if we want to support it, there would be a discrepancy between 
Pt2PtTIData::print and SendRecvParser::parse. The first skips the 
"partner" when MPI_ANY_SOURCE (which sounds ok). The second assumes that 
if the event misses an argument, it must be datatype. And as a 
consequence, replay applies a recv waiting for the source given by tag 
value.. I suppose we prefer to change SendRecvParser::parse to follow 
the former, but I may be missing again something here?

Third, if I patch the first problem, in RecvAction::kernel, things break 
because we try to find actor corresponding to MPI_ANY_SOURCE to generate 
trace.. Is that OK not to trace recv from MPI_ANY_SOURCE, or should we 
add another actor for this case?

   Cheers,

Fabien


On 11/14/2018 9:41 PM, Christian Heinrich wrote:
> Hi Fabien,
>
> the file names are in fact confusing: The process-ids (NOT the rank-ids!) start at 1 (0 is the
> maestro). If you open the file, you should see that the first column is the id-1.
>
> The main issue with using rank-ids is that they're not unique; each communicator defines a "rank 0".
>
> The whole thing should work as expected though.
>
> Cheers
> Christian
>
> PS: I remember someone talking about SendRecv earlier already; maybe you should check with Augustin
> if he has something up his sleeves already.
>
> On Wed, 2018-11-14 at 18:58 +0200, Fabien Chaix wrote:
>> Hi,
>>
>> I am trying to use Time-independent traces on various MPI applications
>> with latest git version.
>>
>> I am looking into implementing SendRecv that is needed for some apps,
>> but this is not my main concern.
>>
>> I am a bit confused about the ranks being used in the trace. I would
>> believe that MPI ranks go from 0 to $size-1, but trace files start from
>> 1. I also get sendrecv event to $size that are causing the simulation to
>> fail. Perhaps the f2c structure in the middle is messing things up (my
>> code is written in c)?
>>
>> I think I can get to the bottom of this, but if someone knows the right
>> way, will be faster and cleaner ;-)
>>
>> Cheers,
>>
>> Fabien
>>
>> _______________________________________________
>> Simgrid-user mailing list
>> Simgrid-user at lists.gforge.inria.fr
>> https://lists.gforge.inria.fr/mailman/listinfo/simgrid-user


More information about the Simgrid-user mailing list