[SimGrid-user] Are time-independent tracing ranks OK + recv with MPI_ANY_SOURCE ?

Augustin DEGOMME adegomme at gmail.com
Thu Nov 15 11:18:33 CET 2018


I just pushed
https://framagit.org/simgrid/simgrid/commit/c619e9d16e3061f1d05b187d80bc65eb578fb868
. Hope it helps.

Le jeu. 15 nov. 2018 à 10:57, Augustin DEGOMME <adegomme at gmail.com> a
écrit :

> Hi,
> indeed the MPI_ANY_SOURCE case was forgotten. I'm patching replay soon to
> handle it, printing explicitely for now -555 in the traces, which is the
> value for MPI_ANY_SOURCE in SMPI. Having two optional fields in the trace
> would be potentially ambiguous.
> And yes, the tracing call inside the replay needs to be adapted and the
> real sender extracted from the MPI_Status field, as is done in
> smpi_pmpi_request.cpp .
>
> I just need to test and I will push it.
>
> Best regards,
> Augustin
>
> Le jeu. 15 nov. 2018 à 04:29, Fabien Chaix <chaix at ics.forth.gr> a écrit :
>
>> Hi Christian,
>>
>> Indeed, nothing was wrong with rank-ids... but I think there is an issue
>> with recv with MPI_ANY_SOURCE.
>>
>> First, we might just not want to support MPI_ANY_SOURCE for
>> time-independent traces, because this could introduce inaccuracies when
>> we change platform.  But then, I think we should fail cleanly when we
>> detect that?
>>
>> Second, if we want to support it, there would be a discrepancy between
>> Pt2PtTIData::print and SendRecvParser::parse. The first skips the
>> "partner" when MPI_ANY_SOURCE (which sounds ok). The second assumes that
>> if the event misses an argument, it must be datatype. And as a
>> consequence, replay applies a recv waiting for the source given by tag
>> value.. I suppose we prefer to change SendRecvParser::parse to follow
>> the former, but I may be missing again something here?
>>
>> Third, if I patch the first problem, in RecvAction::kernel, things break
>> because we try to find actor corresponding to MPI_ANY_SOURCE to generate
>> trace.. Is that OK not to trace recv from MPI_ANY_SOURCE, or should we
>> add another actor for this case?
>>
>>    Cheers,
>>
>> Fabien
>>
>>
>> On 11/14/2018 9:41 PM, Christian Heinrich wrote:
>> > Hi Fabien,
>> >
>> > the file names are in fact confusing: The process-ids (NOT the
>> rank-ids!) start at 1 (0 is the
>> > maestro). If you open the file, you should see that the first column is
>> the id-1.
>> >
>> > The main issue with using rank-ids is that they're not unique; each
>> communicator defines a "rank 0".
>> >
>> > The whole thing should work as expected though.
>> >
>> > Cheers
>> > Christian
>> >
>> > PS: I remember someone talking about SendRecv earlier already; maybe
>> you should check with Augustin
>> > if he has something up his sleeves already.
>> >
>> > On Wed, 2018-11-14 at 18:58 +0200, Fabien Chaix wrote:
>> >> Hi,
>> >>
>> >> I am trying to use Time-independent traces on various MPI applications
>> >> with latest git version.
>> >>
>> >> I am looking into implementing SendRecv that is needed for some apps,
>> >> but this is not my main concern.
>> >>
>> >> I am a bit confused about the ranks being used in the trace. I would
>> >> believe that MPI ranks go from 0 to $size-1, but trace files start from
>> >> 1. I also get sendrecv event to $size that are causing the simulation
>> to
>> >> fail. Perhaps the f2c structure in the middle is messing things up (my
>> >> code is written in c)?
>> >>
>> >> I think I can get to the bottom of this, but if someone knows the right
>> >> way, will be faster and cleaner ;-)
>> >>
>> >> Cheers,
>> >>
>> >> Fabien
>> >>
>> >> _______________________________________________
>> >> Simgrid-user mailing list
>> >> Simgrid-user at lists.gforge.inria.fr
>> >> https://lists.gforge.inria.fr/mailman/listinfo/simgrid-user
>> _______________________________________________
>> Simgrid-user mailing list
>> Simgrid-user at lists.gforge.inria.fr
>> https://lists.gforge.inria.fr/mailman/listinfo/simgrid-user
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gforge.inria.fr/pipermail/simgrid-user/attachments/20181115/6c3bf710/attachment.html>


More information about the Simgrid-user mailing list