[SimGrid-user] Are time-independent tracing ranks OK + recv with MPI_ANY_SOURCE ?

Augustin DEGOMME adegomme at gmail.com
Thu Nov 15 10:57:36 CET 2018


Hi,
indeed the MPI_ANY_SOURCE case was forgotten. I'm patching replay soon to
handle it, printing explicitely for now -555 in the traces, which is the
value for MPI_ANY_SOURCE in SMPI. Having two optional fields in the trace
would be potentially ambiguous.
And yes, the tracing call inside the replay needs to be adapted and the
real sender extracted from the MPI_Status field, as is done in
smpi_pmpi_request.cpp .

I just need to test and I will push it.

Best regards,
Augustin

Le jeu. 15 nov. 2018 à 04:29, Fabien Chaix <chaix at ics.forth.gr> a écrit :

> Hi Christian,
>
> Indeed, nothing was wrong with rank-ids... but I think there is an issue
> with recv with MPI_ANY_SOURCE.
>
> First, we might just not want to support MPI_ANY_SOURCE for
> time-independent traces, because this could introduce inaccuracies when
> we change platform.  But then, I think we should fail cleanly when we
> detect that?
>
> Second, if we want to support it, there would be a discrepancy between
> Pt2PtTIData::print and SendRecvParser::parse. The first skips the
> "partner" when MPI_ANY_SOURCE (which sounds ok). The second assumes that
> if the event misses an argument, it must be datatype. And as a
> consequence, replay applies a recv waiting for the source given by tag
> value.. I suppose we prefer to change SendRecvParser::parse to follow
> the former, but I may be missing again something here?
>
> Third, if I patch the first problem, in RecvAction::kernel, things break
> because we try to find actor corresponding to MPI_ANY_SOURCE to generate
> trace.. Is that OK not to trace recv from MPI_ANY_SOURCE, or should we
> add another actor for this case?
>
>    Cheers,
>
> Fabien
>
>
> On 11/14/2018 9:41 PM, Christian Heinrich wrote:
> > Hi Fabien,
> >
> > the file names are in fact confusing: The process-ids (NOT the
> rank-ids!) start at 1 (0 is the
> > maestro). If you open the file, you should see that the first column is
> the id-1.
> >
> > The main issue with using rank-ids is that they're not unique; each
> communicator defines a "rank 0".
> >
> > The whole thing should work as expected though.
> >
> > Cheers
> > Christian
> >
> > PS: I remember someone talking about SendRecv earlier already; maybe you
> should check with Augustin
> > if he has something up his sleeves already.
> >
> > On Wed, 2018-11-14 at 18:58 +0200, Fabien Chaix wrote:
> >> Hi,
> >>
> >> I am trying to use Time-independent traces on various MPI applications
> >> with latest git version.
> >>
> >> I am looking into implementing SendRecv that is needed for some apps,
> >> but this is not my main concern.
> >>
> >> I am a bit confused about the ranks being used in the trace. I would
> >> believe that MPI ranks go from 0 to $size-1, but trace files start from
> >> 1. I also get sendrecv event to $size that are causing the simulation to
> >> fail. Perhaps the f2c structure in the middle is messing things up (my
> >> code is written in c)?
> >>
> >> I think I can get to the bottom of this, but if someone knows the right
> >> way, will be faster and cleaner ;-)
> >>
> >> Cheers,
> >>
> >> Fabien
> >>
> >> _______________________________________________
> >> Simgrid-user mailing list
> >> Simgrid-user at lists.gforge.inria.fr
> >> https://lists.gforge.inria.fr/mailman/listinfo/simgrid-user
> _______________________________________________
> Simgrid-user mailing list
> Simgrid-user at lists.gforge.inria.fr
> https://lists.gforge.inria.fr/mailman/listinfo/simgrid-user
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gforge.inria.fr/pipermail/simgrid-user/attachments/20181115/679e74a9/attachment-0001.html>


More information about the Simgrid-user mailing list