[Simgrid-user] SimDAG troubles

Martin Quinson martin.quinson at loria.fr
Fri Jan 15 23:36:37 CET 2010


Hey Joao, Fred,

no need to apologize, nobody's hurt ;)

I'm very curious about this story now. We seem to have a bug here, and
it may be due to what I introduced in 3.3.4. That's strange I thought
that I only added stuff to SD, not changed any existing code.

If one of you guys manage to do a [somehow] reduced example, I'd be glad
to host it as bug report on the gforge website. That's the best way to
ensure that the engineers won't be able to shallow the problem...

Once the issue is confirmed by 2 separate users, we are sure that it's
in SimGrid, not in the user code. So, a even relatively large example
(say, 5-10k zip file) is ok then. I really don't want to track down
issues which come down to be on user land. But that's not the case
here. 

In clear: if your code is not sensitive, please submit the bug as is.

Thanks in advance, 
Mt.

Le jeudi 14 janvier 2010 à 19:54 -0200, João Paulo P Tonelli a écrit :
> Hi,
> I would like to apologize for the lacking of provided information and
> the missing of a reduced example. Next time I shall provide as much
> info as I can in order to help you helping me.
> By the way, I would also like to thank you for the tips.
> I partially solved the problem by forcing the serialization of the
> independent tasks, that is, adding dependencies among them, just like
> you said.
> 
> 2010/1/14 Frédéric Suter <frederic.suter at cc.in2p3.fr>
>         Hi,
>         
>         Even though I do agree with Martin's remark, I have to say
>         that I
>         experienced the same kind of problem recently. I didn't found
>         the SG
>         cause of the problem, just a way to correct the problem in my
>         code.
>         
>         My guess is that the chain task dependencies may be broken at
>         some
>         place. In this case, the SD_simulate function stops when there
>         no more
>         action (execution or removal of a dependency) to do. You can
>         try to
>         create two  zero costs tasks (a source and a sink), create
>         dependencies
>         between these tasks and yours and see some tasks are still not
>         simulated.
>         
>         One thing you also have to be aware of is that two independent
>         tasks
>         schedule sequentially on the same resource will be executed
>         concurrently
>         by SD_simulate (as there is no dependency between them). For
>         instance if
>         task A is schedule from t=0 to t=10 on host H and task B is
>         scheduled on
>         H from t=10 to t=15, with A and B two independent tasks, the
>         results
>         will be different with SD_simulate: A and B will both start at
>         t=0 and
>         share the processing power of H. So A will will finish at t=15
>         and B at
>         t=10. The solution is to add a new dependency between A and B
>         before
>         calling SD_Simulate.
>         
>         It may not be the source of your problem, but to help you
>         more, a
>         reduced example is needed.
>         
>         Cheers
>         
>         Fred
>         
>         
>         Martin Quinson a écrit : 
>         
>         > Hello Joao,
>         >
>         > you're asking a hard question. We of course are willing to
>         help you
>         > using the tool, but we don't have the time to debug the code
>         of every
>         > user, unfortunately. What would really help would be if you
>         could come
>         > up with a reduced example with this example. The smaller and
>         simpler you
>         > make it, the easier it is for us to track down the issue and
>         solve it.
>         > If you manage to do so, I guess that we can track the issue
>         very
>         > quickly.
>         >
>         > We need you to help us helping you ;)
>         > Mt.
>         >
>         > Le mercredi 13 janvier 2010 à 12:08 -0200, João Paulo P
>         Tonelli a
>         > écrit :
>         >
>         >> Hello,
>         >> I have developed a scheduler of DAG applications using
>         SimDAG.
>         >> Now, I am trying to use it for scheduling an application of
>         280 tasks
>         >> on 4 hosts, but the simulation finishes without executing
>         all tasks.
>         >> I have checked whether all tasks were correctly scheduled,
>         and they
>         >> were (I saw the state of each task before the
>         "SD_simulate()" call).
>         >>
>         >> I have noticed that after the simulation the tasks which
>         were not
>         >> executed either remain in  the SD_SCHEDULED state or were
>         in the
>         >> SD_IN_FIFO state.
>         >> These tasks which remain in the SD_SCHEDULED state were not
>         executed
>         >> because they depend indirectly on those in SD_IN_FIFO
>         state.
>         >>
>         >> I do not know what to do.
>         >> Could someone help me?
>         >>
>         >>
>         >> Regards,
>         >> João Paulo Pereira Tonelli.
>         >> _______________________________________________
>         >> Simgrid-user mailing list
>         >> Simgrid-user at lists.gforge.inria.fr
>         >>
>         http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/simgrid-user
>         >>
>         >
>         > -- I don't suffer from Insanity, I enjoy every minute of
>         it...
>         >
>         >
>         > _______________________________________________
>         > Simgrid-user mailing list
>         > Simgrid-user at lists.gforge.inria.fr
>         >
>         http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/simgrid-user
>         
>         
>         
>         --
>         Quand ça change, ça change. Faut jamais se laisser démonter !
>                                                        Maître Folace 
>         
>         
>         
>         _______________________________________________
>         Simgrid-user mailing list
>         Simgrid-user at lists.gforge.inria.fr
>         http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/simgrid-user 
> 
> 
> 
> -- 
> Regards,
> João Paulo Pereira Tonelli. 
> _______________________________________________
> Simgrid-user mailing list
> Simgrid-user at lists.gforge.inria.fr
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/simgrid-user

-- 
Le pointeur est aux données ce que la boucle while est au code.




More information about the Simgrid-user mailing list