[SimGrid-user] Getting started with SimDAG

Kiril Dichev K.Dichev at qub.ac.uk
Thu May 10 21:23:14 CEST 2018


The snippet is missing includes and has some pseudo code lines.

 Can you actually attach a source file that compiles? Source code is often the most self explanatory piece of information you can possibly provide.

Regards,
Kiril

> On 10 May 2018, at 16:39, Thomas Mcsweeney <thomas.mcsweeney at postgrad.manchester.ac.uk> wrote:
> 
> Hi Kiril,
> 
> Thanks for the reply!
> 
> I have taken your advice but still seem to be having difficulty.  I think the problem is that my tasks never seem to get beyond the state SD_RUNNABLE and so other tasks that depend on them never get scheduled.  
> 
> I have included a very simple version of my code that I am using to debug below, if anybody is willing to have a look and suggest where I might be going wrong. Is there something simple I am missing? Any advice would be greatly appreciated.
> 
> All the best,
> Tom
> 
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> 
> #define number_episodes 1              
> // Number of times to load the DAG afresh and work through it - never gets past the first iteration so set it to 1 for convenience.
> 
> SD_task_t get_root(xbt_dynar_t dot){
>   // Returns the root of the DAG.
>   SD_task_t task;  
>   xbt_dynar_get_cpy(dot, 0, &task);
>   return task;
> }
> 
> xbt_dynar_t get_ready_tasks(xbt_dynar_t dag) {
>   // Returns an array of the tasks ready to be scheduled.
>   unsigned int i;
>   xbt_dynar_t ready_tasks = xbt_dynar_new(sizeof(SD_task_t), NULL);
>   SD_task_t task;  
>   xbt_dynar_foreach(dag, i, task)
>     if (SD_task_get_state(task) == SD_SCHEDULABLE) {      
>       xbt_dynar_push(ready_tasks, &task);
>     }
>   return ready_tasks;
> }
> 
> bool simulation_complete(xbt_dynar_t dag) {  
>   // Check if all the tasks in the DAG are done.
>   SD_task_t task;
>   int task_index;
>   xbt_dynar_foreach(dag, task_index, task) {
>     if (SD_task_get_state(task) != SD_DONE) 
>       return false;
>   }
>   return true;  
> }
> 
> int main(int argc, char **argv) {  
>   
>   SD_init(&argc, argv); // Initialize SimDAG.   
>   SD_create_environment("./platform.xml");   // Define the environment.
> 
>   int task_index, i;
>   SD_task_t task; 
>   
>   const auto total_nworkstations = SD_workstation_get_number();
>   const auto workstations = SD_workstation_get_list();  
> 
>   // Run for number_episodes number of episodes.
>   for (i = 0; i < number_episodes; ++i) {     
>     
>     // Load the DAG from a dot file.
>     // Very basic DAG - three nodes c1, c2 and c3 with dependencies c1->c3 and c2->c3, all sequential computations of small amounts.
>     auto dag = SD_dotload("./task_graph.dot");    
> 
>     // Schedule the root task on a random workstation.
>     auto root = get_root(dag);
>     ... *find a random workstation* ...
>     SD_task_schedulel(root, 1, random_workstation);
> 
>     // Simulate an episode.
>     xbt_dynar_t changed_tasks = xbt_dynar_new(sizeof(SD_task_t), NULL);
>     SD_simulate_with_update(-1.0, changed_tasks);   
>     while(!(xbt_dynar_is_empty(changed_tasks))) {         
> 
>       // Get the ready task queue.
>       auto ready_tasks = get_ready_tasks(dag);
>       if (xbt_dynar_is_empty(ready_tasks)) {        
>           continue;
>       }                            
> 
>       // Choose some task randomly from the ready tasks.
>       r = ...*random index*...        
>       xbt_dynar_get_cpy(ready_tasks, r, &task);      
> 
>       // Choose some workstation randomly.
>       workstation = ... *find a random workstation* ...           
> 
>       // Schedule the chosen task on the chosen workstation.    
>       SD_task_schedulel(task, 1, workstation);        
>       
>       // Check if all tasks in the DAG have been scheduled, and exit if that is the case.      
>       if (simulation_complete(dag))
>           break;               
>     }   
> 
>     // Tidy up at the end of each episode.
>     xbt_dynar_free_container(&changed_tasks);    
>   }
> 
>   // Exit SimDAG.
>   SD_exit();
> 
>   return 0;
> }
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> 
> From: Kiril Dichev <K.Dichev at qub.ac.uk>
> Sent: Monday, May 7, 2018 1:37:21 PM
> To: Thomas Mcsweeney
> Cc: simgrid-user at lists.gforge.inria.fr; Mawussi Zounon
> Subject: Re: [SimGrid-user] Getting started with SimDAG
>  
> Hey Thomas,
> 
> I’m using SimDAG quite a lot recently, and one thing I notice is this:
> 
> auto done = true;
> xbt_dynar_foreach(dag, task_index, task) {
>     if (SD_task_get_state(task) == SD_NOT_SCHEDULED) {
>         done = false;
>         break;
>     }
> }
> if (done) 
>     break;
> 
> Your tasks go through a lot more states than SD_NOT_SCHEDULED before they are done. They go through SD_NOT_SCHEDULED -> SD_SCHEDULABLE -> SD_SCHEDULED -> SD_RUNNABLE -> SD_RUNNING -> SD_DONE (or SD_FAILED).
> One possible issue is that you may have any task in any of the in-between states between SD_NOT_SCHEDULED and SD_DONE. However, you may incorrectly get “done" in the above code if you have tasks lingering in the states before SD_DONE. I think you should ask if any task at all is in any other state than SD_DONE. 
> 
> I use something like this for my simulation:
> 
> bool sim_not_done(SD_task_t* kernel_tasks) {
>     for (int j=0; j<count; j++)
>         if (SD_task_get_state(kernel_tasks[j]) != SD_DONE)
>                 return true;
>     return false;
> }
> 
> Regards,
> Kiril
> 
>> On 2 May 2018, at 16:26, Thomas Mcsweeney <thomas.mcsweeney at postgrad.manchester.ac.uk <mailto:thomas.mcsweeney at postgrad.manchester.ac.uk>> wrote:
>> 
>> Hello all,
>> 
>>  <>I am a first-year PhD student at the University of Manchester, studying how we can apply techniques from reinforcement learning to design novel scheduling algorithms for applications on HPC systems, with a focus on linear algebra applications. (I have already corresponded with Frédéric and Arnaud; thank you again for your emails!).
>> 
>>  <>I have been having some difficulty with some (in principle) simple SimDAG code that I have been working on and wonder whether anyone would be able to offer any help?
>> 
>> Having only recently begun to work with SimDAG, I first worked my way through this tutorial: 
>> 
>> http://simgrid.gforge.inria.fr/tutorials/simdag-101.pdf <http://simgrid.gforge.inria.fr/tutorials/simdag-101.pdf>,
>> SimulatingDAGScheduling Algorithms withSimDAG - SimGrid <http://simgrid.gforge.inria.fr/tutorials/simdag-101.pdf>
>> simgrid.gforge.inria.fr <http://simgrid.gforge.inria.fr/>
>> SimulatingDAGScheduling Algorithms withSimDAG Fr ed eric Suter (CNRS, IN2P3 Computing Center, France) Martin Quinson (Nancy University, France) Arnaud Legrand (CNRS, Grenoble University, France)
>> and then moved on to attempt to write a scheduler of my own. However, I have encountered some a few problems.
>> 
>>  <>Basically, I want to initialize a DAG (loaded from a DOT file) on every iteration of a loop (with a small number of iterations), then launch a simulation and work my way through the DAG in the body of the loop. After each iteration, I make use of data gathered to do some reinforcement learning things. The problem is that although my code compiles without a problem, when run it hangs forever after the first iteration and I am not sure why.
>> 
>> For debugging purposes, I created a simplified version of my code, with all the extraneous reinforcement learning bits removed, to illustrate where my problems seem to be. I am also using the most basic DAG (with just three nodes, from the tutorial mentioned above) and cluster (again, the one from the tutorial) I possibly can, to simplify things further - but I am still having the same problem (this simplified code is what I have quoted from below).
>> 
>> In each iteration of the loop, after loading the DAG with SD_dotload, I add watchpoints to all the tasks in it (as in the tutorial). I then schedule the root task on a random workstation:
>> 
>> auto root = get_root(dag);
>> int r = r_workstations(mt);       // r = random number chosen from indices of the workstations.
>> auto random_workstation = workstations[r];
>> SD_task_schedulel(root, 1, random_workstation); 
>> 
>> Then I begin to simulate:
>> 
>> xbt_dynar_t changed_tasks = xbt_dynar_new(sizeof(SD_task_t), NULL);
>> SD_simulate_with_update(-1.0, changed_tasks); 
>> while(!(xbt_dynar_is_empty(changed_tasks)))
>> 
>> (I am not sure at all of the correct syntax for doing this so it wouldn’t surprise me if it is incorrect - the example in the tutorial doesn’t seem to work for me, so the syntax I used above I got from another example I found somewhere, although I can’t remember where.)
>> 
>> Then in the body of the while loop, I find the tasks ready to be scheduled, choose one at random, then choose a workstation at random and schedule the chosen task on the chosen workstation.
>> 
>> // Get the ready task queue.
>> auto ready_tasks = get_ready_tasks(dag);
>> if (xbt_dynar_is_empty(ready_tasks))
>>      continue;
>> auto n_ready_tasks = xbt_dynar_length(ready_tasks); 
>> 
>> // Choose some task from the ready tasks.
>> std::uniform_int_distribution<int> r_tasks(0, n_ready_tasks - 1);
>> r = r_tasks(mt);
>> xbt_dynar_get_cpy(ready_tasks, r, &task);
>> 
>> // Choose some workstation randomly.
>> auto r = r_workstations(mt);
>> auto workstation = workstations[r];
>> 
>> // Schedule the chosen task on the chosen workstation. 
>> SD_task_schedulel(task, 1, workstation);
>> 
>> Here, get_ready_tasks is a function taken from one of the examples in the tutorial:
>> 
>> xbt_dynar_t get_ready_tasks(xbt_dynar_t dag) {
>>     unsigned int i;
>>     xbt_dynar_t ready_tasks = xbt_dynar_new(sizeof(SD_task_t), NULL);
>>     SD_task_t task;
>>     xbt_dynar_foreach(dag, i, task)
>>          if (SD_task_get_kind(task) == SD_TASK_COMP_SEQ && SD_task_get_state(task) == SD_SCHEDULABLE)
>>              xbt_dynar_push(ready_tasks, &task);
>>     return ready_tasks;
>> }
>> 
>> (At the moment, I check if the tasks are of kind SD_TASK_COMP_SEQ so I can use SD_task_schedulel, but this shouldn’t be a problem since all the tasks in my DAG are of this type.)
>> 
>> I then finish each iteration of the while loop by checking if we have finished scheduling all the tasks in the DAG:
>> 
>> auto done = true;
>> xbt_dynar_foreach(dag, task_index, task) {
>>     if (SD_task_get_state(task) == SD_NOT_SCHEDULED) {
>>         done = false;
>>         break;
>>     }
>> }
>> if (done) 
>>     break;
>> 
>> (NB: is there a better way to check if we have scheduled all the tasks in the DAG?)
>> 
>> As I said, the code compiles just fine but when I run it, it never seems to get past the first iteration of the loop and I eventually have to kill it, and I haven't been able to locate precisely what the problem is.  Again, I am very much a beginner so I am sure that it is just a silly, basic error on my part, but any guidance or suggestions that you may have would be greatly appreciated.
>> 
>> (Note that I can of course also provide a fuller code and more detail if anybody wishes.)
>> 
>> All the best,
>> Tom 
>> 
>> 
>> _______________________________________________
>> Simgrid-user mailing list
>> Simgrid-user at lists.gforge.inria.fr <mailto:Simgrid-user at lists.gforge.inria.fr>
>> https://lists.gforge.inria.fr/mailman/listinfo/simgrid-user <https://lists.gforge.inria.fr/mailman/listinfo/simgrid-user>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gforge.inria.fr/pipermail/simgrid-user/attachments/20180510/8f18d4af/attachment-0001.html>


More information about the Simgrid-user mailing list