[SimGrid-user] Getting started with SimDAG

Kiril Dichev K.Dichev at qub.ac.uk
Mon May 7 14:37:21 CEST 2018


Hey Thomas,

I’m using SimDAG quite a lot recently, and one thing I notice is this:

auto done = true;
xbt_dynar_foreach(dag, task_index, task) {
    if (SD_task_get_state(task) == SD_NOT_SCHEDULED) {
        done = false;
        break;
    }
}
if (done) 
    break;

Your tasks go through a lot more states than SD_NOT_SCHEDULED before they are done. They go through SD_NOT_SCHEDULED -> SD_SCHEDULABLE -> SD_SCHEDULED -> SD_RUNNABLE -> SD_RUNNING -> SD_DONE (or SD_FAILED).
One possible issue is that you may have any task in any of the in-between states between SD_NOT_SCHEDULED and SD_DONE. However, you may incorrectly get “done" in the above code if you have tasks lingering in the states before SD_DONE. I think you should ask if any task at all is in any other state than SD_DONE. 

I use something like this for my simulation:

bool sim_not_done(SD_task_t* kernel_tasks) {
    for (int j=0; j<count; j++)
        if (SD_task_get_state(kernel_tasks[j]) != SD_DONE)
                return true;
    return false;
}

Regards,
Kiril

> On 2 May 2018, at 16:26, Thomas Mcsweeney <thomas.mcsweeney at postgrad.manchester.ac.uk> wrote:
> 
> Hello all,
> 
>  <>I am a first-year PhD student at the University of Manchester, studying how we can apply techniques from reinforcement learning to design novel scheduling algorithms for applications on HPC systems, with a focus on linear algebra applications. (I have already corresponded with Frédéric and Arnaud; thank you again for your emails!).
> 
>  <>I have been having some difficulty with some (in principle) simple SimDAG code that I have been working on and wonder whether anyone would be able to offer any help?
> 
> Having only recently begun to work with SimDAG, I first worked my way through this tutorial: 
> 
> http://simgrid.gforge.inria.fr/tutorials/simdag-101.pdf <http://simgrid.gforge.inria.fr/tutorials/simdag-101.pdf>,
> SimulatingDAGScheduling Algorithms withSimDAG - SimGrid <http://simgrid.gforge.inria.fr/tutorials/simdag-101.pdf>
> simgrid.gforge.inria.fr <http://simgrid.gforge.inria.fr/>
> SimulatingDAGScheduling Algorithms withSimDAG Fr ed eric Suter (CNRS, IN2P3 Computing Center, France) Martin Quinson (Nancy University, France) Arnaud Legrand (CNRS, Grenoble University, France)
> and then moved on to attempt to write a scheduler of my own. However, I have encountered some a few problems.
> 
>  <>Basically, I want to initialize a DAG (loaded from a DOT file) on every iteration of a loop (with a small number of iterations), then launch a simulation and work my way through the DAG in the body of the loop. After each iteration, I make use of data gathered to do some reinforcement learning things. The problem is that although my code compiles without a problem, when run it hangs forever after the first iteration and I am not sure why.
> 
> For debugging purposes, I created a simplified version of my code, with all the extraneous reinforcement learning bits removed, to illustrate where my problems seem to be. I am also using the most basic DAG (with just three nodes, from the tutorial mentioned above) and cluster (again, the one from the tutorial) I possibly can, to simplify things further - but I am still having the same problem (this simplified code is what I have quoted from below).
> 
> In each iteration of the loop, after loading the DAG with SD_dotload, I add watchpoints to all the tasks in it (as in the tutorial). I then schedule the root task on a random workstation:
> 
> auto root = get_root(dag);
> int r = r_workstations(mt);       // r = random number chosen from indices of the workstations.
> auto random_workstation = workstations[r];
> SD_task_schedulel(root, 1, random_workstation); 
> 
> Then I begin to simulate:
> 
> xbt_dynar_t changed_tasks = xbt_dynar_new(sizeof(SD_task_t), NULL);
> SD_simulate_with_update(-1.0, changed_tasks); 
> while(!(xbt_dynar_is_empty(changed_tasks)))
> 
> (I am not sure at all of the correct syntax for doing this so it wouldn’t surprise me if it is incorrect - the example in the tutorial doesn’t seem to work for me, so the syntax I used above I got from another example I found somewhere, although I can’t remember where.)
> 
> Then in the body of the while loop, I find the tasks ready to be scheduled, choose one at random, then choose a workstation at random and schedule the chosen task on the chosen workstation.
> 
> // Get the ready task queue.
> auto ready_tasks = get_ready_tasks(dag);
> if (xbt_dynar_is_empty(ready_tasks))
>      continue;
> auto n_ready_tasks = xbt_dynar_length(ready_tasks); 
> 
> // Choose some task from the ready tasks.
> std::uniform_int_distribution<int> r_tasks(0, n_ready_tasks - 1);
> r = r_tasks(mt);
> xbt_dynar_get_cpy(ready_tasks, r, &task);
> 
> // Choose some workstation randomly.
> auto r = r_workstations(mt);
> auto workstation = workstations[r];
> 
> // Schedule the chosen task on the chosen workstation. 
> SD_task_schedulel(task, 1, workstation);
> 
> Here, get_ready_tasks is a function taken from one of the examples in the tutorial:
> 
> xbt_dynar_t get_ready_tasks(xbt_dynar_t dag) {
>     unsigned int i;
>     xbt_dynar_t ready_tasks = xbt_dynar_new(sizeof(SD_task_t), NULL);
>     SD_task_t task;
>     xbt_dynar_foreach(dag, i, task)
>          if (SD_task_get_kind(task) == SD_TASK_COMP_SEQ && SD_task_get_state(task) == SD_SCHEDULABLE)
>              xbt_dynar_push(ready_tasks, &task);
>     return ready_tasks;
> }
> 
> (At the moment, I check if the tasks are of kind SD_TASK_COMP_SEQ so I can use SD_task_schedulel, but this shouldn’t be a problem since all the tasks in my DAG are of this type.)
> 
> I then finish each iteration of the while loop by checking if we have finished scheduling all the tasks in the DAG:
> 
> auto done = true;
> xbt_dynar_foreach(dag, task_index, task) {
>     if (SD_task_get_state(task) == SD_NOT_SCHEDULED) {
>         done = false;
>         break;
>     }
> }
> if (done) 
>     break;
> 
> (NB: is there a better way to check if we have scheduled all the tasks in the DAG?)
> 
> As I said, the code compiles just fine but when I run it, it never seems to get past the first iteration of the loop and I eventually have to kill it, and I haven't been able to locate precisely what the problem is.  Again, I am very much a beginner so I am sure that it is just a silly, basic error on my part, but any guidance or suggestions that you may have would be greatly appreciated.
> 
> (Note that I can of course also provide a fuller code and more detail if anybody wishes.)
> 
> All the best,
> Tom 
> 
> 
> _______________________________________________
> Simgrid-user mailing list
> Simgrid-user at lists.gforge.inria.fr <mailto:Simgrid-user at lists.gforge.inria.fr>
> https://lists.gforge.inria.fr/mailman/listinfo/simgrid-user <https://lists.gforge.inria.fr/mailman/listinfo/simgrid-user>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gforge.inria.fr/pipermail/simgrid-user/attachments/20180507/af18e195/attachment-0001.html>


More information about the Simgrid-user mailing list