[SimGrid-user] Latency issues
frederic.suter at cc.in2p3.fr
Wed Jul 5 15:14:17 CEST 2017
This can easily be explained. The underlying network model makes a
correction of the latency declared in the platform file to reflect the
impact of TCP slow start on "large transfers". Such a correction is
motivated and explained in
http://hal.inria.fr/hal-00646896/PDF/rr-validity.pdf (Note that the
domain validity of this model is for transfers larger than 100kB). It is
also mentioned when adding the --help-models flag to your command line.
Long description of the network models accepted by this simulator:
LV08: Realistic network analytic model (slow-start modeled by
multiplying latency by 10.4, bandwidth by .92; bottleneck sharing uses a
payload of S=8775 for evaluating RTT).
Unfortunately, these values are outdated. The latency correction factor
is now 13.01
which explain the observed results. The description given by
--help-models has been corrected and the documentation on these hidden
parameters will be improved.
Coming back to your experiments there are two situations:
1) if your workload comprises a majority of transfers larger than 100kB,
this factor is important to reflect a start-up cost of TCP.
2) if the transfers are smaller than 100kB, you may want to either
override this default parameter of the model with
--cfg=network/latency-factor:1.0 (to disable it completely or switch to
the former default model that didn't apply corrections with
Le 05/07/2017 à 03:52, Tristan Glatard a écrit :
> We're trying to simulate a latency-bound application with Simgrid 3.16
> using MSG, and we're a bit puzzled by simulation times.
> In example "app-pingpong" the ping time between hosts Tremblay and
> Jupiter, i.e., the time for Jupiter to receive a 1-bit task from
> Tremblay, is 0.019014s while the latency of the link between Tremblay
> and Jupiter (id 9 in small_platform.xml) is only 1.461517ms (factor 13
> difference). And if we edit the platform to contain only the bare
> minimal (2 hosts and 1 link with 1ms latency, see attached), we still
> get a ping time of 13ms while we would expect 1ms:
>> [glatard at sapajou app-pingpong]$ ./app-pingpong ./smaller_platform.xml
>> [Tremblay:pinger:(1) 0.000000] [mag_app_pingpong/INFO] Ping -> Jupiter
>> [Jupiter:ponger:(2) 0.000000] [mag_app_pingpong/INFO] Pong -> Tremblay
>> [Jupiter:ponger:(2) 0.013010] [mag_app_pingpong/INFO] Task received :
>> small communication (latency bound)
>> [Jupiter:ponger:(2) 0.013010] [mag_app_pingpong/INFO] Ping time
>> (latency bound) 0.013010
>> [Jupiter:ponger:(2) 0.013010] [mag_app_pingpong/INFO] task_bw->data =
>> [Tremblay:pinger:(1) 150.166348] [mag_app_pingpong/INFO] Task
>> received : large communication (bandwidth bound)
>> [Tremblay:pinger:(1) 150.166348] [mag_app_pingpong/INFO] Pong time
>> (bandwidth bound): 150.153
>> [150.166348] [mag_app_pingpong/INFO] Total simulation time: 150.166
> The same factor-13 difference is still observed on a different but
> similar application.
> Do you have any idea what's going on?
> Simgrid-user mailing list
> Simgrid-user at lists.gforge.inria.fr
One should, for example, be able to see that things are hopeless
and yet be determined to make them otherwise.
-- F. Scott Fitzgerald
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Simgrid-user