[knem-devel] "unexisting region cookie" error

bin wang bighead521 at gmail.com
Mon Jul 11 22:30:40 CEST 2011


hello Brice,

Could you please tell me how to load the module with the parameter
statsverbose?
I'm not familiar with those kernel module staff and didn't find any other
further info on the KNEM webpage.

Thanks.

On Mon, Jul 11, 2011 at 4:26 PM, Brice Goglin <Brice.Goglin at inria.fr> wrote:

> **
> Le 11/07/2011 22:20, bin wang a écrit :
>
> hello Brice,
>
>  Thanks for your timely reply.
>
>  I checked the return value of ioctl, and it was -1 if the process
> crashes.
>
>  In my code,
> 1. I didn't set the KNEM_FLAG_SINGLEUSE flag, and the sender process was
> still running.
>  So I assume that the memory region was not destroyed yet.
> 2.I'm always checking the cookies on both writer and reader sides, and they
> are exactly the same.
> 3.The modular info suggests that the the requests was due to unexisting
> region
>
>
>
>  Do you have any other suggestion?
>
>
> One thing you could do is:
> * Load the knem kernel module with the module parameter statsverbose=1
> * In your MPI code, when you get -1, insert a debug printf with the cookie
> value and then a "while (1);" so that the process stops progressing and
> waits.
> * When you see the above printf, run (as root) "cat /dev/knem" to get even
> more module information, including the existing cookie values in the driver
> (only printed if statsverbose=1).
>
> Brice
>
>
>
>
>
>
>
>  I'm looking forward to hearing from you.
>
>  On Mon, Jul 11, 2011 at 4:10 PM, Brice Goglin <Brice.Goglin at inria.fr>wrote:
>
>> Hello,
>>
>> From what I see in your knem module information, it complains that many
>> knem copy requests were invalid because the submitted cookie is invalid.
>> This usually suggests that the cookie value has been altered between
>> when it was created with the region create ioctl and when it is used in
>> a copy ioctl. Or it could be a cookie that has already been destroyed
>> (either explicitly or through the single-use flag).
>>
>> Are you developing your own MPI port over KNEM? If so, I suggests that
>> you check the return value of the copy ioctl. When it returns -1 with
>> errno=EINVAL, you should print the value of the cookie and check that it
>> matches the cookie value that was previously returned by a region
>> creation.
>>
>>
>
>
>> I am thinking of printing the available cookie values in the kernel logs
>> when such an invalid cookie is requested. It would be very verbose when
>> multiple processes are involved, but it may help you debug this kind of
>> problem.
>>
>> Brice
>>
>>
>>
>> Le 11/07/2011 21:47, bin Wang a écrit :
>>  > hello All,
>> >
>> > I'm trying to utilize knem in MPI.
>> > When there is only two processes, knem was working properly.
>> > when # of processes is 3, the code is not working properly all the time.
>> > when # of processes goes beyond 3, there will be at least one process
>> > that will crash without calling the finalize.
>> >
>> > I don't know why it's not working properly for me.
>> > Below is the information of knem module.
>> >
>> > $ cat /dev/knem
>> > knem 0.9.6
>> >  Driver ABI=0xd
>> >  Flags: forcing 0x0, ignoring 0x0
>> >  DMAEngine: KernelSupported Enabled NoChannelAvailable
>> >  Debug: NotBuilt
>> >  Requests submitted                           : 68
>> >  Requests processed (total)                   : 50
>> >           processed (using DMA)               : 0
>> >           processed (offloaded to thread)     : 0
>> >           processed (with pinned local pages) : 0
>> >  Requests rejected (invalid flags)            : 0
>> >           rejected (not enough memory)        : 0
>> >           rejected (invalid ioctl argument)   : 0
>> >           rejected (unexisting region cookie) : 19
>> >           rejected (failed to pin local pages): 0
>> >  Requests failed during memcpy from/to user   : 0
>> >           failed during DMA copy              : 0
>> >  DMA copy cleanup timeout                     : 0
>> >
>> >
>> > Can anyone help me out?
>> >
>>
>>
>
>
> --
> Bin WANG
>
>
>


-- 
Bin WANG
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gforge.inria.fr/pipermail/knem-devel/attachments/20110711/6b9919c3/attachment.htm>


More information about the knem-devel mailing list