[knem-devel] "unexisting region cookie" error

Brice Goglin Brice.Goglin at inria.fr
Mon Jul 11 22:35:17 CEST 2011


Just append this parameter and value to the insmod of modprobe command-line:

instead of doing
   insmod .....knem.ko
or
   modprobe knem

do
   insmod ....knem.ko statsverbose=1
or
   modprobe knem statsverbose=1

I am adding a small note about this to the documentation.

Brice




Le 11/07/2011 22:30, bin wang a écrit :
> hello Brice, 
>  
> Could you please tell me how to load the module with the parameter 
> statsverbose?
> I'm not familiar with those kernel module staff and didn't find any
> other further info on the KNEM webpage.
>
> Thanks.
>
> On Mon, Jul 11, 2011 at 4:26 PM, Brice Goglin <Brice.Goglin at inria.fr
> <mailto:Brice.Goglin at inria.fr>> wrote:
>
>     Le 11/07/2011 22:20, bin wang a écrit :
>>     hello Brice, 
>>
>>     Thanks for your timely reply.
>>
>>     I checked the return value of ioctl, and it was -1 if the process
>>     crashes. 
>>
>>     In my code,  
>>     1. I didn't set the KNEM_FLAG_SINGLEUSE flag, and the sender
>>     process was
>>     still running.
>>     So I assume
>>     that the memory region was not destroyed yet.
>>     2.I'm always
>>     checking the cookies on both writer and reader sides, and
>>     they are exactly the same.
>>     3.The modular
>>     info suggests that the the requests was due to unexisting
>>     region
>>
>>
>>
>>
>>
>>
>>     Do you have any other suggestion?
>
>     One thing you could do is:
>     * Load the knem kernel module with the module parameter statsverbose=1
>     * In your MPI code, when you get -1, insert a debug printf with
>     the cookie value and then a "while (1);" so that the process stops
>     progressing and waits.
>     * When you see the above printf, run (as root) "cat /dev/knem" to
>     get even more module information, including the existing cookie
>     values in the driver (only printed if statsverbose=1).
>
>     Brice
>
>
>
>
>>
>>
>>
>>
>>
>>
>>     I'm looking forward to hearing from
>>     you.
>>
>>     On Mon, Jul 11, 2011 at 4:10 PM, Brice Goglin
>>     <Brice.Goglin at inria.fr <mailto:Brice.Goglin at inria.fr>> wrote:
>>
>>         Hello,
>>
>>         From what I see in your knem module information, it complains
>>         that many
>>         knem copy requests were invalid because the submitted cookie
>>         is invalid.
>>         This usually suggests that the cookie value has been altered
>>         between
>>         when it was created with the region create ioctl and when it
>>         is used in
>>         a copy ioctl. Or it could be a cookie that has already been
>>         destroyed
>>         (either explicitly or through the single-use flag).
>>
>>         Are you developing your own MPI port over KNEM? If so, I
>>         suggests that
>>         you check the return value of the copy ioctl. When it returns
>>         -1 with
>>         errno=EINVAL, you should print the value of the cookie and
>>         check that it
>>         matches the cookie value that was previously returned by a
>>         region creation.
>>
>>
>>      
>>
>>         I am thinking of printing the available cookie values in the
>>         kernel logs
>>         when such an invalid cookie is requested. It would be very
>>         verbose when
>>         multiple processes are involved, but it may help you debug
>>         this kind of
>>         problem.
>>
>>         Brice
>>
>>
>>
>>         Le 11/07/2011 21:47, bin Wang a écrit :
>>         > hello All,
>>         >
>>         > I'm trying to utilize knem in MPI.
>>         > When there is only two processes, knem was working properly.
>>         > when # of processes is 3, the code is not working properly
>>         all the time.
>>         > when # of processes goes beyond 3, there will be at least
>>         one process
>>         > that will crash without calling the finalize.
>>         >
>>         > I don't know why it's not working properly for me.
>>         > Below is the information of knem module.
>>         >
>>         > $ cat /dev/knem
>>         > knem 0.9.6
>>         >  Driver ABI=0xd
>>         >  Flags: forcing 0x0, ignoring 0x0
>>         >  DMAEngine: KernelSupported Enabled NoChannelAvailable
>>         >  Debug: NotBuilt
>>         >  Requests submitted                           : 68
>>         >  Requests processed (total)                   : 50
>>         >           processed (using DMA)               : 0
>>         >           processed (offloaded to thread)     : 0
>>         >           processed (with pinned local pages) : 0
>>         >  Requests rejected (invalid flags)            : 0
>>         >           rejected (not enough memory)        : 0
>>         >           rejected (invalid ioctl argument)   : 0
>>         >           rejected (unexisting region cookie) : 19
>>         >           rejected (failed to pin local pages): 0
>>         >  Requests failed during memcpy from/to user   : 0
>>         >           failed during DMA copy              : 0
>>         >  DMA copy cleanup timeout                     : 0
>>         >
>>         >
>>         > Can anyone help me out?
>>         >
>>
>>
>>
>>
>>     -- 
>>     Bin WANG
>>
>
>
>
>
> -- 
> Bin WANG
>

-------------- section suivante --------------
Une pi?ce jointe HTML a ?t? nettoy?e...
URL: <http://lists.gforge.inria.fr/pipermail/knem-devel/attachments/20110711/0ca2dc65/attachment-0001.htm>


More information about the knem-devel mailing list