[knem-devel] "unexisting region cookie" error
Brice.Goglin at inria.fr
Mon Jul 11 22:26:34 CEST 2011
Le 11/07/2011 22:20, bin wang a écrit :
> hello Brice,
> Thanks for your timely reply.
> I checked the return value of ioctl, and it was -1 if the process
> In my code,
> 1. I didn't set the KNEM_FLAG_SINGLEUSE flag, and the sender process
> was still running.
> So I assume that the memory region was not destroyed yet.
> 2.I'm always checking the cookies on both writer and reader sides, and
> they are exactly the same.
> 3.The modular info suggests that the the requests was due to
> unexisting region
> Do you have any other suggestion?
One thing you could do is:
* Load the knem kernel module with the module parameter statsverbose=1
* In your MPI code, when you get -1, insert a debug printf with the
cookie value and then a "while (1);" so that the process stops
progressing and waits.
* When you see the above printf, run (as root) "cat /dev/knem" to get
even more module information, including the existing cookie values in
the driver (only printed if statsverbose=1).
> I'm looking forward to hearing from you.
> On Mon, Jul 11, 2011 at 4:10 PM, Brice Goglin <Brice.Goglin at inria.fr
> <mailto:Brice.Goglin at inria.fr>> wrote:
> From what I see in your knem module information, it complains that
> knem copy requests were invalid because the submitted cookie is
> This usually suggests that the cookie value has been altered between
> when it was created with the region create ioctl and when it is
> used in
> a copy ioctl. Or it could be a cookie that has already been destroyed
> (either explicitly or through the single-use flag).
> Are you developing your own MPI port over KNEM? If so, I suggests that
> you check the return value of the copy ioctl. When it returns -1 with
> errno=EINVAL, you should print the value of the cookie and check
> that it
> matches the cookie value that was previously returned by a region
> I am thinking of printing the available cookie values in the
> kernel logs
> when such an invalid cookie is requested. It would be very verbose
> multiple processes are involved, but it may help you debug this
> kind of
> Le 11/07/2011 21:47, bin Wang a écrit :
> > hello All,
> > I'm trying to utilize knem in MPI.
> > When there is only two processes, knem was working properly.
> > when # of processes is 3, the code is not working properly all
> the time.
> > when # of processes goes beyond 3, there will be at least one
> > that will crash without calling the finalize.
> > I don't know why it's not working properly for me.
> > Below is the information of knem module.
> > $ cat /dev/knem
> > knem 0.9.6
> > Driver ABI=0xd
> > Flags: forcing 0x0, ignoring 0x0
> > DMAEngine: KernelSupported Enabled NoChannelAvailable
> > Debug: NotBuilt
> > Requests submitted : 68
> > Requests processed (total) : 50
> > processed (using DMA) : 0
> > processed (offloaded to thread) : 0
> > processed (with pinned local pages) : 0
> > Requests rejected (invalid flags) : 0
> > rejected (not enough memory) : 0
> > rejected (invalid ioctl argument) : 0
> > rejected (unexisting region cookie) : 19
> > rejected (failed to pin local pages): 0
> > Requests failed during memcpy from/to user : 0
> > failed during DMA copy : 0
> > DMA copy cleanup timeout : 0
> > Can anyone help me out?
> Bin WANG
-------------- section suivante --------------
Une pi?ce jointe HTML a ?t? nettoy?e...
More information about the knem-devel