[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

(usagi-users 02310) BUG check in latest (4.1) IPV6



Your kind help is requested.

I am seeing a BUG() check in schedule() while running IPV6 with USAGI patches on a kernel modified from 2.4.18 Linux.

I have worked through the backtrace and find the following:

schedule()
schedule_timeout()
sock_wait_for_wmem()
sock_alloc_send_skbp()
sock_alloc_send_skb()
ndisc_send_rs()
addrconf_rs_timer()
run_timer_list()
cpu_idle()

I am running a VLAN test at the time, with large numbers of VLAN devices being created and configured.

I can send the backtrace and stack info, but reading the source makes the problem apparent:

The run_timer_list calls the timer function in the timer bottom half, which has the property in_interrupt.
ndisc_send_rs calls sock_alloc_send_skb with the noblock parameter = 0.


Thus, when sock_alloc_send_skbp finds no available skb, it calls sock_wait_for_wmem, also blocking, and that fails, then the "process" is put on the wait queue, waiting for a timer to expire, and the scheduler is called, ending up in the "if ( in_interrupt )
BUG();"
being true. This is, of course, fatal.


The obvious solution is to call sock_alloc_send_skb() with noblock = 1, so that the thread will not attempt to sleep, but returns -EAGAIN. The ndisc_send_rs() routine will need to be recoded to check for the error conditions besides NULL, and take appropriate action, like rescheduling the timeout and returning.

However, it is not obvious to me how/where the function should be rescheduled. It lloks as though the ndisc_send_rs routine should return an error code so that the time can determine whether to reshedule the timeout or not.

Comments, please.

Thanks,
Mark Huth