[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Kernel oops using DMA on MCM


Just fixed and resolved the issue! (fingers crossed) I implemented the
suggested workaround for the receive buffers initially but this did not fix
the problem. After lots more experimentation and digging around, I found
that the problem happened once DMA tarnsmit was started. And the kernel
crashed even though I did not attach any interrupt handlers to it as well. I
then tried aligning my DMA transmit buffer with the cacheline bytes (32),
and after that the problem disappeared. I can now very reliably generate the
kernel oops by not aligning the transmit buffer. And once I align it it
seems to work ok. I will run it overnight now to see if it can, and do a bit
more testing; but I am relatively confident that was the issue.

What has happened was probably not the cache-bug though, because according
to its doco it was only supposed to happen for DMA receive channels, not
transmit. Any thoughts?



> -----Original Message-----
> From: Mikael Starvik [mailto:mikael.starvik@xxxxxxx.com]
> Sent: Wednesday, 18 December 2002 21:04
> To: 'Fettahlioglu, Mahmut'; 'Mikael Starvik'
> Cc: dev-etrax
> Subject: RE: Kernel oops using DMA on MCM
> >If you are interested I can send you its source as well.
> Yes, that would be nice.
> >The kernel backtraces are not identical at all times. To be 
> more accurate,
> >the initial
> >Trace; c0014e8a <do_generic_file_read+3a8/3ae>
> >Trace; c0026312 <search_binary_handler+52/fe>
> >Trace; c0025836 <copy_strings+0/1ae>
> >Trace; c002650e <do_execve+150/1aa>
> >Trace; c00455f0 <sys_execve+2a/42>
> >Trace; c00460be <system_call+50/58>
> >part is identical in each trace
> strange. The sequence above would of course occur if any process
> dies and init respawns it but your OOPS says that it is cat that
> has this calltrace and it is just bogus.
> >Actually, option two may also be worthwhile to consider as 
> well. If the CPU
> >was just not caching the area, the driver would not need to 
> be made any more
> >complex than it currently is. Do you think this is something 
> feasible to do?
> This would imply modifications in several non-trivial places to set up
> the MMU to do this and to make sure that your data is really allocated
> in the uncached area. I really think it is easier to add the few lines
> in the synch serial driver. 
> Do you also have lots of network traffic when the problem happens?
> After how long time does the problem occur (seconds? minutes? hours?)
> /Mikael