[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: MCM Crashes



Hi Karl-Johan,

See below for the output of ksymoops.
Really appreciate the support.

Regards,
Peter

--- ksymoops output begin ---
>>EIP; c0009a24 <unhash_process+80/bc>   <=====

>>IRP; c0009a24 <unhash_process+80/bc>
>>SRP; c0008d32 <release_task+3a/b0>
>>IRP; c0009a24 <unhash_process+80/bc>
>>SRP; c0008d32 <release_task+3a/b0>
>>r0; c1c18000 <_end+1af34c0/1edb4c0>
>>r1; c1c58000 <_end+1b334c0/1edb4c0>
>>r5; c1c59f84 <_end+1b35444/1edb4c0>
>>r10; c1c18000 <_end+1af34c0/1edb4c0>
>>r11; c00d9c80 <xtime+0/8>
>>r12; c1c1809d <_end+1af355d/1edb4c0>
>>oR10; c1c18000 <_end+1af34c0/1edb4c0>

Trace; c00085a6 <printk+0/14e>
Trace; c005bf0a <show_stack+0/8a>
Trace; c005c068 <show_registers+d4/13a>
Trace; c005c12e <die_if_kernel+34/46>
Trace; c00085a6 <printk+0/14e>
Trace; c005f136 <do_page_fault+222/2cc>
Trace; c0012066 <do_wp_page+23c/266>
Trace; c0012614 <handle_mm_fault+82/c8>
Trace; c005ef10 <handle_mmu_bus_fault+b0/b4>
Trace; c005bd22 <mmu_bus_fault+28/30>
Trace; c0008d32 <release_task+3a/b0>
Trace; c0009a24 <unhash_process+80/bc>
Trace; dbed0092 <END_OF_CODE+19ed0092/????>
Trace; c0008d32 <release_task+3a/b0>
Trace; c0009920 <sys_wait4+296/30c>
Trace; c005bc26 <system_call+50/58>

Code;  c0009a18 <unhash_process+74/bc>
00000000 <_EIP>:
Code;  c0009a18 <unhash_process+74/bc>
   0:   69 96 0c 30 0f 05 5f      imul   $0xa19d5f,0x50f300c(%esi),%edx
Code;  c0009a1f <unhash_process+7b/bc>
   7:   9d a1 00
Code;  c0009a22 <unhash_process+7e/bc>
   a:   ed                        in     (%dx),%eax
Code;  c0009a23 <unhash_process+7f/bc>   <=====
   b:   db 10                     fistl  (%eax)   <=====
Code;  c0009a25 <unhash_process+81/bc>
   d:   e0 6a                     loopne 79 <_EIP+0x79>
Code;  c0009a27 <unhash_process+83/bc>
   f:   96                        xchg   %eax,%esi
Code;  c0009a28 <unhash_process+84/bc>
  10:   5f                        pop    %edi
Code;  c0009a29 <unhash_process+85/bc>
  11:   ad                        lods   %ds:(%esi),%eax
Code;  c0009a2a <unhash_process+86/bc>
  12:   95                        xchg   %eax,%ebp
Code;  c0009a2b <unhash_process+87/bc>
  13:   00 69 9a                  add    %ch,0xffffff9a(%ecx)
Code;  c0009a2e <unhash_process+8a/bc>
  16:   5f                        pop    %edi
Code;  c0009a2f <unhash_process+8b/bc>
  17:   9d                        popf

Oops: bitten by watchdog
IRP: c000e544 SRP: c0009348 DCCR: 00000400 USP: 9ffffae0 MOF: 00000000
 r0: c1c18000  r1: 000000fd   r2: c1c58099  r3: c1c58000
 r4: c00f4000  r5: c000e41e   r6: c1c5802c  r7: c1c58064
 r8: c1c58010  r9: 0000000e  r10: 00000011 r11: 00000011
r12: c1c18095 r13: 00000000 oR10: 00000011
Process myscript (pid: 253, stackpage=c1c58000)
Stack from 9ffffae0:
       00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
       00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
       00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Call Trace:
Stack from c1c59ca8:
       c00085a6 c1c59d24 c005bf0a c005c068 00000011 00000000 c00f4000 c1c58000
       c1c58099 000000fd c1c59ce0 c005c0e8 c1c18000 c005bdac 00000000 00000011
       00000000 c1c18095 00000011 00000011 0000000e c1c58010 c1c58064 c1c5802c
Call Trace: [<c00085a6>] [<c005bf0a>] [<c005c068>] [<c005c0e8>]
[<c005bdac>] [<c000e41e>] [<c0009348>]
       [<c000e544>] [<c000e5bc>] [<c0020cce>] [<c0009348>]
[<c000963e>] [<c005c13c>] [<c00085a6>] [<c005f136>]
       [<c0012066>] [<c0012614>] [<c005ef10>] [<c005bd22>]
[<c0008d32>] [<c0009a24>] [<dbed0092>] [<c0008d32>]
       [<c0009920>] [<c005bc26>]
Code: 69 9a 14 e1 e9 9b 5f 0d f1 00 69 9a (1c) e1 e9 9b 5f 0d f5 00 69 9a 20 e1


>>EIP; c000e544 <do_notify_parent+2e/ae>   <=====

>>IRP; c000e544 <do_notify_parent+2e/ae>
>>SRP; c0009348 <exit_notify+1b2/236>
>>IRP; c000e544 <do_notify_parent+2e/ae>
>>SRP; c0009348 <exit_notify+1b2/236>
>>r0; c1c18000 <_end+1af34c0/1edb4c0>
>>r2; c1c58099 <_end+1b33559/1edb4c0>
>>r3; c1c58000 <_end+1b334c0/1edb4c0>
>>r4; c00f4000 <init_task_union+0/2000>
>>r5; c000e41e <kill_pg+0/16>
>>r6; c1c5802c <_end+1b334ec/1edb4c0>
>>r7; c1c58064 <_end+1b33524/1edb4c0>
>>r8; c1c58010 <_end+1b334d0/1edb4c0>
>>r12; c1c18095 <_end+1af3555/1edb4c0>

Trace; c00085a6 <printk+0/14e>
Trace; c005bf0a <show_stack+0/8a>
Trace; c005c068 <show_registers+d4/13a>
Trace; c005c0e8 <watchdog_bite_hook+1a/1e>
Trace; c005bdac <Watchdog_bite+1a/1c>
Trace; c000e41e <kill_pg+0/16>
Trace; c0009348 <exit_notify+1b2/236>
Trace; c000e544 <do_notify_parent+2e/ae>
Trace; c000e5bc <do_notify_parent+a6/ae>
Trace; c0020cce <fput+d0/f8>
Trace; c0009348 <exit_notify+1b2/236>
Trace; c000963e <do_exit+272/294>
Trace; c005c13c <die_if_kernel+42/46>
Trace; c00085a6 <printk+0/14e>
Trace; c005f136 <do_page_fault+222/2cc>
Trace; c0012066 <do_wp_page+23c/266>
Trace; c0012614 <handle_mm_fault+82/c8>
Trace; c005ef10 <handle_mmu_bus_fault+b0/b4>
Trace; c005bd22 <mmu_bus_fault+28/30>
Trace; c0008d32 <release_task+3a/b0>
Trace; c0009a24 <unhash_process+80/bc>
Trace; dbed0092 <END_OF_CODE+19ed0092/????>
Trace; c0008d32 <release_task+3a/b0>
Trace; c0009920 <sys_wait4+296/30c>
Trace; c005bc26 <system_call+50/58>

Code;  c000e538 <do_notify_parent+22/ae>
00000000 <_EIP>:
Code;  c000e538 <do_notify_parent+22/ae>
   0:   69 9a 14 e1 e9 9b 5f      imul   $0xf10d5f,0x9be9e114(%edx),%ebx
Code;  c000e53f <do_notify_parent+29/ae>
   7:   0d f1 00
Code;  c000e542 <do_notify_parent+2c/ae>   <=====
   a:   69 9a 1c e1 e9 9b 5f      imul  
$0xf50d5f,0x9be9e11c(%edx),%ebx   <=====
Code;  c000e549 <do_notify_parent+33/ae>
  11:   0d f5 00
Code;  c000e54c <do_notify_parent+36/ae>
  14:   69 9a 20 e1 00 00 00      imul   $0x0,0xe120(%edx),%ebx
Code;  c000e553 <do_notify_parent+3d/ae>
  1b:   00 00 00
--- ksymoops output end ---



On 4/3/06, Karl-Johan Perntz <Karl-Johan.Perntz@xxxxxxx.com> wrote:
> Hi Peter,
>
> Could you please decode the oops you got? Instructions on how to do that can be found here: http://developer.axis.com/wiki/doku.php?id=oops
>
> Note that it is important that you use the vmlinux and System.map files that are associated with the kernel image that crashed.
>
> Regards,
> Karl-Johan Perntz
>
> -----Original Message-----
> From: owner-dev-etrax@xxxxxxx.com">mailto:owner-dev-etrax@xxxxxxx.com] On Behalf Of wretch
> Sent: den 29 mars 2006 09:18
> To: Dave Whittaker; dev-etrax
> Subject: Re: MCM Crashes
>
> Hi Dave
>
> I finally captured the serial port output (see below).
> I've repeated the test a couple of times during the past few days all with the same result.
> Even when a run a script that only executes 'ps' every second (thus not accessing the flash)
> crashes the board.
>
> Hope you can point me in the right direction.
>
> Regards,
> Peter
>
>
> ------ Serial port output begin -----
> Unable to handle kernel access at virtual address 00d4c000
> Oops: 0002
> IRP: c0009a24 SRP: c0008d32 DCCR: 00000480 USP: 9ffffae0 MOF: 00000000
> r0: c1c18000 r1: c1c58000 r2: 00000000 r3: 00000000
> r4: 00005ad9 r5: c1c59f84 r6: 9ffffb10 r7: 00000000
> r8: 00000000 r9: 00d4d400 r10: c1c18000 r11: c00d9c80
> r12: c1c1809d r13: 00000000 oR10: c1c18000
> R_MMU_CAUSE: 00d4d139
> Process myscript (pid: 253, stackpage=c1c58000)
>
> Stack from 9ffffae0:
>  000d1d68 00000001 00089202 00000000 00000000 000c4150 00000000 000ad328
>  9fffffbc 000000fd 000d1d68 000d1d7c 00000000 9ffffb10 000890ee 9fffffbc
>  00000001 000d1d68 00000000 0008553a 00000000 00000000 000c4150 00000000
> Call Trace:
> Stack from c1c59de4:
>  c00085a6 c1c59f2c c005bf0a c005c068 c00d9c80 00000000 c1c59f2c c1c58000
>  c00cc340 00000002 c1c59ee8 c005c12e 00d4c000 c1c59ee8 c00085a6 c005f136
>  00000000 00000000 9ffffb10 c1c59f84 00005ad9 00000000 c1c90000 c1c59ee8
> Call Trace: [<c00085a6>] [<c005bf0a>] [<c005c068>] [<c005c12e>] [<c00085a6>] [<c005f136>] [<c0012066>]
>  [<c0012614>] [<c005ef10>] [<c005bd22>] [<c0008d32>] [<c0009a24>] [<dbed0092>] [<c0008d32>] [<c0009920>]
>
>  [<c005bc26>]
> Code: 69 96 0c 30 0f 05 5f 9d a1 00 ed db (10) e0 6a 96 5f ad 95 00 69 9a 5f 9d
> Oops: bitten by watchdog
> IRP: c000e544 SRP: c0009348 DCCR: 00000400 USP: 9ffffae0 MOF: 00000000
> r0: c1c18000 r1: 000000fd r2: c1c58099 r3: c1c58000
> r4: c00f4000 r5: c000e41e r6: c1c5802c r7: c1c58064
> r8: c1c58010 r9: 0000000e r10: 00000011 r11: 00000011
> r12: c1c18095 r13: 00000000 oR10: 00000011
> R_MMU_CAUSE: 35583010
> Process myscript (pid: 253, stackpage=c1c58000)
>
> Stack from 9ffffae0:
>  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
>  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
>  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> Call Trace:
> Stack from c1c59ca8:
>  c00085a6 c1c59d24 c005bf0a c005c068 00000011 00000000 c00f4000 c1c58000
>  c1c58099 000000fd c1c59ce0 c005c0e8 c1c18000 c005bdac 00000000 00000011
>  00000000 c1c18095 00000011 00000011 0000000e c1c58010 c1c58064 c1c5802c
> Call Trace: [<c00085a6>] [<c005bf0a>] [<c005c068>] [<c005c0e8>] [<c005bdac>] [<c000e41e>] [<c0009348>]
>  [<c000e544>] [<c000e5bc>] [<c0020cce>] [<c0009348>] [<c000963e>] [<c005c13c>] [<c00085a6>] [<c005f136>]
>
>  [<c0012066>] [<c0012614>] [<c005ef10>] [<c005bd22>] [<c0008d32>] [<c0009a24>] [<dbed0092>] [<c0008d32>]
>
>  [<c0009920>] [<c005bc26>]
> Code: 69 9a 14 e1 e9 9b 5f 0d f1 00 69 9a (1c) e1 e9 9b 5f 0d f5 00 69 9a 20 e1
>
> ------ Serial port output end -----
>
> On 3/24/06, wretch <the.wretch@xxxxxxx.com> wrote:
> The board requires a reset to get out of this state. No reflash required.
> Capturing the serial port output is a bit difficult at the moment.
> I will try to post the serial output on Monday.
>
>
> Peter
>
> On 3/24/06, Dave Whittaker <dwhittaker@xxxxxxx.com > wrote:
> Could you elaborate on what happens when the board dies. Can it be rebooted or does it require a reflash? Any output from the serial port?
>
> Dave
>
>
> From: owner-dev-etrax@xxxxxxx.com">mailto:owner-dev-etrax@xxxxxxx.com] On Behalf Of wretch
> Sent: Friday, March 24, 2006 3:54 AM
> To: dev-etrax@xxxxxxx.com
> Subject: MCM Crashes
> Hi group,
>
> We have a number of custom boards (based upon the dev server 83+ design) and we are having a problem with one particular board.
> The problem is that the MCM (4+16) on this board crashes after a couple of minutes/hours uptime.
>
> At first I thought it was the custom SW that caused this, so I ran a test without the SW and it ran (just idle) for a more than a day.
> It must be the SW you would think (me too at first), but this morning I ran a test with a simple shell script (while true do; find /; done;)
> and after a couple of minutes the MCM crashed, so that rules out the custom SW.
>
> We also experimented with different environments to test if the problem was heat related but no change;
>
> We have replace one or two MCM in the past because those refused to program.
> Boards got x-rayed and no short we found :(
> However this board shows a different behaviour.
>
> Any ideas or similar problems ???
>
> Regards,
> Peter
>
>
>
>