[Kittyhawk] Another trace

Eric Van Hensbergen ericvh at gmail.com
Sun Sep 18 22:10:56 EDT 2011


I made a slightly hacky patch to that specific problem -- but all that did was reveal another scenario (the bad page fault one).  I'll post more details tomorrow morning after I wrestle with it a bit more so I can understand it.

Sent from my iPad

On Sep 18, 2011, at 8:46 PM, Jonathan Appavoo <jappavoo at bu.edu> wrote:

> This seems to collaborate the scenario that Dan and I were discussing but
> had not had time to verify yet.  If this trace and our theory check out we
> should be able to verify with a simpler test case and also setup a quick fix
> along with a longer term fix.  We will try and work on it tomorrow.
> 
> Jonathan.
> 
> On Sep 18, 2011, at 10:37 AM, Eric Van Hensbergen wrote:
> 
>> looks like maybe a deadlock if we print during a fault handler when
>> someone else is holding onto tree->lock (?)
>> Of course, the fourth core is timing out talking to one of the other
>> cores because they aren't responding to an IPI, so maybe some sort of
>> deadlock between bgcon->lock {5}.0 and tree->lock {5}.2
>> 
>>      -eric
>> 
>> 
>> haredebug[ANL-R00-M0-N14-64]>{5} dumpier
>> OK
>> {5}.0 IAR:  0xc029c18c    LR:  0xc01de72c      (spin lock in
>> con_flush)  (bgcon->lock)
>> {5}.1 IAR:  0xc0007b74    LR:  0xc0007b88      (cpu idle)
>> {5}.2 IAR:  0xc029c4c8    LR:  0xc01d8094      (spin lock_irqsave in
>> bgtree_enable_inject_wm_interrupt) (tree->lock)
>> {5}.3 IAR:  0xc029c4c8    LR:  0xc01d7a64      (spin lock in
>> bgtree_xmit) (tree->lock)
>> 
>> haredebug[ANL-R00-M0-N14-64]>{5} dump_stacks
>> OK
>> {5}.0 Dump stack
>> Core 0 Stack Dump
>> 0xeff89c90  0xc01de7d0     con_flush
>> 0xeff89cc0  0xc01d83f4     bgtree_inject_interrupt
>> 0xeff89ce0  0xc00576e4    handle_IRQ_event
>> 0xeff89d00  0xc0059754    handle_fasteoi_irq
>> 0xeff89d20  0xc0004418    do_IRQ
>> 0xeff89de0  0xc000fc04     ret_from_except
>> 0xeff89e10  0xc01deae0    bg_tty_write
>> 0xeff89e20  0xc019101c    tty_defaul_put_char
>> 0xeff89e30  0xc0196778    opost
>> 0xeff89f30  0xc0197908     n_tty_receive_buf
>> 0xeff89f60  0xc0191d60     flush_to_ldisc
>> 0xeff89f90  0xc0044bf0      run_workqueue
>> 0xeff89fd0  0xc0045314     worker_thread
>> 0xeff89ff0  0xc004a0d4      kthread
>> 0x00000000  0xc0010658
>> {5}.1 Dump stack
>> Core 1 Stack Dump
>> 0xeffc5fe0  0xc0007b88     cpu_idle
>> 0xeffc5ff0  0xc00109bc
>> 0x00000000  0xc0000258
>> {5}.2 Dump stack
>> Core 2 Stack Dump
>> 0xc049de10  0xc01d7a88   bgtree_xmit
>> 0xc049de20  0xc01de9ac   enqueue_retransmit calling
>> bgtree_enable_inj_wm_interrupt
>> 0xc049de50  0xc01deae0   bg_tty_write (calling enque retransmit)
>> 0xc049deb0  0xc0198808  write_chan
>> 0xc049def0  0xc0193808   tty_write
>> 0xc049df10  0xc008c7e0   vfs_write
>> 0xc049df40  0xc008c9b8   file_pos_write
>> 0x7f8e6960  0xc000f5c4    ret_from_syscall
>> Invalid value in Link register: Oldframe=0xc049df40  Newframe=0x7f8e6960
>> {5}.3 Dump stack
>> Core 3 Stack Dump
>> 0xdfeebb80  0xc140692c
>> 0xdfeebbd0  0xc01dde44     do_write
>> 0xdfeebbf0  0xc01ddec0      bg_console_write
>> 0xdfeebc10  0xc002d9ec     call_console drivers
>> 0xdfeebc50  0xc002df08     release_console_Sem
>> 0xdfeebce0  0xc002e7ec     vprintk
>> 0xdfeebd20  0xc002e900     printk
>> 0xdfeebd70  0xc0010cf0      smp_call_function_map (timing out response
>> from other cpu)
>> 0xdfeebda0  0xc0010e08     smp_flush_tlb_page
>> 0xdfeebde0  0xc006d4c0     do_wp_page
>> 0xdfeebe70  0xc006ffa8       __handle_mm_fault
>> 0xdfeebf40  0xc00132a4      do_page_fault
>> 0x7f8e6940  0xc000fa00      handle_page_fault
>> Invalid value in Link register: Oldframe=0xdfeebf40  Newframe=0x7f8e6940
>> _______________________________________________
>> Kittyhawk mailing list
>> Kittyhawk at cs.bu.edu
>> http://cs-mailman.bu.edu/mailman/listinfo/kittyhawk
> 



More information about the Kittyhawk mailing list