[Lerftest-ctrls] RF CPU reboot & iocConsole problems

Wesley Moore wmoore at jlab.org
Thu Sep 20 10:34:25 EDT 2018


Sonya,

Larry rebooted lcls-llrfcpu02.  Said it powered back up, but isn't showing any connectivity.  Looks the same from my end.  

lclsfs - can't ssh, but pingable
lclsapp1 - seems fine
lclsapp2 - can't ssh, but pingable
lcls-llrfcpu01 - seems fine
lcls-llrfcpu02 - can't ssh, can't ping
Control room hosts (lclsl01-03): can't ssh, but pingable

Let me follow up with the guy that setup the fileserver and see if we can get that checked out first.  We may need to reboot stuff after that's sorted out.

Wesley

On 9/20/18, 9:39 AM, "Lerftest-ctrls on behalf of Wesley Moore" <lerftest-ctrls-bounces at jlab.org on behalf of wmoore at jlab.org> wrote:

    Looks like at least lcls-llrfcpu02 needs to be rebooted.  Others seem likely as well.  The control room hosts aren't connecting either.  Have you heard anything from Larry?
    
    Wesley
    
    On 9/19/18, 7:20 PM, "Lerftest-ctrls on behalf of Sonya Hoobler" <lerftest-ctrls-bounces at jlab.org on behalf of sonya at slac.stanford.edu> wrote:
    
        Hi Wesley, all,
        
        I just tried a reboot of RF CPU lcls-llrfcpu02 and it never successfully 
        re-booted up.
        
        I can't view the boot-up sequence because iocConsole is also no longer 
        working for either CPU:
        
        [softegr at lclsapp1 iocCommon]$ iocConsole lcls-llrfcpu01
          : ssh -x -t -l laci lclsapp2.acc.jlab.org bash -l -c " pyiocscreen.py -t HIOC lcls-llrfcpu01 lclsts1 2001 "
        Read from socket failed: Connection reset by peer
        [softegr at lclsapp1 iocCommon]$ iocConsole lcls-llrfcpu02
          : ssh -x -t -l laci lclsapp2.acc.jlab.org bash -l -c " pyiocscreen.py -t HIOC lcls-llrfcpu02 lclsts1 2002 "
        Read from socket failed: Connection reset by peer
        
        I tried a remote reboot of the terminal server.
        
        I also tried ipmitool (and EPICS ipmi) to remotely restart the CPU.
        
        But still no signs of life.
        
        Perhaps you could take a look at the network and/or locally? We may need a 
        local power-cycle of the CPU and the terminal server. I'm cc'ing Larry 
        Farrish, who may also be able to help with that.
        
        This is not super urgent. When you have a chance during your normal 
        working hours, I'd appreciate any help.
        
        Thanks,
           Sonya
        
        _______________________________________________
        Mailing List: Lerftest-ctrls at jlab.org
        https://mailman.jlab.org/mailman/listinfo/lerftest-ctrls
        Wiki: https://wiki.jlab.org/lerf/index.php/Network
        
    
    
    _______________________________________________
    Mailing List: Lerftest-ctrls at jlab.org
    https://mailman.jlab.org/mailman/listinfo/lerftest-ctrls
    Wiki: https://wiki.jlab.org/lerf/index.php/Network
    




More information about the Lerftest-ctrls mailing list