[Lerftest-ctrls] RF CPU reboot & iocConsole problems

Wesley Moore wmoore at jlab.org
Fri Sep 21 12:41:13 EDT 2018


We had a minor ssh issue.  Just troublesome to sort, since I couldn't ssh in to look around.


Sounds like something is still quite wrong with lcls-llrfcpu02.  What do we need to do to help troubleshoot it?


Wesley

________________________________
From: Sonya Hoobler <sonya at slac.stanford.edu>
Sent: Friday, September 21, 2018 11:54:31 AM
To: Wesley Moore
Cc: lerftest-ctrls at jlab.org; Curt Hovater; se_c at cox.net
Subject: Re: [Lerftest-ctrls] RF CPU reboot & iocConsole problems

Hi again,

I see that lcls-llrfcpu02 still does not boot--tftp timeout.

I'll go back to hands-off.

Sonya



On Fri, 21 Sep 2018, Sonya Hoobler wrote:

> Hi Wesley,
>
> I just logged in to look around and things seem improved.
>
> Was something done to address the problems?
>
> Thanks,
>  Sonya
>
>
> On Thu, 20 Sep 2018, Sonya Hoobler wrote:
>
>> Hi Wesley,
>>
>> Thank you for the update and for following up.
>>
>> I won't do anything until hearing back from you.
>>
>> Sonya
>>
>>
>>
>> On Thu, 20 Sep 2018, Wesley Moore wrote:
>>
>>> Sonya,
>>>
>>> Larry rebooted lcls-llrfcpu02.  Said it powered back up, but isn't showing
>>> any connectivity.  Looks the same from my end.
>>>
>>> lclsfs - can't ssh, but pingable
>>> lclsapp1 - seems fine
>>> lclsapp2 - can't ssh, but pingable
>>> lcls-llrfcpu01 - seems fine
>>> lcls-llrfcpu02 - can't ssh, can't ping
>>> Control room hosts (lclsl01-03): can't ssh, but pingable
>>>
>>> Let me follow up with the guy that setup the fileserver and see if we can
>>> get that checked out first.  We may need to reboot stuff after that's
>>> sorted out.
>>>
>>> Wesley
>>>
>>> On 9/20/18, 9:39 AM, "Lerftest-ctrls on behalf of Wesley Moore"
>>> <lerftest-ctrls-bounces at jlab.org on behalf of wmoore at jlab.org> wrote:
>>>
>>>    Looks like at least lcls-llrfcpu02 needs to be rebooted.  Others seem
>>> likely as well.  The control room hosts aren't connecting either.  Have
>>> you heard anything from Larry?
>>>
>>>    Wesley
>>>
>>>    On 9/19/18, 7:20 PM, "Lerftest-ctrls on behalf of Sonya Hoobler"
>>> <lerftest-ctrls-bounces at jlab.org on behalf of sonya at slac.stanford.edu>
>>> wrote:
>>>
>>>        Hi Wesley, all,
>>>
>>>        I just tried a reboot of RF CPU lcls-llrfcpu02 and it never
>>> successfully
>>>        re-booted up.
>>>
>>>        I can't view the boot-up sequence because iocConsole is also no
>>> longer
>>>        working for either CPU:
>>>
>>>        [softegr at lclsapp1 iocCommon]$ iocConsole lcls-llrfcpu01
>>>          : ssh -x -t -l laci lclsapp2.acc.jlab.org bash -l -c "
>>> pyiocscreen.py -t HIOC lcls-llrfcpu01 lclsts1 2001 "
>>>        Read from socket failed: Connection reset by peer
>>>        [softegr at lclsapp1 iocCommon]$ iocConsole lcls-llrfcpu02
>>>          : ssh -x -t -l laci lclsapp2.acc.jlab.org bash -l -c "
>>> pyiocscreen.py -t HIOC lcls-llrfcpu02 lclsts1 2002 "
>>>        Read from socket failed: Connection reset by peer
>>>
>>>        I tried a remote reboot of the terminal server.
>>>
>>>        I also tried ipmitool (and EPICS ipmi) to remotely restart the CPU.
>>>
>>>        But still no signs of life.
>>>
>>>        Perhaps you could take a look at the network and/or locally? We may
>>> need a
>>>        local power-cycle of the CPU and the terminal server. I'm cc'ing
>>> Larry
>>>        Farrish, who may also be able to help with that.
>>>
>>>        This is not super urgent. When you have a chance during your normal
>>>        working hours, I'd appreciate any help.
>>>
>>>        Thanks,
>>>           Sonya
>>>
>>>        _______________________________________________
>>>        Mailing List: Lerftest-ctrls at jlab.org
>>>        https://mailman.jlab.org/mailman/listinfo/lerftest-ctrls
>>>        Wiki: https://wiki.jlab.org/lerf/index.php/Network
>>>
>>>
>>>
>>>    _______________________________________________
>>>    Mailing List: Lerftest-ctrls at jlab.org
>>>    https://mailman.jlab.org/mailman/listinfo/lerftest-ctrls
>>>    Wiki: https://wiki.jlab.org/lerf/index.php/Network
>>>
>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.jlab.org/pipermail/lerftest-ctrls/attachments/20180921/f953203d/attachment.html>


More information about the Lerftest-ctrls mailing list