[ace] Incident INC0109183 - This solution caused odd issues
Anthony Cuffe
cuffe at jlab.org
Fri Sep 22 16:34:50 EDT 2023
After killing this process and restarting it, it corrupted the local cache and the “secrets” file that gets created by sssd. This means that no one could login and all Kerberos commands would fail.
In order to fix this, I had to do the following:
rm -f /var/lib/sss/db/*cache*
yum reinstall sssd-common
systemctl restart sssd
reboot
While is seems like a logical process that I just did, these commands are after a bunch of frogging around and trying various things. I left these steps out an simply give the commands here to fix it in the future.
Anthony
From: IT Service Desk <jlab at servicenowservices.com>
Sent: Thursday, September 21, 2023 4:07 PM
To: Erik Werlau <werlau at jlab.org>; Brian Bevins <bevins at jlab.org>; Theo McGuckin <tsm at jlab.org>; Brad Cumbia <cumbia at jlab.org>; Anthony Cuffe <cuffe at jlab.org>; Ryan Slominski <ryans at jlab.org>; Jessica Zamzow <zamzow at jlab.org>; accadm at jlab.org; Adam Carpenter <adamc at jlab.org>; Theo Larrieu <theo at jlab.org>
Subject: Incident INC0109183 has been assigned - Some process called Kerberos Cache Manager, sssd_kcm, is hogging a lot of CPU on my machine devl02.
Subject: Some process called Kerberos Cache Manager, sssd_kcm, is hogging a lot of CPU on my machine devl02.
Incident: INC0109183
An incident has been assigned to Theo McGuckin.
Additional Details:
Requestor: Gary Croke (gcroke at jlab.org<mailto:gcroke at jlab.org>) ext: 5097
Updated By: tsm
Opened: 09-21-2023 03:21:03 PM
Category: Accelerator Computing Environment (ACE)
Subcategory: Linux Support
Assigned To: Theo McGuckin
Comments:
________________________________
09-21-2023 04:06:44 PM EDT - Theo McGuckin Additional comments
Stopped and restarted sssd-kcm.socket process as root.
Let me know if the problem returns
________________________________
09-21-2023 03:21:03 PM EDT - Gary Croke Additional comments
Some process called Kerberos Cache Manager, sssd_kcm, is hogging a lot of CPU on my machine devl02. Can't seem to kill it, restarted, still there.
You can view all the details of the incident by following the link below:
Take me to the Incident<https://urldefense.proofpoint.com/v2/url?u=https-3A__jlab.servicenowservices.com_nav-5Fto.do-3Furi-3Dincident.do-3Fsys-5Fid-3De6c3126f1b15b510a552ed3ce54bcbb0&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=39aRN8DBBjQIGUY49byo_w&m=JxWDGsrYLf_LiPPJTOYc2QrrXbWskoInxZYhuKYdbDNk3l4qtAHp650MqAR_9PzF&s=DVX1ttz52oNP81AUGs1MnqdVhoNX3eQZ-DbPYEfc4AQ&e=>
Original Request:
Some process called Kerberos Cache Manager, sssd_kcm, is hogging a lot of CPU on my machine devl02. Can't seem to kill it, restarted, still there.
Ref:MSG0936696_R3ScwvzPR9rKrjXQujNo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.jlab.org/pipermail/ace/attachments/20230922/a4f82001/attachment.html>
More information about the ace
mailing list