[Clas_offline] CLASDB To be rebooted tomorrow morning (5/7/2015) at 6:00 am
Marty Wise
wise at jlab.org
Wed May 6 14:29:23 EDT 2015
All,
CLASDB is currently configured to allow up to 4000 simultaneous connections.
Recently, 7000 farm jobs attempted to connect and many failed due to this
limit.
To allow CLASDB to handle so many connections, it needs more RAM and CPU.
The system will be shut down at 6:00 am on 5/7/2015 and the RAM and CPU
increased. The MySQL connection limit will also be increased to 7200 as the
system is restarted. In all, the system should be offline for about 15
minutes so should be back in service by 6:15 am.
If this plan will cause big problems for anyone, please let me know and I
will try to reschedule for a better time.
Thanks,
Marty Wise, JLab Computer Center
Newport News, Virginia
E Mail: wise at jlab.org
Phone: 757-269-7214
-----Original Message-----
From: Harut Avakian [mailto:avakian at jlab.org]
Sent: Tuesday, May 5, 2015 2:40 PM
To: Marty Wise
Subject: Re: ** PROBLEM Service Alert: clasdb/MySQL is CRITICAL **
Hi Marty,
There are some cooks going on. I suggest you just send a note a day before
to clas_offline at jlab.org with exact time you want to do changes.
Harut
On 4/30/15 10:40 AM, Harut Avakian wrote:
> On 4/30/15 10:33 AM, Marty Wise wrote:
>>
>> Harut,
>>
>> I am sorry, but I did some more checking on the current configuration
>> and several things indicate that the current configuration is correct
>> and that I probably should not increase the connection limit without
>> first adding more RAM.
>>
>> I can add more RAM and double the CPU available to clasdb, bit I will
>> need to reboot the system to do so. It will only take a few minutes.
>> Just let me know when will be a good time. I can wait until there is
>> relatively little farm activity, but there are wikis and logbooks,
>> etc. that also depend on clasdb, so it would probably be best to wait
>> for a maintenance day or do it off-hours. I would be happy to do it
>> early some morning if that is convenient – maybe 5:00 am.
>>
>> Let me know if you would like me to make the change and when would be
>> a good time.
>>
> Hi Marty,
>
> I'll check with users and get back to you.
> We should be able to find some better time than 5am during the weekend.
> Harut
>>
>> JLab IT/CNI
>>
>> *From:*Marty Wise [mailto:wise at jlab.org]
>> *Sent:* Thursday, April 30, 2015 10:00 AM
>> *To:* Harut Avakian
>> *Subject:* RE: Re: ** PROBLEM Service Alert: clasdb/MySQL is CRITICAL
>> **
>>
>> Harut,
>>
>> I think the problem is that the farm has grown and I need to increase
>> the connection limit on clasdb to match it. I just talked with Sandy
>> and it sounds like the farm will stay at its current size (about 7000
>> jobs) for a while. I’m checking what is needed to increase the limit.
>> The current limit is based on a calculation for the amount of memory
>> required. But, the actual usage on the system appears to be much less
>> than predicted so I may be able to safely increase the limit to 8000
>> or so connections. That should prevent problems like those seen
>> yesterday.
>>
>> If I don’t need to add more memory as I am speculating, I can make
>> this change without disrupting the running system. I will let you
>> know something more shortly.
>>
>> Sorry for the problems,
>>
>> Marty Wise (wise at jlab.org <mailto:wise at jlab.org>) JLab IT/CNI
>>
>> *From:*Harut Avakian [mailto:avakian at jlab.org]
>> *Sent:* Wednesday, April 29, 2015 5:50 PM
>> *To:* Marty Wise
>> *Subject:* Fwd: Re: ** PROBLEM Service Alert: clasdb/MySQL is
>> CRITICAL **
>>
>> Hi Marty,
>> Is it possible that connection with db will stay open after the
>> completion of the job?
>> Harut
>>
>>
>>
>> -------- Forwarded Message --------
>>
>> *Subject: *
>>
>>
>>
>> Re: ** PROBLEM Service Alert: clasdb/MySQL is CRITICAL **
>>
>> *Date: *
>>
>>
>>
>> Wed, 29 Apr 2015 17:29:43 -0400 (EDT)
>>
>> *From: *
>>
>>
>>
>> Sandy Philpott <philpott at jlab.org> <mailto:philpott at jlab.org>
>>
>> *To: *
>>
>>
>>
>> Nathan Harrison <nathanh at jlab.org> <mailto:nathanh at jlab.org>
>>
>> *CC: *
>>
>>
>>
>> Harut Avakian <avakian at jlab.org> <mailto:avakian at jlab.org>
>>
>> Hi Nathan,
>>
>> Can you check that your farm jobs are closing the connections to clasdb
>> after they are finished being used? Your bulk start of farm jobs earlier
>> this afternoon grabbed all of the available 4000 connections and held
>> them.
>>
>> Thanks,
>> Sandy
>>
>> ----- Original Message -----
>> From: "Marty Wise"<wise at jlab.org> <mailto:wise at jlab.org>
>> To:monalert at jlab.org <mailto:monalert at jlab.org>
>> Sent: Wednesday, April 29, 2015 3:52:19 PM
>> Subject: RE: ** PROBLEM Service Alert: clasdb/MySQL is CRITICAL **
>>
>> CLASDB is currently configured to allow up to 4000 simultaneous
>> connections and it has hit that limit. I assume jobs beyond the limit
>> will fail. But, it will be interesting to see if the remainder
>> complete as desired.
>>
>> Marty Wise (wise at jlab.org <mailto:wise at jlab.org>
>> <mailto:wise at jlab.org> <mailto:wise at jlab.org> ) JLab IT/CNI
>>
>> -----Original Message-----
>> From: Nagios [mailto:nagios at jlab.org]
>> Sent: Wednesday, April 29, 2015 3:45 PM To:monalert at jlab.org
>> <mailto:monalert at jlab.org>
>> Subject: ** PROBLEM Service Alert: clasdb/MySQL is CRITICAL **
>>
>> ***** Nagios *****
>>
>> Notification Type: PROBLEM
>>
>> Service: MySQL
>> Host: clasdb
>> Address: clasdb
>> State: CRITICAL
>>
>> Date/Time: Wed Apr 29 15:45:06 EDT 2015
>>
>> Additional Info:
>>
>> Error: DBI connect(mysql:clasdb:3306,mon,...) failed: Too many
>> connections MySQL CRITICAL
>>
>
>
> --
> ___________________________________________________________________________
> Harut Avagyan e-mail -avakian at jlab.org
> 12000 Jefferson Ave. Suite 5 tel/fax - 1(757)269-7764/5800
> Newport News, VA 23606http://www.jlab.org/~avakian
>
More information about the Clas_offline
mailing list