[Lowq] Cooking Issue

Lamiaa El Fassi elfassi at jlab.org
Sat Mar 20 14:42:04 EDT 2010


Hi,

Those are some cooking statistics:

Completed jobs: 688
Good jobs        : 249
Crashed jobs   : 439

In the last hour, things are becoming worse. Below an example of time
stamps summary of file clas_41470.A40 taken from auger web
page. The log file of this job is showing the same segmentation fault that
I mentioned in my last email

 Time stamps

Submitted: Mar 18, 2010 10:30:17 PM
Cleared Dependencies: Mar 20, 2010 2:15:29 PM
Started Copying Input Files: Mar 20, 2010 2:22:08 PM
Started Executing: Mar 20, 2010 2:25:12 PM
Started Copying Output Files: Mar 20, 2010 2:25:19 PM
Completed: Mar 20, 2010 2:25:19 PM
Best regards,

Lamiaa


************************************************************
*  Lamiaa El Fassi             email: elfassi at jlab.org
*  Research Associate @ Rutgers University
*  Phone: (757) 269-7011 // Fax: (757) 269-5703
*  Jefferson Lab., 12000 Jefferson Ave.
*  Suite# 4, MS 12H3
*  Newport News, VA. 23606
************************************************************


On Sat, Mar 20, 2010 at 12:32 PM, Lamiaa El Fassi <elfassi at jlab.org> wrote:

> Hi,
>
> Upon request I am reprocessing the elastic runs for the RTPC calibration.
> I have noticed that almost half of the jobs done until now crashed during
> the
> cooking. All these crashed jobs are showing "status:success & exit code: 0"
>
> in auger web page, but if I check their output log files I am finding
> "Segmentation fault & size of raw data equal 0"
> This lack of getting the raw data can be caused by no enough space in the
> farm
> machine or something else?
> In the submission script of each job I am requesting 8 GB of disk space
> which fulfills
> the size requirement of the input and output files of each processed job.
> Is there any problem in the farm machine which may be causing that?
>
> Best regards,
>
> Lamiaa
>
> ************************************************************
> *  Lamiaa El Fassi             email: elfassi at jlab.org
> *  Research Associate @ Rutgers University
> *  Phone: (757) 269-7011 // Fax: (757) 269-5703
> *  Jefferson Lab., 12000 Jefferson Ave.
> *  Suite# 4, MS 12H3
> *  Newport News, VA. 23606
> ************************************************************
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://mailman.jlab.org/pipermail/lowq/attachments/20100320/446bcae2/attachment.html 


More information about the Lowq mailing list