[G12] farm local dsks
Johann Goetz
jgoetz at ucla.edu
Tue Apr 12 11:34:06 EDT 2011
Hi g12ers,
OK, I have started the process to run gflux on a per run basis. However, I
have a concern about runs where there are few files. In this case, we are
still loosing a large chunk of data -- it would be a 4% effect for a 10 file
run, of which we have several. It seems silly to remove these good events
from any analysis because of the flux calculation.
Also, more than half of our runs have one or two files which were
unsuccessful when cooking during pass1. So we'll have to run gflux only on
the files that were successful and keep a record of this somewhere which
will complicate any analysis. Its not to say that they will never be cooked,
just no one has taken the time to do it. If they do get cooked, then we
would have to run gflux once again since we treat runs a whole. We could
avoid this problem if we had gflux for each raw data file.
Is there no way we can modify gflux to be reliable on a per-file basis? Is
it a matter of the scalar banks? If so, can we get the first scalar bank in
the file and use that for the first part (up to the first 10 seconds) of the
file?
Or if we are running gflux on a single file, I could have the job cache the
previous data file, get the last scalar bank and use that for the beginning
of the file. This would work out fine I think, but then I am not sure how to
handle the A00 file of a run.
I understand the 10 seconds gflux cuts out at the beginning of a job, but I
don't know why it cuts out the last 10 seconds.
On Mon, Apr 11, 2011 at 3:53 PM, Eugene Pasyuk <pasyuk at jlab.org> wrote:
> Good news from Sandy. All farm nodes have large disks. So we cans submit
> large chunks of data in a single job.
>
> -Eugene
>
>
> -------- Original Message -------- Subject: Re: farm local dsks Date: Mon,
> 11 Apr 2011 15:27:03 -0400 From: Sandy Philpott <sandy.philpott at jlab.org><sandy.philpott at jlab.org> To:
> Eugene Pasyuk <pasyuk at jlab.org> <pasyuk at jlab.org>
>
>
> Hi Eugene,
>
> Most of the batch farm nodes have 500GB disks, except the newest batch
> of farm11* systems which all have 1000GB (1TB) disks. I don't think
> there's a max a job can request, so long as it physically fits under
> these sizes. We are looking at ways to make sure the local disk isn't an
> I/O bottleneck as it can be for very I/O intensive jobs.
>
> Sandy
>
> > Hi Sandy,
> >
> > What is the size of the local disks on the batch farm nodes these days
> > and what is the max a job can request for DISK_SPACE?
> >
> > Thanks,
> >
> > -Eugene
> >
>
--
Johann T. Goetz, PhD.
jgoetz at ucla.edu
Nefkens Group, UCLA Dept. of Physics & Astronomy
Hall-B, Jefferson Lab, Newport News, VA
Office: 757-269-5465 (CEBAF Center F-335)
Mobile: 757-768-9999
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://mailman.jlab.org/pipermail/g12/attachments/20110412/8a190a03/attachment.html
More information about the G12
mailing list