[Halld-online] Hall-D Online Meeting Minutes

David Lawrence davidl at jlab.org
Wed Jan 11 17:05:00 EST 2017


Hi All,

Minutes from today’s online meeting are now posted:

https://halldweb.jlab.org/wiki/index.php/OWG_Meeting_11-Jan-2017 <https://halldweb.jlab.org/wiki/index.php/OWG_Meeting_11-Jan-2017>


Minutes[edit <https://halldweb.jlab.org/wiki/index.php?title=OWG_Meeting_11-Jan-2017&action=edit&section=5>]

Attendees: David L. (chair). Sergey F., Dave A., Carl T., Vardan G., Bryan M., Simon T., Curtis M., Hovanes E., Graham H.
Announcements[edit <https://halldweb.jlab.org/wiki/index.php?title=OWG_Meeting_11-Jan-2017&action=edit&section=6>]

First run meeting of the Spring 2017 run is tomorrow morning
DAQ specs for upcoming run[edit <https://halldweb.jlab.org/wiki/index.php?title=OWG_Meeting_11-Jan-2017&action=edit&section=7>]

Current expectation is that we should be all to run DAQ at 50kHz during Spring run
This is 2.5 times larger than original spec. but previous experience drives this expectation
First limitation is rate we can write to disk. This is roughly 1GB/s
David showed a preliminary plot indicating this rate may be achievable with 1GB/s but further investigation is needed since this appears somewhat inconsistent with what was extrapolated from Spring 2016 data
gluonraid3 issues[edit <https://halldweb.jlab.org/wiki/index.php?title=OWG_Meeting_11-Jan-2017&action=edit&section=8>]

Corrupted data observed on gluonraid3, partition 3
Fall run used partition3 and 4 for beam data
Each partition used 10 disks configured using software RAID0 and formatted with XFS
Most files currently on gluonraid3, partition 3 have different md5 checksum values compared to what is on tape
Looking closer at one file revealed the file on tape started with a valid EVIO block header while the copy on gluonraid3 was corrupted.
Timestamp of file on gluonraid3 was close to the timestamp of the mss stub file (from Dec. 16) suggesting file has not been modified.
Hovanes pointed out that the problem could be with the controller and not a disk
Current plan is to invest in 2 hardware RAID controller cards to replace the JBOD controllers
Will try and get them delivered and installed by end of next week
Plan B: Use gluonraid1 and gluonraid2 at 800MB/s
Plan C: Try higher level of software RAID
Plan D: Get ZFS configuration optimized
Sergey noted that ieven if we do have RAID level redundancy for data protection, if a disk goes out the system performance will be degraded while it repairs itself, likely making it unusable for high rate I/O
Dave A. asked about spare disks for gluonraid3
Run Preparations[edit <https://halldweb.jlab.org/wiki/index.php?title=OWG_Meeting_11-Jan-2017&action=edit&section=9>]

FCAL10 issue has been resolved (bad f250 module)
CODA 3.0.7 may be available by end of next week. It could provide better performance due to:
More efficient object allocation leading to less garbage collection
Multi-threaded event building
Future Online Meetings[edit <https://halldweb.jlab.org/wiki/index.php?title=OWG_Meeting_11-Jan-2017&action=edit&section=10>]

Next meeting is in 2 weeks, but we will likely be having regular run meetings at that time and will start deferring the Online meetings until after the run.
David noted that the original intent of the Online meetings was to provide tight coordination between Controls, Trigger, DAQ, and all other online systems (monitoring, copy to tape, gluon computer admin, ...). At this point though these topics have naturally broken off into separate meetings. The exception being DAQ which is partially covered in the L1 meetings and partially in the Online meetings. The Online meetings are usually not attended by Sascha who is the primarily responsible person for a large part of the DAQ. Thus, we should think whether the format of this meeting is the most efficient.


Regards,
-David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.jlab.org/pipermail/halld-online/attachments/20170111/6e01058a/attachment.html>


More information about the Halld-online mailing list