<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<p>All,</p>
<p>If you really want this as a production service, you are going to
have to say what you need, including network bandwidth, staging
area, and tape bandwidth. I'm happy to look into what it takes to
provision this as a resource. And we might also need to look into
QoS as well. Hall B might be interested in having whatever GlueX
is granted, so we'll need to take that into account as well.<br>
</p>
<p>regards, Chip<br>
</p>
<br>
<div class="moz-cite-prefix">On 5/23/16 5:25 PM, Richard Jones
wrote:<br>
</div>
<blockquote
cite="mid:CABfxa3S7Jpt804k2MTYoG79aYvyVyaHyDt6WX_y5zSdBPwuZSg@mail.gmail.com"
type="cite">
<div dir="ltr">Dear Matt,
<div><br>
</div>
<div>Following up on your query regarding download speeds for
fetching secondary datasets from Jlab to offsite storage
resources, I have the following experience to share.</div>
<div>
<ol>
<li><i>first bottleneck:</i> switch buffer overflows (last 6
feet) -- data path was 10Gb from source to my server until
last 6 feet, where it dropped to 1Gb. Performance (tcp)
was highly asymmetric: 95 MB/s upload speed, but poor and
oscillating download speed averaging <b>15 MB/s</b>. This
asymmetry was due to switch buffer overflows at the switch
port where it necks down from 10Gb to 1Gb -- tcp does not
have any back-pressure mechanism except packet loss, which
tends to be catastrophic over high-latency pathways with
std linux kernel congestion algos cubic, htcp.</li>
<li><i>second bottleneck:</i> disk speed on receiving server
-- as soon as I replaced the last 6 feet with a 10Gb NIC /
transceiver, I moved up to the next resistance point,
around <b>140 MB/s</b> on my server. Using diagnostics I
could see that my disk drives (2 commodity SATA 1TB drives
in parallel) were both saturating their write queues. At
this speed I was filling up my disks fast, so I had to
start simultaneous jobs to flush these files from
temporary filesystem on the receiving server to permanent
storage in my dcache. Once the drives were interleaving
reads and writes, the download performance dropped to
around 70MB/s net for both drives.</li>
<li><i>third bottleneck:</i> fear of too much success -- to
see what the next limiting point might be, I switched to a
data transfer node that the UConn data center made
available for testing. It combines a 10Gb nic connected to
a central campus switch and what Dell calls a
high-performance raid (Dell H700, 500GB, probably large
fraction of this is SSD). On this system I never saw the
disks saturate their read/write queues. However the
throughput rose quickly as the transfers started, and as
soon as I saw transfers exceeding <b>300MB/s</b> I
remembered Chip's warning and cancelled the job. I then
decreased the number of parallel streams (from the globus
online defaults) to limit the impact on JLab
infrastructure. Using just 1 simultaneous transfer / 2
parallel streams (globus default is 2 / 4) I was seeing a
steady-state rate between 150 and <b>200 MB/s </b>average
download speed, even with simultaneous downloading and
pushing from the fast raid to my dcache (multiple parallel
jobs) -- which was necessary to keep from overflowing this
500GB partition in a matter of minutes. Decreasing the
globus options to just 1 / 1 I was able to limit the speed
to <b>120 MB/s</b> which is still enough to make me happy
for now.</li>
</ol>
<div>I know without any fiddling you were able to get
somewhere between bottlenecks 1 and 2 above. From this log
of lessons learned, I suspect you will know what steps you
might take to increase your speed to the next resistance
point. One suggestion for the future: we should coordinate
this. For example, anyone who wants offsite access to the PS
triggers should get them from UConn, not fetch them again
from Jlab, since we already have the full set of them from
Spring 2016 in gridftp-accessible storage at UConn. Likewise
for what you pull to IU? Perhaps we should set up a central
place where we record what GlueX data is available, where,
and by what protocol.</div>
<div><br>
</div>
<div>-Richard Jones</div>
</div>
</div>
</blockquote>
<br>
</body>
</html>