<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html;
      charset=windows-1252">
  </head>
  <body>
    I disassembled the optimized x86-64 code that gcc 4.8 emits on
    Linux. As a pleasant surprise, the optimizer detects the pattern of
    shift operations used in EVIO for byte-swapping and collapses it
    into a single bswap opcode (or a rol operation for 16 bit). The
    optimized code does look efficient. Impressive. <br>
    <br>
    The performance of the optimized EVIO routines is good. I think the
    days when byte-swapping was a major bottleneck (it was in 2000!) can
    be considered gone indeed. Of course, it would be even better if
    we'd vectorize swapping for large buffers, which typically leads to
    speed gains of 5x – 20x according to benchmarks posted online. But
    it's already (almost) good enough. I'll put some numbers together.<br>
    <br>
    So while performance is not a major issue, obviously we don't want
    to create unnecessary work, and moreover, I think it is important to
    have a definitive and consistent data format. Raw data whose
    endianness is variable, possibly requiring external documentation to
    understand, is not something to aim for. To underscore: I have not
    seen any sort of "endian flag" in any of the EVIO headers. If there
    are such flags, the EVIO C-library does not use them at all. Maybe
    someone can point me to documentation where these bits are hiding?
    Or do we need to use the EVIO C++ library to get support for
    endianness flags?<br>
    <br>
    Ole<br>
    <br>
    <div class="moz-cite-prefix">On 3.10.21 at 14:57, Ole Hansen wrote:<br>
    </div>
    <blockquote type="cite"
      cite="mid:8d7feb8a-2df3-db36-f81f-13cc2ba49d8f@jlab.org">
      <meta http-equiv="Content-Type" content="text/html;
        charset=windows-1252">
      Yes, x86 has byte-swap opcodes. But EVIO isn't using them. Using
      those instructions greatly alleviates the CPU cost. I wrote an
      assembly-optimized version of the byte-swapping code in the early
      2000s, which I can resurrect, although that was 32-bit assembly,
      not even using MMX, let alone SSE instructions. Somehow I would
      think the EVIO library should include such optimizations out of
      the box, like good video codecs do.<br>
      <br>
      Ole<br>
      <br>
      <div class="moz-cite-prefix">On 3.10.21 at 13:45, Benjamin Raydo
        wrote:<br>
      </div>
      <blockquote type="cite"
cite="mid:MN2PR09MB57561524563EC5EC864D0F43A8AD9@MN2PR09MB5756.namprd09.prod.outlook.com">
        <meta http-equiv="Content-Type" content="text/html;
          charset=windows-1252">
        <style type="text/css" style="display:none;">P {margin-top:0;margin-bottom:0;}</style>
        <div style="font-family: Calibri, Arial, Helvetica, sans-serif;
          font-size: 12pt; color: rgb(0, 0, 0);"> Hmm, Dave Abbott can
          comment on this...but there is an endianess flag in the EVIO
          structure that we should be setting to indicate this - we may
          need to check that we are consistent about this use. Anyhow,
          does x86 have a CPU instruction that can do this swap for you
          - so it is really that much CPU power?<br>
        </div>
        <div style="font-family: Calibri, Arial, Helvetica, sans-serif;
          font-size: 12pt; color: rgb(0, 0, 0);"> <br>
        </div>
        <div style="font-family: Calibri, Arial, Helvetica, sans-serif;
          font-size: 12pt; color: rgb(0, 0, 0);"> Ben<br>
        </div>
        <hr style="display:inline-block;width:98%" tabindex="-1">
        <div id="divRplyFwdMsg" dir="ltr"><font style="font-size:11pt"
            face="Calibri, sans-serif" color="#000000"><b>From:</b>
            Sbs_daq <a class="moz-txt-link-rfc2396E"
              href="mailto:sbs_daq-bounces@jlab.org"
              moz-do-not-send="true"><sbs_daq-bounces@jlab.org></a>
            on behalf of Alexandre Camsonne <a
              class="moz-txt-link-rfc2396E"
              href="mailto:camsonne@jlab.org" moz-do-not-send="true"><camsonne@jlab.org></a><br>
            <b>Sent:</b> Sunday, October 3, 2021 1:19 PM<br>
            <b>To:</b> Ole Hansen <a class="moz-txt-link-rfc2396E"
              href="mailto:ole@jlab.org" moz-do-not-send="true"><ole@jlab.org></a><br>
            <b>Cc:</b> <a class="moz-txt-link-abbreviated"
              href="mailto:sbs_daq@jlab.org" moz-do-not-send="true">sbs_daq@jlab.org</a>
            <a class="moz-txt-link-rfc2396E"
              href="mailto:sbs_daq@jlab.org" moz-do-not-send="true"><sbs_daq@jlab.org></a><br>
            <b>Subject:</b> [Sbs_daq] [EXTERNAL] Re: Big endian raw
            data?</font>
          <div> </div>
        </div>
        <div>
          <div dir="auto">
            <div>I think we might be able to choose. 
              <div dir="auto"><br>
              </div>
              <div dir="auto">Though now we use mostly intel CPU unless
                it breaks any software sounds  like little endian would
                be more efficient. Not sure endianness of VTP, it is an
                ARM processor is it Big Endian ?</div>
              <div dir="auto"><br>
              </div>
              <div dir="auto">Alexandre</div>
              <br>
              <br>
              <div class="x_gmail_quote">
                <div dir="ltr" class="x_gmail_attr">On Sun, Oct 3, 2021,
                  13:06 Ole Hansen <<a href="mailto:ole@jlab.org"
                    moz-do-not-send="true">ole@jlab.org</a>> wrote:<br>
                </div>
                <blockquote class="x_gmail_quote" style="margin:0 0 0
                  .8ex; border-left:1px #ccc solid; padding-left:1ex">
                  <div>Maybe our various front-ends differ in
                    endianness, so we write mixed-endian data?!? That
                    would be disastrous since it is not supported by
                    EVIO. A file can only be one or the other—a very
                    binary view. (I guess EVIO was written before we
                    became diversity-aware ;) ).<br>
                    <br>
                    Ole<br>
                    <br>
                    <div>On 3.10.21 at 13:03, Andrew Puckett wrote:<br>
                    </div>
                    <blockquote type="cite">
                      <div>
                        <p class="x_MsoNormal">Hi Ole, </p>
                        <p class="x_MsoNormal"> </p>
                        <p class="x_MsoNormal">This is interesting. The
                          GRINCH data are being read out by the new
                          VETROC modules, I don’t know if they differ
                          from the other modules in terms of
                          “endian-ness”. Maybe a DAQ expert can weigh in
                          here?</p>
                        <p class="x_MsoNormal"> </p>
                        <p class="x_MsoNormal">Andrew </p>
                        <p class="x_MsoNormal"> </p>
                        <div style="border:none; border-top:solid
                          #b5c4df 1.0pt; padding:3.0pt 0in 0in 0in">
                          <p class="x_MsoNormal"
                            style="margin-bottom:12.0pt"><b><span
                                style="font-size:12.0pt; color:black">From:
                              </span></b><span style="font-size:12.0pt;
                              color:black">Sbs_daq <a
                                href="mailto:sbs_daq-bounces@jlab.org"
                                target="_blank" rel="noreferrer"
                                moz-do-not-send="true">
                                <sbs_daq-bounces@jlab.org></a> on
                              behalf of Ole Hansen <a
                                href="mailto:ole@jlab.org"
                                target="_blank" rel="noreferrer"
                                moz-do-not-send="true">
                                <ole@jlab.org></a><br>
                              <b>Date: </b>Sunday, October 3, 2021 at
                              1:00 PM<br>
                              <b>To: </b><a
                                href="mailto:sbs_daq@jlab.org"
                                target="_blank" rel="noreferrer"
                                moz-do-not-send="true">sbs_daq@jlab.org</a>
                              <a href="mailto:sbs_daq@jlab.org"
                                target="_blank" rel="noreferrer"
                                moz-do-not-send="true"><sbs_daq@jlab.org></a><br>
                              <b>Subject: </b>[Sbs_daq] Big endian raw
                              data?</span></p>
                        </div>
                        <p class="x_MsoNormal"
                          style="margin-bottom:12.0pt">Hi guys,<br>
                          <br>
                          Bradley reported a crash of the replay
                          (actually in EVIO) with
                          /adaq1/data1/sbs/grinch_72.evio.0 (see <a
                            href="https://logbooks.jlab.org/entry/3916105"
                            target="_blank" rel="noreferrer"
                            moz-do-not-send="true">
                            https://logbooks.jlab.org/entry/3916105</a>).<br>
                          <br>
                          When digging into the cause of this crash, I
                          discovered that these raw data are written in
                          big-endian format. How can this be? I thought
                          the front-ends are Intel processors. Are we
                          taking data with ARM chips that are configured
                          for big-endian mode? Is this a mistake, or is
                          there some plan to it?<br>
                          <br>
                          These big-endian data have to be byte-swapped
                          when processing them on x86, which is what all
                          our compute nodes run. That's a LOT of work.
                          It leads to significant and seemingly
                          completely unnecessary overhead. I.e. we're
                          burning CPU cycles for nothing good, it seems.<br>
                          <br>
                          Please explain.<br>
                          <br>
                          Ole</p>
                      </div>
                    </blockquote>
                    <br>
                  </div>
                  _______________________________________________<br>
                  Sbs_daq mailing list<br>
                  <a href="mailto:Sbs_daq@jlab.org" target="_blank"
                    rel="noreferrer" moz-do-not-send="true">Sbs_daq@jlab.org</a><br>
                  <a
                    href="https://mailman.jlab.org/mailman/listinfo/sbs_daq"
                    rel="noreferrer noreferrer" target="_blank"
                    moz-do-not-send="true">https://mailman.jlab.org/mailman/listinfo/sbs_daq</a><br>
                </blockquote>
              </div>
            </div>
          </div>
        </div>
      </blockquote>
      <br>
    </blockquote>
    <br>
  </body>
</html>