From jpcrafts at jlab.org Fri Jan 23 12:58:47 2026 From: jpcrafts at jlab.org (Joshua Crafts) Date: Fri, 23 Jan 2026 17:58:47 +0000 Subject: [HydraTeam] Upcoming AI for Hall C meeting confirmation Message-ID: Hello all, I wanted to reach out regarding the upcoming AI for Hall C meeting, and confirm with you the participation of your group in the upcoming AI for Hall C meeting that we will be holding on Friday (Feb 6th) 3-5PM EST. We currently also plan to have participation from the NPS SRO group with Tanja/Kin and Brad/Casey regarding their AI work, no firm order as of yet. Looking forward to getting this restarted. We'll be making a larger announcement for this at the Hall C meeting next week. Thanks, Joshua Crafts. -------------- next part -------------- An HTML attachment was scrubbed... URL: From cmorean at jlab.org Fri Jan 23 13:29:17 2026 From: cmorean at jlab.org (Casey Morean) Date: Fri, 23 Jan 2026 18:29:17 +0000 Subject: [HydraTeam] Current Dev branches of Hydra and integration plans Message-ID: Hello Hydra Team, I CC'd hydrateam for recording purposes. I am bringing in David, Brad and Anil as well. Thomas gave me access to the hydra project on EPSCI's gitlab mid-2025. I have been familiarizing myself with the data management tools found within GitLab for ML. I am working on creating documentation and best practices for using the GitLab model registry. The GitLab model registry has an MLflow integration - more information on MLflow can be found here. It is open source softrware that is used to manage the full lifecycle of MLOps. This was tool highlighted as a useful target by Hall D staff and Data Science staff in a recent presentation I gave to Hall D for storing models used within GlueX. As the primary team using machine learning in production for JLab operations, I would like to evaluate the hydra use-case to ensure the documentation and best practices are up-to-date and validated with a real production workload. Hydra is the best fit! For this purpose, I have reviewed the hydra_train.py script in the main branch of epsci/hydra/Hydra. This script mixes several steps of the dataset curation, model preparation, and training into a single script. I have made some developments integrating MLflow into dataset curation with the SQL queries found in the training script. This is a separate repository I would like to incorporate into the hydra project. I need to understand which branche(s) and features of hydra are currently being developed, what the current release of hydra is, and the development process for hydra. Can you please pass along which branch(s) or features you are currently working on, and what the currently deployed branch / patch is? Also, any contribution guidelines for the project. I would like to share the work I have done so far with you - should I make a new repo in the espci/hydra project, or make a PR to a branch? There is currently no package management tool used within the main branch of Hydra. I am using the pyproject.toml format (PEP 621). There are several components of hydra my work touches including the configuration file, and the database connection system. At this time, a branch / PR would be a difficult integration point. I am utilizing the SQL queries found in hydra_train.py, but I would like to use an ORM - Thomas mentioned you (Torri) had worked on the ORM a bit for the backend of hydra, and the frontend was using an ORM (Raiqa) - can you point me to that work? I am using pydantic to specify the configuration, and Data Transfer Objects (DTO) to format the data manifest and accompanying MLflow metadata. I would like to standardize this with the current ORMs you have developed and expand upon them as needed. I also want to ensure the metadata aligns with the current metadata used in the hydra database / model training process. This is a great opportunity to take some operational burden off the EPSCI team and automate model deployment and training. If you are at all interested, I would like to work with you towards a major version bump of Hydra with a common deployment. This project aligns well with the AmSC and Genesis mission by making data AI ready. I look forward to working with you all. I understand this is a lot of information, and we may need to set up (yet another) meeting. Thank you, Casey Morean -------------- next part -------------- An HTML attachment was scrubbed... URL: