Open Multimodal Datasets and Open-Source Software for Data-Driven Modeling of Multiphase Transport and Thermal Systems
Mirrored from arXiv — Machine Learning for archival readability. Support the source by reading on the original site.
Computer Science > Machine Learning
Title:Open Multimodal Datasets and Open-Source Software for Data-Driven Modeling of Multiphase Transport and Thermal Systems
Abstract:Data-driven modeling is becoming central to multiphase transport, electronics cooling, acoustic diagnostics, and thermal-fluid digital twins, but progress is limited by fragmented datasets and raw instrument files that are difficult to decode, reuse, or benchmark. This paper presents an open ecosystem of multimodal datasets and open-source software packages developed by the Nano Energy and Data-Driven Discovery (NED3) Laboratory for reproducible AI-enabled thermal-fluid research. We introduce a spatial-plus-temporal dimensionality framework, denoted S+TD, to classify datasets by the dimensionality of measured or simulated fields, including 0+0D point values, 0+1D time series, 1+0D profiles, 2+0D images, 2+1D videos, 3+0D volumetric fields, and multimodal combinations. We organize public NED3 datasets spanning boiling images, acoustic and thermal measurements, high-speed videos, infrared thermography, thermal-resistance measurements, CFD-generated fields, design files, and acoustic-emission data. We also describe complementary software packages, including BubbleID, SeqReg, CFDTwin, IRISApp, decode-wfs, AELab, and FlowLab, which support computer vision, sequence regression, surrogate modeling, infrared analysis, waveform decoding, acoustic-emission analysis, and multimodal diagnostics. Particular emphasis is placed on SeqReg, a general sequence-regression library for 0+1D, 1+1D, and 2+1D data, with applications such as nonintrusive heat-flux estimation. Finally, we discuss future community efforts to build interoperable thermal-fluid databanks and curated AI/ML tool libraries that connect datasets, metadata, decoders, baselines, benchmarks, and physically interpretable models.
| Comments: | 23 pages, 7 figures |
| Subjects: | Machine Learning (cs.LG); Fluid Dynamics (physics.flu-dyn) |
| Cite as: | arXiv:2605.23037 [cs.LG] |
| (or arXiv:2605.23037v1 [cs.LG] for this version) | |
| https://doi.org/10.48550/arXiv.2605.23037
arXiv-issued DOI via DataCite (pending registration)
|
Access Paper:
- View PDF
- HTML (experimental)
- TeX Source
Current browse context:
References & Citations
Bibliographic and Citation Tools
Code, Data and Media Associated with this Article
Demos
Recommenders and Search Tools
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.
More from arXiv — Machine Learning
-
Latent Cache Flow: Model-to-Model Communication Without Text
May 25
-
Reading Calibrated Uncertainty from Language Model Trajectories
May 25
-
FusionSense: Tri-Stage Near-Sensor Learning for Runtime-Adaptive Multimodal Edge Intelligence
May 25
-
FuRA: Full-Rank Parameter-Efficient Fine-Tuning with Spectral Preconditioning
May 25
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.