News / #hardware Tag Hardware 274 articles archived under #hardware · RSS Sign in to follow arXiv — Machine Learning research 30m ago scKDGM: KAN-guided Dynamic Graph Masked Learning for Single-Cell RNA-seq Clustering arXiv:2606.28459v1 Announce Type: new Abstract: Single-cell RNA sequencing (scRNA-seq) clustering is essential for identifying cell types, but high dimensionality, sparsity, dropout, and technical noise hinder robust expression representation and cell graph construction.… 27 arXiv — Machine Learning research 30m ago Improving Patient Subtyping on Longitudinal Data using Representations from Mamba-based Architecture arXiv:2606.28623v1 Announce Type: new Abstract: Effective sub-typing (also known as grouping or clustering) of patients using their electronic health record (EHR) data can greatly inform precision medicine efforts. However, subtyping temporal EHR datasets is known to be… 37 arXiv — Machine Learning research 30m ago Nonlinear mixture model motivated subspace clustering arXiv:2606.29261v1 Announce Type: new Abstract: We derive the linear union-of-subspaces (UoS) model for subspace clustering (SC) from the nonlinear mixture model (NMM) used in blind source separation (BSS) to represent a D-dimensional observation vector as an unknown… 7 r/LocalLLaMA community 13h ago Instead of decentralized training effort we should build the “One dataset” There are many threads here calling for united LLM training run of a new open model. Mainly, after govt. stunt of banning commercial frontier models. And also due to the lack of small-medium open-weight models releases lately. I genuinelly believe at some point we’ll have “SETI… 38 Import AI (Jack Clark) community 15h ago Import AI 463: Self-improving robots; a 10k Chinese GPU cluster; and an elegiac essay for the human era What eras bookend our interregnum? 36 TechCrunch — AI news-outlet 15h ago Omen AI’s plan to optimize data centers is all wet Omen AI raised a $31 million Series A to monitor chip coolant and stop bacterial outbreaks in data centers. 8 llama.cpp releases dev-tools 18h ago b9840 DeepSeek V4 ( #24162 ) convert: add dsv4 conversion add basic setup add llm_graph_input_dsv4 add save-load state add sinkhorn eps - correction by @fairydreaming add rope fix cleanup dead code fix bugs support pro model: added by @fairydreaming remove redundant V cache Chat… 26 arXiv — Machine Learning research 1d ago Dual-Learning based Penalized Multi-Align Clustering for Multi-View Incomplete and Disorderly Data arXiv:2606.27984v1 Announce Type: new Abstract: Multimodal feature fusion can effectively capture complex patterns in real-world data by integrating complementary information from different modalities. However, in many applications, such as boiler combustion monitoring,… 18 arXiv — NLP / Computation & Language research 1d ago Mechanism-Driven Monitors for Preemptive Detection of LLM Training Instability arXiv:2606.28116v1 Announce Type: new Abstract: Frontier large language model training consumes massive accelerator fleets and long wall-clock computation, making stability failures costly when they occur. After a numerical or a hyperparameter fault has already destabilized the… 31 arXiv — NLP / Computation & Language research 1d ago Cluster, Route, Escalate: Cascaded Framework for Cost-Aware LLM Serving arXiv:2606.27457v1 Announce Type: cross Abstract: Efficient deployment of large language models (LLMs) in production forces a trade-off between accuracy and cost. Operators often default to a single model that is either expensive for easy queries or insufficient for hard ones.… 20 arXiv — NLP / Computation & Language research 1d ago DG^VoiC: Speaker Clustering for Fraud Investigation under Real Call-Centre Conditions arXiv:2606.28048v1 Announce Type: cross Abstract: Insurance fraud remains costly and operationally difficult, particularly in call-centre workflows where many customer interactions begin at FNOL. While recent fraud detection methods mainly rely on structured data, text, or… 19 r/LocalLLaMA community 1d ago Success story with MiMo-V2.5-GGUF:UD-Q5_K_XL I don't see many stories about this model, but after several attempts (after I finished finally reconfiguring my cluster) I did something useful with it: it wrote a built-in llama.cpp tool for executing C++ code and using the results. Here's an exercise that MiMo V2.5 gave me to… 27 Hacker News — AI on Front Page community 2d ago AMD Strix Halo RDMA Cluster Setup Guide Article URL: https://github.com/kyuz0/amd-strix-halo-vllm-toolboxes/blob/main/rdma_cluster/setup_guide.md Comments URL: https://news.ycombinator.com/item?id=48703258 Points: 207 # Comments: 61 22 TechCrunch — AI news-outlet 2d ago SoftBank’s CEO isn’t the only one with questions about Elon Musk’s orbital data center hype Not everyone is buying Elon Musk’s vision for orbital data centers. 19 r/MachineLearning community 2d ago Kicking off GPU Mode [D] Hey ! I’m starting a series to document my work on GPU infrastructure, LLMs, and CV. Stop #1 is up: A brief look at why GPUs are the center of the industry, the CPU/GPU divide, and why nvidia-smi is the first place you check when things break. We’ll move past the basics quickly… 27 r/MachineLearning community 3d ago Roast my 3-year roadmap: Pivoting from Python/BaaS to AI Infrastructure & Go (Graduating 2029) [D] I'm a B.Tech student in India graduating in mid-2029. Currently, I know Python, SQL, Docker, basic prompt engineering, and I've built a few LLM apps using BaaS like Supabase/Firebase. I’m running all this on an Intel i5 13th Gen laptop with an RTX 5050 (8GB VRAM). The Pivot: I… 10 TechCrunch — AI news-outlet 3d ago Why everyone from OpenAI to SpaceX is building their own chips (and turning up the heat on Nvidia) Nvidia has dominated the AI chip market for years, but the era of total dependence might be ending.   OpenAI just shared its plans to spice things up with Jalapeño, its custom inference chip built with Broadcom, joining Google, Apple, and SpaceX in a growing list… 35 r/LocalLLaMA community 3d ago Why do people keep investing in Intel for AI? If you get a good deal on some Xeons with a lot of memory bandwidth, or a cheap GPU for home inference, that's cool, no disrespect. But how in the hell are Wall Street types considering Intel part of the "AI picks and shovels" play? Who's buying Intel for their AI data centers?… 17 r/LocalLLaMA community 3d ago 8 Tesla T4 Cards, what should it do? I have collected 8 Tesla T4 Datacenter Cards from a few retired VDI servers. I have one in a DEG1 and works ok on n its own. What should we do with the rest?   submitted by   /u/imonlysmarterthanyou [link]   [comments] 7 r/MachineLearning community 3d ago Live Continual Learning in Machine Learning [D] My question on live continual learning use cases was removed by moderators here because they think i asked basic level question about live continual learning which i thought is a frontier level research. But anyways. Is anyone interested in talking about continual learning… 30 TechCrunch — AI news-outlet 3d ago OpenAI’s Jalapeño chip is Big Tech’s spiciest move away from Nvidia Nvidia has dominated the AI chip market for years, but the era of total dependence might be ending.   OpenAI just shared its plans to spice things up with Jalapeño, its custom inference chip built with Broadcom, joining Google, Apple, and SpaceX in a growing list… 25 r/LocalLLaMA community 3d ago Ornith 1.0 - terminology and concepts explained (basic) I made a quick guide for myself while wanting to try the new models, so I share it with you. It's pretty basic, but it may be useful for new people here. I also published the repo with the open code config and the commands: https://github.com/facuHannoch/AI_Workflows-Ornith-1.0… 34 arXiv — Machine Learning research 4d ago SharQ: Bridging Activation Sparsity and FP4 Quantization for LLM Inference arXiv:2606.26587v1 Announce Type: new Abstract: Low-bit floating-point formats and semi-structured sparsity are increasingly supported by modern accelerators, yet combining them for LLM activation compression remains challenging: activations contain input-dependent outliers that… 29 arXiv — NLP / Computation & Language research 4d ago Axon: A Synthesizing Superoptimizer for Tensor Programs arXiv:2606.26344v1 Announce Type: cross Abstract: Writing high performance kernels for AI accelerators requires deep expertise in tiling, instruction selection, data layout, and operator fusion placing a significant burden on programmers. In this paper, we focus on tile based AI… 33 r/LocalLLaMA community 4d ago When you don't have a data center GPU Please don't tell me someone is going to (yet again) reply with the longest finetune-merge name in eternity...   submitted by   /u/Iwaku_Real [link]   [comments] 4 ThursdAI news-outlet 4d ago GLM 5.2 total victory: the week open source won and nobody panicked From CoreWeave: A chill week, but a total Open Source victory for GLM 5.2 + Sakana Fugu, Krea Open Sources, OpenAI makes inference chips with broadcom, Karpathy gets heat about the new Claude Tag... 35 arXiv — Machine Learning research 5d ago Dense Supervision Is Not Enough: The Readout Blind Spot in Looped Language Models arXiv:2606.24898v1 Announce Type: new Abstract: Looped language models turn hidden states into runtime state: each state is decoded for prediction and fed back into future computation. This creates a basic supervision question: which state variables does cross-entropy actually… 37 arXiv — NLP / Computation & Language research 5d ago Is GraphRAG Needed? From Basic RAG to Graph-/Agentic Solutions with Context Optimization arXiv:2606.25656v1 Announce Type: new Abstract: As advanced RAG variants like GraphRAG and Agentic RAG emerge, one leading question is when and how to use them. Here, we introduce a framework for different RAG scenarios evaluation and comparison on semi-structured knowledge… 21 arXiv — NLP / Computation & Language research 5d ago Natural Ungrokking: Asymmetric Control of Which Rules Survive Pretraining arXiv:2606.26050v1 Announce Type: cross Abstract: Midway through an ordinary pretraining run, a small language model learns the pronoun-gender rule: cued with a girl's name ("Sue cried because"), it resolves the next pronoun to she, generalizing to held-out probes (0.94 by step… 4 r/LocalLLaMA community 5d ago Locked Dell quote for 6x RTX PRO 6000 Max-Q at $8,960 — expires tonight. What would you do? Building an inference cluster to run GLM 5.2 locally. Got a Dell quote locked at $8,959.99/unit for 6x RTX PRO 6000 Blackwell Max-Q (300W). List price just jumped to $15,999 yesterday. Quote expires in ~3 hours and I can't swing all 6 right now. I have a second quote for 2 units… 36 r/LocalLLaMA community 5d ago Any chance I could cluster my DGX Spark (128GB unified memory) and my AMD Ryzen AI Max 395 (128GM unified memory) together to run 1 model? Hey all, So I have a Nvidia DGX Spark and an AMD Strix 395, both have 128GB of unified memory. The Spark has 200Gbit network and the AMD Strix has 5Gbit ethernet (but it has a pcie gen 4x4 slot). Is there any chance I can cluster the 2 together to run a larger model that can fit… 30 r/LocalLLaMA community 5d ago OpenAI and Broadcom unveil LLM-optimized inference chip https://openai.com/index/openai-broadcom-jalapeno-inference-chip/ Quoted from the start of the blog post: Early testing shows that the first-generation accelerator will deliver performance per watt substantially better than current state-of-the-art Built from the ground up for… 11 Hacker News — AI on Front Page community 5d ago 45°C cooling design cuts data center water use to near zero Article URL: https://blogs.nvidia.com/blog/liquid-cooling-ai-factories/ Comments URL: https://news.ycombinator.com/item?id=48660178 Points: 206 # Comments: 157 22 OpenAI official-blog 5d ago OpenAI and Broadcom unveil LLM-optimized inference chip OpenAI and Broadcom introduce Jalapeño, a custom AI chip built for LLM inference to improve performance, efficiency, and scale across AI systems. 28 arXiv — NLP / Computation & Language research 6d ago ModTGCN: Modularity-aware Graph Neural Networks for Text Classification arXiv:2606.23694v1 Announce Type: new Abstract: Graph-based text classification models typically rely on local neighborhood aggregation and overlook global community structure, despite semantic document graphs exhibiting strong class-consistent clustering. Ignoring this can blur… 22 arXiv — NLP / Computation & Language research 6d ago Layer-wise Probing of wav2vec 2.0 and Whisper for Consonant Cluster Reduction in African American English arXiv:2606.23948v1 Announce Type: new Abstract: Self-supervised and supervised speech models are increasingly used to investigate which linguistic information their internal representations encode, and at what level of abstraction they encode it. One underexplored phenomenon is… 6 Ars Technica — AI news-outlet 6d ago Oracle’s 21,000 layoffs help drive its debt-fueled AI investments Oracle is spending billions on data center infrastructure to support AI. 20 r/LocalLLaMA community 6d ago Is it possible to run a giant model like GLM5.2 on this cluster (4x servers with 512GB RAM + dual AMD Epyc)? 16 channel memory should hit 409GB/s per node. Hey all, I have a piece of hardware laying around which is pretty fast from a traditional (non-GPU) server viewpoint. The hardware is the following: Dell C6525 Server with Quad Node (4x server blades) with the following: 2x AMD EPYC 7702 64-Core Processors 8 memory channels per… 30 TechCrunch — AI news-outlet 7d ago Nvidia wants to cut data center water use, but that’s not the same as fixing AI’s water problem Nvidia announced a new cooling system that cuts water use inside the data center. But it does nothing to address AI's biggest water use — fossil fuel power plants. 5 TechCrunch — AI news-outlet 7d ago SpaceX inks compute deal with Reflection AI, an open-source AI lab Reflection AI will pay $150 million a month beginning July 1, 2026 through 2029 for immediate access to Nvidia's latest GB300 AI chips and supporting hardware across SpaceX's Colossus 2 data center near Memphis, Tennessee. 33 r/MachineLearning community 8d ago Data-centric debugging for teams training neural nets [P] We just did a big revamp of WeightsLab and wanted to share it here. If you’ve ever spent hours debugging a training run only to discover it was a data problem all along, this is for you. WeightsLab lets you pause training mid-run, inspect your live loss signals, and catch… 29 r/LocalLLaMA community 8d ago What‘s your local „Haiku“-Replacement? Seriously looking for a reliable and fast local Haiku replacement. Basically it should be able to summarize technical stuff, code documentation, architectural descriptions Any suggestions? Edit: sorry, totally forgot that my local machine is a M4 Max 128GB. But at the same time… 6 r/LocalLLaMA community 9d ago Deep Neural Network that can turn any Image into a Playable Game! BUT LOCALLY, NOT ON DATACENTER Hi everyone!! I really wanted to share my research what I've been working on. I wanted to build a nn that can simulate games, or at least start doing that Most video generators are too large to run on consumer hardware realtime, so I I designed a model that does this from… 14 r/LocalLLaMA community 10d ago New Agentic Benchmark Out: Claude Fable and GLM 5.2 Top Their Cohorts You can read about it here: https://artificialanalysis.ai/articles/aa-briefcase This is a solid benchmark from Artificial Analysis. It basically tests an LLMs ability to plan and execute tasks. And more importantly, it is a new benchmark that is not saturated, so no one can… 32 r/LocalLLaMA community 10d ago EvoTensile: Evolutionary algorithms for AMD Tensile GEMM kernel tuning There has been an effort to tune kernels in hipBLASLt so the most basic matmuls can run faster. It's known that on Strix Halo (gfx1151), GEMM with NN and TN input layouts (used in inference) are already well-tuned, while NT and TT layouts (used in training) are not yet tuned.… 8 arXiv — Machine Learning research 11d ago Exploring the potential of AlphaEarth and TESSERA embeddings for Fine-scale Local Climate Zone Mapping: A case study across five cities in Switzerland arXiv:2606.20034v1 Announce Type: new Abstract: Understanding urban spatial morphology is critical for climate modeling, risk assessment, and sustainable urban design, and Local Climate Zone (LCZ) mapping provides the basic framework for this. However, many cities still use… 10 arXiv — NLP / Computation & Language research 11d ago Clusters are All You Need: Pre-Training the Tsetlin Machine with Semantic Clusters from Language Models for Interpretability arXiv:2606.19815v1 Announce Type: new Abstract: Pre-trained language models such as BERT achieve strong text classification performance but lack transparency, limiting their use in high-stakes settings. The Tsetlin Machine (TM) offers fully interpretable, clause-based reasoning… 25 arXiv — NLP / Computation & Language research 11d ago TransLaw: A Large-Scale Dataset and Multi-Agent Benchmark Simulating Professional Translation of Hong Kong Case Law arXiv:2507.00875v3 Announce Type: replace Abstract: Translating Hong Kong Court Judgments from English to Traditional Chinese is mandated by Articles 8-9 of the Basic Law, yet remains constrained by a shortage of parallel resources and rigorous demands on legal terminology,… 38 arXiv — NLP / Computation & Language research 11d ago ShoppingBench: A Real-World Intent-Grounded Shopping Benchmark for LLM-based Agents arXiv:2508.04266v4 Announce Type: replace Abstract: Existing benchmarks in e-commerce primarily focus on basic user intents, such as finding or purchasing products. However, real-world users often pursue more complex goals, such as applying vouchers, managing budgets, and… 22 Hacker News — AI on Front Page community 11d ago Show HN: Are You in the Weights? With more traffic moving off-web and into LLMs, I got curious about what traces we leave "in the weights". My design partner and I built a site in the past few weeks that checks recognition across frontier and small models. It queries many of them in parallel, clusters the… 37 Page 1 of 6 · 274 articles Older →