Tag

Hardware

274 articles archived under #hardware · RSS

r/LocalLLaMA community 29d ago

G7 agrees on shared language around open-source AI and open weights AI

Basically stuff we already knew here, but now governments understand it too. I found the news here: https://www.phoronix.com/news/G7-On-Open-Source-AI   submitted by   /u/Kahvana [link]   [comments]

16
TechCrunch — AI news-outlet 29d ago

Erin Brockovich takes aim at data center secrecy

Environmental activist Erin Brockovich has a new mission.

35
r/LocalLLaMA community 29d ago

Added an old 2070 Super to my rig and I can't go back...worse, now I need more

Context: I built a new system last year November before everything went to shit. I spent like 5k for a 5090, 9800X3D and 96GB RAM. Recently (last 2-3 months) I'm heavily working on my local setup. Ditched Windows, went Ubuntu > Manjaro > CachyOS (now) and I'm basically building…

36
Hacker News — AI on Front Page community 29d ago

I put a datacenter GPU in my gaming PC

Article URL: https://blog.tymscar.com/posts/v100localllm/ Comments URL: https://news.ycombinator.com/item?id=48345694 Points: 241 # Comments: 154

5
r/MachineLearning community 29d ago

Built an AI Accelerator and opensourced it. [P]

There is a huge gap in open source AI accelerators, so I implemented mine . Popular and well known ones are already legacy and doesn't support contemporary operations like Attention. Here is what makes mine special: Attention mechanism smelted directly into silicon Prototyped…

25
r/LocalLLaMA community 29d ago

DIY Local 2x DGX Spark cluster cooler with automatic temperature controlled fan.

I’ve found that DGX Sparks can get pretty warm when you cluster them together. You are forced to keep them close together because the ConnectX-7 cable made for these is extremely short )like less than a foot). I have both a DGX Spark Founder’s Edition and a GIGABYTE AI TOP Atom…

32
r/MachineLearning community 29d ago

How would you model this "strand" clustering problem? [P]

https://preview.redd.it/llqlupnwng4h1.png?width=2188&format=png&auto=webp&s=7fae5860babaffa1c8bfdcb1468b374eb38ac55d I'm currently building a computer vision application. I've managed to successfully train a YOLO model to detect the object I'm interested in for my videos. The…

33
r/LocalLLaMA community 1mo ago

Dell confirms XPS laptop with NVIDIA N1X at Computex ( basically a DGX Spark GB10 for consumers with Windows )

  submitted by   /u/fallingdowndizzyvr [link]   [comments]

24
r/LocalLLaMA community 1mo ago

My home data center

System 1: Threadripper 3960x 24c 4x 3090 ti 128gb ddr4 System 2: Xeon 8352 36c 4x 5070 ti 128gb ddr4 System 3: Intel 14700k 24c 64gb ddr5 5090 System 4: Ryzen 5950x 16c 64gb ddr4 2x 5070 ti The first system uses two PSUs to handle the almost 2000w full load of the 3090s. Was…

4
TechCrunch — AI news-outlet 1mo ago

SoftBank says it will invest up to €75 billion to build French data centers

The goal, the firm said, is to develop and operate up to 5 gigawatts of additional data center capacity.

30
The Information — AI news-outlet 1mo ago

Softbank to Invest Up To 75 Billion Euros on AI Data Centers in France

SoftBank Group announced a commitment to develop and operate five gigawatts of AI data center capacity in France, with an investment of up to 75 billion euros, or about $87.5 billion. The commitment is SoftBank’s largest AI infrastructure investment to date in Europe, the…

13
r/LocalLLaMA community 1mo ago

Why does Thinking Output More Tokens Than a Response?

I was too lazy to use a vector DB + Embedding + Clustering for this list of 1000 items I wanted to categorize. I was hoping to use a local LLM to do it, but it would only respond with a list of about 100 items or so and their categories. It confused me because when I saw the…

22
r/LocalLLaMA community 1mo ago

made a local voice AI for windows you can talk to in any language. open source, bring your own key

been building this on and off for a while and finally got it to a point where i'm not embarrassed to share it, so here goes. it's called Shadow AI. basically a voice-first AI companion that runs on your own windows machine. you just talk to it and it talks back, no typing…

38
r/LocalLLaMA community 1mo ago

Anyone using Flash Attention 2 (ai-bond) on their V100's? How is the performance?

I just Installed Flash Attention 2 from here: https://github.com/ai-bond/flash-attention-v100 " I did some basic benchmarks and I am getting from 4x-7x memory utilization. However, benchmarks don't always translate to real world scenarios. **I have noticed that the thinking time…

30
r/LocalLLaMA community 1mo ago

We gave a Reachy Mini a real-time voice brain

We attended an event the other day and found this little guy lying on our desk, a Reachy Mini from Hugging Face. It belongs to the daughter of the event organizer. We got curious about how it worked, and an hour later we'd given it a brain. The model basically becomes Reachy. It…

19
llama.cpp releases dev-tools 1mo ago

b9402

hexagon: basic/generic op fusion support and RMS_NORM+MUL fusion ( #23835 ) Updating infra to enable op fusion and using RMS_NORM+MUL as the use-case. macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework…

17
arXiv — Machine Learning research 1mo ago

Cycle-Space Informed Detection of Autoencoded Blind False Data Injection Attacks on Power Systems

arXiv:2605.28912v1 Announce Type: new Abstract: The rapid growth of AI-driven data centers and large-scale energy storage systems is increasing the reliance of power system operation on real-time measurement data and automated decision-making. However, many existing detection…

28
arXiv — Machine Learning research 1mo ago

Cluster-Level Attention-Guided Parallel Decoding for Masked Diffusion Language Models

arXiv:2605.29607v1 Announce Type: new Abstract: Masked diffusion language models (MDLMs) enable parallel decoding by predicting all masked positions at each denoising step, yet existing training-free samplers usually decide which positions to commit at token-level granularity.…

28
arXiv — Machine Learning research 1mo ago

M\=oLe-{\Lambda}: Learning the Coupled-Cluster Response State for Energies, Gradients, and Properties

arXiv:2605.29622v1 Announce Type: new Abstract: Coupled-cluster (CC) theory is often considered the gold standard of quantum chemistry, but its high computational cost limits routine access to accurate energies, forces and response properties. While the right-hand $T$-amplitudes…

5
r/LocalLLaMA community 1mo ago

llama.cpp B9387 Significant AMD/ROCm PP Update

https://github.com/ggml-org/llama.cpp/releases/tag/b9387 MFMA is restricted to AMD CDNA architecture that's MI100, MI200, MI300 series datacenter cards. Post your initial results if you try it! wink   submitted by   /u/Bulky-Priority6824 [link]   [comments]

38
r/LocalLLaMA community 1mo ago

Mimo 2.5 Pro - 40t/s on 8x Nvidia Spark/GB10 cluster

I got Mimo 2.5 Pro running on my 8x Asus Nvidia GB10 cluster using mtp-2, single user request, coding: 40 t/s - 1k context, 32t/s - 30k context, 25t/s - 125k context, 17t/s - 250k context. 2 parallel reached 60t/s and in 4 parallel reached 83t/s, not bad for 1T model…

10
The Information — AI news-outlet 1mo ago

SpaceX Committed to Six Month Anthropic Data Center Lease, Musk Says

SpaceX is committed to renting its data center capacity to Anthropic for 180 days, though it could extend the deal for longer, CEO Elon Musk said Thursday. The comment provides more detail than language in SpaceX’s recent public S-1 filing. That document, which SpaceX published…

29
r/LocalLLaMA community 1mo ago

Zai replaced the network architecture running GLM-5.1 inference and the gains are pretty wild

Been following the infrastructure side of AI more lately and stumbled on this from Zai. They upgraded the network architecture on a thousand-GPU cluster running GLM-5.1 coding inference from the standard ROFT setup to something they built called ZCube, developed with Tsinghua…

27
r/LocalLLaMA community 1mo ago

Heterogeneous GPU Weighting & Layer Splitting

This is what I worked on today. With local LLM of course. So if I didn't write the code, did I really work on it? Who cares. It was my idea and I simply asked it to implement it. I basically downloaded /main/ branch, which is totally broken for Windows by the way (i had to…

21
arXiv — Machine Learning research 1mo ago

Robust Contrastive Graph Clustering with Adaptive Local-Global Integration

arXiv:2605.28209v1 Announce Type: new Abstract: Graph clustering is essential in graph analysis for revealing structural patterns and node communities. Despite recent advances in self-supervised contrastive learning that have improved clustering via structural and attribute…

23
The Information — AI news-outlet 1mo ago

ByteDance Mulls $70 Billion Capex This Year as AI Costs Grow

ByteDance is considering more than doubling its capital expenditures this year to as much as $70 billion, as the Chinese tech giant ramps up its investment in data centers and other AI infrastructure, Bloomberg reported. ByteDance’s capex plans reflect the surging costs of…

25
Stratechery (Ben Thompson) community 1mo ago

The SpaceX IPO and Data Centers in Space

There isn't a financial model that justifies the SpaceX IPO, but data centers in space are plausible, and that might be enough.

24
Hugging Face Daily Papers research 1mo ago

DarkForest: Less Talk, Higher Accuracy for Multi-Agent LLMs

Abstract DarkForest is a controlled-communication framework that enhances multi-agent LLM reasoning by clustering semantic candidates and using calibrated belief distributions to reduce error propagation and communication overhead. AI-generated summary Multi-agent LLM systems…

11
The Information — AI news-outlet 1mo ago

Qualcomm Strikes AI Chip Deal With ByteDance

Qualcomm has reached a deal with ByteDance to supply chips for AI data centers to the Chinese tech giant, Bloomberg reported. The deal comes as Qualcomm, one of the world’s largest suppliers of smartphone processors, is trying to increase its presence in chips for AI computing.…

15
r/MachineLearning community 1mo ago

[R]GNN Model For Fraud Detection Isn't Performing Well[R]

We're writing a research paper on explainable fraud detection GNN model and in the first step we're creating a basic Graph Neural Network for that. We're using the most famous dataset available on this topic i.e IEEE CIS Fraud Detection Dataset and implemented all necessary…

7
arXiv — Machine Learning research 1mo ago

GEM: Geometric Entropy Mixing for Optimal LLM Data Curation

arXiv:2605.26121v1 Announce Type: new Abstract: LLM pre-training efficacy increasingly depends on data composition rather than sheer volume. Yet, optimal mixing is hindered by categorization flaws: human taxonomies suffer from ontological misalignment, and Euclidean clustering…

27
r/LocalLLaMA community 1mo ago

Small set of local MCP server installers for home Linux users

Hi all, I have published a small open-source MCP server bundle called MCP Basic Servers : https://github.com/mchowy-troll/mcp-basic-servers It is a collection of simple Bash installer scripts for running local MCP HTTP servers on Linux . The idea is simple: run one script,…

38
arXiv — Machine Learning research 1mo ago

Interdomain Attention: Beyond Token-Level Key-Value Memory

arXiv:2605.24330v1 Announce Type: new Abstract: Transformers and deep state space models (SSMs) sit at opposite ends of a basic design choice: attention routes each query through a growing key-value (KV) cache by content-based matching at quadratic cost, while deep SSMs compress…

19
arXiv — Machine Learning research 1mo ago

A computational phase transition for learning-to-sample from Ising models

arXiv:2605.24752v1 Announce Type: new Abstract: We study \emph{learning-to-sample} -- a basic algorithmic task underlying generative modeling -- for Ising models, a standard testbed for algorithmic ideas in both theoretical computer science and machine learning. Given i.i.d.…

10
arXiv — NLP / Computation & Language research 1mo ago

How Much Structure Do LLMs Need? Evaluating LLMs for Bibliometric Cluster Description

arXiv:2605.24351v1 Announce Type: new Abstract: Large language models (LLMs) can support scientific literature synthesis, but remain prone to hallucinated references, uneven coverage, and weakly grounded thematic organization. We evaluate whether bibliometric structure improves…

14
arXiv — NLP / Computation & Language research 1mo ago

Generating Legal Commentaries from Case Databases via Retrieval, Clustering, and Generation

arXiv:2605.24534v1 Announce Type: new Abstract: We present a fully automated pipeline that transforms large collections of court decisions into legal commentaries for statutes - without providing any handcrafted doctrinal framework. Using 4.555 decisions of the German Federal…

26
r/LocalLLaMA community 1mo ago

Update on 12x32gb sxm v100 cluster / local AI for legal drafting

Update from the lawyer with the V100 server. A few of you asked what I actually ended up running once the dust settled, so here it is. Still just a lawyer, still driving the whole thing through Claude Code, still not fully sure what I'm doing — but it works now, which is more…

15
r/LocalLLaMA community 1mo ago

Anyone use QwQ-32B? It's over a year old? Has Qwen 3.6 27b basically replaced it?

I seen this one mentioned but it was a source from about 14 months ago. In the age of the Qwen 3.6 and Gemma 4- is there still a use for QwQ 32B? Does anyone still favour it over the new stuff? If so, do you use it for coding? something else? Thanks   submitted by  …

29
r/LocalLLaMA community 1mo ago

Embeddings for NVIDIA's Nemotron Personas

I extracted embedding vectors for nvidia/Nemotron-Personas dataset. It's an incredible resource consisting of millions of synthetic personas with detailed backgrounds (names, ages, occupations, hobbies, and more), but finding specific personas or clustering them is difficult. To…

5
TechCrunch — AI news-outlet 1mo ago

Elon Musk has given up on solar power (on Earth)

Elon Muks's xAI has gone all in on natural gas, while SpaceX is obsessed with orbital data centers. What happened to the "solar-electric economy" he promised?

6
r/LocalLLaMA community 1mo ago

LLaMa.cpp basic question

I'm trying to install LLaMa with PI agent. I ran curl -fsSL https://pi.dev/install.sh | sh export PATH="/home/user/.local/share/pi-node/node-v22.22.3-linux-x64/bin:$PATH pi install npm:pi-llama.cpp These commands installed pi, added them to path and then I lastly installed an…

34
r/MachineLearning community 1mo ago

Anthropic posted a profit while xAI burned $4.2B. The AI profitability numbers finally leaked.[D]

This week basically forced everyone to stop guessing about AI margins. Three major financial reality checks hit at once: OpenAI confidentially filing their S-1, xAI’s Q1 numbers leaking via SpaceX, and Anthropic somehow posting an actual operating profit. If you are building an…

4
Stratechery (Ben Thompson) community 1mo ago

2026.21: The Data Center Veto

The best Stratechery content from the week of May 18, 2026, including data center discontent, agent economics, and slime mold.

26
Dwarkesh Podcast news-outlet 1mo ago

Reiner Pope – Chip design from the bottom up

Working up from basic logic gates to why GPUs, TPUs, FPGAs, and the human brain each look the way they do.

22
arXiv — Machine Learning research 1mo ago

TONIC: Token-Centric Semantic Communication for Task-Oriented Wireless Systems

arXiv:2605.21553v1 Announce Type: new Abstract: Tokens are becoming the basic units through which foundation models represent and process information for understanding and inference. However, traditional wireless communication, centered on bit-level fidelity, faces a mismatch…

33
r/LocalLLaMA community 1mo ago

When your LLM treats data center GPUs like an optional DLC

  submitted by   /u/noprompt [link]   [comments]

10
Hugging Face Daily Papers research 1mo ago

Capturing LLM Capabilities via Evidence-Calibrated Query Clustering

Abstract Query clustering algorithm ECC improves LLM capability evaluation by aligning semantic embeddings with latent capability demands through posterior model comparisons and Bradley-Terry modeling. AI-generated summary Query clustering organizes queries into groups that…

13
Ars Technica — AI news-outlet 1mo ago

As Grok flounders, SpaceX bets future on beating Big Tech at AI

SpaceX IPO filing pitches orbital data centers as Grok lags rival AI services.

26
r/LocalLLaMA community 1mo ago

Qwen3.6 35Ba3 has changed my workflows and even how I use my computer

My workflow has changed basically to ask Codex to do certain tasks and then document how to do them (including errors it found on its way) into a skill. I feed that skill to pi, and suddenly my qwen3.6 gets that hard stuff done: - devops on a VPS - using docling to create epubs…

33
Google DeepMind official-blog 1mo ago

We’re launching the Google DeepMind Accelerator program in Asia Pacific to tackle environmental risks

The Asia-Pacific region is a global engine for economic growth, but it's also highly vulnerable to climate change. While green technologies are gaining momentum, a recent report shows they aren’t scaling fast enough to keep up with the region’s rising environmental risks. To…

22

G7 agrees on shared language around open-source AI and open weights AI

Erin Brockovich takes aim at data center secrecy

Added an old 2070 Super to my rig and I can't go back...worse, now I need more

I put a datacenter GPU in my gaming PC

Built an AI Accelerator and opensourced it. [P]

DIY Local 2x DGX Spark cluster cooler with automatic temperature controlled fan.

How would you model this "strand" clustering problem? [P]

Dell confirms XPS laptop with NVIDIA N1X at Computex ( basically a DGX Spark GB10 for consumers with Windows )

My home data center

SoftBank says it will invest up to €75 billion to build French data centers

Softbank to Invest Up To 75 Billion Euros on AI Data Centers in France

Why does Thinking Output More Tokens Than a Response?

made a local voice AI for windows you can talk to in any language. open source, bring your own key

Anyone using Flash Attention 2 (ai-bond) on their V100's? How is the performance?

We gave a Reachy Mini a real-time voice brain

b9402

Cycle-Space Informed Detection of Autoencoded Blind False Data Injection Attacks on Power Systems

Cluster-Level Attention-Guided Parallel Decoding for Masked Diffusion Language Models

M\=oLe-{\Lambda}: Learning the Coupled-Cluster Response State for Energies, Gradients, and Properties

llama.cpp B9387 Significant AMD/ROCm PP Update

Mimo 2.5 Pro - 40t/s on 8x Nvidia Spark/GB10 cluster

SpaceX Committed to Six Month Anthropic Data Center Lease, Musk Says

Zai replaced the network architecture running GLM-5.1 inference and the gains are pretty wild

Heterogeneous GPU Weighting & Layer Splitting

Robust Contrastive Graph Clustering with Adaptive Local-Global Integration

ByteDance Mulls $70 Billion Capex This Year as AI Costs Grow

The SpaceX IPO and Data Centers in Space

DarkForest: Less Talk, Higher Accuracy for Multi-Agent LLMs

Qualcomm Strikes AI Chip Deal With ByteDance

[R]GNN Model For Fraud Detection Isn't Performing Well[R]

GEM: Geometric Entropy Mixing for Optimal LLM Data Curation

Small set of local MCP server installers for home Linux users

Interdomain Attention: Beyond Token-Level Key-Value Memory

A computational phase transition for learning-to-sample from Ising models

How Much Structure Do LLMs Need? Evaluating LLMs for Bibliometric Cluster Description

Generating Legal Commentaries from Case Databases via Retrieval, Clustering, and Generation

Update on 12x32gb sxm v100 cluster / local AI for legal drafting

Anyone use QwQ-32B? It's over a year old? Has Qwen 3.6 27b basically replaced it?

Embeddings for NVIDIA's Nemotron Personas

Elon Musk has given up on solar power (on Earth)

LLaMa.cpp basic question

Anthropic posted a profit while xAI burned $4.2B. The AI profitability numbers finally leaked.[D]

2026.21: The Data Center Veto

Reiner Pope – Chip design from the bottom up

TONIC: Token-Centric Semantic Communication for Task-Oriented Wireless Systems

When your LLM treats data center GPUs like an optional DLC

Capturing LLM Capabilities via Evidence-Calibrated Query Clustering

As Grok flounders, SpaceX bets future on beating Big Tech at AI

Qwen3.6 35Ba3 has changed my workflows and even how I use my computer

We’re launching the Google DeepMind Accelerator program in Asia Pacific to tackle environmental risks