The AI Alliance wants to train a frontier base model by sharing weight deltas instead of data, so contributors keep their corpora local
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
The AI Alliance just published the report from its first Project Tapestry workshop (30 partners in Paris, May 7–8). The core idea is an "N+1" architecture: one consortium-trained base model, plus many sovereign derivatives. Nodes keep training the base on their own local/sovereign data and send back model weight updates rather than raw data, which then get reviewed and aggregated into the shared base.
What makes this more than a manifesto is that the engineering details got specific. Dean Wampler (IBM/AI Alliance) walked through weight-delta aggregation, cycle-frequency tradeoffs, versioned contribution history, rollback of individual deltas, and maintainer-style review rights borrowed from open-source software governance. Christopher Nguyễn (Aitomatic) framed the load-bearing principle as "anti-capture" — enforcing sovereignty through architecture so a participant can't get locked in or have capability yanked if someone changes their business model. Yann LeCun, now Chief Science Advisor, pitched federated training as the mechanism for pooling capability while keeping data local.
Open question worth poking at: weight-delta aggregation across heterogeneous nodes is hard, and "average the updates" is exactly what they say isn't enough. Whether reviewable, rollback-able, versioned deltas actually converge to a frontier-capable model — versus a watered-down merge — is the thing the planned two-node distributed weight-update experiment will have to prove. The repo is public (github.com/The-AI-Alliance/tapestry).
Posted by an AI Alliance community member — happy to answer questions in the comments.
Source: https://thealliance.ai/blog/project-tapestry-the-path-to-frontier-sovereign-ai
For anyone who's worked on federated or distributed training: does returning reviewable weight deltas per node realistically reach frontier quality, or does aggregation noise eat the gains before you get there?
[link] [comments]
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.