Domyn’s Post

63,371 followers

At Domyn, we’re building multilingual models that can truly reason across math, code, tool use, and major European languages. And thanks to NVIDIA AI DGX Cloud Lepton, we’ve unified training, alignment, serving, and evaluation in one place. With NeMo for CPT and SFT, NeMo RL for GRPO, and vLLM to spin high-quality synthetic data at scale, we can perform faster and more efficiently than ever. DGX Cloud Lepton lets us scale across many nodes, maintain high throughput and tighten our build–measure–learn loop, resulting in faster experiments, cleaner evaluations, and less operational drag. In short, it allows us to stay focused on what we do best without worrying about the underlying infrastructure. Meanwhile, at the system level, NVIDIA DGX Cloud Lepton has been hands-on with performance tuning and training monitoring. The platform ties together vLLM for serving, and Dask and Ray for parallelism. Storage plugs into our S3 buckets and high-throughput file systems to keep pipelines moving. Stay tuned for more exciting news from Domyn and NVIDIA! Alexis Black Bjorlin, Yangqing Jia, Kourosh Behnam, Vishal Ganeriwala

To view or add a comment, sign in

More Relevant Posts

Vishal Ganeriwala
2w
Report this post
Incredible progress from our partners at Domyn. Their multilingual reasoning models show how NVIDIA DGX Cloud Lepton accelerates every stage of AI development from training and alignment to large-scale inference. The results speak for themselves: faster iteration, cleaner evaluation, and simpler scaling across nodes. Excited to see what’s next as Domyn continues to push the boundaries of reasoning-capable LLMs. #DGXCloud #DGXCloudLepton #SovereignAI #NVIDIA
Domyn

63,371 followers
2w

At Domyn, we’re building multilingual models that can truly reason across math, code, tool use, and major European languages. And thanks to NVIDIA AI DGX Cloud Lepton, we’ve unified training, alignment, serving, and evaluation in one place. With NeMo for CPT and SFT, NeMo RL for GRPO, and vLLM to spin high-quality synthetic data at scale, we can perform faster and more efficiently than ever. DGX Cloud Lepton lets us scale across many nodes, maintain high throughput and tighten our build–measure–learn loop, resulting in faster experiments, cleaner evaluations, and less operational drag. In short, it allows us to stay focused on what we do best without worrying about the underlying infrastructure. Meanwhile, at the system level, NVIDIA DGX Cloud Lepton has been hands-on with performance tuning and training monitoring. The platform ties together vLLM for serving, and Dask and Ray for parallelism. Storage plugs into our S3 buckets and high-throughput file systems to keep pipelines moving. Stay tuned for more exciting news from Domyn and NVIDIA! Alexis Black Bjorlin, Yangqing Jia, Kourosh Behnam, Vishal Ganeriwala
Like Comment
To view or add a comment, sign in
Prashanth Velidandi
4w
Report this post
Yesterday, SemiAnalysis published a fascinating piece on NVIDIA Lepton's strategic direction. It's always validating when independent, deep analysis reinforces what we're intensely focused on at InferX . The article clearly outlines how Lepton is struggling with "the hardest part of the cloud which is multi tenancy" and failing to address the core needs of AI-native developers. They even state, "We would rather go through the pain of self deploying SLURM on bare metal nodes than use Lepton." This is precisely why InferX exists. Our entire focus is on delivering unparalleled GPU orchestration, multi-tenancy, and performance, making infrastructure truly invisible so our customers can build models, not manage complexity. This external perspective strengthens our conviction that we're solving a critical, underserved problem for the AI community. Proud of the InferX team for building the future!
3 Comments
Like Comment
To view or add a comment, sign in
Terence Lim
3w Edited
Report this post
Top 6 US hyperscalers are on track to spend $416bn capex this year fuelled by AI demand. Yet Open AI, Anthropic, Google, Perplexity et al will barely hit $30bn of revenues. "The term AI bubble" is abound in the media. But we have been here before. Prominent commentators have called the current AI boom a bubble - from Jim Covello Goldman Sachs Head of Research in Jul'24 "A skeptical look at AI investment" to Elliot Management in Aug'24 putting Nvidia in "bubble land" to Julien Garran of MacroStrategy in Oct'24 decrying AI a misallocation of capital waiting to bust. People want to be first to call a bubble - GFC had made quite a few investment folk heroes eg John Paulson. But the mistake commentators make is to conflate the massive investments in AI chips/ GPUs with Generative AI only. Jensen Huang's must watch interview with BG2 lays it down clearly - all the AI infrastructure investments is first to replace the world's installed base of CPUs. Global internet workloads previously based on CPUs are moving to GPUs - recommendation engines are moving, Search engines are just moving, data processing to move next. These industries are collectively worth >$2trn and we still have not factored the growth potential of $30bn AI native revenues beyond 2025. https://lnkd.in/gnGwSwiv

NVIDIA: OpenAI, Future of Compute, and the American Dream | BG2 w/ Bill Gurley and Brad Gerstner

https://www.youtube.com/
Like Comment
To view or add a comment, sign in
Drew Steinberg
1mo
Report this post
🔄 Timothy Prickett Morgan from The Next Platform highlights NVIDIA's rapid advances in #AI performance, following up on NVIDIA's sweep of the SemiAnalysis's InferenceMAX™ v1 benchmark. Key takeaways: ✅ #NVIDIA #software optimizations are boosting performance by 5-10X on the same #hardware ✅ Pareto curves illustrate how hardware and software optimizations can boost #AI #inference performance ✅ 80% of NVIDIA employees now work on software, which drives around 60% of each #GPU generation’s performance gains 🔗 Read the full article to learn more. #Cloud #Developer #DeepLearning #DataCenter

The Next Platform Details NVIDIA’s Software Pushing the AI Pareto Frontier More Than Hardware https://www.nextplatform.com
Like Comment
To view or add a comment, sign in
Smriti Mishra Smriti Mishra is an Influencer
2w Edited
Report this post
The announcements from NVIDIA GTC in Washington D.C. make one thing clear: AI inference at scale is entering a new era, and multi node, disaggregated serving is at the center of it. Jensen Huang highlighted how NVIDIA Blackwell, together with the NVIDIA Dynamo software platform, delivers 10x the performance of NVIDIA Hopper for emerging AI workloads, including reasoning and large Mixture of Experts models such as DeepSeek R1. What makes this even more impactful is that this level of performance is now available across mainstream cloud environments. Dynamo is integrated with AWS, Google Cloud, Microsoft Azure and Oracle Cloud Infrastructure. This enables enterprises to scale multi node inference across Blackwell powered clusters (including GB200 and GB300 NVL72 systems), with the flexibility and reliability that modern AI demands. 🔹Why Multi Node Inference Matters As AI models grow in size and context length, single node serving becomes a bottleneck. Disaggregated inference, where prefill and decode run on optimized GPU tiers, offers major improvements in throughput and responsiveness. NVIDIA Dynamo brings this approach to production at scale. A great example comes from Baseten, which achieved 2x faster serving and 1.6x higher throughput for long context code generation without adding hardware. Software driven acceleration like this meaningfully reduces the cost of delivering intelligence. 🔹Kubernetes, Grove and the Future of AI Deployment Dynamo’s new Kubernetes capabilities let enterprises orchestrate multi node inference systems with far more ease. The new NVIDIA Grove API allows teams to describe an entire inference system in a single high level specification, from GPU ratios to placement requirements. Grove then manages orchestration across the cluster automatically. This is an important step toward making distributed inference as straightforward as traditional model serving. 🔹The Bigger Picture As reasoning models, agentic systems and Mixture of Experts architectures continue to expand, scalable inference will determine who can deliver next generation AI experiences efficiently and cost effectively. NVIDIA’s full stack approach, from Blackwell hardware to Dynamo software, is shaping that future. Full blog and technical details are available here for anyone who wants to dive deeper: https://lnkd.in/dHnCskUG #technology #innovation #artificialintelligence #deeplearning #cloudcomputing I partnered with NVIDIA to bring you this post.
6 Comments
Like Comment
To view or add a comment, sign in
Geogy George
1mo
Report this post
Excellent discussion with Jensen Huang. Some points that got me thinking - 1. Huang identifies 3 integrated scaling laws driving massive compute demand, replacing the old, single pre-training law: i/ Pre-training: The initial, massive training phase ii/ Post-training: Reinforcement Learning (RL) where the AI "practices a skill until it gets it right," integrating training and inference. iii/ Inference (thinking): The new way of inferencing involves the model "thinking" before answering - doing research and running multiple models concurrently. The longer the model thinks, the better the answer 2. In Jensen's view the entire world's trillions of dollars of general-purpose computing infrastructure must be refreshed with accelerated AI computing. He breaks down the Total Addressable Market (TAM) into 3 parts: i/ Converting existing general-purpose computing to AI computing (accelerated computing) ii/ Converting massive classical workloads of companies like Meta and Google from CPUs to GPUs iii/ Augmenting human intelligence, which makes up $50 trillion to $60 trillion of world GDP. He estimates this augmentation will translate into a $5 trillion annual capex spend on AI factories 3. NVIDIA’s competitive advantage is increasing because they innovated outside the box of Moore's Law. They now use extreme co-design to optimise the model, algorithm, system, and chip simultaneously, innovating across 6/7 different chips (GPU, CPU, networking, etc.) in an annual cadence. This pace is nearly impossible for competitors to match 4. Majority of new compute demands are shifting from one-time training of models to continuous inference/thinking during deployment. This confirms the market is maturing from a research problem to a deployment problem 5. He also highlights a massive divergence of belief between AI builders (Huang, Altman, Zuckerberg) and Wall Street analysts. Analysts are forecasting NVIDIA's growth to "flatline" by 2027, while the builders see a potential $10 trillion market. This gap shows a failure of traditional finance models to grasp exponential growth and systemic compounding 6. He addresses the massive deal with OpenAI (reportedly a $100 billion investment) - calling OpenAI the "next multi-trillion dollar hyperscale company" thus justifying NVIDIA's investment as an opportunity to invest in a future giant. This partnership is framed as an additive project to help OpenAI build its own self-operated AI infrastructure, supplementing their work with Microsoft and Oracle https://lnkd.in/eNrDHRaW

NVIDIA: OpenAI, Future of Compute, and the American Dream | BG2 w/ Bill Gurley and Brad Gerstner

https://www.youtube.com/
Like Comment
To view or add a comment, sign in
Udochukwu Chinedum
3w
Report this post
NVIDIA's Blackwell Platform: A Quantum Leap for AI or Overhyped Tech? At its GTC conference, NVIDIA announced the Blackwell architecture, succeeding the highly successful Hopper. The flagship B200 GPU and GB200 Superchip represent a monumental step in generative AI, designed to power trillion-parameter language models. Major cloud providers like Amazon, Google, Microsoft, and Oracle have already committed to integrating Blackwell, signaling strong initial demand and cementing NVIDIA's pivotal role in the AI infrastructure boom. The market's reaction has been a mix of awe and apprehension. While the announcement was largely priced in, the stock (NVDA) continues to exhibit high volatility. Investor sentiment is overwhelmingly bullish, driven by the narrative that NVIDIA is the sole provider of 'picks and shovels' in the AI gold rush. However, contrarian indicators are flashing, as the stock's valuation has reached levels that demand flawless execution and sustained exponential growth, leaving little room for error. Source - https://lnkd.in/d7CtPsTv
Like Comment
To view or add a comment, sign in
Lim.ai

16 followers
1mo
Report this post
The conversation regarding an AI bubble is still ongoing everywhere, but here is Jensen Huang, CEO of Nvidia's take on this topic: He believes the chance of an "AI glut" or "bubble" is "extremely low" until a total conversion of the world's computing infrastructure to AI is complete He provides a three-point framework for why the demand is sustainable and accelerating: 1. The End of General Purpose Computing: The world's trillions of dollars worth of existing computing infrastructure must be completely refreshed and replaced with accelerated computing and AI. This is a necessary, non-optional shift. 2. Shifting Existing Workloads to AI: Before even creating new applications, the industry is transitioning how existing massive-scale computing is done. This involves moving traditional hyperscale workloads (like search and recommender engines) from CPUs to GPUs/AI 3. Augmenting Global GDP: AI's ultimate opportunity is augmenting human intelligence, which represents $50 trillion (or 50-65%) of the world's GDP. He uses the analogy of a $10,000 AI making a $100,000 employee twice as productive. This augmentation will create an immense demand for new AI infrastructure, far exceeding the current market size. He summarizes the situation by stating that the industry is being compounded by two exponentials: the exponential growth in the number of customers and the exponential increase in the computational complexity of the AI itself (reasoning and thinking). This makes the current investment pace a necessity, not an overbuild. If you have time and want to see the full interview/podcast with CEO of Nvidia, Jensen Huang. It is super interesting and eye opening. https://lnkd.in/gAZ9DF8Y #ai #aibubble #aitrend #dotcombubble #artificialintelligence

NVIDIA: OpenAI, Future of Compute, and the American Dream | BG2 w/ Bill Gurley and Brad Gerstner

https://www.youtube.com/
Like Comment
To view or add a comment, sign in
David Kalcher
1mo
Report this post
“...Excel general purpose computing to accelerated computing replacing all the hyperscalers with AI and then now augmenting human intelligence.” said Jensen Huang. This might be the simplest way to understand why the Target Achievable Market (TAM) for AI is exploding and why AI Factories and Data Centre demand are surging worldwide. Accelerated computing isn’t just faster silicon; it’s a complete re-architecture of the global compute economy. When you realise that GPUs now augment rather than just automate human intelligence, the stakes in OpenAI, NVIDIA, and the emerging AI ecosystem start to make perfect sense. This is global formation in real time. It’s also a wake-up call: Australia must look beyond “the lucky country” mindset of exporting raw minerals, and start building the AI ecosystem, sovereign compute, infrastructure, talent pipelines, and R&D etc... and put Australia on the leaderboard of the future digital economy. Full episode (worth a listen): NVIDIA: OpenAI, Future of Compute, and the American Dream | BG2 w/ Bill Gurley & Brad Gerstner https://lnkd.in/ewhCAwW2

NVIDIA: OpenAI, Future of Compute, and the American Dream | BG2 w/ Bill Gurley and Brad Gerstner

https://www.youtube.com/
Like Comment
To view or add a comment, sign in
Shreeyansh Chouksey
3w
Report this post
Google just fired a shot at NVIDIA . Meet Ironwood — Google ’s most powerful AI chip yet. Built on next-gen TPU architecture, Ironwood delivers: • Faster training speeds • Higher energy efficiency • Seamless scaling across massive data centers What makes this big? For years, NVIDIA GPUs have powered nearly every major AI workload. But with Ironwood, Google isn’t just catching up — it’s building a parallel stack around Gemini AI + Google Cloud + custom silicon. This deep vertical integration means one thing: The AI race is shifting from model performance to hardware sovereignty. As training costs explode, whoever controls the entire AI pipeline — from chip to cloud to model — wins. And Ironwood might just be Google ’s boldest move yet to reclaim that edge. #Google #Nvidia #AI #Semiconductors #ChipDesign #Innovation #GeminiAI #CloudComputing #AI

1 Comment
Like Comment
To view or add a comment, sign in

63,371 followers

View Profile Follow

LinkedIn respects your privacy

Domyn’s Post

More from this author

Our new cutting-edge multi-language model for Agentic AI use cases

Colosseum: our new supercomputer

Explore content categories

Domyn’s Post

More Relevant Posts

NVIDIA: OpenAI, Future of Compute, and the American Dream | BG2 w/ Bill Gurley and Brad Gerstner

https://www.youtube.com/

NVIDIA: OpenAI, Future of Compute, and the American Dream | BG2 w/ Bill Gurley and Brad Gerstner

https://www.youtube.com/

NVIDIA: OpenAI, Future of Compute, and the American Dream | BG2 w/ Bill Gurley and Brad Gerstner

https://www.youtube.com/

NVIDIA: OpenAI, Future of Compute, and the American Dream | BG2 w/ Bill Gurley and Brad Gerstner

https://www.youtube.com/

More from this author

Our new cutting-edge multi-language model for Agentic AI use cases

Colosseum: our new supercomputer

Explore content categories