Close Menu
    Facebook X (Twitter) Instagram
    Wifi PortalWifi Portal
    • Blogging
    • SEO & Digital Marketing
    • WiFi / Internet & Networking
    • Cybersecurity
    • Tech Tools & Mobile / Apps
    • Privacy & Online Earning
    Facebook X (Twitter) Instagram
    Wifi PortalWifi Portal
    Home»WiFi / Internet & Networking»Nvidia claims 10x cost savings with open-source inference models
    WiFi / Internet & Networking

    Nvidia claims 10x cost savings with open-source inference models

    adminBy adminFebruary 13, 2026No Comments1 Min Read
    Facebook Twitter LinkedIn Telegram Pinterest Tumblr Reddit WhatsApp Email
    Big data technology and data science illustration. Data flow concept. Querying, analysing, visualizing complex information. Neural network for artificial intelligence. Data mining. Business analytics.
    Share
    Facebook Twitter LinkedIn Pinterest Email

    Nvidia noted that cost per token went from 20 cents on the older Hopper platform to 10 cents on Blackwell. Moving to Blackwell’s native low-precision NVFP4 format further reduced the cost to just 5 cents, so a basic upgrade gave a 4x improvement in cost per token while maintaining the accuracy that customers expect.

    Nvidia outlined four industry deployments in a blog post showing how this combination of Blackwell infrastructure, NVFP4, optimized software stacks and open-source models delivers significant cost reductions. They break down like this:

    • Healthcare — In healthcare, tedious, time-consuming tasks like medical coding, documentation and managing insurance forms cut into the time doctors can spend with patients. Sully.ai helps tackle this problem through AI agents to handle routine tasks that take up time.

    The problem is that Sully.ai’s proprietary, closed source models didn’t scale well. So Sully.ai used Baseten’s open-source Model API on Blackwell GPUs with NVFP4 data format, the TensorRT-LLM library and the Dynamo inference framework .The result was a 90% drop in inference costs dropped by 90%, representing a 10x reduction compared with the prior closed source implementation, while response times improved by 65% for critical workflows like generating medical notes.

    10x claims Cost inference Models Nvidia opensource savings
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Telegram Email
    Previous ArticleGoogle Links China, Iran, Russia, North Korea to Coordinated Defense Sector Cyber Operations
    Next Article Nothing has finally given its phones a truly essential feature
    admin
    • Website

    Related Posts

    Nvidia partners with optics technology vendors Lumentum and Coherent to enhance AI infrastructure

    March 3, 2026

    Intel aims advanced Xeon 6+ at AI edge computing

    March 3, 2026

    Why I’m sticking with 7B models for my local dev environment (and you should too)

    March 2, 2026
    Add A Comment
    Leave A Reply Cancel Reply

    Search Blog
    About
    About

    At WifiPortal.tech, we share simple, easy-to-follow guides on cybersecurity, online privacy, and digital opportunities. Our goal is to help everyday users browse safely, protect personal data, and explore smart ways to earn online. Whether you’re new to the digital world or looking to strengthen your online knowledge, our content is here to keep you informed and secure.

    Trending Blogs

    Best High-Yield Checking Accounts for March 2026

    March 3, 2026

    This amazing ESP32 projector integrates with Home Assistant and displays whatever you want

    March 3, 2026

    SD-WAN 0-Day, Critical CVEs, Telegram Probe, Smart TV Proxy SDK and More

    March 3, 2026

    Google Clarifies How It Picks Thumbnails For Search, Discover

    March 3, 2026
    Categories
    • Blogging (32)
    • Cybersecurity (572)
    • Privacy & Online Earning (80)
    • SEO & Digital Marketing (357)
    • Tech Tools & Mobile / Apps (709)
    • WiFi / Internet & Networking (103)

    Subscribe to Updates

    Stay updated with the latest tips on cybersecurity, online privacy, and digital opportunities straight to your inbox.

    WifiPortal.tech is a blogging platform focused on cybersecurity, online privacy, and digital opportunities. We share easy-to-follow guides, tips, and resources to help you stay safe online and explore new ways of working in the digital world.

    Our Picks

    Best High-Yield Checking Accounts for March 2026

    March 3, 2026

    This amazing ESP32 projector integrates with Home Assistant and displays whatever you want

    March 3, 2026

    SD-WAN 0-Day, Critical CVEs, Telegram Probe, Smart TV Proxy SDK and More

    March 3, 2026
    Most Popular
    • Best High-Yield Checking Accounts for March 2026
    • This amazing ESP32 projector integrates with Home Assistant and displays whatever you want
    • SD-WAN 0-Day, Critical CVEs, Telegram Probe, Smart TV Proxy SDK and More
    • Google Clarifies How It Picks Thumbnails For Search, Discover
    • These budget-friendly wireless earbuds deliver a pleasant experience while still being easy on the wallet
    • AI went from assistant to autonomous actor and security never caught up
    • Segway Cube 1000 Portable Power Station hits lowest price ever!
    • How Microsoft, partners are tackling ‘huge, huge task’ of making security software safer
    © 2026 WifiPortal.tech. Designed by WifiPortal.tech.
    • Home
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms and Conditions
    • Disclaimer

    Type above and press Enter to search. Press Esc to cancel.