Close Menu
    Facebook X (Twitter) Instagram
    Wifi PortalWifi Portal
    • Blogging
    • SEO & Digital Marketing
    • WiFi / Internet & Networking
    • Cybersecurity
    • Tech Tools & Mobile / Apps
    • Privacy & Online Earning
    Facebook X (Twitter) Instagram
    Wifi PortalWifi Portal
    Home»WiFi / Internet & Networking»Nvidia claims 10x cost savings with open-source inference models
    WiFi / Internet & Networking

    Nvidia claims 10x cost savings with open-source inference models

    adminBy adminFebruary 13, 2026No Comments1 Min Read
    Facebook Twitter LinkedIn Telegram Pinterest Tumblr Reddit WhatsApp Email
    Big data technology and data science illustration. Data flow concept. Querying, analysing, visualizing complex information. Neural network for artificial intelligence. Data mining. Business analytics.
    Share
    Facebook Twitter LinkedIn Pinterest Email

    Nvidia noted that cost per token went from 20 cents on the older Hopper platform to 10 cents on Blackwell. Moving to Blackwell’s native low-precision NVFP4 format further reduced the cost to just 5 cents, so a basic upgrade gave a 4x improvement in cost per token while maintaining the accuracy that customers expect.

    Nvidia outlined four industry deployments in a blog post showing how this combination of Blackwell infrastructure, NVFP4, optimized software stacks and open-source models delivers significant cost reductions. They break down like this:

    • Healthcare — In healthcare, tedious, time-consuming tasks like medical coding, documentation and managing insurance forms cut into the time doctors can spend with patients. Sully.ai helps tackle this problem through AI agents to handle routine tasks that take up time.

    The problem is that Sully.ai’s proprietary, closed source models didn’t scale well. So Sully.ai used Baseten’s open-source Model API on Blackwell GPUs with NVFP4 data format, the TensorRT-LLM library and the Dynamo inference framework .The result was a 90% drop in inference costs dropped by 90%, representing a 10x reduction compared with the prior closed source implementation, while response times improved by 65% for critical workflows like generating medical notes.

    10x claims Cost inference Models Nvidia opensource savings
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Telegram Email
    Previous ArticleGoogle Links China, Iran, Russia, North Korea to Coordinated Defense Sector Cyber Operations
    Next Article Nothing has finally given its phones a truly essential feature
    admin
    • Website

    Related Posts

    Tether is shipping TurboQuant KV-cache quantization with Vulkan support into its QVAC SDK

    June 17, 2026

    HPE Discover: Neri outlines an AI architecture built for agents

    June 17, 2026

    HPE product barrage targets AI networks, agents, management

    June 16, 2026
    Add A Comment
    Leave A Reply Cancel Reply

    Search Blog
    About
    About

    At WifiPortal.tech, we share simple, easy-to-follow guides on cybersecurity, online privacy, and digital opportunities. Our goal is to help everyday users browse safely, protect personal data, and explore smart ways to earn online. Whether you’re new to the digital world or looking to strengthen your online knowledge, our content is here to keep you informed and secure.

    Trending Blogs

    How Nina Clapperton Built a Content Business With $100K Months After HCU

    June 17, 2026

    11 Ways to Automate SEO with Agent A

    June 17, 2026

    Written For Readers Who Don’t Read

    June 17, 2026

    9 Best AI Visibility Tools to Track Your Brand in AI Search

    June 17, 2026
    Categories
    • Blogging (97)
    • Cybersecurity (1,955)
    • Privacy & Online Earning (265)
    • SEO & Digital Marketing (1,515)
    • Tech Tools & Mobile / Apps (1,796)
    • WiFi / Internet & Networking (359)

    Subscribe to Updates

    Stay updated with the latest tips on cybersecurity, online privacy, and digital opportunities straight to your inbox.

    WifiPortal.tech is a blogging platform focused on cybersecurity, online privacy, and digital opportunities. We share easy-to-follow guides, tips, and resources to help you stay safe online and explore new ways of working in the digital world.

    Our Picks

    How Nina Clapperton Built a Content Business With $100K Months After HCU

    June 17, 2026

    11 Ways to Automate SEO with Agent A

    June 17, 2026

    Written For Readers Who Don’t Read

    June 17, 2026
    Most Popular
    • How Nina Clapperton Built a Content Business With $100K Months After HCU
    • 11 Ways to Automate SEO with Agent A
    • Written For Readers Who Don’t Read
    • 9 Best AI Visibility Tools to Track Your Brand in AI Search
    • Topics matter for third-party authority signals
    • Tether is shipping TurboQuant KV-cache quantization with Vulkan support into its QVAC SDK
    • The Integrated Search Brief That Aligns SEO, PPC & Content In The AI Search Era
    • Microsoft Ads expands LinkedIn targeting with job seniority filters
    © 2026 WifiPortal.tech. Designed by WifiPortal.tech.
    • Home
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms and Conditions
    • Disclaimer

    Type above and press Enter to search. Press Esc to cancel.