Skip to content

Local, OpenAI-developed AI models now available at par: Premium offerings, run offline without dependence on cloud services

OpenAI's debut of the gpt-oss-120b and gpt-oss-20b models marks their initial open-source language releases since GPT-2, with the smaller model requiring only 16GB of memory for its operation.

Local, Open-Source AI Models Launched by OpenAI, Equal in Quality to Paid Options
Local, Open-Source AI Models Launched by OpenAI, Equal in Quality to Paid Options

Local, OpenAI-developed AI models now available at par: Premium offerings, run offline without dependence on cloud services

OpenAI, the leading artificial intelligence research laboratory, has open-sourced two new language models under the Apache 2.0 licensing: the gpt-oss-120b and gpt-oss-20b. These models showcase impressive capabilities in reasoning tasks and tool use, setting a new standard in the field.

Capabilities

The gpt-oss-120b and gpt-oss-20b models boast strong instruction following, chain-of-thought reasoning, tool use, and structured outputs. They are compatible with OpenAI’s Responses API and popular inference frameworks like Transformers, vLLM, Llama.cpp, and Ollama.

The gpt-oss-120b offers superior reasoning, a deeper attention capacity, and better performance on complex tasks. On the other hand, the gpt-oss-20b is optimised for speed and accessibility, making it suitable for low-cost or on-device inference. Both models utilise the Mixture-of-Experts (MoE) architecture, which reduces active parameters per forward pass to improve efficiency.

Requirements for Running

The gpt-oss-20b requires approximately 16 GB VRAM, while the gpt-oss-120b necessitates an 80 GB VRAM for inference. Here's a breakdown of their specifications:

| Model | Total Parameters | Active Parameters per Token | Memory Requirement | Recommended Hardware | Notes | |----------------|------------------|---------------------------|--------------------|----------------------------------------------|--------------------------------------------| | gpt-oss-20b | ~21B | ~3.6B | 16 GB VRAM | Single 16 GB GPU (e.g., consumer-grade GPU) | Ideal for on-device or low-cost server inference[2][3] | | gpt-oss-120b | ~117B | ~5.1B | 80 GB VRAM | Single high-end GPU such as NVIDIA H100 or multi-GPU setup | Can run on a single 80GB H100; recommended to use vLLM for best performance[1][2][3] |

The training of gpt-oss-120b took approximately 2.1 million H100 GPU hours, while gpt-oss-20b required roughly 10 times less[1]. However, the 120B model is large and loading can be slow due to SSD speed limitations[5].

Safety Precautions

Since these models are open-weight, developers must implement their own safeguards to prevent attackers from fine-tuning them to bypass safety restrictions[1][2][3].

Summary

The gpt-oss-20b, designed for consumer GPUs with 16GB VRAM or edge devices, offers a more accessible option for inference. In contrast, the gpt-oss-120b, requiring an 80GB GPU like NVIDIA H100 or a multi-GPU setup, delivers a higher reasoning capacity.

Both models are available on Hugginface and support adjustable reasoning effort levels (low, medium, high). They have undergone evaluation by independent expert groups, demonstrating near-parity with OpenAI’s o4-mini on reasoning benchmarks.

[1] OpenAI Blog: Link [2] GitHub: Link [3] Apache 2.0 License: Link [4] Codeforces: Link [5] AMD Ryzen™ AI and Radeon GPU Systems: Link

  1. For developers interested in integrating the gpt-oss-20b language model into their projects, it's important to note that it uses artificial intelligence technology and can be fine-tuned using tokens from both science and finance domains, making it suitable for various applications within the field of technology.
  2. In the realm of cryptocurrency and blockchain, the gpt-oss-120b model could potentially assist in the creation and promotion of Initial Coin Offerings (ICO) by generating compelling and informative proposals, leveraging the model's advanced capabilities in tool use and chain-of-thought reasoning.
  3. Asynchronous learning in artificial intelligence and finance can greatly benefit from the gpt-oss models' ability to learn and adapt through structured outputs, as they can be utilised in training algorithms to improve Ethereum (ETH) trading strategies based on real-world data, thereby boosting the efficiency of the financial market.

Read also:

    Latest