Sending and receiving real-time audio will cost developers twice the rate of text-only large language models.
OpenAI’s annual developer day took place Wednesday in San Francisco, with a raft of product and feature announcements. The event’s centerpiece was the company’s introduction of its real-time application programming interface (API).
The feature for developers makes it possible to send and receive spoken-language inputs and outputs during inference operations, or making predictions with a production large language model (LLM). It is hoped this type of interaction can enable a more fluid, real-time conversation between a person and a language model.
A busy schedule at the developer day.
OpenAI’s pricing sheet for real-time API function calls in GPT-4o large language model inference.
OpenAI gives examples of how real-time voice can be used in generative AI, including an automated health coach giving a person advice, and a language tutor that can engage in conversations with a student to practice a new language.
Home
United States
USA — software OpenAI lets developers build real-time voice apps – at a substantial premium