Jump to content


Recommended Posts

Posted

rssImage-16013fb1f4aaed19c805b93826c2ae19.jpeg

More than three million developers are using OpenAI’s APIs as shorthand code to infuse apps and websites with an engine of advanced AI. And today, the company’s most popular API, called Chat Completions, is getting a significant sequel called Responses. Eight months in development, it will vastly expand upon and simplify the experience of plugging into OpenAI.

For developers, Responses will mean using less code to stack more complex questions to the AI. A hundred lines of code will turn into just three, as the company is courting a wider set of developers who don’t consider themselves LLM experts. For consumers, it will mean you’ll soon be interacting with AI that’s faster, more fluid in using forms of media other than text, and more capable of taking more steps on your behalf. 

“Completions were very much designed in a world where you could only put text into it, and you could only get text out of it. But now we have models that can do work across multiple mediums. We can put images in, we can get audio out of the model, and [users] can speak to the model in real time and have it speak to you back,” says Steve Coffey, an engineer at OpenAI. “[Completions] is just like the wrong tool for the job…so Responses was designed from the ground up.”

i-1-91294955-openai-api-design.jpg

What is OpenAI’s new Responses API update? 

APIs are basically the software gateways to use features from a service or platform inside your own. And to OpenAI, its APIs are as carefully designed as any product—even if we don’t tend to consider APIs as designed objects.

The iPhone has APIs for apps using its camera and accelerometers, for instance, while Stripe’s APIs make it possible for websites and apps to take payments—and in each case, the ease of integrating these APIs has been vital to courting developers and growing a business.

OpenAI created the modern API for AI in 2020 (and Chat Completions in 2023) so developers could plug into its AI platform. Its competitors have since copied OpenAI’s approach to be something of an informal standard across the industry. Thousands of apps, ranging from Perplexity (an AI search engine) to Harvey (an AI for lawyers), currently integrate OpenAI APIs.

Today, OpenAI offers a few different APIs, including one that generates images for Dall-E and another that exclusively works to summarize or write text from scratch. For this release, OpenAI is focusing on Responses, the evolution of its Chat Completion API—the company’s popular way that app developers can plug into the core conversational technology behind ChatGPT.

The way the Chat Completions was designed, developers could only send one query in text at a time, and get a single answer in text back. Practically speaking, that meant complicated questions could take several steps, and each individual new question took time, introducing more latency.

Now, a developer plugs strings of code into the Responses API, crossed with natural language queries that developers can use that are more or less the way you or I would talk to ChatGPT. (A long user manual helps developers understand what they can and can’t do.)

OpenAI’s API will offer “multi turn” conversations that understand context and conversational flow—even when you mix in multimedia like images and, soon to arrive, voice/sound. Responses can also juggle several processes at once, because with a single line of code, you can connect “Tools” hosted by OpenAI into this process. These tools will include a web search (so OpenAI’s responses can be grounded in more real time data), a code interpreter to write and test code, and a file search to analyze and summarize files. The new API will also let developers connect to Operator—OpenAI’s agentic tool that can analyze screens and actually take actions on the user’s behalf—and comes with a new kit of software that helps developers juggle multiple AI agents at the same time.

As the company explains, building APIs requires forecasting years ahead at the functions developers may want, and if you squint, it’s not hard to deconstruct OpenAI’s own thesis on the future lurking in the feature set. The API has vastly expanded upon what’s possible to do when you plug into OpenAI as a developer—embracing fuzzy inputs of multimedia, integrating information so responses are current, and perhaps most notable of all, acting on behalf of the user to save them time and effort.

“I’m very excited for this year because of the agentic behavior that our models will unlock… the model is taking multiple steps on its own volition and giving you an answer,” says Atty Eleti, an engineer at OpenAI. “On the far end, [it makes way for] AI engineers, AI designers, AI auditors, AI accountants. Little junior interns that you can instruct and operate and ask them to go up and do these things. And I think we’re on the cusp of that becoming a very tangible reality.”

Still, these long-term possibilities are grounded in immediate efficiencies. The API updates mean that a simple question, “what’s the weather in San Francisco,” goes from taking a hundred lines of code to just three. Adding all of the aforementioned tools requires just one line of code. This means that coding AI apps should be faster for developers. And because many queries hit OpenAI’s servers all at once, responses should come faster for end users.

i-2-91294955-openai-api-design.jpg[Image: OpenAI]

The challenge of bringing developers along

Like any tool, APIs have to be designed for ease of use. They are not just about coding capabilities, the OpenAI team argues, but about designing clarity and possibility. 

“The education ladder of an API is something that has to be very consciously designed, because our target audience is not people who know how AI works or how LLMs work,” says Eleti. “And so we introduce them to AI in a sort of a ladder way, where you do a bit of work and you get some reward out of it. You do a bit more work, you understand some more concepts, and then over time, you can graduate to the more complex functionality.”

OpenAI gives this instructive feedback to developers through their own mistakes. Whenever it generates errors, OpenAI tries to explain what went wrong in plain language, so the developer can actually understand how to improve their technique. The OpenAI team believes that such feedback, coupled with autocomplete coding tools, should make Responses easy for developers to learn.

“I think that really good APIs sort of allow you to start off with the gas pedal in the steering wheel and graduate slowly to the airplane cockpit by exposing more and more functionality in the form of knobs, in the form of like settings and these other things that are hidden from you first, but exposed over time,” says Coffey.

The tricky part of updating an API, however, is not just making it self-explanatory. The API also needs to be backwards compatible because software that’s already been built to connect to OpenAI can’t suddenly go dark after an update. So Responses is backwards compatible with software built upon Chat Completions. Furthermore, the Completions API itself will continue working as it always has. OpenAI will continue to support it into the future, offering updates that put Completions as close to feature parity with Responses as it can. (But to use those nifty tools, you’ll need to graduate to Responses.)

Over time, the OpenAI API team bets that most of its developers will land on Responses, given the extra capabilities and that it will be price-equivalent to run. Assuming that OpenAI has bet on the right future, AI software is about to become faster, more capable, and more proactive than anything we’ve seen to date.

View the full article

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...