news-14112024-222422

OpenAI recently announced a new feature for its ChatGPT desktop app for MacOS that allows the chatbot to read code in various developer-focused coding apps like VS Code, Xcode, TextEdit, Terminal, and iTerm2. This feature, called Work with Apps, eliminates the need for developers to manually copy and paste their code into ChatGPT for assistance. Now, when enabled, ChatGPT will automatically analyze the section of code you are working on along with your prompt.

Unlike other AI coding tools like Cursor or GitHub Copilot, ChatGPT currently cannot write code directly into developer apps. However, OpenAI views this feature as a crucial step towards developing more advanced agentic systems. Understanding different apps on the computer screen is a key challenge for AI agents, and OpenAI is working towards overcoming this obstacle.

Initially, the focus of this feature is on coding apps, reflecting the popularity of AI coding assistants powered by large language models (LLMs). The feature is currently available to Plus and Teams users and will soon be rolled out to Enterprise and Edu users in the coming weeks. Moving forward, OpenAI plans to expand ChatGPT’s capabilities to work with other types of text-based apps for writing tasks.

To enable ChatGPT to read different apps, OpenAI relies on the MacOS Accessibility API to interpret text and provide context to the chatbot. While MacOS’s screen reader is generally reliable for most common apps, certain apps like VS Code require users to install specific extensions for compatibility. Additionally, the screen reader can only process text and cannot interpret visual elements like images, object orientation, or videos.

When using the Work with Apps feature, ChatGPT will either analyze the last 200 lines of code or all the code in the foremost window, depending on the app. Users can highlight specific sections of code or text to direct ChatGPT’s attention, although the chatbot may include surrounding text for context. This feature may require a significant amount of input tokens to function effectively.

Looking ahead, OpenAI’s plans to expand this feature to non-compatible apps that do not support Apple’s screen reader remain unclear. Competitors like Anthropic have developed AI systems that analyze desktop screenshots to interact with various apps, offering a more general-purpose approach compared to OpenAI’s reliance on APIs.

Overall, OpenAI views the integration of ChatGPT with coding tools as a collaborative effort rather than the development of a standalone agent. The company is working towards enhancing ChatGPT’s ability to understand and interact with different types of content to provide more comprehensive assistance in the future. This move towards more advanced AI agents aligns with reports of OpenAI’s upcoming release of a general-purpose AI agent codenamed “Operator,” expected to debut in early 2025.

While the new features are currently available on MacOS, OpenAI plans to integrate ChatGPT with Apple’s upcoming releases in December. The expansion of the Work with Apps feature to Windows, developed by Microsoft, remains uncertain at this time.