Back to posts

OpenAI can think in images & more (April 17, 2025)

David Pawlan

David Pawlan

Co-Founder

Apr 17, 2025
OpenAI can think in images & more (April 17, 2025)

Hey friends,

This week in AI: OpenAI dropped not one but two new models with serious reasoning chops, Anthropic’s Claude is finally learning to talk, and Copilot is now your computer’s new mouse-clicking intern. Oh, and we may have just hit AGI’s doorstep.

Let’s dive in.

🔍 Next Wave of AI Might Be Geniuses

o3 and o4-mini could be our AGI moment

OpenAI released two new reasoning models, o3 and o4-mini, that aren’t just faster—they’re wildly smarter. These models can now use all ChatGPT tools (search, Python, image gen, etc.) and even “think with images”—a first.

  • o3 hit state-of-the-art across math, science, and multimodal benchmarks
  • o4-mini aced the AIME 2025 math comp and performs like a top-200 coder on Codeforces
  • Both can analyze images and diagrams in real time, solve related problems, and generate new ideas
  • The new Codex CLI lets you run an AI dev agent from your terminal—just upload a screenshot and it rebuilds the app

Greg Brockman says it’s “a GPT-4-level leap.” Some researchers are already saying o3 is AGI. 👀

Claude just got smarter — and more human

Anthropic rolled out a major Claude update, adding autonomous research powers and Google Workspace integration. You can now ask Claude to find info across the web and your own docs/emails.

  • Research mode: Claude autonomously searches your files + the internet
  • Workspace mode: Reads your calendar, docs, and inbox for better context
  • Voice mode is next: Airy, Mellow, and Buttery voices are reportedly launching this month

Anthropic is clearly catching up in the AI assistant race. With these updates, Claude could become your full-time researcher.

🤖 Agents Are Clicking Buttons Now

Copilot takes over your desktop

Microsoft’s Copilot Studio now lets you build agents that interact with your desktop apps—clicking buttons, filling forms, and navigating GUIs like a human.

  • Great for automating apps that don’t have APIs
  • Adapts to layout changes in real time
  • Keeps data private (enterprise data isn’t used for training)

Between OpenAI, Anthropic, and Microsoft, the agent era is here. Your software might be the next thing that gets outsourced… to software.

🛠️ Tools of the Week

  • Grok Studio – Canvas-style AI doc builder
  • ScrapeGraphAI – Scrape any site into usable data
  • SpreadSimple – Turn Google Sheets into full websites
  • Plus AI – AI presentation builder with slick slide edits
  • Nily – Compare results from 20+ LLMs for the best answer

🧠 Prompt of the Day

Uncover the truth behind a job posting

Act as a seasoned hiring manager in [insert field]. Analyze this job post and reveal the top 3 traits it values. What problems is this role solving? What keywords should I include on my resume? Craft one power-sentence I can use in a cover letter that proves I get what they want. Paste job post below.

TL;DR

OpenAI’s new models o3 and o4-mini may have just crossed the AGI line, Claude’s now a researcher with a voice, and Copilot is clicking through your desktop apps. Meanwhile, agents are rising—and they’re coming for your workflows.

Want this in your inbox every week? Just say the word.

— David 🧠⚡️

Byte-Sized Apr 16, 2025 Byte-Sized Apr 18, 2025