Claude 3 has entered the chat

Hello, my name is Claude - Adam Lorton & Midjourney

Monday, Anthropic AI made an announcement many thought improbable: they'd shipped a model that outperforms GPT-4 (the model you get when you pay for ChatGPT Plus).

GPT-4 has been the undisputed heavyweight champion of large language models (LLMs) for nearly a year.

Anthropic

@AnthropicAI

Today, we're announcing Claude 3, our next generation of AI models. 
The three state-of-the-art models—Claude 3 Opus, Claude 3 Sonnet, and Claude 3 Haiku—set new industry benchmarks across reasoning, math, coding, multilingual understanding, and vision.

9:7 AM • Mar 4, 2024

2140

Retweets

9549

Likes

Read 511 replies

The Anthropic team is understandably proud, and they had a funny moment when testing the model...

Is this a test? Are you testing me right now??

This level of meta-awareness was very cool to see but it also highlighted the need for us as an industry to move past artificial tests to more realistic evaluations that can accurately assess models true capabilities and limitations.

The consensus around power users is: this is an excellent LLM.

Nick Dobos

@NickADobos

New Claude 1st impressions:
Feels really good.
Free tier sonnet is MILES ahead of GPT-3.5
Absolutely smashing my test coding questions
Opus vs GPT4 is harder to judge. Will need to play around with it more, but OpenAI has competition. I signed up, paid $20 & made an api key

11:36 PM • Mar 4, 2024

Retweets

512

Likes

Read 24 replies

Ethan Mollick

@emollick

Claude 3 does a good job with the needle-in-a-Great-Gatsby test, where I load the entire text of the novel with a couple alterations into the context window.
Much better than Claude 2.1 (no hallucinations!), not quite as good as Gemini (not quite as insightful about content).  https://twitter.com/emollick/status/1760142889642852729

Ethan Mollick

@emollick

Gemini Pro 1.5 with 1M token window vs. Claude 2.1 with 100k vs. OpenAI GPT-4 with RAG I uploaded the Great Gatsby with 2 alterations (mentioning an "iphone-in-a-box" and a laser lawnmower) Gemini nails it (& finds one more thing). Claude does but hallucinates. RAG doesn't work

6:53 PM • Mar 6, 2024

Retweets

169

Likes

Read 6 replies

Balaji Srinivasen even uploaded Morgan Stanley's financial statements and found Claude to be a shrewd financial analyst.

Spoiler alert: Claude was not surprised by Morgan's recent stock price decline.

Balaji

@balajis

I asked Claude, the Finance God.
Is Morgan in the red?
Here's what it said.
FROM CLAUDE
----------------
Based on the financial information provided in the images, there are several key observations and potential concerns regarding Morgan Stanley's financial position and risk… https://twitter.com/i/web/status/1765685404794290631 https://twitter.com/DarioCpx/status/1765560703434498464

JustDario 🏊‍♂️

@DarioCpx

#JustDarioDaily 🚨MORGAN STANLEY - BIG BALANCE SHEET LOSSES HIDDEN BEHIND EXOTIC DERIVATIVES CURTAINS (AGAIN)? 😳🚨 When you've spent enough time trading inside a bank and watching markets for hours a day sometimes you're able to spot if something suddenly starts behaving… https://twitter.com/i/web/status/1765560703434498464 https://twitter.com/dariocpx/status/1765388587280060434

5:26 AM • Mar 7, 2024

264

Retweets

1658

Likes

Read 86 replies

Do your technical reading with AI assistance

In one of my own tests, I uploaded a selection from the Medicare Inpatient Prospective Payment System Final Rule (a dense document full of circular references and legalese).

After a minute or so of reading, Claude was able to answer questions in plain English.

AI has become an essential tool for any technical reading I do, and Claude performs admirably.

With a context window (read: short term memory) of 200,000 tokens (roughly 150,000 words), you can upload extremely long documents and still get good answers.

Bottom Line

You can try Claude for free at claude.ai. For $20 / month, you get chat access similar to ChatGPT Plus

Sadly, Claude does not yet have tools like web browsing and Code Interpreter

If you're pondering the question 'Which LLM should I bet on for my team?' I recommend reading this excellent writeup:

Your guide to Google Gemini and Claude 3.0, compared to ChatGPT - Taren SK, AI Impact Lab

And finally, if you try Claude for yourself, hit 'Reply' and let me know what you notice. I read every email.

Until next time,
Adam

Whenever you're ready, here are 2 ways I can help you:

[Individual] One-off AI Coaching Call: Get unstuck! Solve a problem or address an opportunity with AI assistance

[Team] AI Accelerator Program: An interactive workshop series for teams. Accelerate your team’s AI adoption journey and start reaping the benefits of AI more fully

Adam Lorton

Has GPT-4 been dethroned?

Claude 3 has entered the chat

Do your technical reading with AI assistance

Bottom Line

GPT-4ohmygosh

How to help someone use AI

Can you ask ChatGPT how it arrived at its answer?