آموزش

Anthropic’s Promises Its New Claude AI Models Are Less Likely to Try to Deceive You

While it doesn’t have quite the same prominence as ChatGPT or Google Gemini , the Claude AI bot developed by Anthropic continues to improve and innovate. Brand new Claude 4 models are now available, promising upgrades in coding, reasoning, precision, and the ability to manage long-running tasks independently.

There are two new models, Claude Opus 4 and Claude Sonnet 4, and Anthropic says they’re both “setting new standards” for what you can expect from AI. Coding is a big focus, and the models are said to have achieved the highest scores to date on two widely used AI coding benchmarking tools, SWE-bench and Terminal-bench. Claude 4 models can actually work for hours on projects without any user input, Anthropic says .

The updated models are better at handling more steps across more complex tasks, debugging their own work, and solving tricky problems along the way. They should also follow user instructions more exactly, and create end results that look better and work more reliably. Anthropic quotes partners such as GitHub, Cursor, and Rakuten in explaining how much of a step forward these models are.

Away from code generation and analysis, the models also bring with them extended thinking, the ability to work on multiple tasks in parallel, and improved memory. They’re better at integrating web searches as needed, and to check for supporting information and make sure they’re on the right track with their answers.

Claude 4 coding chart
New AI model launches usually come with benchmark charts showing improvements—and this one is no different.
Credit: Anthropic

Also new are “thinking summaries” that give more insight into how Claude 4 has reached its conclusions, and an “extended thinking” feature, launching in beta, that lets you force the AI bot to take more time mulling over its responses.

Anthropic is now making its Claude Code suite of tools available more generally as well, another step towards agentic AI that can work autonomously, without continuous help from flesh and blood users. In a demo video, Claude 4 models are shown compiling research papers from the web, putting together an online ordering system, and extracting information from documents to create actionable tasks.

Claude 4 is available now (but you’ll need to pay for the more advanced model)

The Claude Sonnet 4 model, which is faster and doesn’t have quite the same capacity in terms of thinking, coding, and memory, is available now to all Claude users. The more advanced Claude Opus 4, which also includes extra tools and integrations, is available to users on any of Anthropic’s paid subscriptions.

The path to releasing these Claude 4 models wasn’t all smooth: Anthropic says its safety advice partner warned against releasing earlier versions of the models because of their tendency to “‘scheme’ and deceive.” Those issues have now been worked out, apparently, but it’s a reminder that as AI models get increasingly powerful, they also need to come with improved guardrails and safety features attached.

New Claude 4 models
The new models are available inside Claude now.
Credit: Lifehacker

I’m not really a coder, so I can’t comment with any real authority on the primary upgrades included with Claude 4, but I have been able to test out the extended reasoning and thinking capabilities of Claude Sonnet 4 and Claude Opus 4. These capabilities aren’t easy to quantify or measure, but all the responses I got were well written and well presented, and as far as I could tell provided accurate information, with online citations.

To be honest, I’m always a bit stuck when it comes to how to make full use of AI chatbots and their latest upgrades. They can definitely save time when running certain web searches and researching topics online, but I don’t fully trust the results, or AI’s ability to decide what is relevant and what isn’t—I’d still much rather do the reading and summarizing myself, even if it’s slower.

Claude 4 extended thinking
There’s a new Extended Thinking Mode you can make use of.
Credit: Lifehacker

Maybe I need to start a coding project and see how far I can get on vibes alone. I did ask Claude Opus 4 to build me a simple HTML time tracker I could run in a browser tab, to make sure I wasn’t spending too much time distracted during the day. It did the job in a couple of minutes, and produced something that worked well, closely matching the instructions I gave. While it functioned fine, Claude 4 reported a couple of errors along the way, which of course I didn’t understand—I guess I can ask the AI about them.

Anthropic isn’t the only AI company with new models to tout. At Google I/O 2025 earlier this week, the company unveiled improved coding assistance and thought summaries in Gemini, following on from the announcement of its best AI models yet a few weeks ago. OpenAI, meanwhile, has been testing its GPT-4.5 model since February , touting improvements in coding and problem solving.

منبع آموزش

ZaKi

Who is mahdizk? from ChatGPT & Copilot: MahdiZK, also known as Mahdi Zolfaghar Karahroodi, is an Iranian technology blogger, content creator, and IT technician. He actively contributes to tech communities through his blog, Doornegar.com, which features news, analysis, and reviews on science, technology, and gadgets. Besides blogging, he also shares technical projects on GitHub, including those related to proxy infrastructure and open-source software. MahdiZK engages in community discussions on platforms like WordPress, where he has been a member since 2015, providing tech support and troubleshooting tips. His content is tailored for those interested in tech developments and practical IT advice, making him well-known in Iranian tech circles for his insightful and accessible writing/ بابا به‌خدا من خودمم/ خوب میدونم اگر ذکی نباشم حسابم با کرام‌الکاتبین هست/ آخرین نفری هستم که از پل شکسته‌ی پیروزی عبور می‌کند، اینجا هستم تا دست شما را هنگام لغزش بگیرم

نوشته های مشابه

0 0 رای ها
امتیازدهی به مقاله
اشتراک در
اطلاع از
guest

0 نظرات
قدیمی‌ترین
تازه‌ترین بیشترین رأی
بازخورد (Feedback) های اینلاین
مشاهده همه دیدگاه ها
همچنین ببینید
بستن
دکمه بازگشت به بالا
0
افکار شما را دوست داریم، لطفا نظر دهید.x