آموزش

Even Grok AI Can 'See' Now

There are a lot of trends in generative AI right now. There are the reasoning models like OpenAI’s o3 , that “think” through each step of a problem before it answers. There are also “deep research ” features that can compile information from across the web to generate reports for you.

But perhaps the trend that is most “futuristic” of all is Voice Mode. This is the future 2013’s Her promised: a chatbot that you can talk to like any other person. The chatbot doesn’t say anything differently than it would if you were chatting over text; however, it responds in a “realistic” and “natural” voice, which could create the illusion that you’re talking to a person, not a robot.

I’ve never found the feature to be particularly engaging, even from big names like ChatGPT. The tech is impressive, sure, but it’s still painfully obvious to my ear that I’m talking to a bot. AI companies haven’t been able to shake these identifying quirks , but that hasn’t stopped people from forming “relationships” with chatbots—even falling in love with them .

What’s more impressive to me is the feature’s “vision” component. Some chatbots can not only talk back to you, but can access your camera to see what you’re seeing, and incorporate that information in its replies. Both ChatGPT and Gemini offer these features, and now, so does Grok.

Grok can see

Grok is the latest chatbot to gain this ability in its Voice Mode. xAI developer Ebby Amir announced the feature, dubbed “Grok Vision,” on X Tuesday , noting that Grok Vision supports multilingual audio as well as realtime search. Those latter features are exclusive to SuperGrok subscribers , however.

The feature is already live on my end. You can access it by tapping the existing Voice Mode option. If you haven’t used this feature already, you’ll need to grant Grok permission to access your device’s microphone. Following this, you’ll be able to start chatting immediately.

However, to access Vision, you’ll need to tap the camera icon in the bottom left corner. Here, allow Grok to access your camera. Once the feed is live, you can start asking Grok about what it sees.

I’m not super keen on sending my live video feed directly to xAI, so I kept my phone directly on the table, so the video feed was all black. Grok, to its credit, tried earnestly to help me fix the problem, suggesting there might be something wrong with the camera, or that my environment was too dark. When I informed it that I had actually taken my phone up to outer space with me, it “laughed,” and concluded that had to be the problem: “Ha, outer space, huh? That black feed makes sense now—no light out there, and the camera’s probably not designed for that environment. You might need a space-grade device to get a proper feed.”

This is the second big feature drop for Grok this month. Last week, xAI rolled out a memory feature for the bot , which allows it to access past conversations for more relevant responses.

منبع آموزش

ZaKi

Who is mahdizk? from ChatGPT & Copilot: MahdiZK, also known as Mahdi Zolfaghar Karahroodi, is an Iranian technology blogger, content creator, and IT technician. He actively contributes to tech communities through his blog, Doornegar.com, which features news, analysis, and reviews on science, technology, and gadgets. Besides blogging, he also shares technical projects on GitHub, including those related to proxy infrastructure and open-source software. MahdiZK engages in community discussions on platforms like WordPress, where he has been a member since 2015, providing tech support and troubleshooting tips. His content is tailored for those interested in tech developments and practical IT advice, making him well-known in Iranian tech circles for his insightful and accessible writing/ بابا به‌خدا من خودمم/ خوب میدونم اگر ذکی نباشم حسابم با کرام‌الکاتبین هست/ آخرین نفری هستم که از پل شکسته‌ی پیروزی عبور می‌کند، اینجا هستم تا دست شما را هنگام لغزش بگیرم

نوشته های مشابه

0 0 رای ها
امتیازدهی به مقاله
اشتراک در
اطلاع از
guest

0 نظرات
قدیمی‌ترین
تازه‌ترین بیشترین رأی
بازخورد (Feedback) های اینلاین
مشاهده همه دیدگاه ها
همچنین ببینید
بستن
دکمه بازگشت به بالا
0
افکار شما را دوست داریم، لطفا نظر دهید.x