Gemini Live is (sort of) Alive

Aug 18, 2024

I have a Gemini Advanced account with Google, so I got an email the other evening prompting me (in the classic sense of prompting, not the new GenAI sense) to “Go Live with Gemini”, and I’ve been trying it out a bit over the weekend. The short story on my initial impressions is:
Gemini Live is quite limited - maybe even more than advertised in the image above - and a little confused about its abilities. Said differently, following in the tradition of just-launched Bard/Gemini tools.

Here’s a little more on the Limited, Confused, and Fun impressions I got while giving Gemini Live a spin:

Limited

I Tried these things out on the Pixel 8 Pro and the Nothing 2. Gemini Live:

Can’t access Gmail, Photos, Drive, Google Tasks
You can’t upload anything to it right now
Can’t “see” web pages
Can’t be invoked to see what’s on my phone screen and talk about it, answer questions on it

Maybe I’m misreading the welcome blurb for it, but I thought the “Some features, like Extensions, aren’t available in Live yet” was referring to third party apps, not Google’s own core apps.

Gemini’s voice audio cuts out A LOT when responding; you can see the response in chat history, but that is no longer Live of course.

It’s also worth noting here that ChatGPT 4o is far better at continuous conversation right now than Gemini Live; and the Advanced Voice mode for ChatGPT has not come out yet. Apart from continuous conversations, I wonder whether Gemini Live will be able to match ChatGPT 4o’s Memory feature - which is a killer feature by the way.

Thanks for reading Tech & Nonsense! This post is public so feel free to share it.

Confused / Confusing

I asked Gemini Live to generate an image, it replied, oddly, that it cannot “show” pictures.

So Gemini Live doesn’t know that Gemini Pro - which it sort of its foundation - can generate images, as I did using voice-to-text on the app.

I pasted a link to my recent Substack post on GenAI Overhyped, Bursting Bubble, Super Bloody Useful - and its summary was 100% backwards, wrong in terms of the views expressed in the post.

Maybe it’s just my writing style I thought - so I gave it a link from The Guardian newspaper that starts like so:

OpenAI said on Friday it had taken down accounts of an Iranian group for using its ChatGPT chatbot to generate content meant for influencing the US presidential election and other issues.

Gemini’s summary is WAY wrong again:

OpenAI is reportedly circumventing its own sanctions and allowing users in Iran to access its ChatGPT service. This is despite US sanctions prohibiting the export of goods and services to Iran. Many users in Iran are utilizing ChatGPT for various purposes. OpenAI has not officially commented on the situation.

Fun

It’s the weekend; I needed to finish with some sort of fun result with Gemini Live but not really live. After it told me it could show pictures I used Gemini (minus Live) and gave it this prompt:

Can you please generate an image of a black German Shepherd dreaming about killing squirrels?

Mostly because that is what my black Shepherd Mix dreams about every night :) And I really like these two images:

I’ve done enough criticizing of Gemini Live. I should note that it’s brand new, maybe more of an alpha version just now than a beta and most or all of the cool features touted and demo-ed are still somewhere on the horizon. Or maybe this will all be better/great on a Pixel 9 phone - I’ll know later this week.

Tech & Nonsense

Gemini Live is (sort of) Alive

Limited

Confused / Confusing

Fun

Discussion about this post