Greetings, readers. Today I’m trying an experiment with my scanner reports. Rather than giving an overview of AI, education, and the future across multiple domains, I’m going to try offering reports focused on one domain.
So today’s issue is all about technological developments and nothing else. As usual I’ll summarize each topic, link to supporting documentation, and offer some reflections.
Let’s start with Google’s recent releases, and appropriately enough with a visual from their upgraded image generator. ImageFX appeared last year, but now is much more powerful and also widely available. It’s very easy to use, following the now customary workflow of entering a text prompt, checking out multiple results, then iterating:
ImageFX differs in suggesting alternative words and phrases for some of yours, both within the text entry field (called “chips,” weirdly) as well as by tags floating below.
I find the quality to be comparable to that now offered by DALL-E. The generative AI image race is deepening.
You can find ImageFX on Google’s AI Test Kitchen site. Also there is Music-FX, which turns your text prompts into short audio clips. As with ImageFX, there are “chips” which appear within your text prompt, as well as tags below:
Here’s one result from that prompt, which turned out kinda fun:
You can find more information in this paper.
Google also added smaller AI-backed features to the Chrome browser, including suggestions for tab organization.
On the chatbot side, Amazon is rolling out a shopping assistant named Rufus. It’s named after a Welsh corgi.
Rufus is an expert shopping assistant trained on Amazon’s product catalog and information from across the web to answer customer questions on shopping needs, products, and comparisons, make recommendations based on this context, and facilitate product discovery in the same Amazon shopping experience customers use regularly.
Why do this? Apparently the idea is to catch up with other tech giants in the AI races, but also to give users a different way to search for products. One executive described it as an alternative to search, according to a New York Times account: it “lets customers discover items in a very different way than they have been able to on e-commerce websites.”
Microsoft has also been busy. They offered tiered access to Copilot, one free and the other paid, following the now established AI freemium business model. One small yet symbolic note: Microsoft is asking hardware makers to add a CoPilot key to keyboards.
Meanwhile, Hugging Face launched a textbot creator, letting users create their own chatbot. As a test I quickly set up a futures bot, which you’re welcome to try out. So far it does a good if generic job of replying to my questions about the future.
Indentifying AI-generated content: In response to fears of AI-generated content flooding markets or filling our minds with poor/heinous materials, people have been scrambling. There are many “watermarking” projects out there, which would (in theory) attach an indelible stamp to any such document. Here’s a good analysis about possibilities and problems. And Meta just announced it would add watermarks to content carried on Facebook, Instagram, etc., while penalizing users for not identifying AI-generated materials they post.
Hostile AI architecture: In November I introduced you to Nightshade, an academic project to poison generative AI. Specifically, it would introduce altered images to confuse training sets. The project is now live, and you can download version 1 from this University of Chicago site.
Nightshade transforms images into "poison" samples, so that models training on them without consent will see their models learn unpredictable behaviors that deviate from expected norms, e.g. a prompt that asks for an image of a cow flying in space might instead get an image of a handbag floating in space.
How many people will use this? How many academics?
A few observations:
We’re still in the rapid development phase of generative AI. No sign of hype collapse or AI winter yet.
We continue to see generative AI woven into non-AI apps, like Microsoft’s offerings and Google’s Chrome browser. Again, it’s possible that AI-specific apps will be transitional to a time when the normative user experience is of AI woven throughout the world.
Music: generative AI is continuing to increase the size and complexity of products. That is, text is smaller and simpler than images, which are smaller and simpler than audio, which video etc.
I do wonder how many chat bots we’ll get accustomed to in our lives. Will we accept a chat bot in, say, grocery shopping, or buying movie tickets? What proportion of the human race will prefer having an anthropomorphic chat bot (perhaps plus other media) to interact with, a la Replika?
And that wraps up this experimental scanner. We just focused on technology this time. Next up, stories and observations about politics, business, medicine, and especially education.
Let me know if this new focus model works for you.
I do like this new format, Bryan. Your new focus model gives great "food for thought"!
Double down on the format, though I have now so many things to try that my focus is gone! The Hugging Face bot is interesting. I asked it a silly question
https://hf.co/chat/r/De3uqA_
So i need to get better at the asking ;-) Appreciated stuff