Weeknotes 288 - Turing test for well-behaving bots

By Iskander Smit in weeknotes — May 14, 2024

This week, some thoughts from the Heart and the Chip, and lots of news on new AI model facelifts.

imagined by Midjourney

Hi, y’all! Welcome to the new subs and readers!

What about last week? A new iPad was introduced (super thin and with a new pencil), and most attention went to an ad…. I did not follow all the discussions, but I think the conversation between John Gruber and Ben Thompson was a good summary. And it was an interesting thought that the same movie played in reverse might have worked ok. Opening up, being more positive. But maybe also a bit more boring… Overall, it is interesting to discover how sensible it all is and how tech is treated more critically. And then it misses one thing crushed: the creative brain. Maybe as Apple plans to announce a full LLM integration, they will update the movie.. :-)

On the other end of the world, the academic interaction design community gathers at the yearly CHI conference. A lot was said on the location as it is maximising travel offset.. The prestigious academic gathering for interaction designers and beyond. I checked the Best Paper list for now, It's hard to keep up, but I will keep my eye on the accounts of people present.

Google I/O is happening today (as you receive the email tomorrow for me), and it will probably be all about AI, Gemini AI. Like some years ago, it is promising, but the umfeld has changed a lot. An interview with Meta’s product guy is informative, with a little sneer at the AI devices debacle and the continuous focus on creating ad models for Gen AI, too.

I am not going into the whole Eurovision saga. Still, one thing is interesting for this newsletter: I think the fight between manipulating reality vs. the antidote of social media, especially if you make TikTok an official partner… I never had such a newsy event dominate the TikTok stream, and it turns out that you need specific strategies to disconnect from that black hole… In a column in the Dutch newspaper NRC Floor, Rusman shares a triggering thought: “The more we live in an online universe, the less local facts matter. In imaging, at least. But in political debate, image matters more than reality.” So, is there an immersive reality that goes beyond the usual framing and branding? Do we care at all about reality?

Ok, on to more reflective thoughts…

Triggered thought

In the book Heart and the Chip, Daniele Rus proposes a reference board for robots and AI to test safety broadly.

Perhaps we need an equivalent agency to monitor robots and AI. I don’t want this regulatory step to impede or dampen innovation, but the establishment of a standardized testing and evaluation program which certifies that a robot or machine intelligence has met the requirements outlined above could be a tremendously powerful and positive force for shaping the future of robotics and ensuring the maximal benefit for humans.

How will this work? Can we build it into AI systems as a list of questions it replies to? This would give insights into decision-making, not directly but as a quiz. How it responds to certain questions is telling and intriguing. It would be a Turing test for ethical behavior.

Also, she notions that the problems will be more from mistakes than misdesigns, which contrasts with the focus in development. Of course, there is always the discussion of whether misdesign is not the reason for the same mistakes, directly (by the robot or AI) or indirectly (responding to the robot).

Another follow-up thought, not new but triggered even more, is to define robots. We see robots broader than humanoids. But go beyond thinking of robots as a new type of thing and think of them as enhanced robotics. Think like a house that works as an ecosystem of decision-making. The house is a robot. Compared with Wijkbot and Crate-bot.

For the subscribers or first-time readers (welcome!), thanks for joining! A short general intro: I am Iskander Smit, educated as an industrial design engineer, and have worked in digital technology all my life, with a particular interest in digital-physical interactions and a focus on human-tech intelligence co-performance. I like to (critically) explore the near future in the context of cities of things. And organising ThingsCon. I call Target_is_New my practice for making sense of unpredictable futures in human-AI partnerships. That is the lens I use to capture interesting news and share a paper every week.

Notions from the news

Human-AI partnerships

It was all around the internet last evening: the introduction of a new version of GPT; GPT-4o. It is not (yet) the big update in the next iteration of intelligence what is expected with GPT-5; the focus is on a much quicker responding system and improvements in the conversation that make the interaction even more human-like, especially when using the voice mode. Also, the other modalities, such as vision recognition, are much smoother. A desktop version has been announced to offer more fluid integrations. It speaks also Dutch quite well, as Peet experienced.

Watching the video makes an impression. It again confirms that the smoothness of the interaction is important for the tool's value. It also makes you wonder how far we are into a kind of AGI situation, or at least in a movie, and her level of conversationHer. I am curious if Google i/O will bring back an updated version of the Duplex demo from years ago, too.

Also, the presentation ends with a hint about the next frontier, which will be announced later this year. That should be GPT-5. I think they will closely examine how 4o holds and will be used, in good and bad ways, and whether guardrails are necessary, as it will be even more difficult to recognize fact from fiction… That’s probably why they introduced the model spec.

The new version is free for all users, which could make it more mundane quickly. Next up is a life-existing tool to which you can delegate other services.

Maybe not a good idea as a single peer, but as a reference peer? Or as making sense of multiple peers.

The future (that is here): Being prompted by prompt engineering tools.

Human-compatible, and focus not on human values but instead human preferences.

Personal AI for personalized interactions.

Limiting yourself can be a good way to make things more sharp. Six-word stories. What would it tell us about the thinking of our AI companions if they have to come up with these six-word poems?

Will superhumans be an extension, the evolutionary next state of humans, or the best collaborations?

The new AI assistants and other tools of this week: Claud chatbot, Generative spies,

Robotic performances

Reinforcement learning with robots.

Robots for good.

That Tesla Optimus is zen…

Where are all the autonomous vehicles?

Immersive connectedness

Sometimes you forget there are still these existing smart products becoming incremental improved. And AI-touched.

For on my reading list

Is singularity now?

Tech soci(e)ties

A plea for analog tech in a digital immersive world. DJing.

Are normcore restaurants a sign of looking to The New Authenticity (Is AuthentiCITY already claimed btw?)

More tech revolt by Stack overflow users

Water-based logistics in Amsterdam makes a lot of sense, except considering the congestion we experience on certain days.

The internet as we know it now will change fase

Paper for the week

Next week, one from CHI looking back, now one on ride-sourcing platforms:

Limited available market share data seems to suggest that ridesourcing platforms benefit from, even thrive on, socio-economic inequality. We suspect that this is associated with high levels of socio-economic inequality allowing for cheap labour as well as increasing the share of travellers with a considerably above-average willingness to pay for travel time savings and comfort. We test the relation between inequality and system performance by means of an agent-based simulation model representing within-day and day-to-day supply-demand interaction in the ridesourcing market.

de Ruijter, A., Cats, O. & van Lint, H. Ridesourcing platforms thrive on socio-economic inequality. Sci Rep 14, 7371 (2024). https://doi.org/10.1038/s41598-024-57540-x

Looking forward

I updated my personal website (Target is New as a company, or better, a practice). Curious to hear what you think :)

This week is dedicated to finishing the exploration project and having multiple conversations. I don’t have plans yet, who knows where I end up:

14 May - Amsterdam; UX research topic
15 May - London - Interesting
15 May - Amsterdam AI in product management
15 May - running LLMs locally, DIY workshop by Sensemakers
17 May - Amsterdam Breaking the Meme: Critical Meme Reader III Launch
21 May - online - Ancestral AI - Postmortem Privacy, at SPUI25, Amsterdam.

Enjoy your week!