Weeknotes 288 - Turing test for well-behaving bots

This week, some thoughts from the Heart and the Chip, and lots of news on new AI model facelifts.

Weeknotes 288 - Turing test for well-behaving bots
imagined by Midjourney

Hi, y’all! Welcome to the new subs and readers!

What about last week? A new iPad was introduced (super thin and with a new pencil), and most attention went to an ad…. I did not follow all the discussions, but I think the conversation between John Gruber and Ben Thompson was a good summary. And it was an interesting thought that the same movie played in reverse might have worked ok. Opening up, being more positive. But maybe also a bit more boring… Overall, it is interesting to discover how sensible it all is and how tech is treated more critically. And then it misses one thing crushed: the creative brain. Maybe as Apple plans to announce a full LLM integration, they will update the movie.. :-)

Pluralistic: AI “art” and uncanniness (13 May 2024) – Pluralistic: Daily links from Cory Doctorow

On the other end of the world, the academic interaction design community gathers at the yearly CHI conference. A lot was said on the location as it is maximising travel offset.. The prestigious academic gathering for interaction designers and beyond. I checked the Best Paper list for now, It's hard to keep up, but I will keep my eye on the accounts of people present.

Google I/O is happening today (as you receive the email tomorrow for me), and it will probably be all about AI, Gemini AI. Like some years ago, it is promising, but the umfeld has changed a lot. An interview with Meta’s product guy is informative, with a little sneer at the AI devices debacle and the continuous focus on creating ad models for Gen AI, too.

I am not going into the whole Eurovision saga. Still, one thing is interesting for this newsletter: I think the fight between manipulating reality vs. the antidote of social media, especially if you make TikTok an official partner… I never had such a newsy event dominate the TikTok stream, and it turns out that you need specific strategies to disconnect from that black hole… In a column in the Dutch newspaper NRC Floor, Rusman shares a triggering thought: “The more we live in an online universe, the less local facts matter. In imaging, at least. But in political debate, image matters more than reality.” So, is there an immersive reality that goes beyond the usual framing and branding? Do we care at all about reality?

Ok, on to more reflective thoughts…

Triggered thought

In the book Heart and the Chip, Daniele Rus proposes a reference board for robots and AI to test safety broadly.

Perhaps we need an equivalent agency to monitor robots and AI. I don’t want this regulatory step to impede or dampen innovation, but the establishment of a standardized testing and evaluation program which certifies that a robot or machine intelligence has met the requirements outlined above could be a tremendously powerful and positive force for shaping the future of robotics and ensuring the maximal benefit for humans.

How will this work? Can we build it into AI systems as a list of questions it replies to? This would give insights into decision-making, not directly but as a quiz. How it responds to certain questions is telling and intriguing. It would be a Turing test for ethical behavior.

Also, she notions that the problems will be more from mistakes than misdesigns, which contrasts with the focus in development. Of course, there is always the discussion of whether misdesign is not the reason for the same mistakes, directly (by the robot or AI) or indirectly (responding to the robot).

Another follow-up thought, not new but triggered even more, is to define robots. We see robots broader than humanoids. But go beyond thinking of robots as a new type of thing and think of them as enhanced robotics. Think like a house that works as an ecosystem of decision-making. The house is a robot. Compared with Wijkbot and Crate-bot.

For the subscribers or first-time readers (welcome!), thanks for joining! A short general intro: I am Iskander Smit, educated as an industrial design engineer, and have worked in digital technology all my life, with a particular interest in digital-physical interactions and a focus on human-tech intelligence co-performance. I like to (critically) explore the near future in the context of cities of things. And organising ThingsCon. I call Target_is_New my practice for making sense of unpredictable futures in human-AI partnerships. That is the lens I use to capture interesting news and share a paper every week.

Notions from the news

Human-AI partnerships

It was all around the internet last evening: the introduction of a new version of GPT; GPT-4o. It is not (yet) the big update in the next iteration of intelligence what is expected with GPT-5; the focus is on a much quicker responding system and improvements in the conversation that make the interaction even more human-like, especially when using the voice mode. Also, the other modalities, such as vision recognition, are much smoother. A desktop version has been announced to offer more fluid integrations. It speaks also Dutch quite well, as Peet experienced.

Watching the video makes an impression. It again confirms that the smoothness of the interaction is important for the tool's value. It also makes you wonder how far we are into a kind of AGI situation, or at least in a movie, and her level of conversationHer. I am curious if Google i/O will bring back an updated version of the Duplex demo from years ago, too.

Also, the presentation ends with a hint about the next frontier, which will be announced later this year. That should be GPT-5. I think they will closely examine how 4o holds and will be used, in good and bad ways, and whether guardrails are necessary, as it will be even more difficult to recognize fact from fiction… That’s probably why they introduced the model spec.

OpenAI’s Model (behavior) Spec, RLHF transparency, and personalization
Now we will have some grounding for when weird ChatGPT behaviors are intended or side-effects — shrinking the Overton window of RLHF bugs.

The new version is free for all users, which could make it more mundane quickly. Next up is a life-existing tool to which you can delegate other services.

This Is the Next Smartphone Evolution
OpenAI just killed Siri.

Maybe not a good idea as a single peer, but as a reference peer? Or as making sense of multiple peers.

Researchers warned against using AI to peer review academic papers | Semafor
Top AI conferences and academic publishers worry about intellectual integrity as more researchers use tools like ChatGPT

The future (that is here): Being prompted by prompt engineering tools.

Microsoft is ‘turning everyone into a prompt engineer’ with new Copilot AI features
Auto-complete and rewrite are coming to Copilot for Microsoft 365 prompts.

Human-compatible, and focus not on human values but instead human preferences.

Human-Compatible AI | NOEMA
“Putting values” in machines is risky business.

Personal AI for personalized interactions.

Personal vs. Personalized AI
There is a war going on. Humanity and nature are on one side and Big Tech is on the other. The two sides are not opposed. They are orthogonal. The human side is horizontal and the Big Tech side is …

Limiting yourself can be a good way to make things more sharp. Six-word stories. What would it tell us about the thinking of our AI companions if they have to come up with these six-word poems?

Will superhumans be an extension, the evolutionary next state of humans, or the best collaborations?

Superhuman?
What does it mean for AI to be better than a human? And how can we tell?
Google DeepMind’s Groundbreaking AI for Protein Structure Can Now Model DNA
Move over, chatbots. This upgraded AI can model antibodies, DNA, and molecules from disease organisms. This next generation of AlphaFold, from Google Deepmind, is poised to significantly advance drug development.

The new AI assistants and other tools of this week: Claud chatbot, Generative spies,

Robotic performances

Reinforcement learning with robots.

Exploration-focused training lets robotics AI immediately handle new tasks
Maximum Diffusion Reinforcement Learning focuses training on end states, not process.

Robots for good.

Researchers build microrobots to remove microplastics from water
When old food packaging, discarded children’s toys and other mismanaged plastic waste break down into microplastics, they become even harder to clean up from oceans and waterways. Researchers are turning to microrobots for help.

That Tesla Optimus is zen…

A video of Tesla’s new humanoid robot leaves actual humans less than impressed
Elon Musk says Tesla is a robotics and AI company now

Where are all the autonomous vehicles?

Where Are All the Autonomous Vehicles?
Three reasons why you shouldn’t sell your car just yet

Immersive connectedness

Sometimes you forget there are still these existing smart products becoming incremental improved. And AI-touched.

Looking for the Best Smart Scale? Step On Up
If you’re ready to start tracking your weight, BMI, and other critical health data on your phone, we’ve weighed in on some great options.

For on my reading list

How to win at Enterprise AI - A playbook
The Service-as-a-software playbook

Is singularity now?

The Singularity is Now
Rethinking an iconic idea

Tech soci(e)ties

A plea for analog tech in a digital immersive world. DJing.

The DJ Technology Edition
On Grimes, Steve Jobs, and returning to vinyl

Are normcore restaurants a sign of looking to The New Authenticity (Is AuthentiCITY already claimed btw?)

Normcore, Illegibility, and the Fear of Mid
The internet made the world more crowded - or does it just feel that way?

More tech revolt by Stack overflow users

Stack Overflow Users Are Revolting Against an OpenAI Deal
Members of the software developer community have reported deleting or altering their posts to prevent them from being used by OpenAI.

Water-based logistics in Amsterdam makes a lot of sense, except considering the congestion we experience on certain days.

Using Amsterdam waterways for city logistics
A recent study by Dutch TNO explores the potential of shifting transportation modes from roads to waterways to mitigate adverse externalities like CO2 emissions, congestion, and air pollution associated with last-mile construction logistics in urban settings. The investigation was carried out within the framework of the Amsterdam Vaart! project, where the transportation activities of sixteen…

The internet as we know it now will change fase

The death (again) of the internet as we know it
A few big changes are making the online world a more boring place to hang out.

Paper for the week

Next week, one from CHI looking back, now one on ride-sourcing platforms:

Limited available market share data seems to suggest that ridesourcing platforms benefit from, even thrive on, socio-economic inequality. We suspect that this is associated with high levels of socio-economic inequality allowing for cheap labour as well as increasing the share of travellers with a considerably above-average willingness to pay for travel time savings and comfort. We test the relation between inequality and system performance by means of an agent-based simulation model representing within-day and day-to-day supply-demand interaction in the ridesourcing market.

de Ruijter, A., Cats, O. & van Lint, H. Ridesourcing platforms thrive on socio-economic inequality. Sci Rep 14, 7371 (2024). https://doi.org/10.1038/s41598-024-57540-x

Looking forward

I updated my personal website (Target is New as a company, or better, a practice). Curious to hear what you think :)

This week is dedicated to finishing the exploration project and having multiple conversations. I don’t have plans yet, who knows where I end up:

Enjoy your week!

Buy Me a Coffee at ko-fi.com