The last time I tried to reflect on a new Apple category here was shortly after when the AirPods were launched back in 2016. My thoughts back then were about the potential impact of intelligence in the audio device on the phone ecosystem. I am not doing a valuation of that ‘prediction’; the impact of immersive audio devices as touchpoints are felt for sure after half a decade. In that early period, the AirPods were still seen as weird and even ugly devices that could not become successful. I don’t have to make the case that turned out differently. The AirPods are now often mentioned as a comparison to responses to the introduction of the Vision Pro and Vision OS of Apple, which make partly sense from an emotional and receptive standpoint; much more interesting is it about the role of technology in our lives: the computational filter in our hearing that is becoming accepted as integrated part; a lot of people wear the AirPods all the time also during conversations, and not only AirPods, also other brands, it is accepted almost, and the transparency mode is often even more clear than reality. We live in the audio domain already in a synthetic context.
To give away my conclusion for the Vision Pro up front: even more than with the AirPods, this is not the introduction of a new device; it is about a new relation towards tech in our social context, in the way technology is mediating our experience and how it is creating a synthetic layer. It makes so much sense that it is called VisionOS and not RealityOS. It enhances our vision in a new way. The current form factor of the goggles, whether high-end or not, is not the ultimate device as a form factor, it is an ultimate implementation of the synthetic layer, and it will become the testbed to shape how we will interact through this synthetic layer. That will have two aspects: (1) the interaction with a vision intervening device, and (2) the role a synthetic living environment will work out.
First: reality is the fundament to build on.
As Casey Newton framed it in the Hard Fork podcast, the essence is that we are looking to a moment where we are wearing computers on our faces. He might have a different intention with this frame, but it is essential; the current goggles are just a form factor of creating a way to wear computing capabilities on our faces. If you believe in this as a sensible thing, then the big question is how it will become the way to interact with this computing (not the computer, the computing capabilities).
The Vision Pro is the fully loaded version that aims to reduce all possible barriers to understanding and experiencing what that will mean. I think Apple wants to create here a starting position with the maximum merge from adding the computational layer without interfering with your sight. It is described by several testers that wear the device; the most impressive part is how unintrusive it is to the experience of the reality to have this filter on your head. All the cameras and the sensors, and the computing power together create a point of departure that is seamless.
The next step is to start adding elements step by step to this computation canvas. The first elements are probably rather mundane. Having an appstore is like a sanity factor. Next is the possibility of having large screens all over the place. All the stuff is like finding a nearby iteration of known concepts from the real world. In that sense, Apple is creating a desktop metaphor in spatial computing, like we started the desktop computing area. And Apple is trying to stay as close to the viewing concept in the interaction; watching the icons to select is the proposed interaction method.
We will see new interaction concepts emerge and these will be driven by app makers and companies that will merge into the OS if they make sense, as always.
Apple has created a style guide for spatial computing. It is interesting to dive into experiments done before, art projects, and academic research to develop the interaction framework further. I expect that some of the presented concepts are already updated when the Vision Pro is released for buying next year.
Second: find the right agency in the synthetic environment
So the canvas is created as a blank sheet for new concepts, the fundamentals that are rigid enough to build upon but open enough to stimulate new ideas.
Thinking about the synthetic layer generated as really immersive for the first time delivers the next step in running developments in synthesizing the experience of the world. The Vision Pro is not only creating a true immersive canvas, but it is also creating the potential first real immersive synthetic world viewing. We are already filtering our audio continuously, and of course, there is a lot of discussion about the other side: the synthetic media and the potential fake realities we now need to deal with.
I don’t want to dive too deep into the relationship between the synthetic experience and the concept of technology mediation, but it is interesting to think about the consequences here. I had to think about this short introduction of the thinking of Don Ide by Peter Paul Verbeek, and I think there will be a growing need for the agency in a fully mediated experience. Who is, in the end, controlling our perception, our vision if everything is mediated all the time? It is what is called hyperreality.
One important thing that Apple is trying to do is to secure trust in the technology. Where the introduction of VR by Meta was welcomed with reluctance, the privacy reputation and lack of a primary add model might open up possibilities to start creating and extending concepts in the synthetic world.
The new Vision is here again supported with the setup of a trusted framework to develop on. The uncanny valley of synthetic experiences will become part of the explorations.
What is here super interesting is the merger with generative AI tools for vision-related objects. Will Apple integrate generative AI tooling in the Vision toolkit? Will it acquire Midjourney to make an immersive real-time version of shaping the world and artifacts of the synthetic experience? I expect there are different scenarios developed already.
Finally, it is interesting to make a comparison between the boom of ChatGPT and the importance of the interface. Having a system where the interaction with the GenAI is both creating a popular service and helping to make the AI more intelligent could be an outcome of using Vision too. Will the interaction with generative images become just as powerful as interacting with large language models? Is interacting via spatial computing the same as the chat is for GPT?
So Vision Pro is all about a new way of looking.
When Google Glass was introduced, I found the new model of timely information and interaction the most interesting. That is also the case with Vision. Not about the goggles and even not about the apps themselves. The potential new vision we will have and interact with. That might be the biggest impact of Vision. As a side note: The refreshed introduction of interactive widgets in the OS might even be a preparation for a more timely interaction paradigm.
It still can go wrong. The strategy of the Apple Watch and AirPods to let them grow into a social position was only possible by people actually using them. The current first iteration might be too expensive to run into now and then. But maybe the SE version will be introduced next to the Pro next year very soon.
Goggles are not the future of computing; a computer on your face might be though. I won’t say that we entered the age of goggles, but we very well might have entered the age of new synthetic vision.