ep. 15: Three user experience themes in the early days of Apple Vision Pro

9 min read

Feb 07, 2024

The Apple Vision Pro headset became available in the US on February 2, marking a long-awaited day, especially for early tech adopters and those working in the spatial computing industry. There have been many thorough reviews of the Vision Pro - in fact, so many, it’s hard to keep up. The general consensus is that the headset is a technical marvel, especially for its passthrough display and hand and eye tracking. Others had critical feedback that there weren’t many truly “spatial computing” apps and the headset was not comfortable for long-term wear. This week’s edition applies a user experience lens to this news, and synthesizes three (perhaps) less obvious themes we’re seeing in these early days of Vision Pro use.

The long-awaited Apple Vision Pro. Image: Apple

Theme 1: Interoperability, not virtual monitors, is what’s unlocking work use cases

A key Vision Pro use case people were excited about was putting up giant virtual monitors in their physical environment, to enable productivity on work tasks. However, as Nilay Patel of The Verge reported, “you can only have a single Mac display in visionOS. You can’t have multiple Mac monitors floating in space.” Ok, what about multiple windows? You can do that, but as Ben Thompson of Stratechery writes, the “user interface is exceptionally difficult to manage once you have multiple windows on the screen”.

Where you do see people giving high scores across the board is around interoperability within the Mac ecosystem. As the Wall Street Journal’s Joanna Stern says in her 24 hours in the headset video: “The real game changer for getting work done? Adding in a Mac”. She goes on to show how a giant virtual desktop appears above her Macbook, with Vision Pro apps running to the left of the desktop. This Apple interoperability was a high point, even for Patel, who just expressed disappointment at the lack of multiple Mac monitors: “Mac display sharing works really well, and Apple ecosystem tricks like Handoff and Continuity are pure magic in this context. You can copy on your Mac and paste in visionOS, and it just works.”

The Mac Virtual Display feature, which allows the headset to function as a 100-foot display for your Mac. Image: Apple

Putting this all together, interoperability within the Apple ecosystem seems to be what’s supplying user value when using Vision Pro for work productivity. Factors that could increase this value are the ability to project two or more monitors, and a more intuitive app management system (especially when it comes to arranging apps depth-wise, or in “z-space”).

It’s worth a footnote that productivity isn’t limited to work tasks. There are also everyday tasks that fall under the productivity umbrella, and the Vision Pro was clearly designed with these use cases in mind. Apple themselves wrote that the Vision Pro “is an ideal productivity tool for everyday tasks”, with numerous productivity apps built for the new headset (e.g., Planny for task management or Hold On to set spatial timers around your physical environment). These everyday tasks experiences (especially the ability to place multiple timers over different pots as you’re making dinner) have been receiving positive feedback.

Theme 2: People wearing the Vision Pro outside - a “+1” for passthrough

Apple advises using the Vision Pro in a controlled indoor or outdoor space. Naturally, it only took a few hours for examples of use in very uncontrolled (and often, dangerous) outdoor environments to emerge. Notable examples included use while riding an electric skateboard, at the gym, and in a Tesla (more examples here and here). People are often consuming media in these scenarios (think watching a movie at the gym). Why are people wearing their Vision Pros in these settings? What can we learn from this user behavior?

Some obvious explanations for why we’re seeing this use is that the Vision Pro, with its $3,500 USD price tag, is a status symbol. Some people will want to show it off in public regardless of Apple’s warnings, seeking to go viral wearing this shiny, new gadget. It’s also simply novel - people buy a new device, and may want to try it out in different environments.

I hypothesize there might be something else also at play: at the level of quality achieved by Apple, passthrough mixed reality (MR), can be a compelling user experience in a range of settings (including outside).

Let’s look at a quote from the electric skateboard wearer, YouTuber Casey Neistat: "something happened today that was completely unexpected...after a couple of hours of running around the streets of New York...my brain sort of clicked, and I just sort of forgot that I was looking through cameras and screens, and it took what it saw as reality."

YouTuber Casey Neistat wears a Vision Pro while riding an electric skateboard (don’t try this at home). Image: YouTube

This quote and the many examples of outdoor use, suggests the promise of valuable MR passthrough use cases. While many people dismiss passthrough MR as an awkward evolutionary step or stop-gap on our quest towards optical augmented reality (AR) smart glasses, this early signal suggests that MR potential, at least at the level of quality Apple has achieved, holds promise. (Optical AR is when light passes directly through unobtrusive glasses to your eyes, with digital information layered over the top of what you’re seeing). As AR pioneer Tom Emrich writes, “It is very likely that we will see MR passthrough devices on the heads of people indoors and outdoors long before optical see-through AR devices find meaningful consumer adoption”.

What we’re observing with the use of the Vision Pro’s MR passthrough is likely the “early PC days” of spatial computing headsets. Understanding the kernel of value these “extreme users” see in these use cases will be useful to track over time, to provide signals on future user cases for both passthrough and optical AR. In the meantime, let’s remember that our user experience North Star is not a visually-cluttered Hyper-Reality where users have little situation awareness. And, it goes without saying, don’t do dangerous stuff that can harm you or others around you while wearing the Vision Pro.

Theme 3: The uncanny valley of Personas (for now)

The uncanny valley is a person’s negative reaction to lifelike robots or digital representations of humans, and is what many early Vision Pro users felt when using the Persona feature. Apple defines Persona (currently in Beta) as “a dynamic, natural representation of your face and hand movements that allows others to see you while you’re using Apple Vision Pro for FaceTime and other videoconferencing apps”. To capture their Persona, People scan their face using the Vision Pro, holding the device in front of them to capture their likeness for later use on calls when their face would otherwise be obscured by the headset.

Joanna Stern of the Wall Street Journal describes her uncanny valley experience while using Personas on a call.

The resulting scan is firmly in the uncanny valley according to many early adopters, and highlights the challenges of photoreal avatars. Interestingly, Personas are supported in the recent launch of Zoom for Apple Vision Pro, “to make hybrid collaboration more immersive”. Until we either improve avatar quality or provide alternatives (e.g., represent your identity however you want, with an avatar that doesn’t strive for a 1-to-1 match how you look IRL), I predict people will stick to 2D representations of themselves, and save spatial sharing on calls for 3D objects (e.g., as part of media or design workflows). Apple seems to be acutely aware of this limitation, and released more detailed, lifelike Personas in their vision OS 1.1 update just yesterday.

Wrap-up

While it is exciting to read these reviews in the first week of Vision Pro’s real-world use, I also take them with a grain of salt. The device is novel, and the use cases that deliver lasting user value remain to be seen. The feedback I care more about will come in six months, which will speak to if and how early adopters find durable user value in the device. Questions I’m interested in then include: Who (if anyone) is still using the Vision Pro daily or weekly? How (if at all) are they using it? And how does this compare to the long-term use by early adopters of the Meta Quest 3?

I’m also tracking how the Vision Pro evolves within the broader landscape of AI-powered ubiquitous computing, with devices like the Rabbit R1 pocket companion boasting a 360-degree, contextually-aware camera to sense and help synthesize knowledge about your environment. To quote Patrick Keenan in a recent Machine is the Message podcast, we may start to care “less about the form factor of me looking at a screen, and more the form factor of the AI looking at the world”. Despite being a spatial computing device, the Vision Pro is still firmly rooted in Apple’s screen-based legacy, with app-based interfaces. And to Andrej Karpathy’s point, there currently aren’t that many truly “spatial computing” apps for the Vision Pro.

I’m looking forward to seeing spatial computing headsets help us better understand and navigate our environments, and leverage gen AI to streamline tasks and anticipate needs, rather than replicating the apps and taps required to complete tasks on laptops and mobile devices.

Human Computer Interaction News

February 2024 report on the world’s most used gen AI tools highlights adoption of chat buddies and homework help: FlexOS work surveyed gen AI platforms to reveal which tools get used most, using web traffic and search rankings as proxies for usage. While you may be unsurprised to read that Chat GPT (including Dall-E) was #1, there were a few less obvious findings: Character AI came in #4, and three of the top 10 were education tools (Brainly for homework help in #6, CourseHero for tutoring in #7, and TurnitIn for AI detection in #9). Character AI was disproportionately visited by the youngest measured visitor group (56.7% were 18-24), and confided in AI chat buddies for an average of two hours per day (!) Together with the high use of gen AI-powered education tools, this finding illustrates what a future workplace might look like, with a generation who has grown used to augmenting their learning and work with gen AI.

Nielsen-Norman Group UX research provides insight on how people structure their Gen AI chatbot prompts: People commonly used response outlining, including format or structure specifications in their prompts when using gen AI chatbots, to ensure that the response satisfies those requirements (e.g., adding numbers 1, 2, 3, 1a, 1b, to structure your first prompt to ensure that ChatGPT output would follow the same numbering). While response outlining is effective, this tactic increases users’ effort. People may realize the need to employ response outlining only after getting unsatisfactory responses from the AI tool. Helping users adding extra specificity to prompts can reduce the articulation barrier and improve the usability of generative-AI interfaces.

The Browser Company announced “Act II'' for Arc, ‘The Browser That Browses For You’: Arc is targeting what their CEO Josh Miller has called ‘a post-Google Internet’ by implementing AI within the browsing experience. Current features include, ‘Ask On Page’, which answers questions about the contents of webpages, and ‘5 Second Previews’, which summarizes a webpage at the other end of a link. Four new features have been announced, all around the theme of ‘the browser that browses for you’. One of these features involves typing a search query into the address bar and selecting ‘Browse For Me’. This yields a personalized webpage for you with key points on your query and links to webpages that might be useful. I can recommend this feature for getting a birds-eye view on the many Apple Vision Pro reviews.

Looking to measure user value of your Apple Vision Pro app? Sendfull can help. Reach out at hello@sendfull.com

That’s a wrap 🌯 . More human-computer interaction news from Sendfull next week.