Today’s mobile devices are incredibly sophisticated computers with multi-gigahertz, multi-core, hardware accelerated graphics, network connectivity, and many gigabytes of high-speed memory. Indeed, the smartphones we carry in our pockets today would have been deemed supercomputers just twenty years ago.

Yet, the fundamental usability issue with mobile devices is apparent to anyone who has used one: they are small. Achieving mobility through miniaturization has been both their greatest success and most significant shortcoming.

Developments in human-computer interfaces have significantly trailed behind the tremendous advances in electronics. As such, we do not use our smartphones and tablets like our laptops or desktop computers. This issue is particularly acute with wearable devices, which must be even smaller in order to be unobtrusive.

Within the last year, the first standalone consumer smartwatches have emerged—a tremendous milestone in the history of computing—but the industry is still struggling with applications for this platform. At present, the most compelling uses are notifications and heath tracking—a far cry from their true potential.

Developing and deploying a new interaction paradigm is high risk; it is easier and more profitable to quickly follow then it is to lead such a revolution. Few companies have survived the quest to lead a revolution. This has fostered an industry where players keenly watch one another and are quick to sue, and where there is little public innovation. Comparing a 2007 iPhone 1 to a 2014 iPhone 5s, it is easy to see that while computers have continued to get faster and better, the core interactive experience really hasn’t evolved at all in nearly a decade.

The Rich-Touch Revolution

“Each morning begins with a ritual dash through our own private obstacle course—objects to be opened or closed, lifted or pushed, twisted or turned, pulled, twiddled, or tied, and some sort of breakfast to be peeled or unwrapped, toasted, brewed, boiled, or fried. The hands move so ably over this terrain that we think nothing of the accomplishment.”—Frank Wilson (The Hand)

Touch interaction has been a significant boon to mobile devices, by enabling direct manipulation interfaces and allowing more of the device to be dedicated to interaction. However, in the seven years since multi-touch devices went mainstream, primarily with the release of the iPhone, the core user experience has evolved little.

Contemporary touch gestures rely on poking screens with different numbers of fingers: one-finger tap, two-finger pinch, three-finger swipe and so on. For example, a “right click” can be triggered with a two-fingered tap. On some platforms, moving the cursor vs. scrolling is achieved with one or two finger translations respectively. On some Apple products, four-finger swipes allow users to switch between desktops or applications. Other combinations of finger gestures exist, but they generally share one commonality: the number of fingers parameterizes the action.

This should be a red flag: the number of digits employed does not characterize actions we perform in the real world. For example, I do not two-finger drink my coffee, or three-finger sign my name—it is simply not a human-centered dimension, nor is it particularly expressive! Instead, we change the mode of hands in the world (and in turn, the tools we wield) by varying the configuration of our hands and the forces that our fingers apply. Indeed, the human hand is incredible, yet we boil this input down to 2-D location on today’s touch devices.

The human hand is incredible, yet we boil input down to 2-D location on today’s touch devices

Fortunately, with good technology and design, we can elevate touch interaction to new heights. This has recently led to new area of research—one that looks beyond multi-touch and aims to create a new category of “rich-touch” interactions. Whereas multi-touch was all about counting the number of fingers on the screen (hence the “multi”), rich-touch aims to digitize the complex dimensions of input our fingers and hands can express—things like sheer force, pressure, grasp pose, part of finger, ownership of said finger and so on. These are all the rich dimensions of touch that make interacting in the real world powerful and fluid.

The Early Days of Rich-Touch Interaction

Initial research has already proven successful. One such technology, developed by my team when we were graduate students at Carnegie Mellon University’s Human-Computer Interaction Institute, is FingerSense.

The technology uses acoustic sensing and real-time classification to allow touchscreens to not only know where a user it touching, but also how they are touching—for example, with the fingertip, knuckle, or nail. The technology is currently being developed for inclusion in upcoming smartphone models to bring traditional “right-click” style functions into the mix, among many other features.

touch using finger


touch using knuckle


touch using nail


Another illustrative project is TouchTools, which draws on users' familiarity and motor skills with physical tools from the real world. Specifically, users replicate a tool’s corresponding real-world grasp and press it to the screen as though it was physically present. The system recognizes this pose and instantiates the virtual tool as if it was being grasped at that position—for example, a dry erase marker or a camera. Users can then translate, rotate, and otherwise manipulate the tool as they would its physical counterpart. For example, a marker can be moved to draw, and a camera’s shutter button can be pressed to take a photograph.


In the same way as using our hands in the real world, both FingerSense and TouchTools provide fast and fluid mode switching, which is generally cumbersome in today’s interactive environments.

Contemporary applications often expose a button or toolbar that allows users to toggle between modes (e.g., pointer, pen, eraser modes) or require use of a special physical tool, such as a stylus. FingerSense and TouchTools can utilize the natural modality of our hands, rendering these accessories superfluous.

Learning from Human-Computer Interaction

For the past half-century, we’ve believed that the manifestation of tools in computing environments means providing a toolbar to users (like those seen in illustrating programs) and, in general, having them click buttons to switch modes. However, this is incredibly simplistic, does not scale well to small device sizes, and requires constant mode switching. For example, on touchscreens, the inability to disambiguate between scrolling and selection has made something as commonplace as copy and paste a truly awkward dance of the fingers.

Instead, computers should utilize the natural modality and power of our fingers and hands to provide powerful and intuitive mode switching. If we are successful, the era of poking our fingers at screens will end feeling rather archaic. Instead, we will have interactive devices that leverage the full capabilities of our hands, matching the richness of our manipulations in the real world.

Combined with the fact that the digital world allows us to escape many mundane physical limitations (e.g., items cannot disappear, or rewind in time), it seems likely we can craft interactive experiences that exceed the ability of our hands in the real world for the first time. For example, today I can’t sculpt virtual clay nearly as well as I can real clay with my bare hands. However, if we can match the capability though superior input technologies, it is inevitable that we will exceed those capabilities, which is the true promise of computing.

In order to fully realize the full potential of computing on the go, we must continue to innovate powerful and natural interactions between humans and mobile computers. This entails the creation of both novel sensing technologies and interaction techniques. Put simply: we either need to make better use of our fingers in the same small space, or give them more space to work within.


Image of elegant hands courtesy Shutterstock.

Article No. 929 | January 2, 2013
Article No. 1 248 | June 3, 2014
Article No. 1 244 | May 27, 2014

Add new comment


Hi Chris,

I found interesting your concepts and studies. If I got it right you are proposing an enhancement of hand recognition on screens and also applying more natural gestures to mimic real world actions. Don’t you think the industry is actually moving away from those paradigms (flat, metro design) and treating the users as more skilled and technology savvy in a way that they don't need obvious metaphors to interact with digital products? 

Thanks for the article. 

"..treating the users as more skilled and technology savvy in a way that they don't need obvious metaphors.." 

Agreed. I'm actually a big fan of flat design. The prototypes shown in the article are nothing like what a real commercial product would look like. They are explorations of ideas to see if they are worth pursing. 

I would disagree on the point that industry treats users as skilled and savvy. Today, we're all experts on touchscreen smart devices. I'm not getting any better/faster; I've hit the ceiling. Thats not good. We need to innovate strategies beyond poking different numbers of fingers at screens. That worked great for the iPhone 1, when it was all new terrain. Now we've mastered that limited vocabulary -- industry needs to raise that ceiling again. Unfortunately, it doesn't scale to six finger swipes and seven finger taps. So we need entirely new approaches, based on human-centered dimensions, like grasp, tap type, pressure, sheer, etc. The things we use everyday to skillfully interact with the world around us.