xCoAx 2020 8th Conference on Computation, Communication, Aesthetics & X 8–10 July, Graz online

Hot Summer Afternoon: Towards the Embodiment of Musical Expression in Virtual Reality

Keywords: Sound, Performance, Virtual Reality, Hand Tracking, Hand Gesture, Machine Learning, Musicality, Embodiment.

Motivation and Development Process

The Virtual Reality (VR) headset Oculus Quest was chosen to be used in particular because of its new hand tracking technology released in December 2019. [1] Now, with the hand tracking SDK the headset can track user’s hands using the four front cameras on the headset without requiring wearing a pair of tracking gloves or any extra tracking devices. Based on our experience, this feature removes the previous disembodying interaction in VR such as controlling the representational fake hand objects with controllers.

The soundscape of Hot Summer Afternoon is about the feeling of post-holiday melancholia (Fig 1). It is set to be a late afternoon when everyone left the main event of the holiday. The performer is the last person remained at the place still. We were inspired by the very first immersive moment when we tried the Oculus Quest headset in order to test the hand tracking feature. There was no other object in the game scene but just our reflection of hands. Because it did not require to hold controllers, we felt deeper immersion in the empty scene. Based on this experience, the soundscape was composed.

With the Oculus hand tracking SDK, it is possible to retrieve individual bone ID to track more sophisticated hand motion. Each hand is constituted with twenty-four bones. Instead of mapping each bone position one by one, some of supervised machine learning (ML) models were used to detect some gestures. Those detected gestures were mapped to different sound parameters. Also, the spawned objects’ in VR were mapped to control sound parameters as well. To train position data of the spawned objects and the user’s hands with ML models, the free open source software Wekinator was used. The software communicates with Max via OSC to control sound parameters created in Max.

Towards the Embodiment of Musical Expression

What is important is that the peculiar nature of our bodies shapes our very possibilities for conceptualization and categorization. (Lakeoff and Johnson 1999)

Hand motion has been extensively explored in digital music composition and new instrument design. For example, Michel Waisvisz’s The Hands is the early example and long-lasting research projects (Torre, Andersen, and Baldé, 2016), and the composer Laetitia Sonami explored musicality of hand gesture with her Lady’s Glove (1994).

This project is yet not an attempt to add more standard musical language or mapping techniques with hand gesture, but to explore the embodied experience in VR environment through the real-time controller-free hand tracking feature. Even though only real hand motion is embodied in the VR so far, the immersive experience has been dramatically enhanced based on our experience. This let the performer have more intimate interaction with the VR world. Therefore, here, ML is used to capture very personal and subjective musicality of the performer’s hand gesture, which allows to explore very personal and peculiar musicality retrospectively by observing the classified gestures.

During the performance, the audience will see a streamed video from the Oculus headset through the performer’s view and listen to the imprinted musical expression in the VR space.


  1. “Hand Tracking SDK for Oculus Quest Available with v12 Release,” Oculus Developer Center, https://developer.oculus.com/blog/hand-tracking-sdk-for-oculus-quest-available/.


Join the conversation