An Overview Of Gesture Recognition

Print   

02 Nov 2017

Disclaimer:
This essay has been written and submitted by students and is not an example of our work. Please click this link to view samples of our professional work witten by our professional essay writers. Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of EssayCompany.

Interaction with an audience in a live digital music performance is sometimes restricted to a minimum value, this being because of a performer hiding behind their medium, predominantly a computer screen. The main goal of a DJ/live artist is to share and enjoy music through live mixing/performance. Unfortunately, this only allows the audience a partial insight into their interaction with their interfaces. Although, this is cannot be helped in certain aspects of live digital performance, in more recent years, more technologies have been developed to re-engage the audience, allowing the artist to step away from their screens. However, with society and

This dissertation focuses on how Gesture Recognition has been developed for digital performances to help rediscover that engagement the audience once had from their live acoustic counterparts. These live analogue performances used expressive gestures to emotionally engage their audience.

With increasing developments in Gesture Recognition software, these artistes can perform gestural movements in order to change parameters on their equipment in a more intuitive way, giving more of an insight to the audience. However do these gestural interfaces being developed for show engage the audience and does the audience need to be engaged?

Introduction

An overview of gesture recognition

Gesture recognition is the interpretation of human motion by a computational device. (Mitra, S. 2007) This means that a computers’ computational software containing mathematical algorithms can calculate our bodily motion or state through a series of inputs. Research of gesture recognition allows us to create systems which can identify these specific human gestures and use them to convey information or for device control. G. Kurtenbach says "A gesture is a motion of the body that contains information" (Kurtenbach G. & Hulteen E.A, 1990). Therefore, a gesture can be commonly recognized as a movement which embodies a special meaning. In computing, gestures are most often used for inputting commands. (Turk, M. 2000)

Performers have been using sensors to trigger, control and manipulate sounds via computer software for over 50 years, such as the pioneering work by Max Mathews, for example, who use an augmented radio baton to trigger and continuously control a MIDI synthesizer. (Mathews, M. 2010) However it was just after the end of the first world war when the first Gesture Recognition Device was first invented. The 'Theremin'. The theremin is an electronic musical instrument that does not require touch but accepts a users gesture to create music. (Billingshurst, M. 2011) It was invented by a young Russian physicist Lev Sergeevich Termen or as most people in the western world know him Leon Theremin. The Theremin works using two metal antennas which sense the position of the users hands, which in turn, with one hand you control the oscillators for frequency and the volume with the other. (Glinsky, A. 2000).

Recognizing gestures as an input allows computers to be more accessible for the physically-impaired and also makes interaction more natural in several aspects of the digital world. For example, a lot of emerging technologies can cause socially challenging problems; this is because the user has to learn a new GUI (Graphical User Interface) which often contains several new unnatural experiences. However, human communication is a combination of speech and gestures. Gestures are used for everything from pointing at a person to get their attention to conveying information about space.

"Gestures are part of the non-verbal conversation and are used consciously as well as subconsciously. Gestures are a basic concept of communication and were used by humans even before speech developed", (Nesselrath, R. Alexandersson, J. 2009).

What Nesselrath tells us here, is that gestures have the potential to be a huge enrichment to an intuitive human-computer interaction. Even so, with the social challenges mentioned above, gestures must be simple and universally acceptable. Gestures associated with speech are referred to as gesticulation and gestures which function independently of speech are referred to as autonomous. This study of gestures and other non-verbal types of communication are known as "Kinesics". (Birdwhistell. R. L. 1970). The term is said have been coined in 1952 by Ray Birdwhistell, a ballet dancer turned anthropologist who wished to study how people communicate through posture, gesture, stance, and movement. Kinesics is a topic which has been thoroughly researched and it has been heavily linked with how body language within a musical performance has sufficient dynamic information which is recognized by an audience according to emotional intention performed by an artist. (Salgado, A. 2007)

This brings us onto the evolution of computer music and its connection with gesture recognition. There are several reasons to why artists are using gesture recognition in their live performances. For example, there is no single permitted set of options, such as a menu, but rather a series of continuous controls and there is an instant response to the user’s movements. These reasons open up a plethora of possibilities and release the restrictions in the artist’s performance. Because of the evolution of computers, thanks to the production of inexpensive ubiquitous sensor devices, and exceptionally powerful computer hardware, it has lead to a surplus in methods of sound synthesis and allows a mass audience direct access to real-time computer-generated sound. "Computer Music has come to mean two things: the direct synthesis of sound by digital means and computer-assisted composition and analysis."(Baggi, 1991). What Baggi means here is that firstly the music can be played by any digital medium, as of today a major example of a modern composer of computer music are known as Live Artists, whom use programs such as Ableton to compose and play live and pre-recorded sounds. Secondly computer-assisted composition and analysis, is grouped together and not separated because this directly refers to software that can be utilised to aid with composition when the composer is writing and also when a fellow composer wants to study or learn another piece at the same time.

Marcelo Wanderley tells us of this growing maturity in new innovative technologies that have been developed for live digital performance incorporating gesture recognition. "Both signal and physical models have already been considered as sufficiently mature to be used in concert situations". This maturity has led us to discover multiple ways of capturing this data. R Nesselrath explains that there are two ways we are utilising modern technologies for recording gestures. Firstly, "Non-instrumental projects recognize hand and finger postures with cameras and image processing algorithms." (Nesselrath, R. Alexandersson, J. 2009). These hand and also body gestures can be amplified by controllers which can recognize and interpret specific gestures, for example a wave of the hand, for instance, might terminate the program. A prime example is the use of the Microsofts’ "Kinect". Kinect is a motion sensing input device which was released in November 2010 and was according to the Guinness Book of Records, the fastest-selling consumer electronics device. (Alexander, L. 2011). There are three innovative working parts to a Kinect sensor, they are, a Colour VGA video camera, a Depth sensor, and a multi-array microphone. This hardware and also the software behind a Kinect is able to detect and track "48 points on each of the user’s body, mapping them to a digital reproduction of that user’s body shape and skeletal structure, including facial details". (Crawford, S.) The second way of utilising modern technologies to capture our gestural information are projects which use instruments for recording, for example sensor apparel or hand held devices with integrated accelerometers, gyroscopes or contact microphones. The most obvious example of a hand held device used as an instrument is the Wii Controller, however a more modern technique is that of a MOGEES (Mosaicing Gestural Surface). According to Bruno Zamborlin, creator of Mogees, "Mogees is a system that allows you to transform any object into a musical instrument just by placing a contact microphone on it." (Zamborlin, B. 2012). The Mogee simply works by connecting a contact microphone to a computer and then using the computer to analyze the audio input signals and extracts the information about how the user is touching the surface.

With these mature technologies stated above, a vast majority of artists who perform computer-generated music in popular culture still fail to step away from behind their laptop screens or decks. The use of digital media and technology for performing disguises the physicality of the experience for an audience unlike that which is produced from an acoustic performance which makes music inherently physical. So what is our understanding of gestures in music? What is the history between gestures and Human-Computer interaction? Can the use of gesture recognition help re-establish the disembodiment of music in current digital performances? Does it need to be saved? Has Digital Performance moved on? At this moment in time, there are many new styles of digital performance that have broken through thanks to Computer-Generated Music and gesture recognition. This dramatic change in the field of musical composition has been so fast that it has allowed the composer greater flexibility and a richer source of sounds. Current performances include the use of gestures to control devices such as production line robotic arms to play acoustic instruments such as.....(Lost Research on Artist)!

Chapter 1

Gestures used in Live Music

A musical gesture can be taken in quite a broad sense. It does not mean only movement, but a movement which can express something. In musical performance, gestures are widely used in different aspects. It is a thoroughly researched field and there are a variety of in depth analysis’s however there is a basic distinction for these gestures which are known as physical gestures and mental gestures (Iazzetta. F, 1997).

Physical gestures do not only encompass how to produce a sound from an instrument but also how the body moves and its posture while accompanying the sound. For example, the hands are a means of fine action due to their dexterity and the amount of responsive nervous receptors in the finger tips; the feet are mostly suited to the performance of slower and more static movements, whilst other body parts’ roles serve as the general support (stability) for the instrument (Cadoz. C, 1988), however there are other roles depending on the instrument. This brings us on to more modern musical gestures. People argue that actions such as turning knobs or pushing levers, which are current in today's technology used by computer musicians or computer music performers or more current live electronic artists, are movements which cannot be considered as gestures. Typing words on computer's keyboard has nothing to do with gesture since the movement of pressing each key does not convey any special meaning.

"Pressing a key on a keyboard is not a gesture because the motion of a finger on its way to hitting a key is neither observed nor significant. All that matters is which key was pressed." (Kurtenbach and Hulteen, 1990)

However, the situation is completely different when a musician plays something on a piano keyboard: the result, that is, the musical performance. This is understandable yet electronic artists argue that there are major differences between that of turning a knob at a given moment in time and the pressing a key of computer’s keyboard. This is because the modern uses have taken the musical gesture away or the so called instrument away so the performer’s body has come to the close attention giving it its own musical gesture. An Interview from renowned site Resident Advisor where an interview with Daniel Brandt from the Brandt Brauer Frick Ensemble discusses the concept of 'Beyond the Laptop', Brandt exerts that there are good live acts that mainly just use a laptop, but the use of a laptop "doesn't really feel live and you can't see, as the audience, what this person's actually doing. The music can sound complicated but then he maybe just switches between different Ableton lines in the arrangement". This establishes the idea that artists which use laptop technologies in musical performance do not produce any musical gesture with embodied information. (Keeling, R. 2012) Conducting is another example of physical gesture as it can be viewed as a way of controlling high-level aspects of performance of multiple instruments with a physical gesture but without direct contact with the instruments. According to Paul Kolesnik's study of Conducting Gesture Recognition, Analysis and Performance System, this example is reinforced through the explanation that there are two principal functions of an orchestral conductor. The first being to "indicate the timing information for the beats of the score in order to synchronize the performance of the musical instruments, and to provide the gestures to indicate his or her artistic interpretation of the performance". (Kolesnik, P. 2004) The second is the introduction of "a degree of variation and personal interpretation in the musical performance, and is represented by a number of gestures with a high degree of expressivity." (Kolesnik, P. 2004)

Mental gestures differ from physical gestures because they are closely related to "the processes of composition, interpretation and listening" (Iazzetta. F, 1997). A composer may use the term musical gestures to "designate a sequence of events within a space of musical parameters". (Wanderley. M, 2000). So mental gestures refer to physical gestures and their relationship and how they occur as an idea or an image of another gesture. An explanation for this is that the mental gesture is learned through the experience and stored into the memory to be used. (Iazzetta. F, 1997). This means that the listener performs the mental gesture while the performer performs the physical gesture. Mental gesture affects the audience/listener through a multitude of perceptual and cognitive mechanisms which have yet to be fully described. Though the human mirror neuron system is a key explanation to how these mental gestures work. The mirror neuron system is said to be "a mechanism allowing an individual to understand the meaning and intention of a communicative signal by evoking a representation of that signal in the perceiver’s own brain". (Molnar-Szakacs et al, 2006). What we can understand from is that listeners are able perceive speech and gestures by a way of articulatory gestures in a way they would perform themselves in order to produce a similar signal. This theory proposes that the brain is able to extract gestural information from the signal. An example of one of these studies is Music-related Motor Learning (Buccino et al., 2004; Calvo-Merino et al., 2004).

Buccino's study of the role of the mirror system in motor learning was through the experimentation of the motor system in both human and non-human primates. In primates, he expresses that a set of neurons are discharged during the "execution of both hand and mouth object-directed actions also respond when a monkey observes another monkey or an experimenter perfoming the same or similar action." (Buccino et al. 2006). In humans, Buccino strengthens this theory with divulging various methods used to support this notion, for example, experiments with neurophysiological, behavioural, and brain imaging techniques were undertaken. One example Buccino tells of is one at the beginning of the century in 2002, by Craighero and associates. They conducted research in assessing the reaction time of the volunteers. Volunteers were asked to prepare to grasp as fast as possible a bar orientated in either clockwise or anti-clockwise after presenting a picture showing the right hand. Two experiments were carried out. In the first, the picture shown was of a mirror image of the final position in which the hand required to grasp the bar. In the second, two images were produced to the volunteers representing 90 degree rotations of the hand in leftward and rightward directions. Both experiments concluded with subjects having faster responses when delivered with stimuli, however it was the level of similarity between the observed and executed movement led them to a further advantage in the task. (Buccino et al. 2006). These experiments reinforce the fact that mirror neurons exist but they also support that the fact observed actions such as gestures in music can be reflected in the motor representation for the same action of the observers. When it boils down to our mirror-neurons and music Molnar-Szakacs and Overy explain it beautifully.

"The ability to create and enjoy music is a universal human trait and plays an important role in the daily life of most cultures. Music has a unique ability to trigger memories, awaken emotions and to intensify our social experiences. We do not need to be trained in music performance or appreciation to be able to reap its benefits already as infants, we relate to it spontaneously and effortlessly." (Molnar-Szakacs et al, 2006)

Chapter 2

The growth of HCI & Gestural Interaction

HCI or Human-Computer Interaction is an area of research and practice that came about in the early 1980’s. The term "human-computer interaction" is commonly used interchangeably with term such as "man-machine interaction" (MMI), "computer and human interaction" (CHI) and "human-machine interaction" (HMI) but it is predominantly known as HCI. An over simplified definition of HCI might say that it is "the study of the interaction between humans and computers." (Carroll, J. 2013). This can be seen in a general point of view as an acceptable definition but it does not by far justify the actual complexity and multi-disciplinary nature of the subject. In previous decades the majority of computer users were themselves programmers and designers of computer systems. Consequently, a person using a computer system was likely to have been immersed in the same conventions and culture as the individual who designed it.

It was not till the mid-1990s did the study of Human-Computer Interaction (HCI) finally take centre stage with the release of Windows 95 bursting upon the scene. Brad Myers, believes that research in HCI has been incredibly successful because of Windows 95 and its "ubiquitous graphical interface", he also mentions that it was based on the Macintosh, which in turn based itself upon the work at Xerox PARC, which in turn again is based upon early research at the Stanford Research Laboratory (now SRI) and at the Massachusetts Institute of Technology (MIT). (Myers, B. 1996). However it was the challenge through the 1960’s and 1970's that started this research and practice. With the emergence of personal computing in the later 1970s, there was eminent change taking place. Personal computing included both personal software (text editors, spreadsheets, and interactive computer games) and personal computer platforms, for example operating systems, and programming languages. This saw a substantial growth in the number users who are not computer users. This change has focused attention upon the needs of what Eason (Eason, K. D. 1976) has termed the naive user and the lack of understanding of the naive user on the part of many designers. It also created awareness to the problems with computers with respect to usability for those users who wanted to use computers as tools. To take the naive user away from arcane commands and system dialogues, scientists delved deeper into cognitive science, which as a collective, included cognitive psychology, artificial intelligence, linguistics, cognitive anthropology, and the philosophy of mind. Researchers made pioneering efforts studying how people interacted with technology, even if they weren’t quite sure what HCI initially was.

Nevertheless, whilst this discovery of HCI was being made, other designers and developers had already been diving deeper into the understanding of gestural interaction with computational devices, and therefore the recognition of gestures. This sub-field of HCI had been the focus of research throughout the development of early applications in the 1960's; such as Ivan Sutherland's Sketchpad (Sutherland, I. 1964) which used an early form of stroke-based gestures using a light pen to grab and manipulate graphical objects on a tablet display. Waren Teitelman was one of the first researchers to develop a trainable gesture recognizer that could classify hand drawn characters in real-time. Several other pen-based recognition systems followed in the 1960's and the 1970's, such as the GRAIL system (Ellis et al., 1969) or in the AMBIT/G system (Christensen, 1968); with this form of interaction now being widely accepted throughout the HCI community (Karam, 2006).

With the progression of technology and HCI in the 1980s, hand held devices such as the mobile phone and the laptop gave designers and developers a greater diversification to research. This brought about the form of interaction using glove and magnetic sensor based systems. Examples of such work are Richard Bolt's 'Put-That-There' (1980) and Thomas Zimmerman's 'DataGlove' & 'Z-Glove'. (1986). Bolt's 'Put-That-There' was a pioneering multi-modal application that combined speech and gesture recognition. The demo shows users commanding simple shapes about a large-screen graphics display surface. Speech is augmented with simultaneous pointing. (Bolt, 1984) Whilst Zimmerman's 'hand gesture input devices' were "lightweight cotton gloves containing flex sensors which measure finger bending, positioning and orientation systems, and tactile feedback vibrators". (Zimmerman et al, 1987). These devices and others received a small amount of attention from researchers throughout the 1980s and early 1990s but this research was limited due to the large expense and technical requirements of the sensor technology but a commercial glove came out from Nintendo USA, called the 'Power Glove'. It was not until William Freeman and Craig Weissman (1995) first demonstrated a camera based system that enabled gestures to control the volume and channel function of a television that the field of computer-vision rapidly started to grow. (Freeman, et al, 1995)

The next area of gestural interaction that followed camera based systems was CV (Computer -vision) based systems. This system had advantages over the glove based technologies in that sensor technology is non-invasive and can be relatively cheap to purchase. However, one of the newest applications of CV are not cheap and they are autonomous vehicles, such as a UAV (unmanned areal vehicle). This progression from glove to CV based systems finally surfaced in a main stream application for gesture recognition systems in 2006. This application was the Nintendo Wiimote. The Wiimote strayed away from CV based systems creating a new based system called IMU (Inertial Measurement Unit). An Intertial Measurement Unit is a device that measures an object's velocity, orientation and gravitational forces created within it as it moved using a combination of sensors such as accelerometers, gyroscopes and magnetometers. (Chow, R. 2011) Like the CV, IMU based system can be built very cheaply, therefore making the mainstream commercialisation of such a system possible. The small size and low cost of IMUs also makes them applicable for use in novel ways such as within mobile phones. So gesture recognition has been used throughout HCI for over half a century, however, it has only been the last decade that gesture recognition based systems have been successfully integrated into commercial applications. This has been made possible, as stated in introduction, by the ever decreasing cost of sensor devices combined with the extremely powerful hardware. This brings us onto our next part of my dissertation, the discussion.

Chapter 3

Gesture Recognition & its audience

The current state of gesture recognition in digital music performances finds most of its applications in a sub-field of HCI called MCI (Musician-computer interaction). This area focuses specially on technology that can enable musicians to interact with computers. A performer, for example, may want to trigger on/off a number of samples, displaying discrete control, while at the same time continuously modulating the cut-off frequencies of a number of filters, displaying continuous control. It is this, somewhat contradictory, requirement for fine-grain simultaneous control of multiple parameters that makes designing interfaces for MCI such as interesting and challenging research area.

In order to achieve this level of fine-grain mulitple parameter real-time control; a large number of specifically designed commercial pieces of hardware and software have been developed. Such hardware devices like, the Akai APC40 USB Performance Controller or the Korg MicroKontrol MC1, combined with software programs, such as Ableton Live or Max/MSP. These developments enable a musician to interact with a computer in a real-time performance scenario in ways that would not be possible with more conventional HCI devices like the Keyboard and Mouse. This is because hardware devices, like the APC40, provide a musician with both multi-functional discrete and continuous control in the form of toggle buttons, sliders and knobs. Dedicated MCI hardware devices also importantly provide the opportunity for the musician to specifically map the output from the device to the input of the music software program being used.

The trend toward user-specific systems is evident well beyond the mainstream commercial controllers, such as the APC40 or MicroKontrol, as a large body of musicians regularly experiment with both designing and developing new MCI hardware and software systems. Leaders in these fields showcase their skills at annual conferences, such as the New Interfaces for Musical Expression (NIME) conference, the International Computer Music Conference (ICMC) and the Sound and Music Computing (SMC) conference, all feature dozens of examples each year of new developments in hardware and software that have been specially designed for MCI.

Free, customizable music-software, such as Pure Data, SuperCollider or Chuck, facilitates performers to create their own uniquely-tailored audio systems. A large number of performers are now also making use of the cheap open-source electronics platforms, like the Arduino, to create their own custom built sensor interfaces that can be used to gain real-time control over their specific audio systems. The combination of the ever decreasing cost of sensor devices along with the hacking of existing motion controllers, such as the Nintendo Wii-mote or Microsoft Kinect, have now made it possible for a large number of composers, performers and researchers to use the data from these sensors as controllers for their music software. This has made accessible an exciting interaction paradigm, that was previously only feasible for a minority of researchers and engineers, in enabling a performer to use their own body gestures to interact with a computer. This trend is further supported by the fact that most undergraduate and graduate level courses in the broader area of 'music technology' now typically include modules on interaction design for music that provide student performers, composers and sound engineers with the skills to design and build their own digital musical instruments.

With all this technology screaming in artistes faces, can gesture recognition re-establish that loss of embodiment that acoustic performances have? I believe so yes, gestural interaction is of particular use to a live digital musicians as it enables them to control a specific parameter or effect, even if their hands are busy playing a theremin. Gestural interaction enables a musician to use aesthetic, expressive gestures to control a computer which is of great benefit in a live performance scenario. These expressive gestures are what acoustic performers display every show, creating a theatrical musical performance. However, theorists believe that there isn't a need to re-establish the reification of musical gesture into digital performance. Ben Neill argued that one of key ideas to come out of recent digital performance was the way traditional notions of performer and audience were "completely erased and redefined".

One key advantage of using gestural interaction for MCI is that it could enable the performer to control multiple parameters of a sound, such as pitch, timbre and on-set amplitude, simultaneously.

This real-time control over multiple degrees of freedom is even di

cult with current commercial MCI devices, therefore making gestural control a rewarding research area. Gestural interaction would facilitate a musician to augment their own acoustic instrument with additional sensors, enabling them to control a synthesis program in real-time by performing a number of musical gestures, creating what Wanderley calls a Digital Musical Instrument (DMI) (Wanderley and Battier, 2000). It would also importantly enable a performer to control the synthesis program on a machine without using any physical instrument at all; allowing the musician to play what Mulder calls a Virtual Musical Instrument (VMI) (Mulder, 1994). Alternatively, gestural interaction could facilitate a performer to use musical conducting gestures to simultaneously interact with a number of performers and a computer with one succinct movement.

Conclusion



rev

Our Service Portfolio

jb

Want To Place An Order Quickly?

Then shoot us a message on Whatsapp, WeChat or Gmail. We are available 24/7 to assist you.

whatsapp

Do not panic, you are at the right place

jb

Visit Our essay writting help page to get all the details and guidence on availing our assiatance service.

Get 20% Discount, Now
£19 £14/ Per Page
14 days delivery time

Our writting assistance service is undoubtedly one of the most affordable writting assistance services and we have highly qualified professionls to help you with your work. So what are you waiting for, click below to order now.

Get An Instant Quote

ORDER TODAY!

Our experts are ready to assist you, call us to get a free quote or order now to get succeed in your academics writing.

Get a Free Quote Order Now