My research interests fall between machine learning, signal processing, and psychophysics, particularly as they pertain to sound perception. I wish to understand how learning can occur in ecologically realistic environments rather than in a laboratory setting. I am also interested in using psychophysical and physiological observations regarding animal plasticity (and stability) in these system for two ends: 1.) to develop novel machine learning approaches 2.) to better understand the purpose and function of the observed biological properties and to generate new hypothesis about these observations.

Currently this has meant looking at how machines can learn to recognize particular sound events (e.g. is there a clarinet in this song?) when the training data (what is learned from) consists of mixtures of sound, where different sounds overlap in time.

Current Projects

Learning sounds from mixtures of audio: 2007-now

More information about this project can also be found at my lab's website.

When a squirrel sees another squirrel eaten by a hawk, how does it then come to realize that the cry of a hawk that came before is associated with great danger? There are many sounds in the squirrels' environment: the sounds of other squirrels, the sound of passing traffic nearby, the sounds of other birds in the trees. Two thirds of young infants, in an informal survey from Barker and Newman [2004], were frequently in an environment where multiple people were talking, when learning their native language. Yet somehow infants learn to recognize the sounds of their native language, and are more capable of separating out their own name [Newman, 2005] and the voice of their own mother [Barker and Newman, 2004], than other sounds, from a mixture of other sounds. How can these links between a sound to an important label be learned in an environment with many concurrent sounds?

Broadly, my goal in this project is to understand how machine systems can learn to recognize sounds from mixtures of sound, and to understand how particular biological observations might help facilitate this. More specifically, I would like to design a machine system capable of learning to label sounds from mixtures of sounds, using only weak labels. A weak label is much like the label the squirrel or the infant gets: all the learner knows is that something, somewhere around a particular time is important, and the learner must figure out how to appropriately assign that label to future sounds. There may be multiple sounds during learning, which may overlap (in time) with the sound the label is associated with (which I call the target).

For broader society, the ability to learn from mixtures of sounds could be used to teach a system particular sounds and instruments that a music fan enjoys: new songs which the user had not heard that had the same kinds of sounds could then be found. It might be used to train an aid for the deaf, which could identify dangerous sounds in a street-scape, such as a passing truck, the sound of screeching breaks, or a police siren. Such an ability could be used to facilitate monitoring of an environment for particular rare species [Charif, 2009], to track certain individuals [Terry et al., 2005], or to provide a measure of bio-diversity [Chesmore, 2004].

You can find examples of the sounds one version of my system was evaluated with, as described in my ISMIR, 2008 paper, here

Related Papers

  • D. Little, B. Pardo, Learning musical instruments from mixtures of audio with weak labels. 9th International Conference on Music Information Retrieval (ISMIR) September 14th-18th, 2008, Philadelphia, USA. (PDF)

Computational modeling of acquisition and consolidation in human learning: 2009-now

In spring of 2009 I recieved a fellowship to pursue this project over the course of the 2009-2010 school year. In this project I will model a series of experiments performed in Beverly Wright's lab examining perceptual learning when people are trained on two tasks in the same session. I will construct the model using machine learning methods that I have been exploring with my advisor Bryan Pardo. I will be integrating information across two disciplines (perceptual learning and machine learning) for the purposes of understanding more about two stages of learning, acquisition and consolidation, in both human and machine learning systems.

The relevant experiments from Professor Wright's lab [Wright et al., 2009] suggest that the stage of acquisition is functionally distinct from consolidation. Acquisition is the stage of learning that occurs during practice of a task. Consolidation occurs after practice has ceased: memories change from a fragile, disruptable state to a stable state, in long-term memory. Evidence of consolidation can be found across a range of learning scenarios and species (cf. [Dudai, 1996] and [McGaugh, 2000]). In Wright et al. [2009], acquisition appears to be distinct from consolidation. This is suggested by the differing ways training a second task or stimulus interferes with the learning resulting from training the first task or stimulus. The task or stimulus that interfered with learning during consolidation did not interfere with learning during acquisition, and vice versa (see Fig. 1).

Figure 1: The patterns of learning across different training conditions.

Although there are machine learners that replicate patterns of learning interference relevant to consolidation [Robins and McCallum, 1999], past studies have not explored the effects of acquisition as distinct from consolidation. The central question of this project will be to ask which of a set of machine learning models best explains the learning interference seen in the human data during both acquisition and consolidation. This study may lead to a better understanding of human learning and suggest new methods of organizing machine learners. The learning addressed here is over simple stimuli, which might suggest a limited scope. However, there is good reason to believe the studied phenomena, consolidation and acquisition, should apply across many kinds of stimuli, both simple and complex.

Both the machine learning and (human) perceptual learning communities could benefit from this study. It may lead to a better understanding of human perceptual learning and/or suggest promising directions for further multiple-task machine learning methods. More broadly, it may teach us something about acquisition and consolidation, as distinct stages of learning. This in turn may help guide further human experiments and could ultimately lead to clinical applications of auditory training for the hearing impaired or developmentally disabled. New strategies for multiple task machine learners could lead to systems capable of pooling experience from many domains, making it possible for them to utilize previous experience or from multiple perceptual modalities. Researchers generally agree this will be an important component of next generation systems capable of providing us with multimedia files (e.g. video, music, images) based on their content [Enser, 2008, pp. 538].

Past Projects

Online learning for Query-by-Humming: 2006-2007

During my first year as a graduate student I worked on a project called VocalSearch, now named Tunebot. VocalSearch allowed a user to search for music by singing melodies into a microphone (also called Query-by-Humming). I showed that the system can improve results for particular users, after it has been deployed, based on feedback from the user. This feedback customizes the parameters of how notes are identified and compared in a person's sung melody.

Related Papers

  • D. Little, D. Raffensperger, B. Pardo, A Query By Humming System that Learns from Experience, 8th International Conference on Music Information Retrieval (ISMIR) September 23-27, 2007, Vienna, Austria. (PDF)
  • D. Little, D. Raffensperger, B. Pardo, User-specific Training for a Music Search Engine, 4th Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms, June 28-30, 2007, Brno, Czech Republic. (PDF)
  • D. Little, D. Raffensperger, and B. Pardo. Machine Learning and Mutlimodal Interaction: Fourth International Workshop, MLMI 2007, Brno, CZ, June 28-30, 2007, Revised Selected Papers, chapter User specific training of a music search engine. Lecture Notes in Computer Science. Springer, 2007.

Hexagonal Metamorphic Robot Algorithms: 2003-2005

As an undergraduate, I worked for three years with Jennifer Walter at Vassar College developing algorithms for the rearrangement of hexagonal robots. The purpose of this research is to design a set of reconfigurable modular hexagonal robots, which would enable an adaptive system able to rearrange itself for various tasks in hostile environments. The work I did was largely theoretical: I looked at how to move a large collection of these robots efficiently and without error around various kinds of obstacles.

Related Papers

  • D. Little and J. Walter, Using Hexagonal Metamorphic Robots to Form Temporary Bridges, in Proc. of the IEEE International Conference on Intelligent Robotic Systems , Aug. 2005, Edmonton, Alberta, Canada, pages 2652-2657.
  • J. Walter, M. Brooks, D. Little, and N. Amato, Enveloping Multi-Pocket Obstacles with Hexagonal Metamorphic Robots, in Proc. of the IEEE Intl. Conf. on Robotics and Automation, Apr. 2004, New Orleans, LA, pages 2204-2209.
  • J. Walter and D. Little, Bridging Gaps in Traversal Surfaces with Hexagonal Metamorphic Robots, in Proc. of the American Nuclear Society 10th International Conference on Robotics and Remote Systems for Hazardous Environments, 28-31 March, 2004, Gainsville, FL.

References