Machine Perception of Music and Audio (EECS 352)

Winter Quarter 2008

Electrical Engineering and Computer Science

Northwestern University

Course Basics (to top)

CLASS LOCATION:  Tech MG 28   

DAYS AND HOURS:  Mon, Wed 2:00pm – 3:20pm

 

INSTRUCTOR:  Bryan Pardo

office location:  3-323 Ford Building

office phone number: 847 491 7184

office hours: 3:30 – 4:30pm, Monday

 

REQUIRED TEXTBOOK:  Signals Sound and Sensation by William M. Hartmann

ACCESS TO MATLAB: Lab assignments will be done in Matlab. Matlab is available in the T-lab, located at the north end of the connector between Tech and the Ford building. All of the T-lab boxes are dual-boot Linux/Windows. If all of the machines that default to Linux are in use, a student simply needs to reboot a Windows box.  Right after the BIOS, the GRUB bootloader will prompt for the OS to boot. Matlab can then be started buy typing the following command at the command prompt:

 

              /usr/local/matlab/bin/matlab &

 

The EECS tech support people will use the class list and work with the lock shop to give access to the T-Lab to enrolled students in the class. Christopher Bachmann will also create accounts for students who do not already have one.  Those students will receive an email with instructions to access their account. There may be some students who do not remember their password or where it has expired.  If that is the case, then they should email root@eecs.northwestern.edu. Additionally, any students who find their WildCard does not unlock the door should stop by the EECS tech support office in Tech M334 so that we may update their access.

 

PREREQUISITES: Prior programming experience sufficient to be able to do laboratory assignments in MATLAB is required. Completion of the Engineering Analysis ( GEN_ENG 205-2 ) series or EECS 211 or EECS 231 would demonstrate sufficient experience.  A willingness to deal with math is also a prerequisite.

                                                                                                                                                            

COURSE GOALS:  How do you tell the sound of a clarinet from the sound of a kazoo? Is this song a waltz or a tango? If your friend likes Yo La Tengo, would she prefer a CD by the Flaming Lips or Bon Jovi? Can a computer answer these questions?

 

Researchers in computational music perception apply signal processing, psychology, music theory, machine learning, and natural language processing techniques to auditory user interfaces for human-computer interaction. Current application areas include vocal interfaces and search engines for music databases, machine accompaniment of human musicians, automated music recommendation systems, and tools for music production.

 

Machine Perception of Music will introduce students to the field of computational music perception through a combination of lectures, readings, and lab work in MATLAB. Students will learn basics of how sound and music are recorded and encoded by computers as .wav and MIDI files. The class will also explore basics of audio perception, including the relationship between pitch and frequency and the difficulties inherent in auditory scene analysis by humans and machines.  Basic classification and sequence alignment techniques will also be introduced.

 

Course Policies (to top)

Grading

Grading is straightforward. The total points for all projects sum to 100. Those receiving 93-100 points receive an A. Those with 90-92 receive an A minus, and so on.  All students will have the chance to earn 5 points of extra credit. This is equivalent to a ˝ letter grade boost. No other alterations to grades will be made. There is no curve.

Submitting Work

Each homework assignment must be handed in as specified in the particular homework assignment. I am not responsible for homework left in mailboxes.

 

*NOTE* Assignments are due at the start of class on the day specified. Late assignments will not be graded. Thus, it is better to hand in a partial assignment on time than to receive zero credit for a complete assignment handed in late.

Attendance and Lateness

Attendance is not taken. Lateness is a disruption to the class. Do your level best not to be late. Late assignments will not be graded.

Academic Dishonesty

Do your own work. Academic dishonesty will be dealt with as laid out in the student handbook.

 

Course Calendar/Schedule (Subject to change) (to top)

 

Week

Day

Date

Topic

Suggested Reading

Assigned

Due

Points

1

Mon

7-Jan

Intro to the class

SSS Chapters 1 & 2

 

 

 

1

Wed

9-Jan

Pure Tones, Power, Intensity, dB

SSS Chapter 3

 

 

 

2

Mon

14-Jan

Human auditory system, Loudness

SSS Chapter 4

HW 1

 

 

2

Wed

16-Jan

Pitch, Musical Frequency

SSS Chapters  6, 11, 12

Paper Reviews

 

 

3

Mon

21-Jan

NO CLASS: MLK DAY

NO CLASS: MLK DAY

NO CLASS

 

 

3

Wed

23-Jan

Mathematics of Fourier Series

SSS Chapter 8

 

HW 1

12

4

Mon

28-Jan

Mathematics of Fourier Series

Reading on Audio Representations

 

Review 1,2

4

4

Wed

30-Jan

Spectrograms, Filters

 

HW 2

 

 

5

Mon

4-Feb

More on Filters, Chromagrams

Beat Tracking Papers

 

Review 3,4

4

5

Wed

6-Feb

Autocorrelation

SSS Chapters 9, 10

 

HW 2

15

6

Mon

11-Feb

Cepstrograms,  Final Projects

Readings for Cepstrograms

Final Project Proposal

 

 

6

Wed

13-Feb

Similarity Measurement, Clustering

Readings on Similarity Measurement

HW 3

Review 5

2

7

Mon

18-Feb

Midterm Review, Final Projects

 

 

Final Project Proposal

3

7

Wed

20-Feb

MIDTERM, Final Project

 

 

MIDTERM

15

8

Mon

25-Feb

Audio Fingerprinting

Papers on Audio Fingerprinting

 

Meeting with Professor

10

8

Wed

27-Feb

Melody Recognition

Papers on Melody Recognition

 

HW3

15

9

Mon

3-Mar

Instrument Recognition

Papers on Instrument Recognition

 

 

 

9

Wed

5-Mar

Copyright, Fair play and the Law

Papers on Copyright and the Law

 

Xtra Credit Review 1-5

(5 extra)

10

Mon

10-Mar

Audio Source Separation

Paper on Audio Source Separation

 

 

 

10

Wed

12-Mar

Final Project Presentations

 

 

Project Presentations

10

10

Fri

 14-Mar

Final Project Submissions (11:59pm)

 

 

Project Submissions

10

 

Helpful Links (to top)

Links to Research Papers

The International Society of Music Information Retrieval (ISMIR) has many useful papers available HERE.

Making Music Programming Easier

Useful files to help you with Machine Perception of Music lab projects

An MPEG implementation in MATLAB

A good list of music tools used by computer music researchers    

The MIDI toolbox provides MIDI functionality to MATLAB.

CLAM is a full-fledged software framework for research and application development in the Audio and Music Domain. It offers a conceptual model as well as tools for the analysis, synthesis and transformation of audio signals.

Elias Pampalk’s MA Toolbox for Matlab: Implementing Similarity Measures for Audio.

The MATLAB Auditory Demo of the Speech & Hearing Group in the Computer Science Department at the University of Sheffield

Tools for dealing with music notation

The NETLAB toolbox for neural networks and other kinds of machine learning in MATLAB.

Dan Ellis’ Matlab Audio Processing Examples include MFCCs, LPCs, MP3 readers and more

Data sets

The University of Iowa Musical Instrument Samples

Music Technology Researchers

CNMAT is the UC Berkeley music tech lab.

CCRMA is the Stanford computer music lab.

IRCAM is the most famous music technology lab in France

The music tech group at McGill has lots of cool projects.

Elaine Chew does music technology research at USC.

Roger Dannenberg is a music technology researcher at Carnegie Mellon.

Christopher Raphael is a music technology researcher at Indiana University.

David Temperley does automated harmonic analysis of music.

Masataka Goto does cool music technology stuff.

Researchers in Auditory Psychology

Nina Kraus is an auditory psychology researcher at Northwestern.

Beverly Wright does auditory psychology at Northwestern.

Diana Deutsch is an auditory psychology researcher at UC San Diego.

Dan Levitin does auditory psychology at McGill University.

Researchers in Music Cognition

Richard Ashley does music cognition at Northwestern.

David Huron does music cognition at Ohio State.

Free sound playback tools

Audacity is free, open source software for recording and editing sounds. It is available for Mac OS X, Microsoft Windows, GNU/Linux, and other operating systems.

Cool Informational Websites

This site at McGill university is good for examples of streaming and source segregation.

A wonderful course on music content analysis by machine, courtesy of Dan Ellis.

This is a FREE book on digital signal processing that is quite readable, by the standards of such books….

To hear Shepard’s Tones click on this:

To see and hear the work of Mark Bartsch on source separation of audio in music click on the following:

Ever wonder how a woodwind instrument works? Check out this site.

Want to find out more about bells and how they work? Check this out.

Find out more about famous and obscure musicians at www.allmusic.com.

Good references books

Rabiner, Lawrence R., A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition

Bregman, Albert S., Auditory Scene Analysis: The Perceptual Organization of sound. is THE definitive book on auditory scene analysis and streaming.

Moore, F. Richard, Elements of Computer Music is decent intro to digital audio with a computer music bent.

Oppenheim, Alan V., and Schafer, Ronald W., Discrete-time Signal Processing is the textbook used at the University of Michigan to teach…well.. discrete-time signal processing.

Rabiner, L.R. and Schafer, R.W., Digital Processing of Speech Signals is standard book on digital processing of audio.

Yost, William A., Fundamentals of Hearing: An Introduction gives a lot of info on how hearing and the ear work.