Special Topic: Generative AI for Music and Audio Course ID 15798 Description In this seminar class, we will discuss state-of-the-art methods in generative AI for music and general audio (everyday sounds, speech, bioacoustics, etc.), with applications to both generation and understanding. We will examine and compare the two primary families of methods that are used in modern audio generation research: large language models applied to discrete audio tokens, and diffusion models applied to continuous audio representations. With an eye towards offering intuitive controls for music generation, we will also examine classic methods and tasks in music information retrieval such as spectral analysis, synchronization, beat detection, and transcription. Moreover, we will explore emerging topics in generative AI for music and audio such as new architectures, training data attribution, interaction, compression, multimodality, and evaluation. Finally, we will discuss the ethical and societal implications of music generation specifically, and its potential effects on music both economically and culturally. Much of the course activity will center around (1) in-class lectures and demonstrations on small scale datasets, (2) student-led discussions of research papers, and (3) an open-ended research project. Key Topics Generative AI, music generation, audio generation, language models, diffusion models, music information retrieval, multimodal learning Required Background Knowledge Solid math skills ("just okay" is fine), strong programming skills, understanding of probability, at least some musical background, some previous exposure to AI / ML Course Relevance - Upper division CS undergrads, ideally those who have taken 15322 Intro to Computer Music - Masters students across departments (music technology, CS, ECE) - SCS PhD students Course Goals Students should emerge from the course with a better understanding of the following aspects of generative AI for music and audio: (1) the modern research landscape, (2) domain-specific considerations, including understanding of basic audio signal processing, (3) classical techniques in music information retrieval, (4) research frontiers in generative AI for music and audio. Learning Resources Textbooks: Meinard Muller Fundamentals of Music Processing, research papers Software: Python, Google Colab, Numpy Assessment Structure Homeworks: 25% Reading reflections: 25% In-class reading presentations: 15% In-class participation: 10% Final project: 25% Extra Time Commitment n/a