Skip to main content

Showing 1–21 of 21 results for author: Saitis, C

  1. arXiv:2407.04547  [pdf, other

    cs.SD cs.AI cs.LG eess.AS eess.SP

    Real-time Timbre Remapping with Differentiable DSP

    Authors: Jordie Shier, Charalampos Saitis, Andrew Robertson, Andrew McPherson

    Abstract: Timbre is a primary mode of expression in diverse musical contexts. However, prevalent audio-driven synthesis methods predominantly rely on pitch and loudness envelopes, effectively flattening timbral expression from the input. Our approach draws on the concept of timbre analogies and investigates how timbral expression from an input signal can be mapped onto controls for a synthesizer. Leveraging… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: Accepted for publication at the 24th International Conference on New Interfaces for Musical Expression in Utrecht, Netherlands

  2. arXiv:2403.07678  [pdf, other

    cs.CL cs.CY

    MoralBERT: Detecting Moral Values in Social Discourse

    Authors: Vjosa Preniqi, Iacopo Ghinassi, Kyriaki Kalimeri, Charalampos Saitis

    Abstract: Morality plays a fundamental role in how we perceive information while greatly influencing our decisions and judgements. Controversial topics, including vaccination, abortion, racism, and sexuality, often elicit opinions and attitudes that are not solely based on evidence but rather reflect moral worldviews. Recent advances in natural language processing have demonstrated that moral values can be… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

  3. arXiv:2310.14044  [pdf, other

    cs.SD cs.AI eess.AS

    Composer Style-specific Symbolic Music Generation Using Vector Quantized Discrete Diffusion Models

    Authors: Jincheng Zhang, Jingjing Tang, Charalampos Saitis, György Fazekas

    Abstract: Emerging Denoising Diffusion Probabilistic Models (DDPM) have become increasingly utilised because of promising results they have achieved in diverse generative tasks with continuous data, such as image and sound synthesis. Nonetheless, the success of diffusion models has not been fully extended to discrete symbolic music. We propose to combine a vector quantized variational autoencoder (VQ-VAE) a… ▽ More

    Submitted 21 October, 2023; originally announced October 2023.

  4. arXiv:2310.14040  [pdf, other

    cs.SD cs.AI eess.AS

    Fast Diffusion GAN Model for Symbolic Music Generation Controlled by Emotions

    Authors: Jincheng Zhang, György Fazekas, Charalampos Saitis

    Abstract: Diffusion models have shown promising results for a wide range of generative tasks with continuous data, such as image and audio synthesis. However, little progress has been made on using diffusion models to generate discrete symbolic music because this new class of generative models are not well suited for discrete data while its iterative sampling process is computationally expensive. In this wo… ▽ More

    Submitted 21 October, 2023; originally announced October 2023.

  5. arXiv:2309.06649  [pdf, other

    cs.SD eess.AS

    Differentiable Modelling of Percussive Audio with Transient and Spectral Synthesis

    Authors: Jordie Shier, Franco Caspe, Andrew Robertson, Mark Sandler, Charalampos Saitis, Andrew McPherson

    Abstract: Differentiable digital signal processing (DDSP) techniques, including methods for audio synthesis, have gained attention in recent years and lend themselves to interpretability in the parameter space. However, current differentiable synthesis methods have not explicitly sought to model the transient portion of signals, which is important for percussive sounds. In this work, we present a unified sy… ▽ More

    Submitted 12 September, 2023; originally announced September 2023.

    Comments: To be published in The Proceedings of Forum Acusticum, Sep 2023, Turin, Italy

  6. arXiv:2308.15422  [pdf, other

    cs.SD eess.AS

    A Review of Differentiable Digital Signal Processing for Music & Speech Synthesis

    Authors: Ben Hayes, Jordie Shier, György Fazekas, Andrew McPherson, Charalampos Saitis

    Abstract: The term "differentiable digital signal processing" describes a family of techniques in which loss function gradients are backpropagated through digital signal processors, facilitating their integration into neural networks. This article surveys the literature on differentiable audio signal processing, focusing on its use in music & speech synthesis. We catalogue applications to tasks including mu… ▽ More

    Submitted 29 August, 2023; originally announced August 2023.

    Comments: Under review for Frontiers in Signal Processing

  7. arXiv:2305.14867  [pdf, other

    cs.SD cs.HC eess.AS

    Interactive Neural Resonators

    Authors: Rodrigo Diaz, Charalampos Saitis, Mark Sandler

    Abstract: In this work, we propose a method for the controllable synthesis of real-time contact sounds using neural resonators. Previous works have used physically inspired statistical methods and physical modelling for object materials and excitation signals. Our method incorporates differentiable second-order resonators and estimates their coefficients using a neural network that is conditioned on physica… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

  8. arXiv:2304.09499  [pdf, ps, other

    cs.LG cs.DM

    The Responsibility Problem in Neural Networks with Unordered Targets

    Authors: Ben Hayes, Charalampos Saitis, György Fazekas

    Abstract: We discuss the discontinuities that arise when mapping unordered objects to neural network outputs of fixed permutation, referred to as the responsibility problem. Prior work has proved the existence of the issue by identifying a single discontinuity. Here, we show that discontinuities under such models are uncountably infinite, motivating further research into neural networks for unordered data.

    Submitted 19 April, 2023; originally announced April 2023.

    Comments: Accepted for TinyPaper archival at ICLR 2023: https://openreview.net/forum?id=jd7Hy1jRiv4

  9. arXiv:2304.07830  [pdf, ps, other

    cs.CL cs.SD eess.AS

    The language of sounds unheard: Exploring musical timbre semantics of large language models

    Authors: Kai Siedenburg, Charalampos Saitis

    Abstract: Semantic dimensions of sound have been playing a central role in understanding the nature of auditory sensory experience as well as the broader relation between perception, language, and meaning. Accordingly, and given the recent proliferation of large language models (LLMs), here we asked whether such models exhibit an organisation of perceptual semantics similar to those observed in humans. Spec… ▽ More

    Submitted 4 May, 2023; v1 submitted 16 April, 2023; originally announced April 2023.

    Comments: 12 pages, 3 figures

  10. arXiv:2210.15306  [pdf, other

    cs.SD cs.LG eess.AS

    Rigid-Body Sound Synthesis with Differentiable Modal Resonators

    Authors: Rodrigo Diaz, Ben Hayes, Charalampos Saitis, György Fazekas, Mark Sandler

    Abstract: Physical models of rigid bodies are used for sound synthesis in applications from virtual environments to music production. Traditional methods such as modal synthesis often rely on computationally expensive numerical solvers, while recent deep learning approaches are limited by post-processing of their results. In this work we present a novel end-to-end framework for training a deep neural networ… ▽ More

    Submitted 28 October, 2022; v1 submitted 27 October, 2022; originally announced October 2022.

    Comments: 5 pages

  11. arXiv:2210.14476  [pdf, other

    eess.SP cs.SD eess.AS

    Sinusoidal Frequency Estimation by Gradient Descent

    Authors: Ben Hayes, Charalampos Saitis, György Fazekas

    Abstract: Sinusoidal parameter estimation is a fundamental task in applications from spectral analysis to time-series forecasting. Estimating the sinusoidal frequency parameter by gradient descent is, however, often impossible as the error function is non-convex and densely populated with local minima. The growing family of differentiable signal processing methods has therefore been unable to tune the frequ… ▽ More

    Submitted 18 November, 2022; v1 submitted 26 October, 2022; originally announced October 2022.

    Comments: Submitted to ICASSP 2023

  12. arXiv:2209.01169  [pdf, other

    cs.CY cs.CL cs.LG

    "More Than Words": Linking Music Preferences and Moral Values Through Lyrics

    Authors: Vjosa Preniqi, Kyriaki Kalimeri, Charalampos Saitis

    Abstract: This study explores the association between music preferences and moral values by applying text analysis techniques to lyrics. Harvesting data from a Facebook-hosted application, we align psychometric scores of 1,386 users to lyrics from the top 5 songs of their preferred music artists as emerged from Facebook Page Likes. We extract a set of lyrical features related to each song's overarching narr… ▽ More

    Submitted 2 September, 2022; originally announced September 2022.

    Comments: Accepted to the 23rd International Society for Music Information Retrieval Conference (ISMIR 2022)

  13. arXiv:2204.04651  [pdf, other

    cs.SD cs.IR eess.AS

    Deep Conditional Representation Learning for Drum Sample Retrieval by Vocalisation

    Authors: Alejandro Delgado, Charalampos Saitis, Emmanouil Benetos, Mark Sandler

    Abstract: Imitating musical instruments with the human voice is an efficient way of communicating ideas between music producers, from sketching melody lines to clarifying desired sonorities. For this reason, there is an increasing interest in building applications that allow artists to efficiently pick target samples from big sound libraries just by imitating them vocally. In this study, we investigated the… ▽ More

    Submitted 10 April, 2022; originally announced April 2022.

    Comments: Submitted to Interspeech 2022 (under review)

  14. arXiv:2204.04646  [pdf, other

    cs.SD cs.IR eess.AS

    Deep Embeddings for Robust User-Based Amateur Vocal Percussion Classification

    Authors: Alejandro Delgado, Emir Demirel, Vinod Subramanian, Charalampos Saitis, Mark Sandler

    Abstract: Vocal Percussion Transcription (VPT) is concerned with the automatic detection and classification of vocal percussion sound events, allowing music creators and producers to sketch drum lines on the fly. Classifier algorithms in VPT systems learn best from small user-specific datasets, which usually restrict modelling to small input feature sets to avoid data overfitting. This study explores severa… ▽ More

    Submitted 10 April, 2022; originally announced April 2022.

    Comments: Accepted at Sound and Music Computing (SMC) conference 2022

  15. arXiv:2110.09223  [pdf, other

    cs.SD eess.AS

    Learning Models for Query by Vocal Percussion: A Comparative Study

    Authors: Alejandro Delgado, SkoT McDonald, Ning Xu, Charalampos Saitis, Mark Sandler

    Abstract: The imitation of percussive sounds via the human voice is a natural and effective tool for communicating rhythmic ideas on the fly. Thus, the automatic retrieval of drum sounds using vocal percussion can help artists prototype drum patterns in a comfortable and quick way, smoothing the creative workflow as a result. Here we explore different strategies to perform this type of query, making use of… ▽ More

    Submitted 18 October, 2021; originally announced October 2021.

    Comments: Published in proceedings of the International Computer Music Conference (ICMC) 2021

  16. arXiv:2109.02096  [pdf, other

    cs.SD cs.AI cs.CV cs.LG eess.AS

    Timbre Transfer with Variational Auto Encoding and Cycle-Consistent Adversarial Networks

    Authors: Russell Sammut Bonnici, Charalampos Saitis, Martin Benning

    Abstract: This research project investigates the application of deep learning to timbre transfer, where the timbre of a source audio can be converted to the timbre of a target audio with minimal loss in quality. The adopted approach combines Variational Autoencoders with Generative Adversarial Networks to construct meaningful representations of the source audio and produce realistic generations of the targe… ▽ More

    Submitted 10 October, 2021; v1 submitted 5 September, 2021; originally announced September 2021.

    Comments: 12 pages, 3 main figures, 4 tables

  17. arXiv:2107.05050  [pdf, other

    cs.SD cs.LG eess.AS eess.SP

    Neural Waveshaping Synthesis

    Authors: Ben Hayes, Charalampos Saitis, György Fazekas

    Abstract: We present the Neural Waveshaping Unit (NEWT): a novel, lightweight, fully causal approach to neural audio synthesis which operates directly in the waveform domain, with an accompanying optimisation (FastNEWT) for efficient CPU inference. The NEWT uses time-distributed multilayer perceptrons with periodic activations to implicitly learn nonlinear transfer functions that encode the characteristics… ▽ More

    Submitted 27 July, 2021; v1 submitted 11 July, 2021; originally announced July 2021.

    Comments: Accepted to ISMIR 2021; See online supplement at https://benhayes.net/projects/nws/

  18. arXiv:2107.00349  [pdf, other

    cs.CY

    Modelling Moral Traits with Music Listening Preferences and Demographics

    Authors: Vjosa Preniqi, Kyriaki Kalimeri, Charalampos Saitis

    Abstract: Music is an essential component in our everyday lives and experiences, as it is a way that we use to express our feelings, emotions and cultures. In this study, we explore the association between music genre preferences, demographics and moral values by exploring self-reported data from an online survey administered in Canada. Participants filled in the moral foundations questionnaire, while they… ▽ More

    Submitted 1 July, 2021; originally announced July 2021.

  19. arXiv:2105.11836  [pdf, other

    cs.SD cs.LG eess.AS

    A Modulation Front-End for Music Audio Tagging

    Authors: Cyrus Vahidi, Charalampos Saitis, György Fazekas

    Abstract: Convolutional Neural Networks have been extensively explored in the task of automatic music tagging. The problem can be approached by using either engineered time-frequency features or raw audio as input. Modulation filter bank representations that have been actively researched as a basis for timbre perception have the potential to facilitate the extraction of perceptually salient features. We exp… ▽ More

    Submitted 25 May, 2021; originally announced May 2021.

  20. arXiv:2009.11706  [pdf

    cs.SD eess.AS

    Timbre Space Representation of a Subtractive Synthesizer

    Authors: Cyrus Vahidi, George Fazekas, Charalampos Saitis, Alessandro Palladini

    Abstract: In this study, we produce a geometrically scaled perceptual timbre space from dissimilarity ratings of subtractive synthesized sounds and correlate the resulting dimensions with a set of acoustic descriptors. We curate a set of 15 sounds, produced by a synthesis model that uses varying source waveforms, frequency modulation (FM) and a lowpass filter with an enveloped cutoff frequency. Pairwise dis… ▽ More

    Submitted 24 September, 2020; originally announced September 2020.

  21. Multimodal Classification of Stressful Environments in Visually Impaired Mobility Using EEG and Peripheral Biosignals

    Authors: Charalampos Saitis, Kyriaki Kalimeri

    Abstract: In this study, we aim to better understand the cognitive-emotional experience of visually impaired people when navigating in unfamiliar urban environments, both outdoor and indoor. We propose a multimodal framework based on random forest classifiers, which predict the actual environment among predefined generic classes of urban settings, inferring on real-time, non-invasive, ambulatory monitoring… ▽ More

    Submitted 25 November, 2018; originally announced November 2018.

    Comments: IEEE Transactions on Affective Computing 2018