Inference through innovation processes tested in the authorship attribution task
Authors:
Giulio Tani Raffaelli,
Margherita Lalli,
Francesca Tria
Abstract:
Urn models for innovation capture fundamental empirical laws shared by several real-world processes. The so-called urn model with triggering includes, as particular cases, the urn representation of the two-parameter Poisson-Dirichlet process and the Dirichlet process, seminal in Bayesian non-parametric inference. In this work, we leverage this connection to introduce a general approach for quantif…
▽ More
Urn models for innovation capture fundamental empirical laws shared by several real-world processes. The so-called urn model with triggering includes, as particular cases, the urn representation of the two-parameter Poisson-Dirichlet process and the Dirichlet process, seminal in Bayesian non-parametric inference. In this work, we leverage this connection to introduce a general approach for quantifying closeness between symbolic sequences and test it within the framework of the authorship attribution problem. The method demonstrates high accuracy when compared to other related methods in different scenarios, featuring a substantial gain in computational efficiency and theoretical transparency. Beyond the practical convenience, this work demonstrates how the recently established connection between urn models and non-parametric Bayesian inference can pave the way for designing more efficient inference methods. In particular, the hybrid approach that we propose allows us to relax the exchangeability hypothesis, which can be particularly relevant for systems exhibiting complex correlation patterns and non-stationary dynamics.
△ Less
Submitted 5 July, 2024; v1 submitted 8 June, 2023;
originally announced June 2023.
Maximum entropy models capture melodic styles
Authors:
Jason Sakellariou,
Francesca Tria,
Vittorio Loreto,
François Pachet
Abstract:
We introduce a Maximum Entropy model able to capture the statistics of melodies in music. The model can be used to generate new melodies that emulate the style of the musical corpus which was used to train it. Instead of using the $n-$body interactions of $(n-1)-$order Markov models, traditionally used in automatic music generation, we use a $k-$nearest neighbour model with pairwise interactions o…
▽ More
We introduce a Maximum Entropy model able to capture the statistics of melodies in music. The model can be used to generate new melodies that emulate the style of the musical corpus which was used to train it. Instead of using the $n-$body interactions of $(n-1)-$order Markov models, traditionally used in automatic music generation, we use a $k-$nearest neighbour model with pairwise interactions only. In that way, we keep the number of parameters low and avoid over-fitting problems typical of Markov models. We show that long-range musical phrases don't need to be explicitly enforced using high-order Markov interactions, but can instead emerge from multiple, competing, pairwise interactions. We validate our Maximum Entropy model by contrasting how much the generated sequences capture the style of the original corpus without plagiarizing it. To this end we use a data-compression approach to discriminate the levels of borrowing and innovation featured by the artificial sequences. The results show that our modelling scheme outperforms both fixed-order and variable-order Markov models. This shows that, despite being based only on pairwise interactions, this Maximum Entropy scheme opens the possibility to generate musically sensible alterations of the original phrases, providing a way to generate innovation.
△ Less
Submitted 11 October, 2016;
originally announced October 2016.