-
Anomaly-aware summary statistic from data batches
Authors:
Gaia Grosso
Abstract:
Signal-agnostic data exploration based on machine learning could unveil very subtle statistical deviations of collider data from the expected Standard Model of particle physics. The beneficial impact of a large training sample on machine learning solutions motivates the exploration of increasingly large and inclusive samples of acquired data with resource efficient computational methods. In this w…
▽ More
Signal-agnostic data exploration based on machine learning could unveil very subtle statistical deviations of collider data from the expected Standard Model of particle physics. The beneficial impact of a large training sample on machine learning solutions motivates the exploration of increasingly large and inclusive samples of acquired data with resource efficient computational methods. In this work we consider the New Physics Learning Machine (NPLM), a multivariate goodness-of-fit test built on the Neyman-Pearson maximum-likelihood-ratio construction, and we address the problem of testing large size samples under computational and storage resource constraints. We propose to perform parallel NPLM routines over batches of the data, and to combine them by locally aggregating over the data-to-reference density ratios learnt by each batch. The resulting data hypothesis defining the likelihood-ratio test is thus shared over the batches, and complies with the assumption that the expected rate of new physical processes is time invariant. We show that this method outperforms the simple sum of the independent tests run over the batches, and can recover, or even surpass, the sensitivity of the single test run over the full data. Beside the significant advantage for the offline application of NPLM to large size samples, the proposed approach offers new prospects toward the use of NPLM to construct anomaly-aware summary statistics in quasi-online data streaming scenarios.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Triggerless data acquisition pipeline for Machine Learning based statistical anomaly detection
Authors:
Gaia Grosso,
Nicolò Lai,
Matteo Migliorini,
Jacopo Pazzini,
Andrea Triossi,
Marco Zanetti,
Alberto Zucchetta
Abstract:
This work describes an online processing pipeline designed to identify anomalies in a continuous stream of data collected without external triggers from a particle detector. The processing pipeline begins with a local reconstruction algorithm, employing neural networks on an FPGA as its first stage. Subsequent data preparation and anomaly detection stages are accelerated using GPGPUs. As a practic…
▽ More
This work describes an online processing pipeline designed to identify anomalies in a continuous stream of data collected without external triggers from a particle detector. The processing pipeline begins with a local reconstruction algorithm, employing neural networks on an FPGA as its first stage. Subsequent data preparation and anomaly detection stages are accelerated using GPGPUs. As a practical demonstration of anomaly detection, we have developed a data quality monitoring application using a cosmic muon detector. Its primary objective is to detect deviations from the expected operational conditions of the detector. This serves as a proof-of-concept for a system that can be adapted for use in large particle physics experiments, enabling anomaly detection on datasets with reduced bias.
△ Less
Submitted 3 November, 2023;
originally announced November 2023.
-
Fast kernel methods for Data Quality Monitoring as a goodness-of-fit test
Authors:
Gaia Grosso,
Nicolò Lai,
Marco Letizia,
Jacopo Pazzini,
Marco Rando,
Lorenzo Rosasco,
Andrea Wulzer,
Marco Zanetti
Abstract:
We here propose a machine learning approach for monitoring particle detectors in real-time. The goal is to assess the compatibility of incoming experimental data with a reference dataset, characterising the data behaviour under normal circumstances, via a likelihood-ratio hypothesis test. The model is based on a modern implementation of kernel methods, nonparametric algorithms that can learn any c…
▽ More
We here propose a machine learning approach for monitoring particle detectors in real-time. The goal is to assess the compatibility of incoming experimental data with a reference dataset, characterising the data behaviour under normal circumstances, via a likelihood-ratio hypothesis test. The model is based on a modern implementation of kernel methods, nonparametric algorithms that can learn any continuous function given enough data. The resulting approach is efficient and agnostic to the type of anomaly that may be present in the data. Our study demonstrates the effectiveness of this strategy on multivariate data from drift tube chamber muon detectors.
△ Less
Submitted 9 March, 2023;
originally announced March 2023.
-
The Analytical Method algorithm for trigger primitives generation at the LHC Drift Tubes detector
Authors:
G. Abbiendi,
J. Alcaraz Maestre,
A. Álvarez Fernández,
B. Álvarez González,
N. Amapane,
I. Bachiller,
L. Barcellan,
C. Baldanza,
C. Battilana,
M. Bellato,
G. Bencze,
M. Benettoni,
N. Beni,
A. Benvenuti,
A. Bergnoli,
L. C. Blanco Ramos,
L. Borgonovi,
A. Bragagnolo,
V. Cafaro,
A. Calderon,
E. Calvo,
R. Carlin,
C. A. Carrillo Montoya,
F. R. Cavallo,
J. M. Cela Ruiz
, et al. (121 additional authors not shown)
Abstract:
The Compact Muon Solenoid (CMS) experiment prepares its Phase-2 upgrade for the high-luminosity era of the LHC operation (HL-LHC). Due to the increase of occupancy, trigger latency and rates, the full electronics of the CMS Drift Tube (DT) chambers will need to be replaced. In the new design, the time bin for the digitisation of the chamber signals will be of around 1~ns, and the totality of the s…
▽ More
The Compact Muon Solenoid (CMS) experiment prepares its Phase-2 upgrade for the high-luminosity era of the LHC operation (HL-LHC). Due to the increase of occupancy, trigger latency and rates, the full electronics of the CMS Drift Tube (DT) chambers will need to be replaced. In the new design, the time bin for the digitisation of the chamber signals will be of around 1~ns, and the totality of the signals will be forwarded asynchronously to the service cavern at full resolution. The new backend system will be in charge of building the trigger primitives of each chamber. These trigger primitives contain the information at chamber level about the muon candidates position, direction, and collision time, and are used as input in the L1 CMS trigger. The added functionalities will improve the robustness of the system against ageing. An algorithm based on analytical solutions for reconstructing the DT trigger primitives, called Analytical Method, has been implemented both as a software C++ emulator and in firmware. Its performance has been estimated using the software emulator with simulated and real data samples, and through hardware implementation tests. Measured efficiencies are 96 to 98\% for all qualities and time and spatial resolutions are close to the ultimate performance of the DT chambers. A prototype chain of the HL-LHC electronics using the Analytical Method for trigger primitive generation has been installed during Long Shutdown 2 of the LHC and operated in CMS cosmic data taking campaigns in 2020 and 2021. Results from this validation step, the so-called Slice Test, are presented.
△ Less
Submitted 3 February, 2023;
originally announced February 2023.
-
A fast and flexible machine learning approach to data quality monitoring
Authors:
Gaia Grosso,
Nicolò Lai,
Marco Letizia,
Jacopo Pazzini,
Marco Rando,
Andrea Wulzer,
Marco Zanetti
Abstract:
We present a machine learning based approach for real-time monitoring of particle detectors. The proposed strategy evaluates the compatibility between incoming batches of experimental data and a reference sample representing the data behavior in normal conditions by implementing a likelihood-ratio hypothesis test. The core model is powered by recent large-scale implementations of kernel methods, n…
▽ More
We present a machine learning based approach for real-time monitoring of particle detectors. The proposed strategy evaluates the compatibility between incoming batches of experimental data and a reference sample representing the data behavior in normal conditions by implementing a likelihood-ratio hypothesis test. The core model is powered by recent large-scale implementations of kernel methods, nonparametric learning algorithms that can approximate any continuous function given enough data. The resulting algorithm is fast, efficient and agnostic about the type of potential anomaly in the data. We show the performance of the model on multivariate data from a drift tube chambers muon detector.
△ Less
Submitted 10 March, 2023; v1 submitted 21 January, 2023;
originally announced January 2023.
-
Learning new physics efficiently with nonparametric methods
Authors:
Marco Letizia,
Gianvito Losapio,
Marco Rando,
Gaia Grosso,
Andrea Wulzer,
Maurizio Pierini,
Marco Zanetti,
Lorenzo Rosasco
Abstract:
We present a machine learning approach for model-independent new physics searches. The corresponding algorithm is powered by recent large-scale implementations of kernel methods, nonparametric learning algorithms that can approximate any continuous function given enough data. Based on the original proposal by D'Agnolo and Wulzer (arXiv:1806.02350), the model evaluates the compatibility between exp…
▽ More
We present a machine learning approach for model-independent new physics searches. The corresponding algorithm is powered by recent large-scale implementations of kernel methods, nonparametric learning algorithms that can approximate any continuous function given enough data. Based on the original proposal by D'Agnolo and Wulzer (arXiv:1806.02350), the model evaluates the compatibility between experimental data and a reference model, by implementing a hypothesis testing procedure based on the likelihood ratio. Model-independence is enforced by avoiding any prior assumption about the presence or shape of new physics components in the measurements. We show that our approach has dramatic advantages compared to neural network implementations in terms of training times and computational resources, while maintaining comparable performances. In particular, we conduct our tests on higher dimensional datasets, a step forward with respect to previous studies.
△ Less
Submitted 5 April, 2022;
originally announced April 2022.
-
Learning Multivariate New Physics
Authors:
Raffaele Tito D'Agnolo,
Gaia Grosso,
Maurizio Pierini,
Andrea Wulzer,
Marco Zanetti
Abstract:
We discuss a method that employs a multilayer perceptron to detect deviations from a reference model in large multivariate datasets. Our data analysis strategy does not rely on any prior assumption on the nature of the deviation. It is designed to be sensitive to small discrepancies that arise in datasets dominated by the reference model. The main conceptual building blocks were introduced in Ref.…
▽ More
We discuss a method that employs a multilayer perceptron to detect deviations from a reference model in large multivariate datasets. Our data analysis strategy does not rely on any prior assumption on the nature of the deviation. It is designed to be sensitive to small discrepancies that arise in datasets dominated by the reference model. The main conceptual building blocks were introduced in Ref. [1]. Here we make decisive progress in the algorithm implementation and we demonstrate its applicability to problems in high energy physics. We show that the method is sensitive to putative new physics signals in di-muon final states at the LHC. We also compare our performances on toy problems with the ones of alternative methods proposed in the literature.
△ Less
Submitted 23 September, 2021; v1 submitted 27 December, 2019;
originally announced December 2019.
-
Study of the effects of radiation on the CMS Drift Tubes Muon Detector for the HL-LHC
Authors:
G. Abbiendi,
J. Alcaraz Maestre,
A. Álvarez Fernández,
B. Álvarez González,
N. Amapane,
I. Bachiller,
J. M. Barcala,
L. Barcellan,
C. Battilana,
M. Bellato,
G. Bencze,
M. Benettoni,
N. Beni,
A. Benvenuti,
L. C. Blanco Ramos,
A. Boletti,
A. Bragagnolo,
J. A. Brochero Cifuentes,
V. Cafaro,
A. Calderon,
E. Calvo,
A. Cappati,
R. Carlin,
C. A. Carrillo Montoya,
F. R. Cavallo
, et al. (118 additional authors not shown)
Abstract:
The CMS drift tubes (DT) muon detector, built for withstanding the LHC expected integrated and instantaneous luminosities, will be used also in the High Luminosity LHC (HL-LHC) at a 5 times larger instantaneous luminosity and, consequently, much higher levels of radiation, reaching about 10 times the LHC integrated luminosity. Initial irradiation tests of a spare DT chamber at the CERN gamma irrad…
▽ More
The CMS drift tubes (DT) muon detector, built for withstanding the LHC expected integrated and instantaneous luminosities, will be used also in the High Luminosity LHC (HL-LHC) at a 5 times larger instantaneous luminosity and, consequently, much higher levels of radiation, reaching about 10 times the LHC integrated luminosity. Initial irradiation tests of a spare DT chamber at the CERN gamma irradiation facility (GIF++), at large ($\sim$O(100)) acceleration factor, showed ageing effects resulting in a degradation of the DT cell performance. However, full CMS simulations have shown almost no impact in the muon reconstruction efficiency over the full barrel acceptance and for the full integrated luminosity. A second spare DT chamber was moved inside the GIF++ bunker in October 2017. The chamber was being irradiated at lower acceleration factors, and only 2 out of the 12 layers of the chamber were switched at working voltage when the radioactive source was active, being the other layers in standby. In this way the other non-aged layers are used as reference and as a precise and unbiased telescope of muon tracks for the efficiency computation of the aged layers of the chamber, when set at working voltage for measurements. An integrated dose equivalent to two times the expected integrated luminosity of the HL-LHC run has been absorbed by this second spare DT chamber and the final impact on the muon reconstruction efficiency is under study. Direct inspection of some extracted aged anode wires presented a melted resistive deposition of materials. Investigation on the outgassing of cell materials and of the gas components used at the GIF++ are underway. Strategies to mitigate the ageing effects are also being developed. From the long irradiation measurements of the second spare DT chamber, the effects of radiation in the performance of the DTs expected during the HL-LHC run will be presented.
△ Less
Submitted 12 December, 2019;
originally announced December 2019.