Skip to main content

Showing 1–49 of 49 results for author: Hong, J

  1. arXiv:2407.05744  [pdf, other

    eess.AS cs.SD

    Automating Urban Soundscape Enhancements with AI: In-situ Assessment of Quality and Restorativeness in Traffic-Exposed Residential Areas

    Authors: Bhan Lam, Zhen-Ting Ong, Kenneth Ooi, Wen-Hui Ong, Trevor Wong, Karn N. Watcharasupat, Vanessa Boey, Irene Lee, Joo Young Hong, Jian Kang, Kar Fye Alvin Lee, Georgios Christopoulos, Woon-Seng Gan

    Abstract: Formalized in ISO 12913, the "soundscape" approach is a paradigmatic shift towards perception-based urban sound management, aiming to alleviate the substantial socioeconomic costs of noise pollution to advance the United Nations Sustainable Development Goals. Focusing on traffic-exposed outdoor residential sites, we implemented an automatic masker selection system (AMSS) utilizing natural sounds t… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: 41 pages, 4 figures. Preprint submitted to an Elsevier journal

  2. arXiv:2407.04239  [pdf, other

    eess.SY

    Enabling Multicast Transmission for Spatio-Temporally Asynchronous User Requests in Wireless Environments

    Authors: Hojung Lee, Jun-Pyo Hong, Wan Choi

    Abstract: The surge in wireless devices and data traffic volume necessitates more efficient transmission methods. Multicasting has garnered consistent attention as a means to fulfill the increasing demand for more efficient data transmission methods. Nevertheless, leveraging multicast wireless networks for spatio-temporally asynchronous data requests poses challenges. In this context, this paper introduces… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  3. arXiv:2407.03274  [pdf, other

    eess.SP

    Using Photoplethysmography to Detect Real-time Blood Pressure Changes with a Calibration-free Deep Learning Model

    Authors: Jingyuan Hong, Manasi Nandi, Weiwei Jin, Jordi Alastruey

    Abstract: Blood pressure (BP) changes are linked to individual health status in both clinical and non-clinical settings. This study developed a deep learning model to classify systolic (SBP), diastolic (DBP), and mean (MBP) BP changes using photoplethysmography (PPG) waveforms. Data from the Vital Signs Database (VitalDB) comprising 1,005 ICU patients with synchronized PPG and BP recordings was used. BP cha… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: 8 pages, 5 figures, 7 tables, 1 supplementary material

  4. arXiv:2406.05472  [pdf, other

    cs.CR eess.SY

    A Novel Generative AI-Based Framework for Anomaly Detection in Multicast Messages in Smart Grid Communications

    Authors: Aydin Zaboli, Seong Lok Choi, Tai-Jin Song, Junho Hong

    Abstract: Cybersecurity breaches in digital substations can pose significant challenges to the stability and reliability of power system operations. To address these challenges, defense and mitigation techniques are required. Identifying and detecting anomalies in information and communication technology (ICT) is crucial to ensure secure device interactions within digital substations. This paper proposes a… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

    Comments: 10 pages, 10 figures, Submitted to IEEE Transactions on Information Forensics and Security

  5. arXiv:2402.18076  [pdf, other

    eess.SY

    Online Ecological Gearshift Strategy via Neural Network with Soft-Argmax Operator

    Authors: Xi Luo, Shiying Dong, Jinlong Hong, Bingzhao Gao, Hong Chen

    Abstract: This paper presents a neural network optimizer with soft-argmax operator to achieve an ecological gearshift strategy in real-time. The strategy is reformulated as the mixed-integer model predictive control (MIMPC) problem to minimize energy consumption. Then the outer convexification is introduced to transform integer variables into relaxed binary controls. To approximate binary solutions properly… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

    Comments: 6 pages, 5 figures, submitted to 8th IFAC Conference on Nonlinear Model Predictive Control

  6. arXiv:2402.11632  [pdf, other

    eess.SP

    Reliable long timescale decision-directed channel estimation for OFDM system

    Authors: Xun Wang, Xin Xie, Cunqing Hua, Jianan Hong, Pengwenlong Gu

    Abstract: Decision-directed channel estimation (DDCE) is one kind of blind channel estimation method that tracks the channel blindly by an iterative algorithm without relying on the pilots, which can increase the utilization of wireless resource. However, one major problem of DDCE is the performance degradation caused by error accumulation during the tracking process. In this paper, we propose an reliable D… ▽ More

    Submitted 18 February, 2024; originally announced February 2024.

  7. arXiv:2402.06777  [pdf, other

    cs.HC cs.MM cs.SD eess.AS

    Capturing Cancer as Music: Cancer Mechanisms Expressed through Musification

    Authors: Rostyslav Hnatyshyn, Jiayi Hong, Ross Maciejewski, Christopher Norby, Carlo C. Maley

    Abstract: The development of cancer is difficult to express on a simple and intuitive level due to its complexity. Since cancer is so widespread, raising public awareness about its mechanisms can help those affected cope with its realities, as well as inspire others to make lifestyle adjustments and screen for the disease. Unfortunately, studies have shown that cancer literature is too technical for the gen… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

  8. arXiv:2312.02669  [pdf, other

    physics.optics eess.IV

    Deep-learning-driven end-to-end metalens imaging

    Authors: Joonhyuk Seo, Jaegang Jo, Joohoon Kim, Joonho Kang, Chanik Kang, Seongwon Moon, Eunji Lee, Jehyeong Hong, Junsuk Rho, Haejun Chung

    Abstract: Recent advances in metasurface lenses (metalenses) have shown great potential for opening a new era in compact imaging, photography, light detection and ranging (LiDAR), and virtual reality/augmented reality (VR/AR) applications. However, the fundamental trade-off between broadband focusing efficiency and operating bandwidth limits the performance of broadband metalenses, resulting in chromatic ab… ▽ More

    Submitted 10 May, 2024; v1 submitted 5 December, 2023; originally announced December 2023.

    Comments: 17 pages, 7 figures, 1 table

  9. arXiv:2311.13488  [pdf, other

    eess.SY

    Machine Learning based Post Event Analysis for Cybersecurity of Cyber-Physical System

    Authors: Kuchan Park, Junho Hong, Wencong Su, HyoJong Lee

    Abstract: As Information and Communication Technology (ICT) equipment continues to be integrated into power systems, issues related to cybersecurity are increasingly emerging. Particularly noteworthy is the transition to digital substations, which is shifting operations from traditional hardwired-based systems to communication-based Supervisory Control and Data Acquisition (SCADA) system operations. These c… ▽ More

    Submitted 7 March, 2024; v1 submitted 22 November, 2023; originally announced November 2023.

    Comments: Submitted to 2024 IEEE Power and Energy Society General Meeting

  10. arXiv:2311.06829  [pdf, ps, other

    eess.SP

    Joint Design of Coding and Modulation for Digital Over-the-Air Computation

    Authors: Xin Xie, Cunqinq Hua, Jianan Hong, Yuejun Wei

    Abstract: Due to its high communication efficiency, over-the-air computation (AirComp) has been expected to carry out various computing tasks in the next-generation wireless networks. However, up to now, most applications of AirComp are explored in the analog domain, which limits the capability of AirComp in resisting the complex wireless environment, not to mention to integrate the AirComp technique to the… ▽ More

    Submitted 12 November, 2023; originally announced November 2023.

    Comments: This paper has been submitted to IEEE ICC 2024

  11. arXiv:2311.05462  [pdf, other

    cs.CR eess.SY

    ChatGPT and Other Large Language Models for Cybersecurity of Smart Grid Applications

    Authors: Aydin Zaboli, Seong Lok Choi, Tai-Jin Song, Junho Hong

    Abstract: Cybersecurity breaches targeting electrical substations constitute a significant threat to the integrity of the power grid, necessitating comprehensive defense and mitigation strategies. Any anomaly in information and communication technology (ICT) should be detected for secure communications between devices in digital substations. This paper proposes large language models (LLM), e.g., ChatGPT, fo… ▽ More

    Submitted 25 February, 2024; v1 submitted 9 November, 2023; originally announced November 2023.

    Comments: 5 pages, 2 figures, Accepted, 2024 IEEE Power & Energy Society General Meeting (PESGM), Seattle, WA, USA

  12. arXiv:2310.14946  [pdf, other

    cs.MM cs.SD eess.AS

    Intuitive Multilingual Audio-Visual Speech Recognition with a Single-Trained Model

    Authors: Joanna Hong, Se Jin Park, Yong Man Ro

    Abstract: We present a novel approach to multilingual audio-visual speech recognition tasks by introducing a single model on a multilingual dataset. Motivated by a human cognitive system where humans can intuitively distinguish different languages without any conscious effort or guidance, we propose a model that can capture which language is given as an input speech by distinguishing the inherent similariti… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023 Findings

  13. arXiv:2310.05934  [pdf, other

    cs.CV cs.AI cs.MM eess.IV

    DF-3DFace: One-to-Many Speech Synchronized 3D Face Animation with Diffusion

    Authors: Se Jin Park, Joanna Hong, Minsu Kim, Yong Man Ro

    Abstract: Speech-driven 3D facial animation has gained significant attention for its ability to create realistic and expressive facial animations in 3D space based on speech. Learning-based methods have shown promising progress in achieving accurate facial motion synchronized with speech. However, one-to-many nature of speech-to-3D facial synthesis has not been fully explored: while the lip accurately synch… ▽ More

    Submitted 23 August, 2023; originally announced October 2023.

  14. arXiv:2309.12566  [pdf, other

    cs.RO eess.SY math.OC

    Recent Advances in Path Integral Control for Trajectory Optimization: An Overview in Theoretical and Algorithmic Perspectives

    Authors: Muhammad Kazim, JunGee Hong, Min-Gyeom Kim, Kwang-Ki K. Kim

    Abstract: This paper presents a tutorial overview of path integral (PI) control approaches for stochastic optimal control and trajectory optimization. We concisely summarize the theoretical development of path integral control to compute a solution for stochastic optimal control and provide algorithmic descriptions of the cross-entropy (CE) method, an open-loop controller using the receding horizon scheme k… ▽ More

    Submitted 1 December, 2023; v1 submitted 21 September, 2023; originally announced September 2023.

    Comments: 16 pages, 9 figures

    MSC Class: 68T40; 13P25 ACM Class: I.2.9; I.2.8; G.1.6; G.4

  15. arXiv:2308.07787  [pdf, other

    cs.SD cs.CV cs.LG eess.AS

    DiffV2S: Diffusion-based Video-to-Speech Synthesis with Vision-guided Speaker Embedding

    Authors: Jeongsoo Choi, Joanna Hong, Yong Man Ro

    Abstract: Recent research has demonstrated impressive results in video-to-speech synthesis which involves reconstructing speech solely from visual input. However, previous works have struggled to accurately synthesize speech due to a lack of sufficient guidance for the model to infer the correct content with the appropriate sound. To resolve the issue, they have adopted an extra speaker embedding as a speak… ▽ More

    Submitted 15 August, 2023; originally announced August 2023.

    Comments: ICCV 2023

  16. arXiv:2306.15212  [pdf, other

    cs.SD cs.LG eess.AS

    TranssionADD: A multi-frame reinforcement based sequence tagging model for audio deepfake detection

    Authors: Jie Liu, Zhiba Su, Hui Huang, Caiyan Wan, Quanxiu Wang, Jiangli Hong, Benlai Tang, Fengjie Zhu

    Abstract: Thanks to recent advancements in end-to-end speech modeling technology, it has become increasingly feasible to imitate and clone a user`s voice. This leads to a significant challenge in differentiating between authentic and fabricated audio segments. To address the issue of user voice abuse and misuse, the second Audio Deepfake Detection Challenge (ADD 2023) aims to detect and analyze deepfake spe… ▽ More

    Submitted 27 June, 2023; originally announced June 2023.

  17. arXiv:2304.06237  [pdf, other

    cs.LG eess.SP

    Deep learning based ECG segmentation for delineation of diverse arrhythmias

    Authors: Chankyu Joung, Mijin Kim, Taejin Paik, Seong-Ho Kong, Seung-Young Oh, Won Kyeong Jeon, Jae-hu Jeon, Joong-Sik Hong, Wan-Joong Kim, Woong Kook, Myung-Jin Cha, Otto van Koert

    Abstract: Accurate delineation of key waveforms in an ECG is a critical initial step in extracting relevant features to support the diagnosis and treatment of heart conditions. Although deep learning based methods using a segmentation model to locate the P, QRS, and T waves have shown promising results, their ability to handle signals exhibiting arrhythmia remains unclear. This study builds on existing rese… ▽ More

    Submitted 6 September, 2023; v1 submitted 12 April, 2023; originally announced April 2023.

  18. arXiv:2304.01544  [pdf

    physics.flu-dyn eess.SY physics.pop-ph

    Numerical Investigation of Airborne Infection Risk in an Elevator Cabin under Different Ventilation Designs

    Authors: Ata Nazari, Changchang Wang, Ruichen He, Farzad Taghizadeh-Hesary, Jiarong Hong

    Abstract: Airborne transmission of SARS-CoV-2 via virus-laden aerosols in enclosed spaces poses a significant concern. Elevators, commonly utilized enclosed spaces in modern tall buildings, present a challenge as the impact of varying heating, ventilation, and air conditioning (HVAC) systems on virus transmission within these cabins remains unclear. In this study, we employ computational modeling to examine… ▽ More

    Submitted 4 April, 2023; originally announced April 2023.

    Comments: 38 pages, 14 figures

  19. arXiv:2303.08536  [pdf, other

    cs.MM cs.CV cs.LG cs.SD eess.AS

    Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scoring

    Authors: Joanna Hong, Minsu Kim, Jeongsoo Choi, Yong Man Ro

    Abstract: This paper deals with Audio-Visual Speech Recognition (AVSR) under multimodal input corruption situations where audio inputs and visual inputs are both corrupted, which is not well addressed in previous research directions. Previous studies have focused on how to complement the corrupted audio inputs with the clean visual inputs with the assumption of the availability of clean visual inputs. Howev… ▽ More

    Submitted 20 March, 2023; v1 submitted 15 March, 2023; originally announced March 2023.

    Comments: Accepted at CVPR 2023. Implementation available: https://github.com/joannahong/AV-RelScore

  20. arXiv:2303.05732  [pdf

    cs.SE cs.PF eess.SY

    Securing Safety in Collaborative Cyber-Physical Systems through Fault Criticality Analysis

    Authors: Manzoor Hussain, Nazakat Ali, Jang-Eui Hong

    Abstract: Collaborative Cyber-Physical Systems (CCPS) are systems that contain tightly coupled physical and cyber components, massively interconnected subsystems, and collaborate to achieve a common goal. The safety of a single Cyber-Physical System (CPS) can be achieved by following the safety standards such as ISO 26262 and IEC 61508 or by applying hazard analysis techniques. However, due to the complex,… ▽ More

    Submitted 10 March, 2023; originally announced March 2023.

    Comments: This paper is an extended version of an article submitted to KCSE-2021

    Journal ref: KIPS Transactions on Software and Data Engineering, vol. 10, no. 8, pp. 287-300, 2021

  21. arXiv:2302.08841  [pdf, other

    cs.SD cs.CV cs.LG cs.MM eess.AS

    Lip-to-Speech Synthesis in the Wild with Multi-task Learning

    Authors: Minsu Kim, Joanna Hong, Yong Man Ro

    Abstract: Recent studies have shown impressive performance in Lip-to-speech synthesis that aims to reconstruct speech from visual information alone. However, they have been suffering from synthesizing accurate speech in the wild, due to insufficient supervision for guiding the model to infer the correct content. Distinct from the previous methods, in this paper, we develop a powerful Lip2Speech method that… ▽ More

    Submitted 17 February, 2023; originally announced February 2023.

    Comments: Accepted at ICASSP 2023. Demo available: https://github.com/joannahong/Lip-to-Speech-Synthesis-in-the-Wild

  22. arXiv:2301.09638  [pdf

    physics.bio-ph eess.IV physics.ins-det physics.optics

    In situ Biological Particle Analyzer based on Digital Inline Holography

    Authors: Delaney Sanborn, Ruichen He, Lei Feng, Jiarong Hong

    Abstract: Obtaining in situ measurements of biological microparticles is crucial for both scientific research and numerous industrial applications (e.g., early detection of harmful algal blooms, monitoring yeast during fermentation). However, existing methods are limited to offer timely diagnostics of these particles with sufficient accuracy and information. Here, we introduce a novel method for real-time,… ▽ More

    Submitted 14 January, 2023; originally announced January 2023.

    Comments: 18 pages, 9 figures

  23. arXiv:2212.06368  [pdf, other

    cs.CV eess.IV

    Single Cell Training on Architecture Search for Image Denoising

    Authors: Bokyeung Lee, Kyungdeuk Ko, Jonghwan Hong, Hanseok Ko

    Abstract: Neural Architecture Search (NAS) for automatically finding the optimal network architecture has shown some success with competitive performances in various computer vision tasks. However, NAS in general requires a tremendous amount of computations. Thus reducing computational cost has emerged as an important issue. Most of the attempts so far has been based on manual approaches, and often the arch… ▽ More

    Submitted 12 December, 2022; originally announced December 2022.

  24. arXiv:2211.08530  [pdf, ps, other

    eess.SY

    Cyber-Attack Event Analysis for EV Charging Stations

    Authors: Mansi Girdhar, Junho Hong, Yongsik You, Tai-jin Song, Manimaran Govindarasu

    Abstract: Safe and secure electric vehicle charging stations (EVCSs) are important in smart transportation infrastructure. The prevalence of EVCSs has rapidly increased over time in response to the rising demand for EV charging. However, developments in information and communication technologies (ICT) have made the cyber-physical system (CPS) of EVCSs susceptible to cyber-attacks, which might destabilize th… ▽ More

    Submitted 15 November, 2022; originally announced November 2022.

    Comments: 5 Pages, 2 Figures, 2 Tables, 10 Mathematical Equations, PES GM Conference Paper

  25. arXiv:2211.00924  [pdf, other

    cs.CV cs.AI eess.IV

    SyncTalkFace: Talking Face Generation with Precise Lip-Syncing via Audio-Lip Memory

    Authors: Se Jin Park, Minsu Kim, Joanna Hong, Jeongsoo Choi, Yong Man Ro

    Abstract: The challenge of talking face generation from speech lies in aligning two different modal information, audio and video, such that the mouth region corresponds to input audio. Previous methods either exploit audio-visual representation learning or leverage intermediate structural information such as landmarks and 3D models. However, they struggle to synthesize fine details of the lips varying at th… ▽ More

    Submitted 2 November, 2022; v1 submitted 2 November, 2022; originally announced November 2022.

    Comments: Accepted at AAAI 2022 (Oral)

  26. arXiv:2210.14297  [pdf, other

    eess.IV cs.CV

    Progressively refined deep joint registration segmentation (ProRSeg) of gastrointestinal organs at risk: Application to MRI and cone-beam CT

    Authors: Jue Jiang, Jun Hong, Kathryn Tringale, Marsha Reyngold, Christopher Crane, Neelam Tyagi, Harini Veeraraghavan

    Abstract: Method: ProRSeg was trained using 5-fold cross-validation with 110 T2-weighted MRI acquired at 5 treatment fractions from 10 different patients, taking care that same patient scans were not placed in training and testing folds. Segmentation accuracy was measured using Dice similarity coefficient (DSC) and Hausdorff distance at 95th percentile (HD95). Registration consistency was measured using coe… ▽ More

    Submitted 25 October, 2022; originally announced October 2022.

    Comments: This manuscript is currently under review at Medical Physics

  27. arXiv:2208.10644  [pdf, other

    cs.CR eess.SY

    Machine Learning-Enabled Cyber Attack Prediction and Mitigation for EV Charging Stations

    Authors: Mansi Girdhar, Junho Hong, Yongsik Yoo, Tai-Jin Song

    Abstract: Safe and reliable electric vehicle charging stations (EVCSs) have become imperative in an intelligent transportation infrastructure. Over the years, there has been a rapid increase in the deployment of EVCSs to address the upsurging charging demands. However, advances in information and communication technologies (ICT) have rendered this cyber-physical system (CPS) vulnerable to suffering cyber th… ▽ More

    Submitted 22 August, 2022; originally announced August 2022.

    Comments: 5 pages, 4 figures, 11 mathematical equations

  28. arXiv:2207.06020  [pdf, other

    cs.SD cs.AI cs.CV cs.MM eess.AS eess.IV

    Visual Context-driven Audio Feature Enhancement for Robust End-to-End Audio-Visual Speech Recognition

    Authors: Joanna Hong, Minsu Kim, Daehun Yoo, Yong Man Ro

    Abstract: This paper focuses on designing a noise-robust end-to-end Audio-Visual Speech Recognition (AVSR) system. To this end, we propose Visual Context-driven Audio Feature Enhancement module (V-CAFE) to enhance the input noisy audio speech with a help of audio-visual correspondence. The proposed V-CAFE is designed to capture the transition of lip movements, namely visual context and to generate a noise r… ▽ More

    Submitted 13 July, 2022; originally announced July 2022.

    Comments: Accepted at Interspeech 2022

  29. ARAUS: A Large-Scale Dataset and Baseline Models of Affective Responses to Augmented Urban Soundscapes

    Authors: Kenneth Ooi, Zhen-Ting Ong, Karn N. Watcharasupat, Bhan Lam, Joo Young Hong, Woon-Seng Gan

    Abstract: Choosing optimal maskers for existing soundscapes to effect a desired perceptual change via soundscape augmentation is non-trivial due to extensive varieties of maskers and a dearth of benchmark datasets with which to compare and develop soundscape augmentation models. To address this problem, we make publicly available the ARAUS (Affective Responses to Augmented Urban Soundscapes) dataset, which… ▽ More

    Submitted 2 July, 2024; v1 submitted 3 July, 2022; originally announced July 2022.

    Comments: [v1, v2] 25 pages, 11 figures. [v3] 33 pages, 18 figures. v3 updated with changes made after peer review. in IEEE Transactions on Affective Computing, 2023. [v4] 33 pages, 18 figures. Fixed inaccurate author list in citation #90

    Journal ref: IEEE Trans. Affect. Comput., pp. 1-17, 2023

  30. arXiv:2206.07458  [pdf, other

    cs.CV cs.LG cs.SD eess.AS

    VisageSynTalk: Unseen Speaker Video-to-Speech Synthesis via Speech-Visage Feature Selection

    Authors: Joanna Hong, Minsu Kim, Yong Man Ro

    Abstract: The goal of this work is to reconstruct speech from a silent talking face video. Recent studies have shown impressive performance on synthesizing speech from silent talking face videos. However, they have not explicitly considered on varying identity characteristics of different speakers, which place a challenge in the video-to-speech synthesis, and this becomes more critical in unseen-speaker set… ▽ More

    Submitted 20 July, 2022; v1 submitted 15 June, 2022; originally announced June 2022.

    Comments: Accepted by ECCV 2022

  31. arXiv:2206.03112  [pdf

    cs.LG cs.SD eess.AS

    Singapore Soundscape Site Selection Survey (S5): Identification of Characteristic Soundscapes of Singapore via Weighted k-means Clustering

    Authors: Kenneth Ooi, Bhan Lam, Joo Young Hong, Karn N. Watcharasupat, Zhen-Ting Ong, Woon-Seng Gan

    Abstract: The ecological validity of soundscape studies usually rests on a choice of soundscapes that are representative of the perceptual space under investigation. For example, a soundscape pleasantness study might investigate locations with soundscapes ranging from "pleasant" to "annoying". The choice of soundscapes is typically researcher-led, but a participant-led process can reduce selection bias and… ▽ More

    Submitted 7 June, 2022; originally announced June 2022.

    Comments: 23 pages, 8 figures. Submitted to Sustainability

    Journal ref: MDPI Sustainability. 2022; 14(12):7485

  32. arXiv:2204.01726  [pdf, other

    cs.CV cs.AI eess.AS

    Lip to Speech Synthesis with Visual Context Attentional GAN

    Authors: Minsu Kim, Joanna Hong, Yong Man Ro

    Abstract: In this paper, we propose a novel lip-to-speech generative adversarial network, Visual Context Attentional GAN (VCA-GAN), which can jointly model local and global lip movements during speech synthesis. Specifically, the proposed VCA-GAN synthesizes the speech from local lip visual features by finding a mapping function of viseme-to-phoneme, while global visual context is embedded into the intermed… ▽ More

    Submitted 4 April, 2022; originally announced April 2022.

    Comments: Published at NeurIPS 2021

  33. arXiv:2204.01265  [pdf, other

    cs.CV cs.AI cs.SD eess.AS

    Multi-modality Associative Bridging through Memory: Speech Sound Recollected from Face Video

    Authors: Minsu Kim, Joanna Hong, Se Jin Park, Yong Man Ro

    Abstract: In this paper, we introduce a novel audio-visual multi-modal bridging framework that can utilize both audio and visual information, even with uni-modal inputs. We exploit a memory network that stores source (i.e., visual) and target (i.e., audio) modal representations, where source modal representation is what we are given, and target modal representations are what we want to obtain from the memor… ▽ More

    Submitted 4 April, 2022; originally announced April 2022.

    Comments: Published at ICCV 2021

  34. arXiv:2111.07377  [pdf, other

    eess.SY

    Eco-Coasting Strategies Using Road Grade Preview: Evaluation and Online Implementation Based on Mixed Integer Model Predictive Control

    Authors: Yongjun Yan, Nan Li, Jinlong Hong, Bingzhao Gao, Hong Chen, Jing Sun, Ziyou Song

    Abstract: Coasting has been widely used in the eco-driving guidelines to reduce fuel consumption by profiting from kinetic energy. However, the comprehensive comparison between different coasting strategies and online performance of the eco-coasting strategy using road grade preview are still unclear because of the oversimplification and the integer variable in the optimal control problems. Herein, two diff… ▽ More

    Submitted 25 December, 2021; v1 submitted 14 November, 2021; originally announced November 2021.

    Comments: 13 pages, 18 figures

  35. arXiv:2110.10965  [pdf, other

    eess.IV cs.CV

    2020 CATARACTS Semantic Segmentation Challenge

    Authors: Imanol Luengo, Maria Grammatikopoulou, Rahim Mohammadi, Chris Walsh, Chinedu Innocent Nwoye, Deepak Alapatt, Nicolas Padoy, Zhen-Liang Ni, Chen-Chen Fan, Gui-Bin Bian, Zeng-Guang Hou, Heonjin Ha, Jiacheng Wang, Haojie Wang, Dong Guo, Lu Wang, Guotai Wang, Mobarakol Islam, Bharat Giddwani, Ren Hongliang, Theodoros Pissas, Claudio Ravasio, Martin Huber, Jeremy Birch, Joan M. Nunez Do Rio , et al. (15 additional authors not shown)

    Abstract: Surgical scene segmentation is essential for anatomy and instrument localization which can be further used to assess tissue-instrument interactions during a surgical procedure. In 2017, the Challenge on Automatic Tool Annotation for cataRACT Surgery (CATARACTS) released 50 cataract surgery videos accompanied by instrument usage annotations. These annotations included frame-level instrument presenc… ▽ More

    Submitted 24 February, 2022; v1 submitted 21 October, 2021; originally announced October 2021.

  36. Data-driven yaw misalignment correction for utility-scale wind turbines

    Authors: Linyue Gao, Jiarong Hong

    Abstract: In recent years, wind turbine yaw misalignment that tends to degrade the turbine power production and impact the blade fatigue loads raises more attention along with the rapid development of large-scale wind turbines. The state-of-the-art correction methods require additional instruments such as LiDAR to provide the ground truths and are not suitable for long-term operation and large-scale impleme… ▽ More

    Submitted 18 September, 2021; originally announced September 2021.

    Comments: 24 pages, 9 figures

  37. arXiv:2109.05664  [pdf

    cs.CV eess.IV

    Unsupervised domain adaptation for cross-modality liver segmentation via joint adversarial learning and self-learning

    Authors: Jin Hong, Simon Chun-Ho Yu, Weitian Chen

    Abstract: Liver segmentation on images acquired using computed tomography (CT) and magnetic resonance imaging (MRI) plays an important role in clinical management of liver diseases. Compared to MRI, CT images of liver are more abundant and readily available. However, MRI can provide richer quantitative information of the liver compared to CT. Thus, it is desirable to achieve unsupervised domain adaptation f… ▽ More

    Submitted 24 February, 2022; v1 submitted 12 September, 2021; originally announced September 2021.

  38. arXiv:2108.11364  [pdf, other

    cs.CV eess.IV

    Blind Image Decomposition

    Authors: Junlin Han, Weihao Li, Pengfei Fang, Chunyi Sun, Jie Hong, Mohammad Ali Armin, Lars Petersson, Hongdong Li

    Abstract: We propose and study a novel task named Blind Image Decomposition (BID), which requires separating a superimposed image into constituent underlying images in a blind setting, that is, both the source components involved in mixing as well as the mixing mechanism are unknown. For example, rain may consist of multiple components, such as rain streaks, raindrops, snow, and haze. Rainy images can be tr… ▽ More

    Submitted 18 July, 2022; v1 submitted 25 August, 2021; originally announced August 2021.

    Comments: ECCV 2022. Project page: https://junlinhan.github.io/projects/BID.html. Code: https://github.com/JunlinHan/BID

  39. Power Management of Nanogrid Cluster with P2P Electricity Trading Based on Future Trends of Load Demand and PV Power Production

    Authors: Sangkeum Lee, Hojun Jin, Luiz Felipe Vecchietti, Junhee Hong, Ki-Bum Park, Dongsoo Har

    Abstract: This paper presents the power management of the nanogrid clusters assisted by a novel peer-to-peer(P2P) electricity trading. In our work, unbalance of power consumption among clusters is mitigated by the proposed P2P trading method. For power management of individual clusters, multi-objective optimization simultaneously minimizing total power consumption, portion of grid power consumption, and tot… ▽ More

    Submitted 2 December, 2020; v1 submitted 2 September, 2020; originally announced September 2020.

    Comments: This article is submitted for publication in Sustainable Cities and Society

  40. arXiv:2003.14373  [pdf

    eess.IV physics.data-an

    Machine learning shadowgraph for particle size and shape characterization

    Authors: Jiaqi Li, Siyao Shao, Jiarong Hong

    Abstract: Conventional image processing for particle shadow image is usually time-consuming and suffers degraded image segmentation when dealing with the images consisting of complex-shaped and clustered particles with varying backgrounds. In this paper, we introduce a robust learning-based method using a single convolution neural network (CNN) for analyzing particle shadow images. Our approach employs a tw… ▽ More

    Submitted 31 March, 2020; originally announced March 2020.

    Comments: 11 pages, 6 figures

  41. arXiv:2003.03053  [pdf, ps, other

    eess.SP

    Experimental Demonstration of Location-aware Beam Alignment

    Authors: Junyeol Hong, Hyeonjin Chung, Sunwoo Kim

    Abstract: The main focus of beam alignment is to find the optimal beam which yields the largest received signal strength (RSS) with faster speed.In this paper, we demonstrate an efficient beam alignment scheme with our testbed. The algorithm we experiment uses the location information for the computation efficient beam alignment.The testbed transmits and receives the 13.8 GHz signal and steers a beam on bot… ▽ More

    Submitted 6 March, 2020; originally announced March 2020.

    Comments: 4 pages, 6 figures

  42. arXiv:1912.13036  [pdf

    physics.app-ph eess.IV

    Machine learning holography for measuring 3D particle size distribution

    Authors: Siyao Shao, Kevin Mallery, Jiarong Hong

    Abstract: Particle size measurement based on digital holography with conventional algorithms are usually time-consuming and susceptible to noises associated with hologram quality and particle complexity, limiting its usage in a broad range of engineering applications and fundamental research. We propose a learning-based hologram processing method to cope with the aforementioned issues. The proposed approach… ▽ More

    Submitted 30 December, 2019; originally announced December 2019.

    Comments: 14 pages, 6 figures

  43. arXiv:1912.06258  [pdf, other

    cs.CV cs.RO eess.SY

    Mcity Data Collection for Automated Vehicles Study

    Authors: Yiqun Dong, Yuanxin Zhong, Wenbo Yu, Minghan Zhu, Pingping Lu, Yeyang Fang, Jiajun Hong, Huei Peng

    Abstract: The main goal of this paper is to introduce the data collection effort at Mcity targeting automated vehicle development. We captured a comprehensive set of data from a set of perception sensors (Lidars, Radars, Cameras) as well as vehicle steering/brake/throttle inputs and an RTK unit. Two in-cabin cameras record the human driver's behaviors for possible future use. The naturalistic driving on sel… ▽ More

    Submitted 12 December, 2019; originally announced December 2019.

  44. W-Net: Two-stage U-Net with misaligned data for raw-to-RGB mapping

    Authors: Kwang-Hyun Uhm, Seung-Wook Kim, Seo-Won Ji, Sung-Jin Cho, Jun-Pyo Hong, Sung-Jea Ko

    Abstract: Recent research on learning a mapping between raw Bayer images and RGB images has progressed with the development of deep convolutional neural networks. A challenging data set namely the Zurich Raw-to-RGB data set (ZRR) has been released in the AIM 2019 raw-to-RGB mapping challenge. In ZRR, input raw and target RGB images are captured by two different cameras and thus not perfectly aligned. Moreov… ▽ More

    Submitted 21 November, 2019; v1 submitted 19 November, 2019; originally announced November 2019.

    Comments: Accepted by ICCVW 2019

  45. arXiv:1911.00805  [pdf

    eess.IV physics.optics

    Machine Learning Holography for 3D Particle Field Imaging

    Authors: Siyao Shao, Kevin Mallery, Santosh Kumar, Jiarong Hong

    Abstract: We propose a new learning-based approach for 3D particle field imaging using holography. Our approach uses a U-net architecture incorporating residual connections, Swish activation, hologram preprocessing, and transfer learning to cope with challenges arising in particle holograms where accurate measurement of individual particles is crucial. Assessments on both synthetic and experimental hologram… ▽ More

    Submitted 2 November, 2019; originally announced November 2019.

    Comments: 12 pages, 7 figures

  46. arXiv:1910.04681  [pdf

    physics.bio-ph eess.IV physics.optics

    Laser scanning reflection-matrix microscopy for label-free in vivo imaging of a mouse brain through an intact skull

    Authors: Seokchan Yoon, Hojun Lee, Jin Hee Hong, Yong-Sik Lim, Wonshik Choi

    Abstract: We present a laser scanning reflection-matrix microscopy combining the scanning of laser focus and the wide-field mapping of the electric field of the backscattered waves for eliminating higher-order aberrations even in the presence of strong multiple light scattering noise. Unlike conventional confocal laser scanning microscopy, we record the amplitude and phase maps of reflected waves from the s… ▽ More

    Submitted 8 October, 2019; originally announced October 2019.

    Comments: 14 pages, 4 figures

  47. arXiv:1904.04884  [pdf, ps, other

    eess.IV physics.flu-dyn physics.optics

    Regularized Inverse Holographic Volume Reconstruction for 3D Particle Tracking

    Authors: Kevin Mallery, Jiarong Hong

    Abstract: The key limitations of digital inline holography (DIH) for particle tracking applications are poor longitudinal resolution, particle concentration limits, and case-specific processing. We utilize an inverse problem method with fused lasso regularization to perform full volumetric reconstructions of particle fields. By exploiting data sparsity in the solution and utilizing GPU processing, we dramat… ▽ More

    Submitted 9 April, 2019; originally announced April 2019.

    Comments: 15 pages, 6 figures

  48. arXiv:1903.09495  [pdf, other

    cs.OH eess.SY

    Substation One-Line Diagram Automatic Generation and Visualization

    Authors: Jing Hong, Yue Li, Yiran Xu, Chen Yuan, Hong Fan, Guangyi Liu, Renchang Dai

    Abstract: In Energy Management System (EMS) applications and many other off-line planning and study tools, one-line diagram (OLND) of the whole system and stations is a straightforward view for planners and operators to design, monitor, analyze, and control the power system. Large-scale power system OLND is usually manually developed and maintained. The work is tedious, time-consuming and ease to make mista… ▽ More

    Submitted 20 March, 2019; originally announced March 2019.

    Comments: 6 pages, 6 figures, 1 table, accepted by 2019 IEEE PES ISGT ASIA

  49. arXiv:1805.00367  [pdf, other

    eess.SP cs.LG

    A Multi-State Diagnosis and Prognosis Framework with Feature Learning for Tool Condition Monitoring

    Authors: Chong Zhang, Geok Soon Hong, Jun-Hong Zhou, Kay Chen Tan, Haizhou Li, Huan Xu, Jihoon Hong, Hian-Leng Chan

    Abstract: In this paper, a multi-state diagnosis and prognosis (MDP) framework is proposed for tool condition monitoring via a deep belief network based multi-state approach (DBNMS). For fault diagnosis, a cost-sensitive deep belief network (namely ECS-DBN) is applied to deal with the imbalanced data problem for tool state estimation. An appropriate prognostic degradation model is then applied for tool wear… ▽ More

    Submitted 30 April, 2018; originally announced May 2018.

    Comments: 14 pages, 12 figures, 10 tables, submitted to IEEE Transactions on Cybernetics