skip to main content
research-article

SimClusters: Community-Based Representations for Heterogeneous Recommendations at Twitter

Published: 20 August 2020 Publication History
  • Get Citation Alerts
  • Abstract

    Personalized recommendation products at Twitter target a multitude of heterogeneous items: Tweets, Events, Topics, Hashtags, and users. Each of these targets varies in their cardinality (which affects the scale of the problem) and their "shelf life'' (which constrains the latency of generating the recommendations). Although Twitter has built a variety of recommendation systems before dating back a decade, solutions to the broader problem were mostly tackled piecemeal. In this paper, we present SimClusters, a general-purpose representation layer based on overlapping communities into which users as well as heterogeneous content can be captured as sparse, interpretable vectors to support a multitude of recommendation tasks. We propose a novel algorithm for community discovery based on Metropolis-Hastings sampling, which is both more accurate and significantly faster than off-the-shelf alternatives. SimClusters scales to networks with billions of users and has been effective across a variety of deployed applications at Twitter.

    Supplementary Material

    MP4 File (3394486.3403370.mp4)
    A brief explainer video with slides for the paper "SimClusters: Community-Based Representations for Heterogeneous Recommendations at Twitter".

    References

    [1]
    Edoardo M. Airoldi, David M. Blei, Stephen E. Fienberg, and Eric P. Xing. 2008. Mixed Membership Stochastic Blockmodels. JMLR, Vol. 9 (June 2008), 1981--2014.
    [2]
    Iván Cantador and Paolo Cremonesi. 2014. Tutorial on Cross-domain Recommender Systems. In RecSys '14. 401--402.
    [3]
    Andrzej Cichocki and Anh-Huy Phan. 2009. Fast Local Algorithms for Large Scale Nonnegative Matrix and Tensor Factorizations. IEICE Transactions, Vol. 92-A (03 2009), 708--721.
    [4]
    Graham Cormode and Shan Muthukrishnan. 2005. An improved data stream summary: the count-min sketch and its applications. Journal of Algorithms, Vol. 55, 1 (2005), 58--75.
    [5]
    Paul Covington, Jay Adams, and Emre Sargin. 2016. Deep Neural Networks for YouTube Recommendations. In RecSys '16. 191--198.
    [6]
    Maurizio Ferrari Dacrema, Paolo Cremonesi, and Dietmar Jannach. 2019. Are We Really Making Much Progress? A Worrying Analysis of Recent Neural Recommendation Approaches. In Recsys'19. 101--109.
    [7]
    Inderjit S. Dhillon, Yuqiang Guan, and Brian Kulis. 2007. Weighted Graph Cuts Without Eigenvectors A Multilevel Approach. IEEE Trans. Pattern Anal. Mach. Intell., Vol. 29, 11 (Nov. 2007), 1944--1957.
    [8]
    Ali Mamdouh Elkahky, Yang Song, and Xiaodong He. 2015. A multi-view deep learning approach for cross domain user modeling in recommendation systems. In WWW'15. 278--288.
    [9]
    Ajeet Grewal, Jerry Jiang, Gary Lam, Tristan Jung, Lohith Vuddemarri, Quannan Li, Aaditya Landge, and Jimmy Lin. 2018. Recservice: Distributed Real-Time Graph Processing at Twitter. In HotCloud'18. USENIX Association, 3.
    [10]
    Aditya Grover and Jure Leskovec. 2016. Node2Vec: Scalable Feature Learning for Networks. In KDD '16. 855--864.
    [11]
    Pankaj Gupta, Ashish Goel, Jimmy Lin, Aneesh Sharma, Dong Wang, and Reza Zadeh. 2013. WTF: The Who to Follow Service at Twitter. In WWW '13. 505--514.
    [12]
    Pankaj Gupta, Venu Satuluri, Ajeet Grewal, Siva Gurumurthy, Volodymyr Zhabiuk, Quannan Li, and Jimmy Lin. 2014. Real-Time Twitter Recommendation: Online Motif Detection in Large Dynamic Graphs. Proceedings of the VLDB Endowment, Vol. 7, 13 (2014), 1379--1380.
    [13]
    William L. Hamilton, Rex Ying, and Jure Leskovec. 2017. Inductive Representation Learning on Large Graphs. In NIPS'17. 1025--1035.
    [14]
    Krishna Kamath, Aneesh Sharma, Dong Wang, and Zhijun Yin. 2014. Realgraph: User interaction prediction at twitter. In User Engagement Optimization Workshop at KDD'14.
    [15]
    Richard M Karp, Scott Shenker, and Christos H Papadimitriou. 2003. A simple algorithm for finding frequent elements in streams and bags. ACM Transactions on Database Systems (TODS), Vol. 28, 1 (2003), 51--55.
    [16]
    Thomas N Kipf and Max Welling. 2017. Semi-supervised classification with graph convolutional networks. In ICLR'17.
    [17]
    Jon M. Kleinberg. 1999. Authoritative Sources in a Hyperlinked Environment. J. ACM, Vol. 46, 5 (Sept. 1999), 604--632.
    [18]
    Yehuda Koren, Robert Bell, and Chris Volinsky. 2009. Matrix Factorization Techniques for Recommender Systems. Computer, Vol. 42, 8 (Aug. 2009), 30--37.
    [19]
    Jérôme Kunegis. 2013. KONECT -- The Koblenz Network Collection. In Proc. Int. Conf. on World Wide Web Companion. 1343--1350.
    [20]
    R. Lempel and S. Moran. 2001. SALSA: The Stochastic Approach for Link-Structure Analysis. ACM Trans. Inf. Syst., Vol. 19, 2 (April 2001), 131--160.
    [21]
    Jure Leskovec and Rok Sosivc. 2016. SNAP: A General-Purpose Network Analysis and Graph-Mining Library. ACM Transactions on Intelligent Systems and Technology (TIST), Vol. 8, 1 (2016), 1.
    [22]
    Dawen Liang, Rahul G. Krishnan, Matthew D. Hoffman, and Tony Jebara. 2018. Variational Autoencoders for Collaborative Filtering. In WWW '18. 689--698.
    [23]
    David Melamed. 2014. Community Structures in Bipartite Networks: A Dual-Projection Approach. PLOS ONE, Vol. 9, 5 (05 2014), 1--5.
    [24]
    Feng Niu, Benjamin Recht, Christopher Re, and Stephen J. Wright. 2011. HOGWILD!: A Lock-free Approach to Parallelizing Stochastic Gradient Descent. In NIPS'11. 693--701.
    [25]
    F. et. al. Pedregosa. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, Vol. 12 (2011), 2825--2830.
    [26]
    Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. Deepwalk: Online learning of social representations. In KDD'14. 701--710.
    [27]
    Steffen Rendle. 2010. Factorization machines. In ICDM'10. IEEE, 995--1000.
    [28]
    Venu Satuluri and Srinivasan Parthasarathy. 2011. Symmetrizations for Clustering Directed Graphs. In EDBT/ICDT '11. 343--354.
    [29]
    Venu Satuluri, Srinivasan Parthasarathy, and Yiye Ruan. 2011. Local Graph Sparsification for Scalable Clustering. In SIGMOD '11. 721--732.
    [30]
    Sebastian Schelter, Venu Satuluri, and Reza Bosagh Zadeh. 2014. Factorbird - a Parameter Server Approach to Distributed Matrix Factorization. ArXiv, Vol. abs/1411.0602 (2014).
    [31]
    Aneesh Sharma, Jerry Jiang, Praveen Bommannavar, Brian Larson, and Jimmy Lin. 2016. GraphJet: Real-time Content Recommendations at Twitter. Proc. VLDB Endow., Vol. 9, 13 (Sept. 2016), 1281--1292.
    [32]
    Aneesh Sharma, C. Seshadhri, and Ashish Goel. 2017. When Hashes Met Wedges: A Distributed Algorithm for Finding High Similarity Vectors. In WWW '17. 431--440.
    [33]
    Charalampos Tsourakakis. 2015. Provably Fast Inference of Latent Features from Networks: With Applications to Learning Social Circles and Multilabel Classification. In WWW'15. 1111--1121.
    [34]
    Jaewon Yang and Jure Leskovec. 2013. Overlapping Community Detection at Scale: A Nonnegative Matrix Factorization Approach. In WSDM'13. 587--596.
    [35]
    Jaewon Yang, Julian McAuley, and Jure Leskovec. 2014. Detecting Cohesive and 2-Mode Communities Indirected and Undirected Networks. In WSDM'14. 323--332.
    [36]
    Xinyang Yi, Ji Yang, Lichan Hong, Derek Zhiyuan Cheng, Lukasz Heldt, Aditee Kumthekar, Zhe Zhao, Li Wei, and Ed Chi. 2019. Sampling-bias-corrected neural modeling for large corpus item recommendations. In Recsys'19. 269--277.
    [37]
    Rex Ying, Ruining He, Kaifeng Chen, Pong Eksombatchai, William L. Hamilton, and Jure Leskovec. 2018. Graph Convolutional Neural Networks for Web-Scale Recommender Systems. In KDD '18. 974--983.
    [38]
    Xiao Yu, Xiang Ren, Yizhou Sun, Quanquan Gu, Bradley Sturt, Urvashi Khandelwal, Brandon Norick, and Jiawei Han. 2014. Personalized entity recommendation: A heterogeneous information network approach. In WSDM'14. 283--292.
    [39]
    Yongfeng Zhang, Qingyao Ai, Xu Chen, and W Bruce Croft. 2017. Joint representation learning for top-n recommendation with heterogeneous information sources. In CIKM'17. 1449--1458.

    Cited By

    View all
    • (2024)Heterogeneous graph community detection method based on K-nearest neighbor graph neural networkIntelligent Data Analysis10.3233/IDA-230356(1-22)Online publication date: 21-Mar-2024
    • (2024)Towards a Scalable Parallel Infomap Algorithm for Community Detection2024 32nd Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)10.1109/PDP62718.2024.00023(116-123)Online publication date: 20-Mar-2024
    • (2024)HGNN-QSSA: Heterogeneous Graph Neural Networks With Quantitative Sampling and Structure-Aware AttentionIEEE Access10.1109/ACCESS.2024.336623112(25512-25524)Online publication date: 2024
    • Show More Cited By

    Index Terms

    1. SimClusters: Community-Based Representations for Heterogeneous Recommendations at Twitter

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
      August 2020
      3664 pages
      ISBN:9781450379984
      DOI:10.1145/3394486
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 20 August 2020

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. community detection
      2. personalization
      3. recommender systems

      Qualifiers

      • Research-article

      Conference

      KDD '20
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

      Upcoming Conference

      KDD '24

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)579
      • Downloads (Last 6 weeks)31

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Heterogeneous graph community detection method based on K-nearest neighbor graph neural networkIntelligent Data Analysis10.3233/IDA-230356(1-22)Online publication date: 21-Mar-2024
      • (2024)Towards a Scalable Parallel Infomap Algorithm for Community Detection2024 32nd Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)10.1109/PDP62718.2024.00023(116-123)Online publication date: 20-Mar-2024
      • (2024)HGNN-QSSA: Heterogeneous Graph Neural Networks With Quantitative Sampling and Structure-Aware AttentionIEEE Access10.1109/ACCESS.2024.336623112(25512-25524)Online publication date: 2024
      • (2024)Skewed perspectives: examining the influence of engagement maximization on content diversity in social media feedsJournal of Computational Social Science10.1007/s42001-024-00255-wOnline publication date: 20-Mar-2024
      • (2024)A robust two-step algorithm for community detection based on node similarityThe Journal of Supercomputing10.1007/s11227-024-06328-xOnline publication date: 3-Jul-2024
      • (2024)Interpretable Cross-Platform Coordination Detection on Social NetworksComplex Networks & Their Applications XII10.1007/978-3-031-53503-1_12(143-155)Online publication date: 29-Feb-2024
      • (2023)Error in the Euclidean preference modelProceedings of the Thirty-Second International Joint Conference on Artificial Intelligence10.24963/ijcai.2023/322(2888-2896)Online publication date: 19-Aug-2023
      • (2023)Counterfactual Reasoning Over Community Detection: A Case Study of the Public Science Day CommunityJournal of Social Computing10.23919/JSC.2023.00104:2(125-138)Online publication date: Jun-2023
      • (2023)Multi-Modal Graph-Based Recommendation System: Integrating Heterogeneous Modalities for Enhanced Predictions2023 International Conference on Electrical, Computer and Energy Technologies (ICECET)10.1109/ICECET58911.2023.10389370(1-9)Online publication date: 16-Nov-2023
      • (2023)A Graph Convolutional Neural Network for Recommendation Based on Community Detection and Combination of Multiple Heterogeneous Graphs2023 IEEE International Conference on Data Mining (ICDM)10.1109/ICDM58522.2023.00154(1235-1240)Online publication date: 1-Dec-2023
      • Show More Cited By

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media