Ryan Rogers

San Francisco Bay Area Contact Info
1K followers 500+ connections

Join to view profile

About

I work on machine learning and data privacy at Apple on the Proactive team, focusing on…

Activity

Join now to see all activity

Experience & Education

  • Apple

View Ryan’s full experience

See their title, tenure and more.

or

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

Publications

  • Differentially Private Histograms under Continual Observation: Streaming Selection into the Unknown

    Theory and Practice of Differential Privacy Workshop (TPDP) 2021

    We generalize the continuous observation privacy setting from Dwork et al. '10 and Chan et al. '11 by allowing each event in a stream to be a subset of some (possibly unknown) universe of items. We design differentially private (DP) algorithms for histograms in several settings, including top-k selection, with privacy loss that scales with polylog(T), where T is the maximum length of the input stream. We present a meta-algorithm that can use existing one-shot top-k DP algorithms as a subroutine…

    We generalize the continuous observation privacy setting from Dwork et al. '10 and Chan et al. '11 by allowing each event in a stream to be a subset of some (possibly unknown) universe of items. We design differentially private (DP) algorithms for histograms in several settings, including top-k selection, with privacy loss that scales with polylog(T), where T is the maximum length of the input stream. We present a meta-algorithm that can use existing one-shot top-k DP algorithms as a subroutine to continuously release private histograms from a stream. Further, we present more practical DP algorithms for two settings: 1) continuously releasing the top-k counts from a histogram over a known domain when an event can consist of an arbitrary number of items, and 2) continuously releasing histograms over an unknown domain when an event has a limited number of items.

    Other authors
    See publication
  • A Members First Approach to Enabling LinkedIn's Labor Market Insights at Scale

    Theory and Practice of Differential Privacy Workshop (TPDP) 2021

    We describe the privatization method used in reporting labor market insights from LinkedIn's Economic Graph, including the differentially private algorithms used to protect member's privacy. The reports show who are the top employers, as well as what are the top jobs and skills in a given country/region and industry. We hope this data will help governments and citizens track labor market trends during the COVID-19 pandemic while also protecting the privacy of our members.

    Other authors
    See publication
  • Bounding, Concentrating, and Truncating: Unifying Privacy Loss Composition for Data Analytics

    Algorithmic Learning Theory (ALT) 2021

    Differential privacy (DP) provides rigorous privacy guarantees on individual's data while also allowing for accurate statistics to be conducted on the overall, sensitive dataset. To design a private system, first private algorithms must be designed that can quantify the privacy loss of each outcome that is released. However, private algorithms that inject noise into the computation are not sufficient to ensure individuals' data is protected due to many noisy results ultimately concentrating to…

    Differential privacy (DP) provides rigorous privacy guarantees on individual's data while also allowing for accurate statistics to be conducted on the overall, sensitive dataset. To design a private system, first private algorithms must be designed that can quantify the privacy loss of each outcome that is released. However, private algorithms that inject noise into the computation are not sufficient to ensure individuals' data is protected due to many noisy results ultimately concentrating to the true, non-privatized result. Hence there have been several works providing precise formulas for how the privacy loss accumulates over multiple interactions with private algorithms. However, these formulas either provide very general bounds on the privacy loss, at the cost of being overly pessimistic for certain types of private algorithms, or they can be too narrow in scope to apply to general privacy systems. In this work, we unify existing privacy loss composition bounds for special classes of differentially private (DP) algorithms along with general DP composition bounds. In particular, we provide strong privacy loss bounds when an analyst may select pure DP, bounded range (e.g. exponential mechanisms), or concentrated DP mechanisms in any order. We also provide optimal privacy loss bounds that apply when an analyst can select pure DP and bounded range mechanisms in a batch, i.e. non-adaptively. Further, when an analyst selects mechanisms within each class adaptively, we show a difference in privacy loss between different, predetermined orderings of pure DP and bounded range mechanisms. Lastly, we compare the composition bounds of Laplace and Gaussian mechanisms based on histogram datasets.

    Other authors
    See publication
  • LinkedIn's Audience Engagements API: A Privacy Preserving Data Analytics System at Scale

    Theory and Practice of Differential Privacy Workshop (TPDP) 2020

    We present a privacy system that leverages differential privacy to protect LinkedIn members' data while also providing audience engagement insights to enable marketing analytics related applications. We detail the differentially private algorithms and other privacy safeguards used to provide results that can be used with existing real-time data analytics platforms, specifically with the open sourced Pinot system. Our privacy system provides user-level privacy guarantees. As part of our privacy…

    We present a privacy system that leverages differential privacy to protect LinkedIn members' data while also providing audience engagement insights to enable marketing analytics related applications. We detail the differentially private algorithms and other privacy safeguards used to provide results that can be used with existing real-time data analytics platforms, specifically with the open sourced Pinot system. Our privacy system provides user-level privacy guarantees. As part of our privacy system, we include a budget management service that enforces a strict differential privacy budget on the returned results to the analyst. This budget management service brings together the latest research in differential privacy into a product to maintain utility given a fixed differential privacy budget.

    Other authors
    See publication
  • Optimal Differential Privacy Composition for Exponential Mechanisms and the Cost of Adaptivity

    International Conference on Machine Learning (ICML) 2020

    Composition is one of the most important properties of differential privacy (DP), as it allows algorithm designers to build complex private algorithms from DP primitives. We consider precise composition bounds of the overall privacy loss for exponential mechanisms, one of the fundamental classes of mechanisms in DP. We give explicit formulations of the optimal privacy loss for both the adaptive and non-adaptive settings. For the non-adaptive setting in which each mechanism has the same privacy…

    Composition is one of the most important properties of differential privacy (DP), as it allows algorithm designers to build complex private algorithms from DP primitives. We consider precise composition bounds of the overall privacy loss for exponential mechanisms, one of the fundamental classes of mechanisms in DP. We give explicit formulations of the optimal privacy loss for both the adaptive and non-adaptive settings. For the non-adaptive setting in which each mechanism has the same privacy parameter, we give an efficiently computable formulation of the optimal privacy loss. Furthermore, we show that there is a difference in the privacy loss when the exponential mechanism is chosen adaptively versus non-adaptively. To our knowledge, it was previously unknown whether such a gap existed for any DP mechanisms with fixed privacy parameters, and we demonstrate the gap for a widely used class of mechanism in a natural setting. We then improve upon the best previously known upper bounds for adaptive composition of exponential mechanisms with efficiently computable formulations and show the improvement.

    Other authors
    • David Durfee
    • Jinshuo Dong
    See publication
  • Practical Differentially Private Top-k Selection with Pay-what-you-get Composition

    Conference on Neural Information Processing Systems (NeurIPS) 2019 - Spotlight

    We study the problem of top-k selection over a large domain universe subject to user-level differential privacy. Typically, the exponential mechanism or report noisy max are the algorithms used to solve this problem. However, these algorithms require querying the database for the count of each domain element. We focus on the setting where the data domain is unknown, which is different than the setting of frequent itemsets where an apriori type algorithm can help prune the space of domain…

    We study the problem of top-k selection over a large domain universe subject to user-level differential privacy. Typically, the exponential mechanism or report noisy max are the algorithms used to solve this problem. However, these algorithms require querying the database for the count of each domain element. We focus on the setting where the data domain is unknown, which is different than the setting of frequent itemsets where an apriori type algorithm can help prune the space of domain elements to query. We design algorithms that ensures (approximate) (ϵ,δ>0)-differential privacy and only needs access to the true top-k¯ elements from the data for any chosen k¯≥k. This is a highly desirable feature for making differential privacy practical, since the algorithms require no knowledge of the domain. We consider both the setting where a user's data can modify an arbitrary number of counts by at most 1, i.e. unrestricted sensitivity, and the setting where a user's data can modify at most some small, fixed number of counts by at most 1, i.e. restricted sensitivity. Additionally, we provide a pay-what-you-get privacy composition bound for our algorithms. That is, our algorithms might return fewer than k elements when the top-k elements are queried, but the overall privacy budget only decreases by the size of the outcome set.

    Other authors
    • David Durfee
    See publication
  • Lower Bounds for Locally Private Estimation via Communication Complexity

    Computational Learning Theory Conference

    We develop lower bounds for estimation under local privacy constraints---including differential privacy and its relaxations to approximate or Rényi differential privacy---by showing an equivalence between private estimation and communication-restricted estimation problems. Our results apply to arbitrarily interactive privacy mechanisms, and they also give sharp lower bounds for all levels of differential privacy protections, that is, privacy mechanisms with privacy levels ε∈[0,∞). As a…

    We develop lower bounds for estimation under local privacy constraints---including differential privacy and its relaxations to approximate or Rényi differential privacy---by showing an equivalence between private estimation and communication-restricted estimation problems. Our results apply to arbitrarily interactive privacy mechanisms, and they also give sharp lower bounds for all levels of differential privacy protections, that is, privacy mechanisms with privacy levels ε∈[0,∞). As a particular consequence of our results, we show that the minimax mean-squared error for estimating the mean of a bounded or Gaussian random vector in d dimensions scales as d/n*d / min{ε,ε^2}.

    Other authors
    • John Duchi
    See publication
  • Protection Against Reconstruction and Its Applications in Private Federated Learning

    Apple Machine Learning Research Journal

    In large-scale statistical learning, data collection and model fitting are moving increasingly
    toward peripheral devices—phones, watches, fitness trackers—away from centralized data collection. Concomitant with this rise in decentralized data are increasing challenges of maintaining
    privacy while allowing enough information to fit accurate, useful statistical models. This motivates local notions of privacy—most significantly, local differential privacy, which provides
    strong…

    In large-scale statistical learning, data collection and model fitting are moving increasingly
    toward peripheral devices—phones, watches, fitness trackers—away from centralized data collection. Concomitant with this rise in decentralized data are increasing challenges of maintaining
    privacy while allowing enough information to fit accurate, useful statistical models. This motivates local notions of privacy—most significantly, local differential privacy, which provides
    strong protections against sensitive data disclosures—where data is obfuscated before a statistician or learner can even observe it, providing strong protections to individuals’ data. Yet local
    privacy as traditionally employed may prove too stringent for practical use, especially in modern
    high-dimensional statistical and machine learning problems. Consequently, we revisit the types
    of disclosures and adversaries against which we provide protections, considering adversaries with
    limited prior information and ensuring that with high probability, ensuring they cannot reconstruct an individual’s data within useful tolerances. By reconceptualizing these protections, we
    allow more useful data release—large privacy parameters in local differential privacy—and we
    design new (minimax) optimal locally differentially private mechanisms for statistical learning
    problems for all privacy levels. We thus present practicable approaches to large-scale locally
    private model training that were previously impossible, showing theoretically and empirically
    that we can fit large-scale image classification and language models with little degradation in
    utility

    Other authors
    See publication
  • Learning with Privacy at Scale

    Apple Machine Learning Research Journal

    Understanding how people use their devices often helps in improving the user experience. However,
    accessing the data that provides such insights — for example, what users type on their keyboards and
    the websites they visit — can compromise user privacy. We design a system architecture that enables
    learning at scale by leveraging local differential privacy, combined with existing privacy best practices.
    We develop efficient and scalable local differentially private algorithms and…

    Understanding how people use their devices often helps in improving the user experience. However,
    accessing the data that provides such insights — for example, what users type on their keyboards and
    the websites they visit — can compromise user privacy. We design a system architecture that enables
    learning at scale by leveraging local differential privacy, combined with existing privacy best practices.
    We develop efficient and scalable local differentially private algorithms and provide rigorous analyses to
    demonstrate the tradeoffs among utility, privacy, server computation, and device bandwidth. Understanding the balance among these factors leads us to a successful practical deployment using local differential
    privacy. This deployment scales to hundreds of millions of users across a variety of use cases, such as
    identifying popular emojis, popular health data types, and media playback preferences in Safari.

    Other authors
    • Differential Privacy Team
    See publication

More activity by Ryan

View Ryan’s full profile

  • See who you know in common
  • Get introduced
  • Contact Ryan directly
Join to view full profile

Other similar profiles

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Others named Ryan Rogers in United States

Add new skills with these courses