skip to main content
research-article

Longshot: Indexing Growing Databases Using MPC and Differential Privacy

Published: 01 April 2023 Publication History
  • Get Citation Alerts
  • Abstract

    In this work, we propose Longshot, a novel design for secure outsourced database systems that supports ad-hoc queries through the use of secure multi-party computation and differential privacy. By combining these two techniques, we build and maintain data structures (i.e., synopses, indexes, and stores) that improve query execution efficiency while maintaining strong privacy and security guarantees. As new data records are uploaded by data owners, these data structures are continually updated by Longshot using novel algorithms that leverage bounded information leakage to minimize the use of expensive cryptographic protocols. Furthermore, Long-shot organizes the data structures as a hierarchical tree based on when the update occurred, allowing for update strategies that provide logarithmic error over time. Through this approach, Longshot introduces a tunable three-way trade-off between privacy, accuracy, and efficiency. Our experimental results confirm that our optimizations are not only asymptotic improvements but also observable in practice. In particular, we see a 5x efficiency improvement to update our data structures even when the number of updates is less than 200. Moreover, the data structures significantly improve query runtimes over time, about ~103x faster compared to the baseline after 20 updates.

    References

    [1]
    2022. Emp-toolkit. https://github.com/emp-toolkit.
    [2]
    2022. TLC Trip Record Data. https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page.
    [3]
    Archita Agarwal, Maurice Herlihy, Seny Kamara, and Tarik Moataz. 2019. Encrypted Databases for Differential Privacy. Proceedings on Privacy Enhancing Technologies 2019, 3 (2019), 170--190.
    [4]
    Rakesh Agrawal, Jerry Kiernan, Ramakrishnan Srikant, and Yirong Xu. 2004. Order preserving encryption for numeric data. In Proceedings of the 2004 ACM SIGMOD international conference on Management of data. 563--574.
    [5]
    Ghous Amjad, Seny Kamara, and Tarik Moataz. 2019. Forward and backward private searchable encryption with SGX. In Proceedings of the 12th European Workshop on Systems Security. 1--6.
    [6]
    Johes Bater, Gregory Elliott, Craig Eggen, Satyender Goel, Abel Kho, and Jennie Rogers. 2016. SMCQL: secure querying for federated databases. arXiv preprint arXiv:1606.06808 (2016).
    [7]
    Johes Bater, Xi He, William Ehrich, Ashwin Machanavajjhala, and Jennie Rogers. 2018. Shrinkwrap: efficient sql query processing in differentially private data federations. Proceedings of the VLDB Endowment 12, 3 (2018), 307--320.
    [8]
    Johes Bater, Yongjoo Park, Xi He, Xiao Wang, and Jennie Rogers. 2020. Saqe: practical privacy-preserving approximate query processing for data federations. Proceedings of the VLDB Endowment 13, 12 (2020), 2691--2705.
    [9]
    Amos Beimel. 2011. Secret-sharing schemes: a survey. In International conference on coding and cryptology. Springer, 11--46.
    [10]
    Mihir Bellare, Alexandra Boldyreva, and Adam O'Neill. 2007. Deterministic and efficiently searchable encryption. In Annual International Cryptology Conference. Springer, 535--552.
    [11]
    Vincent Bindschaedler, Paul Grubbs, David Cash, Thomas Ristenpart, and Vitaly Shmatikov. 2017. The tao of inference in privacy-protected databases. Cryptology ePrint Archive (2017).
    [12]
    Laura Blackstone, Seny Kamara, and Tarik Moataz. 2019. Revisiting Leakage Abuse Attacks. IACR Cryptol. ePrint Arch. 2019 (2019), 1175.
    [13]
    Dmytro Bogatov, Georgios Kellaris, George Kollios, Kobbi Nissim, and Adam O'Neill. 2021. εpsolute : Efficiently Querying Databases While Providing Differential Privacy. arXiv preprint arXiv:1706.01552 (2021).
    [14]
    Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone, H Brendan McMahan, Sarvar Patel, Daniel Ramage, Aaron Segal, and Karn Seth. 2017. Practical secure aggregation for privacy-preserving machine learning. In proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. 1175--1191.
    [15]
    Dan Boneh, Giovanni Di Crescenzo, Rafail Ostrovsky, and Giuseppe Persiano. 2004. Public key encryption with keyword search. In International conference on the theory and applications of cryptographic techniques. Springer, 506--522.
    [16]
    Dan Boneh, Eu-Jin Goh, and Kobbi Nissim. 2005. Evaluating 2-DNF formulas on ciphertexts. In Theory of cryptography conference. Springer, 325--341.
    [17]
    Yang Cao, Masatoshi Yoshikawa, Yonghui Xiao, and Li Xiong. 2018. Quantifying differential privacy in continuous data release under temporal correlations. IEEE transactions on knowledge and data engineering 31, 7 (2018), 1281--1295.
    [18]
    David Cash, Paul Grubbs, Jason Perry, and Thomas Ristenpart. 2015. Leakage-abuse attacks against searchable encryption. In Proceedings of the 22nd ACM SIGSAC conference on computer and communications security. 668--679.
    [19]
    David Cash, Joseph Jaeger, Stanislaw Jarecki, Charanjit S Jutla, Hugo Krawczyk, Marcel-Catalin Rosu, and Michael Steiner. 2014. Dynamic searchable encryption in very-large databases: data structures and implementation. In NDSS, Vol. 14. Citeseer, 23--26.
    [20]
    Guoxing Chen, Ten-Hwang Lai, Michael K Reiter, and Yinqian Zhang. 2018. Differentially private access patterns for searchable symmetric encryption. In IEEE INFOCOM 2018-IEEE Conference on Computer Communications. IEEE, 810--818.
    [21]
    Amrita Roy Chowdhury, Chenghong Wang, Xi He, Ashwin Machanavajjhala, and Somesh Jha. 2019. Cryptepsilon: Crypto-Assisted Differential Privacy on Untrusted Servers. arXiv preprint arXiv:1902.07756 (2019).
    [22]
    Henry Corrigan-Gibbs and Dan Boneh. 2017. Prio: Private, Robust, and Scalable Computation of Aggregate Statistics. In NSDI. 259--282.
    [23]
    Natacha Crooks, Matthew Burke, Ethan Cecchetti, Sitar Harel, Rachit Agarwal, and Lorenzo Alvisi. 2018. Obladi: Oblivious Serializable Transactions in the Cloud. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). USENIX Association, Carlsbad, CA, 727--743. https://www.usenix.org/conference/osdi18/presentation/crooks
    [24]
    Rachel Cummings, Sara Krehbiel, Kevin A Lai, and Uthaipon Tantipongpipat. 2018. Differential privacy for growing databases. arXiv preprint arXiv:1803.06416 (2018).
    [25]
    Reza Curtmola, Juan Garay, Seny Kamara, and Rafail Ostrovsky. 2011. Searchable symmetric encryption: improved definitions and efficient constructions. Journal of Computer Security 19, 5 (2011), 895--934.
    [26]
    Jonathan L Dautrich Jr and Chinya V Ravishankar. 2013. Compromising privacy in precise query protocols. In Proceedings of the 16th International Conference on Extending Database Technology. 155--166.
    [27]
    Ioannis Demertzis, Dimitrios Papadopoulos, Charalampos Papamanthou, and Saurabh Shintre. 2020. {SEAL}: Attack Mitigation for Encrypted Databases via Adjustable Leakage. In 29th {USENIX} Security Symposium ({USENIX} Security 20).
    [28]
    Irit Dinur and Kobbi Nissim. 2003. Revealing information while preserving privacy. PODS.
    [29]
    Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. 2006. Calibrating noise to sensitivity in private data analysis. In Theory of cryptography conference. Springer, 265--284.
    [30]
    Cynthia Dwork, Moni Naor, Toniann Pitassi, and Guy N Rothblum. 2010. Differential privacy under continual observation. In Proceedings of the forty-second ACM symposium on Theory of computing. 715--724.
    [31]
    Saba Eskandarian and Matei Zaharia. 2017. Oblidb: Oblivious query processing using hardware enclaves. arXiv preprint arXiv:1710.00458 (2017).
    [32]
    Craig Gentry. 2009. Fully homomorphic encryption using ideal lattices. In Proceedings of the forty-first annual ACM symposium on Theory of computing. 169--178.
    [33]
    Javad Ghareh Chamani, Dimitrios Papadopoulos, Charalampos Papamanthou, and Rasool Jalili. 2018. New constructions for forward and backward private symmetric searchable encryption. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. 1038--1055.
    [34]
    Oded Goldreich and Rafail Ostrovsky. 1996. Software protection and simulation on oblivious RAMs. Journal of the ACM (JACM) 43, 3 (1996), 431--473.
    [35]
    Paul Grubbs, Marie-Sarah Lacharité, Brice Minaud, and Kenneth G Paterson. 2018. Pump up the volume: Practical database reconstruction from volume leakage on range queries. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. 315--331.
    [36]
    Paul Grubbs, Marie-Sarah Lacharité, Brice Minaud, and Kenneth G Paterson. 2019. Learning to reconstruct: Statistical learning theory and encrypted database attacks. In 2019 IEEE Symposium on Security and Privacy (SP). IEEE, 1067--1083.
    [37]
    Zichen Gui, Oliver Johnson, and Bogdan Warinschi. 2019. Encrypted databases: New volume attacks against range queries. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security. 361--378.
    [38]
    Michael Hay, Vibhor Rastogi, Gerome Miklau, and Dan Suciu. 2009. Boosting the accuracy of differentially-private histograms through consistency. arXiv preprint arXiv:0904.0942 (2009).
    [39]
    Mohammad Saiful Islam, Mehmet Kuzu, and Murat Kantarcioglu. 2012. Access pattern disclosure on searchable encryption: ramification, attack and mitigation. In Ndss, Vol. 20. Citeseer, 12.
    [40]
    Mohammad Saiful Islam, Mehmet Kuzu, and Murat Kantarcioglu. 2014. Inference attack against encrypted range queries on outsourced databases. In Proceedings of the 4th ACM conference on Data and application security and privacy. 235--246.
    [41]
    Seny Kamara, Charalampos Papamanthou, and Tom Roeder. 2012. Dynamic searchable symmetric encryption. In Proceedings of the 2012 ACM conference on Computer and communications security. 965--976.
    [42]
    Georgios Kellaris, George Kollios, Kobbi Nissim, and Adam O'neill. 2016. Generic attacks on secure outsourced databases. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. 1329--1340.
    [43]
    Georgios Kellaris, George Kollios, Kobbi Nissim, and Adam O'Neill. 2017. Accessing data while preserving privacy. arXiv preprint arXiv:1706.01552 (2017).
    [44]
    Georgios Kellaris, Stavros Papadopoulos, Xiaokui Xiao, and Dimitris Papadias. 2014. Differentially private event sequences over infinite streams. Proceedings of the VLDB Endowment 7, 12 (2014), 1155--1166.
    [45]
    Evgenios M Kornaropoulos, Charalampos Papamanthou, and Roberto Tamassia. 2020. The state of the uniform: Attacks on encrypted databases beyond the uniform query distribution. In 2020 IEEE Symposium on Security and Privacy (SP). IEEE, 1223--1240.
    [46]
    Ios Kotsogiannis, Yuchao Tao, Xi He, Maryam Fanaeepour, Ashwin Machanavajjhala, Michael Hay, and Gerome Miklau. 2019. Privatesql: a differentially private sql query engine. Proceedings of the VLDB Endowment 12, 11 (2019), 1371--1384.
    [47]
    Marie-Sarah Lacharité, Brice Minaud, and Kenneth G Paterson. 2018. Improved reconstruction attacks on encrypted data using range query leakage. In 2018 IEEE Symposium on Security and Privacy (SP). IEEE, 297--314.
    [48]
    Mathias Lécuyer, Riley Spahn, Kiran Vodrahalli, Roxana Geambasu, and Daniel Hsu. 2019. Privacy Accounting and Quality Control in the Sage Differentially Private ML Platform. In Proceedings of the 27th ACM Symposium on Operating Systems Principles (Huntsville, Ontario, Canada) (SOSP '19). Association for Computing Machinery, New York, NY, USA, 181--195.
    [49]
    Jaewoo Lee, Yue Wang, and Daniel Kifer. 2015. Maximum likelihood postprocessing for differential privacy under consistency constraints. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 635--644.
    [50]
    Chao Li, Gerome Miklau, Michael Hay, Andrew McGregor, and Vibhor Rastogi. 2015. The matrix mechanism: optimizing linear counting queries under differential privacy. VLDB.
    [51]
    Haoran Li, Li Xiong, Xiaoqian Jiang, and Jinfei Liu. 2015. Differentially private histogram publication for dynamic datasets: an adaptive sampling approach. In Proceedings of the 24th ACM international on conference on information and knowledge management. 1001--1010.
    [52]
    Yanbin Lu. 2012. Privacy-preserving Logarithmic-time Search on Encrypted Data in Cloud. In NDSS.
    [53]
    Tao Luo, Mingen Pan, Pierre Tholoniat, Asaf Cidon, Roxana Geambasu, and Mathias Lécuyer. 2021. Privacy Budget Scheduling. In 15th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 21). 55--74.
    [54]
    Sahar Mazloom and S Dov Gordon. 2018. Secure computation with differentially private access patterns. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. 490--507.
    [55]
    Ryan McKenna, Gerome Miklau, Michael Hay, and Ashwin Machanavajjhala. 2018. Optimizing error of high-dimensional statistical queries under differential privacy. VLDB.
    [56]
    Ryan McKenna, Gerome Miklau, and Daniel Sheldon. 2021. Winning the NIST Contest: A scalable and general approach to differentially private synthetic data. arXiv preprint arXiv:2108.04978 (2021).
    [57]
    Ryan McKenna, Daniel Sheldon, and Gerome Miklau. 2019. Graphical-model based estimation and inference for differential privacy. In International Conference on Machine Learning. PMLR, 4435--4444.
    [58]
    Muhammad Naveed, Seny Kamara, and Charles V Wright. 2015. Inference attacks on property-preserving encrypted databases. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security. 644--655.
    [59]
    Muhammad Naveed, Manoj Prabhakaran, and Carl A Gunter. 2014. Dynamic searchable encryption via blind storage. In 2014 IEEE Symposium on Security and Privacy. IEEE, 639--654.
    [60]
    Kartik Nayak, Xiao Shaun Wang, Stratis Ioannidis, Udi Weinsberg, Nina Taft, and Elaine Shi. 2015. Graphsc: Parallel secure computation made easy. In 2015 IEEE Symposium on Security and Privacy. IEEE, 377--394.
    [61]
    Sarvar Patel, Giuseppe Persiano, Kevin Yeo, and Moti Yung. 2019. Mitigating leakage in secure cloud-hosted data structures: Volume-hiding for multi-maps via hashing. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security. 79--93.
    [62]
    Christian Priebe, Kapil Vaswani, and Manuel Costa. 2018. Enclavedb: A secure database using SGX. In 2018 IEEE Symposium on Security and Privacy (SP). IEEE, 264--278.
    [63]
    Wahbeh Qardaji, Weining Yang, and Ninghui Li. 2013. Understanding hierarchical methods for differentially private histograms. Proceedings of the VLDB Endowment 6, 14 (2013), 1954--1965.
    [64]
    Paul Grubbs Tom Ristenpart and Vitaly Shmatikov. [n.d.]. Why Your Encrypted Database Is Not Secure. ([n. d.]).
    [65]
    Bharath Kumar Samanthula, Wei Jiang, and Elisa Bertino. 2014. Privacy-preserving complex query evaluation over semantically secure encrypted data. In European Symposium on Research in Computer Security. Springer, 400--418.
    [66]
    Zhiwei Shang, Simon Oya, Andreas Peter, and Florian Kerschbaum. 2021. Obfuscated Access and Search Patterns in Searchable Encryption. arXiv preprint arXiv:2102.09651 (2021).
    [67]
    Emily Shen, Elaine Shi, and Brent Waters. 2009. Predicate privacy in encryption systems. In Theory of Cryptography Conference. Springer, 457--473.
    [68]
    Elaine Shi, John Bethencourt, TH Hubert Chan, Dawn Song, and Adrian Perrig. 2007. Multi-dimensional range query over encrypted data. In 2007 IEEE Symposium on Security and Privacy (SP'07). IEEE, 350--364.
    [69]
    Emil Stefanov, Marten Van Dijk, Elaine Shi, T-H Hubert Chan, Christopher Fletcher, Ling Ren, Xiangyao Yu, and Srinivas Devadas. 2018. Path ORAM: an extremely simple oblivious RAM protocol. Journal of the ACM (JACM) 65, 4 (2018), 1--26.
    [70]
    Emil Stefanov, Charalampos Papamanthou, and Elaine Shi. 2014. Practical Dynamic Searchable Encryption with Small Leakage. In NDSS, Vol. 71. 72--75.
    [71]
    Sijun Tan, Brian Knott, Yuan Tian, and David J Wu. 2021. CRYPTGPU: Fast Privacy-Preserving Machine Learning on the GPU. arXiv preprint arXiv:2104.10949 (2021).
    [72]
    Yuchao Tao, Ryan McKenna, Michael Hay, Ashwin Machanavajjhala, and Gerome Miklau. 2021. Benchmarking differentially private synthetic data generation algorithms. arXiv preprint arXiv:2112.09238 (2021).
    [73]
    Dhinakaran Vinayagamurthy, Alexey Gribov, and Sergey Gorbunov. 2019. Stealthdb: a scalable encrypted database with full SQL query support. Proceedings on Privacy Enhancing Technologies 2019, 3 (2019), 370--388.
    [74]
    Sameer Wagh, Paul Cuff, and Prateek Mittal. 2018. Differentially private oblivious ram. Proceedings on Privacy Enhancing Technologies 2018, 4 (2018), 64--84.
    [75]
    Chenghong Wang, Johes Bater, Kartik Nayak, and Ashwin Machanavajjhala. 2021. DP-Sync: Hiding Update Patterns in Secure OutsourcedDatabases with Differential Privacy. arXiv preprint arXiv:2103.15942 (2021).
    [76]
    Chenghong Wang, Johes Bater, Kartik Nayak, and Ashwin Machanavajjhala. 2022. IncShrink: architecting efficient outsourced databases using incremental mpc and differential privacy. SIGMOD.
    [77]
    Xingchen Wang and Yunlei Zhao. 2018. Order-revealing encryption: file-injection attack and forward security. In European Symposium on Research in Computer Security. Springer, 101--121.
    [78]
    Min Xu, Antonis Papadimitriou, Andreas Haeberlen, and Ariel Feldman. 2019. Hermetic: Privacy-preserving distributed analytics without (most) side channels. External Links: Link Cited by (2019).
    [79]
    Jun Zhang, Graham Cormode, Cecilia M Procopiuc, Divesh Srivastava, and Xiaokui Xiao. 2017. Privbayes: Private data release via bayesian networks. ACM Transactions on Database Systems (TODS) 42, 4 (2017), 1--41.
    [80]
    Yupeng Zhang, Jonathan Katz, and Charalampos Papamanthou. 2016. All your queries are belong to us: The power of file-injection attacks on searchable encryption. In 25th {USENIX} Security Symposium ({USENIX} Security 16). 707--720.

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the VLDB Endowment
    Proceedings of the VLDB Endowment  Volume 16, Issue 8
    April 2023
    257 pages
    ISSN:2150-8097
    Issue’s Table of Contents

    Publisher

    VLDB Endowment

    Publication History

    Published: 01 April 2023
    Published in PVLDB Volume 16, Issue 8

    Check for updates

    Badges

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 166
      Total Downloads
    • Downloads (Last 12 months)162
    • Downloads (Last 6 weeks)15

    Other Metrics

    Citations

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media