Gennady Pekhimenko

Toronto, Ontario, Canada Contact Info

Sign in to view Gennady’s full profile

Welcome back

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

or

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

5K followers 500+ connections

View mutual connections with Gennady

Welcome back

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

or

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

Join to view profile

CentML

Carnegie Mellon University

About

I am generally interested in the areas of systems and machine learning. My major research…

Activity

Looking forward to my PhD student Amey Agrawal's #OSDI24 talk on how to tame tail latency in Large Language Model (LLM) inference efficiently, while…

Looking forward to my PhD student Amey Agrawal's #OSDI24 talk on how to tame tail latency in Large Language Model (LLM) inference efficiently, while…

Liked by Gennady Pekhimenko
My dad was a professor of Philosophy and his teachings continue through me and my students ... ای مرغ سحر! چو این شب…

My dad was a professor of Philosophy and his teachings continue through me and my students ... ای مرغ سحر! چو این شب…

Liked by Gennady Pekhimenko
Your Voice Matters in Shaping the Future of LLMs At CentML, we're on a mission to revolutionize AI optimization, and your insights are crucial. If…

Your Voice Matters in Shaping the Future of LLMs At CentML, we're on a mission to revolutionize AI optimization, and your insights are crucial. If…

Liked by Gennady Pekhimenko

Join now to see all activity

Experience & Education

CentML

********** ** *******

********* *********
****** *********

******* ******
******** ****** **********

*** ******** *******

2010 - 2016
********** ** *******

***. ******** *******

2006 - 2008

View Gennady’s full experience

See their title, tenure and more.

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

Publications

A Case for Core-Assisted Bottleneck Acceleration in GPUs: Enabling Efficient Data Compression

ACM Jun 2015
Modern Graphics Processing Units (GPUs) are well provisioned to support the concurrent execution of thousands of threads. Unfortunately, different bottlenecks during execution and heterogeneous application requirements create imbalances in utilization of resources in the cores. For example, when a GPU is bottlenecked by the available off-chip memory bandwidth, its computational resources are often overwhelmingly idle, waiting for data from memory to arrive.

This paper introduces the…

Modern Graphics Processing Units (GPUs) are well provisioned to support the concurrent execution of thousands of threads. Unfortunately, different bottlenecks during execution and heterogeneous application requirements create imbalances in utilization of resources in the cores. For example, when a GPU is bottlenecked by the available off-chip memory bandwidth, its computational resources are often overwhelmingly idle, waiting for data from memory to arrive.

This paper introduces the Core-Assisted Bottleneck Acceleration (CABA) framework that employs idle on-chip resources to alleviate different bottlenecks in GPU execution. CABA provides flexible mechanisms to automatically generate "assist warps" that execute on GPU cores to perform specific tasks that can improve GPU performance and efficiency.

CABA enables the use of idle computational units and pipelines to alleviate the memory bandwidth bottleneck, e.g., by using assist warps to perform data compression to transfer less data from memory. Conversely, the same framework can be employed to handle cases where the GPU is bottlenecked by the available computational units, in which case the memory pipelines are idle and can be used by CABA to speed up computation, e.g., by performing memoization using assist warps.

We provide a comprehensive design and evaluation of CABA to perform effective and flexible data compression in the GPU memory hierarchy to alleviate the memory bandwidth bottleneck. Our extensive evaluations show that CABA, when used to implement data compression, provides an average performance improvement of 41.7% (as high as 2.6X) across a variety of memory-bandwidth-sensitive GPGPU applications.

Other authors
See publication
Page Overlays: An Enhanced Virtual Memory Framework to Enable Fine-grained Memory Management

ACM Jun 2015
Many recent works propose mechanisms demonstrating the potential advantages of managing memory at a fine (e.g., cache line) granularity---e.g., fine-grained deduplication and fine-grained memory protection. Unfortunately, existing virtual memory systems track memory at a larger granularity (e.g., 4 KB pages), inhibiting efficient implementation of such techniques. Simply reducing the page size results in an unacceptable increase in page table overhead and TLB pressure.

We propose a new…

Many recent works propose mechanisms demonstrating the potential advantages of managing memory at a fine (e.g., cache line) granularity---e.g., fine-grained deduplication and fine-grained memory protection. Unfortunately, existing virtual memory systems track memory at a larger granularity (e.g., 4 KB pages), inhibiting efficient implementation of such techniques. Simply reducing the page size results in an unacceptable increase in page table overhead and TLB pressure.

We propose a new virtual memory framework that enables efficient implementation of a variety of fine-grained memory management techniques. In our framework, each virtual page can be mapped to a structure called a page overlay, in addition to a regular physical page. An overlay contains a subset of cache lines from the virtual page. Cache lines that are present in the overlay are accessed from there and all other cache lines are accessed from the regular physical page. Our page-overlay framework enables cache-line-granularity memory management without significantly altering the existing virtual memory framework or introducing high overheads.

We show that our framework can enable simple and efficient implementations of seven memory management techniques, each of which has a wide variety of applications. We quantitatively evaluate the potential benefits of two of these techniques: overlay-on-write and sparse-data-structure computation. Our evaluations show that overlay-on-write, when applied to fork, can improve performance by 15% and reduce memory capacity requirements by 53% on average compared to traditional copy-on-write. For sparse data computation, our framework can outperform a state-of-the-art software-based sparse representation on a number of real-world sparse matrices. Our framework is general, powerful, and effective in enabling fine-grained memory management at low cost.

Other authors
See publication
Toggle-Aware Compression for GPUs

Computer Architecture Letters (Jan-Jun 2015) Jun 2015
Other authors
PocketTrend: Timely Identification and Delivery of Trending Search Content to Mobile Users

ACM May 2015
Trending search topics cause unpredictable query load spikes that hurt the end-user search experience, particularly the mobile one, by introducing longer delays. To understand how trending search topics are formed and evolve over time, we analyze 21 million queries submitted during periods where popular events caused search query volume spikes. Based on our findings, we design and evaluate PocketTrend, a system that automatically detects trending topics in real time, identifies the search…

Trending search topics cause unpredictable query load spikes that hurt the end-user search experience, particularly the mobile one, by introducing longer delays. To understand how trending search topics are formed and evolve over time, we analyze 21 million queries submitted during periods where popular events caused search query volume spikes. Based on our findings, we design and evaluate PocketTrend, a system that automatically detects trending topics in real time, identifies the search content associated to the topics, and then intelligently pushes this content to users in a timely manner. In that way, PocketTrend enables a client-side search engine that can instantly answer user queries related to trending events, while at the same time reducing the impact of these trends on the datacenter workload. Our results, using real mobile search logs, show that in the presence of a trending event, up to 13-17% of the overall search traffic can be eliminated from the datacenter, with as many as 19% of all users benefiting from PocketTrend.

Other authors
See publication
Adaptive-Latency DRAM: Optimizing DRAM Timing for the Common-Case

21st International Symposium on High-Performance Computer Architecture (HPCA) January 12, 2015
In current systems, memory accesses to a DRAM chip must
obey a set of minimum latency restrictions specified in the
DRAM standard. Such timing parameters exist to guarantee reliable
operation. When deciding the timing parameters, DRAM
manufacturers incorporate a very large margin as a provision
against two worst-case scenarios. First, due to process variation,
some outlier chips are much slower than others and cannot
be operated as fast. Second, chips become slower at…

In current systems, memory accesses to a DRAM chip must
obey a set of minimum latency restrictions specified in the
DRAM standard. Such timing parameters exist to guarantee reliable
operation. When deciding the timing parameters, DRAM
manufacturers incorporate a very large margin as a provision
against two worst-case scenarios. First, due to process variation,
some outlier chips are much slower than others and cannot
be operated as fast. Second, chips become slower at higher
temperatures, and all chips need to operate reliably at the highest
supported (i.e., worst-case) DRAM temperature (85◦C). In
this paper, we show that typical DRAM chips operating at typical
temperatures (e.g., 55◦C) are capable of providing a much
smaller access latency, but are nevertheless forced to operate
at the largest latency of the worst-case.
Our goal in this paper is to exploit the extra margin that
is built into the DRAM timing parameters to improve performance.
Using an FPGA-based testing platform, we first characterize
the extra margin for 115 DRAM modules from three
major manufacturers. Our results demonstrate that it is possible
to reduce four of the most critical timing parameters by
a minimum/maximum of 17.3%/54.8% at 55◦C without sacrificing
correctness. Based on this characterization, we propose
Adaptive-Latency DRAM (AL-DRAM), a mechanism that
adaptively reduces the timing parameters for DRAM modules
based on the current operating condition. AL-DRAM does not
require any changes to the DRAM chip or its interface.
We evaluate AL-DRAM on a real system that allows us to reconfigure
the timing parameters at runtime. We show that ALDRAM
improves the performance of memory-intensive workloads
by an average of 14% without introducing any errors.
We discuss and show why AL-DRAM does not compromise reliability.
We conclude that dynamically optimizing the DRAM
timing parameters can reliably improve system performance.

Other authors
See publication
Exploiting Compressed Block Size as an Indicator of Future Reuse

21st International Symposium on High-Performance Computer Architecture (HPCA) January 12, 2015
We introduce a set of new Compression-Aware Management
Policies (CAMP) for on-chip caches that employ data compression.
Our management policies are based on two key ideas. First,
we show that it is possible to build a more ecient management
policy for compressed caches if the compressed block size is directly
used in calculating the value (importance) of a block to
the cache. This leads to Minimal-Value Eviction (MVE), a policy
that evicts the cache blocks with the least…

We introduce a set of new Compression-Aware Management
Policies (CAMP) for on-chip caches that employ data compression.
Our management policies are based on two key ideas. First,
we show that it is possible to build a more ecient management
policy for compressed caches if the compressed block size is directly
used in calculating the value (importance) of a block to
the cache. This leads to Minimal-Value Eviction (MVE), a policy
that evicts the cache blocks with the least value, based on both
the size and the expected future reuse. Second, we show that, in
some cases, compressed block size can be used as an ecient indicator
of the future reuse of a cache block. We use this idea to
build a new insertion policy called Size-based Insertion Policy
(SIP) that dynamically prioritizes cache blocks using their compressed
size as an indicator.
We compare CAMP (and its global variant G-CAMP) to prior
on-chip cache management policies (both size-oblivious and
size-aware) and nd that our mechanisms are more eective in
using compressed block size as an extra dimension in cache management
decisions. Our results show that the proposed management
policies (i) decrease o-chip bandwidth consumption
(by 8.7% in single-core), (ii) decrease memory subsystem energy
consumption (by 7.2% in single-core) for memory intensive
workloads compared to the best prior mechanism, and (iii)
improve performance (by 4.9%/9.0%/10.2% on average in single-
/two-/four-core workload evaluations and up to 20.1%) CAMP is
eective for a variety of compression algorithms and dierent
cache designs with local and global replacement strategies.

Other authors
See publication
Shifted Hamming Distance: A Fast and Accurate SIMD-Friendly Filter to Accelerate Alignment Verification in Read Mapping

Oxford Bioinformatics January 10, 2015
Motivation: Calculating the edit-distance (i.e., minimum number of insertions, deletions, and substitutions) between short DNA sequences is the primary task performed by seed-and-extend based mappers, which compare billions of sequences.

In practice, only sequence pairs with a small edit-distance provide useful scientific data. However, the majority of sequence pairs analyzed by seed-and-extend based mappers differ by significantly more errors than what is typically allowed. Such…

Motivation: Calculating the edit-distance (i.e., minimum number of insertions, deletions, and substitutions) between short DNA sequences is the primary task performed by seed-and-extend based mappers, which compare billions of sequences.

In practice, only sequence pairs with a small edit-distance provide useful scientific data. However, the majority of sequence pairs analyzed by seed-and-extend based mappers differ by significantly more errors than what is typically allowed. Such error-abundant sequence pairs needlessly waste resources and severely hinder the performance of read mappers. Therefore, it is crucial to develop a fast and accurate filter that can rapidly and efficiently detect error- abundant string pairs and remove them from consideration before more computationally expensive methods are used.

Results: We present a simple and efficient algorithm, Shifted Hamming Distance (SHD), which accelerates the alignment verification procedure in read mapping, by quickly filtering out error-abundant sequence pairs using bit-parallel and SIMD-parallel operations. SHD only filters string pairs that contain more errors than a user-defined threshold, making it fully comprehensive. It also maintains high accuracy with moderate error threshold (up to 5% of the string length) while achieving a 3-fold speedup over the best previous algorithm (Gene Myers's bit-vector algorithm). SHD is compatible with all mappers that perform sequence alignment for verification.

Other authors
See publication
Rollback-Free Value Prediction with Approximate Loads

The 23rd International Conference on Parallel Architecture and Compiler Techniques (PACT'14) May 24, 2014
This paper demonstrates how to utilize the inherent error resilience of a wide range of applications to mitigate the memory wall—the discrepancy between core and memory speed. We define a new microarchitecturally-triggered approximation technique called rollback-free value prediction. This technique predicts the value of safe-to-approximate loads when they miss in the cache without tracking mispredictions or requiring costly recovery from misspeculations. This technique mitigates the memory…

This paper demonstrates how to utilize the inherent error resilience of a wide range of applications to mitigate the memory wall—the discrepancy between core and memory speed. We define a new microarchitecturally-triggered approximation technique called rollback-free value prediction. This technique predicts the value of safe-to-approximate loads when they miss in the cache without tracking mispredictions or requiring costly recovery from misspeculations. This technique mitigates the memory wall by allowing the core to continue computation without stalling for long-latency memory accesses. Our detailed study of the quality trade-offs shows that with a modern out-of-order processor, average 8% (up to 19%) per- formance improvement is possible with 0.8% (up to 1.8%) average quality loss on an approximable subset of SPEC CPU 2000/2006.

Other authors
See publication
Linearly Compressed Pages: A Low-Complexity, Low-Latency Main Memory Compression Framework

MICRO 2013 December 10, 2013
Base-Delta-Immediate Compression: Practical Data Compression for On-Chip Caches

PACT 2012 September 24, 2012

See publication
Software Automatic Tuning: From Concepts to State-of-the-Art Results

Springer September 21, 2010
Chapter 19, Gennady Pekhimenko, Angela Demke Brown
"Efficient Program Compilation Through Machine
Learning Techniques"

Other authors
See publication
Efficient Program Compilation through Machine Learning Techniques

iWAPT 2009 Oct 2009
Other authors
See publication

Patents

Trend response management

Issued August 13, 2015 US 14/175,934

See patent
Managing speculative assist threads

Issued April 21, 2011 US 12/905,202

An illustrative embodiment provides a computer-implemented process for managing speculative assist threads for data pre-fetching that analyzes collected source code and cache profiling information to identify a code region containing a delinquent load instruction and generates an assist thread, including a value for a local version number, at a program entry point within the identified code region. Upon activation of the assist thread the local version number of the assist thread is compared to…

An illustrative embodiment provides a computer-implemented process for managing speculative assist threads for data pre-fetching that analyzes collected source code and cache profiling information to identify a code region containing a delinquent load instruction and generates an assist thread, including a value for a local version number, at a program entry point within the identified code region. Upon activation of the assist thread the local version number of the assist thread is compared to the global unique version number of the main thread for the identified code region and an iteration distance between the assist thread relative to the main thread is compared to a predefined value. The assist thread is executed when the local version number of the assist thread matches the global unique version number of the main thread, and the iteration distance between the assist thread relative to the main thread is within a predefined range of values.

Courses

Graduate Algorithms

15-750
Graduate Computer Architecture

15-740
Graduate Computer Networks

15-744
Graduate Machine Learning

15-781
Optimizing Compilers for Modern Architecture

15-745
Parallel Computer Architecture

18-742
Program Analysis

15-819
Semantics of Programming Languages

15-812

Honors & Awards

NVIDIA Graduate Fellowship, 2015-2016

NVIDIA

Mar 2015
Qualcomm Innovation Fellowship (QInF'13, Honorable Mention)

Qualcomm

Apr 2013
Microsoft Research Fellowship, 2013-2015

Microsoft Research

Jan 2013
Alexander Graham Bell Canada Graduate Scholarship, 2012-2014

NSERC (Canada)

May 2012

Languages

English

Full professional proficiency
Russian

Native or bilingual proficiency
Ukrainian

Native or bilingual proficiency
German

Elementary proficiency

Organizations

ACM

-

May 2012 - Present

Recommendations received

Rui Luo

“I had the pleasure to work with Gennady in IBM Rational compiler team for the same project. Gennady always has a passion to make smart and elegant changes to improve the product. He also processes a keen sense of spotting problems. Whenever I discuss a problem with him, he will be able to point out the specific areas to look for a bug. Most importantly, he possesses the passion to explore new concept and to explain them to his team mates. He never gets tired of expressing his ideas and giving advice to others. Working with a passionate programmer like Gennady is always a joyful and exciting experience to me.”
LinkedIn User

“I worked with Gennady as part of a feature team in the Compiler Optimization area at IBM. Throughout the time spent with him, I noticed his strong leadership and technical skills, especially while having to deal with the few hardware resources we had. I consider him a great asset to any company.”

8 people have recommended Gennady

Join now to view

More activity by Gennady

Aidan Gomez will free us all from tyranny of the word "delve" in language model outputs! This is why Cohere doesn't use any output from OpenAI for…

Aidan Gomez will free us all from tyranny of the word "delve" in language model outputs! This is why Cohere doesn't use any output from OpenAI for…

Liked by Gennady Pekhimenko
Congratulations to Rahul Bera, Adithya Ranganathan and co-authors on receiving the Best Paper Award at #ISCACONF2024 for their work “Constable:…

Congratulations to Rahul Bera, Adithya Ranganathan and co-authors on receiving the Best Paper Award at #ISCACONF2024 for their work “Constable:…

Liked by Gennady Pekhimenko
The best part of what we do is working with people we love:)

The best part of what we do is working with people we love:)

Liked by Gennady Pekhimenko
📢 📢 Nvidia's latest GPUs feature CUDA MPS, an often overlooked but powerful tool that allows spatial partitioning of GPUs SMs while sharing memory.…

📢 📢 Nvidia's latest GPUs feature CUDA MPS, an often overlooked but powerful tool that allows spatial partitioning of GPUs SMs while sharing memory.…

Liked by Gennady Pekhimenko
Happy Canada Day, everyone! 🇨🇦 This Canada Day, we are thrilled to welcome Ruslan Salakhutdinov as our newest investor and research advisor at…

Happy Canada Day, everyone! 🇨🇦 This Canada Day, we are thrilled to welcome Ruslan Salakhutdinov as our newest investor and research advisor at…

Liked by Gennady Pekhimenko
I am thrilled to announce that Juliana Salazar will be joining my team as the Director of Executive Operations. Juliana brings a wealth of experience…

I am thrilled to announce that Juliana Salazar will be joining my team as the Director of Executive Operations. Juliana brings a wealth of experience…

Liked by Gennady Pekhimenko
I am honored to be selected as a 2024 MLSys Rising Star ⭐! I look forward to meeting everyone at the workshop next month in Santa…

I am honored to be selected as a 2024 MLSys Rising Star ⭐! I look forward to meeting everyone at the workshop next month in Santa…

Liked by Gennady Pekhimenko
For as long as I can remember, I’ve been drawn to the cutting edge of technology, constantly seeking roles that allow me to be part of groundbreaking…

For as long as I can remember, I’ve been drawn to the cutting edge of technology, constantly seeking roles that allow me to be part of groundbreaking…

Liked by Gennady Pekhimenko
Many thanks to Lisa Hsu and Suvinay Subramanian for hosting me at the Computer Architecture Podcast with an episode on Sustainability in a Post-AI…

Many thanks to Lisa Hsu and Suvinay Subramanian for hosting me at the Computer Architecture Podcast with an episode on Sustainability in a Post-AI…

Liked by Gennady Pekhimenko
SAPEON Inc. is proud to collaborate with the University of Toronto and SK Telecom on optimizing AI models for NPUs using our advanced computing…

SAPEON Inc. is proud to collaborate with the University of Toronto and SK Telecom on optimizing AI models for NPUs using our advanced computing…

Liked by Gennady Pekhimenko
I am deeply honored to receive the 2024 ACM SIGARCH Maurice Wilkes Award for "In Memory Computing" at ISCA in Buenos Aires. Maurice Wilkes Award is…

I am deeply honored to receive the 2024 ACM SIGARCH Maurice Wilkes Award for "In Memory Computing" at ISCA in Buenos Aires. Maurice Wilkes Award is…

Liked by Gennady Pekhimenko
Compute cost is definitely a big pain point. I got more evidence of that, as I was talking to fellow AI-founders at #CollisionConf today.

Compute cost is definitely a big pain point. I got more evidence of that, as I was talking to fellow AI-founders at #CollisionConf today.

Liked by Gennady Pekhimenko
🚀 Exciting News for Developers! Introducing 🍎APPL: A Prompt Programming Language that seamlessly blends Natural Language Prompts with Python…

🚀 Exciting News for Developers! Introducing 🍎APPL: A Prompt Programming Language that seamlessly blends Natural Language Prompts with Python…

Liked by Gennady Pekhimenko

View Gennady’s full profile

See who you know in common
Get introduced
Contact Gennady directly

Join to view full profile

Sign in

Stay updated on your professional world

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Add new skills with these courses

See all courses

About

Activity

Looking forward to my PhD student Amey Agrawal's #OSDI24 talk on how to tame tail latency in Large Language Model (LLM) inference efficiently, while…

Liked by Gennady Pekhimenko

My dad was a professor of Philosophy and his teachings continue through me and my students ... ای مرغ سحر! چو این شب…

Liked by Gennady Pekhimenko

Your Voice Matters in Shaping the Future of LLMs At CentML, we're on a mission to revolutionize AI optimization, and your insights are crucial. If…

Liked by Gennady Pekhimenko

Experience & Education

CentML

**-******* *** ***

View Gennady’s full experience

See their title, tenure and more.

Publications

ACM Jun 2015

ACM Jun 2015

Toggle-Aware Compression for GPUs

Computer Architecture Letters (Jan-Jun 2015) Jun 2015

ACM May 2015

21st International Symposium on High-Performance Computer Architecture (HPCA) January 12, 2015

21st International Symposium on High-Performance Computer Architecture (HPCA) January 12, 2015

Oxford Bioinformatics January 10, 2015

The 23rd International Conference on Parallel Architecture and Compiler Techniques (PACT'14) May 24, 2014

Linearly Compressed Pages: A Low-Complexity, Low-Latency Main Memory Compression Framework

MICRO 2013 December 10, 2013

PACT 2012 September 24, 2012

Springer September 21, 2010

iWAPT 2009 Oct 2009

Patents

Issued August 13, 2015 US 14/175,934

Managing speculative assist threads

Issued April 21, 2011 US 12/905,202

Courses

Graduate Algorithms

15-750

Graduate Computer Architecture

15-740

Graduate Computer Networks

15-744

Graduate Machine Learning

15-781

Optimizing Compilers for Modern Architecture

15-745

Parallel Computer Architecture

18-742

Program Analysis

15-819

Semantics of Programming Languages

15-812

Honors & Awards

NVIDIA Graduate Fellowship, 2015-2016

NVIDIA

Qualcomm Innovation Fellowship (QInF'13, Honorable Mention)

Qualcomm

Microsoft Research Fellowship, 2013-2015

Microsoft Research

Alexander Graham Bell Canada Graduate Scholarship, 2012-2014

NSERC (Canada)

Languages

English

Full professional proficiency

Russian

Native or bilingual proficiency

Ukrainian

Native or bilingual proficiency

German

Elementary proficiency

Organizations

ACM

-

Recommendations received

Rui Luo

LinkedIn User

More activity by Gennady

Aidan Gomez will free us all from tyranny of the word "delve" in language model outputs! This is why Cohere doesn't use any output from OpenAI for…

Liked by Gennady Pekhimenko

Congratulations to Rahul Bera, Adithya Ranganathan and co-authors on receiving the Best Paper Award at #ISCACONF2024 for their work “Constable:…

Liked by Gennady Pekhimenko

The best part of what we do is working with people we love:)