- Taboola Blog
- Engineering
As a content discovery product, we need to be able to pace the campaign through its life on real time – spending the budget entirely without overspending. The team I am leading is responsible for serving Taboola’s video content. Our main goal is to enable growth of our business. Owning the entire serving process of the video content can be crucial to this end. Depending on 3rd parties serving systems with the core process would leave us vulnerable to rising prices, compromised features, serving latency, reduced performance and so on. In order to serve the video on our own we had to come up with a way to pace the amount of times we want to display the video and prevent instances in which the entire budget of the video would drain in a few seconds – common scenario in Taboola’s scale when the video is not targeted aggressively. We […]
In the following article, I describe how we came up with a way to improve the chances that our SDK library gets smoothly integrated in our customers’ Applications and reduce issues when going to production. The main idea is to take a number of significant clients’ applications and replace your existing SDK code with a new code, allowing you to see how the apps perform before you release a new SDK version. Why releasing a reliable SDK is so important Developing an SDK for mobile apps is very different from developing a standalone app. You can think of an SDK as a guest in someone else’s house. You need to behave, you can’t put your legs on the table or wipe your hands on the sofa (well, in most countries you can’t). So what I mean is that you can’t interfere with the app’s normal behavior, break some flows or […]
On our day to day lives, professional relationships matter. Theoretically, how QA should handle themselves with developers is very obvious. Or is it? Well, it’s not. Reporting to developers about an issue and leave it like that is not a good enough approach. It’s way too basic and distant. Professional relationship should include talking to the developer about the issue. Sometimes it requires further explanations. Other times, it will require helping them reproduce it. It will also have to involve a good set of interpersonal communication skills. “Us” vs. “Them” From the early days of my career, I never stopped hearing about the “Us” (QA) vs “Them” (developers) perception. I never joined those calls. Not because I feared to speak up my voice, but rather because I could never relate to it, even to this day. I think this perception is useless and has nothing to do with teamwork. The […]
About 8 months ago my team and I were facing the challenge of building our first Deep Learning infrastructure. One of my team members (a brilliant data scientist) was working on a prototype for our first deep model. The time arrived to move forward to production. I was honored to lead this effort. Our achievements: we built an infrastructure that ranks over 600K items/sec, our deep models have beaten the previous models by a large margin. This pioneer project has led the way for the subsequent Deep Learning projects at Taboola. So the prototype was ready, and I was wondering: how to go from a messy script to a production ready framework? In other words, if you are into establishing a deep model pipeline this post is for you. This blog post is focused on the training infrastructure, without the inference infrastructure. Prerequisites Assume you have basic knowledge in: Python […]
Delivering good product to live environment requires big effort from R&D. Under the software development life cycle, we can find 6 basic phases: Understanding the requirements, design, coding, testing, deployment (incl. A/B test, if necessary) and maintenance. But how can we measure product quality? By its stability? Scalability? Easy to maintain? Bug free code? There are probably many definitions for what is a good product, but in my opinion, the two foundation stones are product behavior & functionality as defined (be aligned with the product manager’s requirements), and zero critical bugs. The product can serve many goals, but if it doesn’t achieve the main one, it might not have a reason to exist. Naturally, customers are always expecting high quality from the product, so before releasing it to production QA should make sure that indeed critical bugs don’t exist. In order to respect these two, both R&D and QA should […]
Writing features as added chunks into an ever growing one bulk of code is unorganized and messy. Overtime, the tasks of testing new behaviors becomes harder and harder. Why is that? Chunky Code is hard to Navigate When working with developers on a new feature, we have to identify what we need to test. As the code grows, the challenge of finding the parts that were modified and should be tested is increasing. If a feature becomes problematic or irrelevant, reverting it becomes more difficult, since we need to go back to every line of code we changed. This way we endanger production environment and are affecting our end-users. Last but not least, it is irritating to impossible to manage when dealing with “legacy” code, that needs to be reverse engineered to find how to control. Coming to compare a buggy behavior with its intended fix or testing a new […]
General This post describes how to use Android ContentProvider to allow automatic system initialisation for your library, therefore help make your library easier to integrate and control its flow. While this article mostly demonstrates one use case, you can use the idea in other cases as well. What is it all about? Always Strive To Simplify Integration Let’s assume you are writing code for a software library that would be used by an Android application. Most common flows require the app using your library to manually call the initialisation of your library and usually, provide it with their own Context. This will require your client to write a code along these lines: This article suggests using a Content Provider to allow: Completely autonomous initialisation, liberating you from having to ask the client for init at all. Avoiding the necessity of asking your client for their Application Context. […]
What is the connection between kernel system calls and database performance, and how can we improve performance by reducing the number of system calls? Performance of any database system depends on four main system resources: CPU Memory Disk I/O Network Performance will increase while tuning or scaling each resource – this blog will cover the CPU resource.It’s important to note that whenever we release a bottleneck in the system, we might just encounter another one. For example, when improving CPU performance the database load shifts to IO, so unless our storage is capable of delivering more IOPS, we might not actually see the improvement we hoped for. But don’t be discouraged, performance tuning is sometimes a game of whack-a-mole… We all know that the more processing power available for your server, the better the overall system is likely to perform. Especially when the CPU spends the majority of its time […]
For the past year, my team and I have been working on a personalized user experience in the Taboola feed. We used Multi-Task Learning (MTL) to predict multiple Key Performance Indicators (KPIs) on the same set of input features, and implemented a Deep Learning (DL) model in TensorFlow to do so. Back when we started, MTL seemed way more complicated to us than it does now, so I wanted to share some of the lessons learned. There are already quite a few posts about implementing MTL in a DL model (1, 2, 3). In this post I will share some specific points to consider when implementing MTL in a Neural Network (NN). I will also present simple TensorFlow solutions to overcome the discussed issues. Sharing is caring We wanted to start with the basic approach of hard parameter sharing. Hard sharing means we have a shared subnet, followed by […]
If you happen to write code for a living, there’s a pretty good chance you’ve found yourself explaining another interviewer again how to reverse a linked list or how to tell if a string contains only digits. Usually, the necessity of this B.Sc. material ends once a contract is signed, as most of these low-level questions are dealt with for us under-the-hood of modern coding languages and external libraries. Still, not long ago we found ourselves facing one such question in real-life: find an efficient algorithm for real-time weighted sampling. As naive as it might seem at first sight, we’d like to show you why it’s actually not – and then walk you through how we solved it, just in case you’ll run into something similar. So buckle up, we’ve got some statistics and integrals coming up next! Why We Need Weighted Sampling in Production? At Taboola, our core business is to personalize […]