RESOURCES

In-House DevOps Infrastructure at DT

Jul 16, 2023
By: Ori Bracha

Our DevOps team at DT works intensely to make sure that our production environment is always up and running, using continuously-updated and cutting-edge technology. The tens of millions of transactions that are processed every minute, makes our team responsible for creating and maintaining a healthy and efficient environment to maintain such a data pipeline. To learn about the implementation of new technologies, in this blog we are talking to Ori Bracha, DevOps Team Lead.

One of the main principles in strategizing the introduction of new DevOps technologies is the level of reliance on external vendors for services. Being less coupled to external vendors allows DT’s DevOps team to be more cost-efficient. We therefore experiment with developing our own in-house solutions whenever we get a chance. This led us to create a Kubernetes infrastructure from scratch.

The infrastructure built is a one-click environment using Terraform (by HashiCorp). This environment includes a wide variety of tools, each built with different resources: monitoring (with Prometheus), logging, visibility (Grafana & Rancher), and secret management (HashiCorp Vault). This infrastructure gave a significant boost for both our horizontal and vertical scaling abilities. We have also created a CI for docker image creation with ECR and Jenkins. This custom made in-house environment provides for DT’s needs while reducing costs on external cloud resources.

We have also recently released a new druid based-in house solution for price slicing visibility. At DT, we process a huge scale of data: 50 million transactions per minute, with over 1.5B unique users. Our goal is to develop and maintain a resilient infrastructure to support this massive data flow. Since our product works with RTB technology, the auction must clear the winner within milliseconds, after processing a large amount of bids and the data points. We built a resilient data pipeline through an in-house-developed Apache Kafka. Following that, we work continuously on data aggregations and enrichment, to produce insights all data regarding the bid results.

Our Console contains the UI that allows our users to learn from this data for themselves. Users are able to slice and dice our data with dozens of combinations of metrics and attributes, all in an organized, comfortable manner. They are able to modify and customize metrics and dimensions, producing the insights in line with their needs and requirements. This service is crucial for the success of both app developers and advertisers that work with us, allowing them to make well-informed and data-backed decisions.

Using Druid, the data is aggregated and presented on our UI. We’ve been using this program since the birth of druid technology 8 years ago. Creating this service using Apache Druid was followed with the creation of our in-house chef-solo, to efficiently conduct these aggregations on the data. This infrastructure allows our Data Team to deeply explore our data and build the metrics accessible to our users. Moreover, the internal performance analysis is also enhanced with this technology.

While we remain critical about our relationships with external vendors and in-house infrastructure, on some parts of our technology, we actively choose to maintain a hybrid environment. We hold extensive Kubernetes environments that allow the communication and sync with other external services. Similarly, we maintain a hybrid environment for configuration management. We review these external services as part of the general effort to migrate most of our service into Docker and Kubernetes.

By Ori Bracha
Read more by this author