Frequently Asked Questions: Technology
RTBkit is software framework which developers can use to build real-time bidders. The first step is to download and explore the code and design your bidder around the RTBkit core and plugin architecture. If you need help, don't hesitate to join our Google Group or find a member of the RTBkit ecosystem to help you.
RTBkit is written in high-performance C++11 and runs on Linux (Ubuntu 12.04).
RTBkit is released under the Apache License, v2.0.
RTBkit is a collection of components which are meant to run behind a firewall which permits connections only to authorized users and exchanges. It is not intended to exposed to the open internet. RTBkit is open-source software, which encourages open collaboration, especially around issues such as security.
The RTBkit core is horizontally scalable and was designed to support 20,000 queries per second per high-end commodity server.
RTBkit does not include a user interface. The RTBkit core is a collection of software components which expose HTTP and ZeroMQ API's, which can be used as a back-end for an external user interface.
RTBkit is ad-server agnostic and is designed to work with any ad-server. RTBkit bidding agents specify which ad-server tags to bid with via a configuration message to the RTBkit core.
RTBkit is an exchange-agnostic framework and is meant to be compatible with any ad exchange. The RTBkit Core communicates with exchanges via Exchange Connector plugins. A connector for The Rubicon Project and for Gum Gum are already available, and connectors for AppNexus, Nexage, The Doubleclick Ad Exchange and others will be released in the coming weeks and months. Community contributions of Exchange Connector plugins are welcome!
We're maintaining a list of connectors available and being developed here.
RTBkit is an exchange-agnostic framework and is meant to be compatible with any ad exchange. The RTBkit Core communicates with exchanges via Exchange Connector plugins and these plugins depend on bid-request parsers. RTBkit's internal representation of a bid request is similar to OpenRTB's representation, with full round-trip serialization as a goal. This means that OpenRTB-formatted bid requests are easy to parse, although exchange-specific connectors or parser extensions may be required, depending on the specifics of each exchange's use of OpenRTB.
RTBkit is an exchange-agnostic framework and is meant to be compatible with any ad exchange, including video and mobile exchanges. The RTBkit Core communicates with exchanges via Exchange Connector plugins, and it is easy to write such plugins for OpenRTB-compatible exchanges. Connectors for major exchanges, including video and mobile exchanges, are expected in the coming weeks and months. Community contributions of Exchange Connector plugins are welcome!
The goal on the roadmap for RTBkit 2.0 is to open up Augmenters, Augmenter Data integration, logging and analytics plugins to be available to the RTBkit core through HTTP interfaces. The BidderInterface that is in the current master branch is an early taste of this functionality.
The core of RTBkit is written in C++ and many of the components (i.e. exchange connectors) can only be written in C++. Therefore, it is highly recommended that developers working on RTBkit have C++ experience.
Running an RTB system is not trivial.
You need to consider the following areas:
Cloud vs. buying and hosting your own hardware (i.e. lease vs buy )
- Hosting - in the cloud or dedicated data center
- 24-hour logging, monitoring and support
- Supporting a high-volume distributed system, running on commodity hardware and using basic protocols, connected to other partners on the public Internet. You want ops people with this kind of experience.
- Don't overspend on handling more bid requests then you need. Think about your planned spend, and the number of impressions you want to buy. A good rule of thumb is that you will win about 1 out of 100 bid requests you see incoming, so if you assume a CPM typical for your campaigns you can calculate a rough number for the expected volume you will need to support.
- From that you will want to prototype your system architecture and load test to understand how many hosts you project you will need to run.
- From that you can calculate a rough monthly operating cost for hosting.
- You should target 5% of media spend as the limit of what you spend on hosting
Yes, RTBkit uses Redis in the Banker and also ships with a Redis Augmenter connector. You can review the code here:
It is probably not advisable to log every incoming bid request you receive. This is a massive amount of data and the storage costs will quickly escalate, probably beyond any value you might get from having all of this data on hand. Depending on your use case and your reasons for wanting to log the data you might want to consider sampling the bid stream or logging the full bid request for each impression won, and/or click and/or conversion. You might also consider forwarding bid requests to a streaming endpoint that retains a rolling limited window of data, such as Kafka or Storm, if your goal is to feed bid requests into an analytical system looking at (and storing) data for a limited time-frame.
With all that said, bid requests can be logged via the RTBkit data logger by setting the Router option log-auctions to true.
RTBkit should scale comfortably at least to low 100s of agents. The only issues we have heard about in production were when customers had 1000s of agents.
The recently released HttpBidderInterface and MultiBidderInterface may also help with scalability, because agents can now run out of process from the RTBkit core, on separate hosts in their own processes. This decoupling allows you to be able to truly scale agents horizontally separately from RTBkit core.
Also, if you have this many agents, and the reason is that you have many agents per campaign, you should perhaps think about your agent design. Often, non-RTB systems execute broad optimization strategies, such as segment-based targeting, but subdividing a campaign budget into many child campaigns, each with different targeting settings. If this is your reason for having so many agents, perhaps a different approach that puts more of the work of optimization in dynamic decision making by a smaller number of bidding agents might make more sense.
There are examples of how to implement frequency capping included in the RTBkit wiki, and example code.
For pacing, the only commercial system we are aware of is our Datacratic Bid Optimizer, a commercial, hosted, RTB optimization service compatible with RTBkit through the HttpBidderInterface and with any third-party bidder. Bid Optimizer includes pacing as a feature.
For campaign management, RTBkit offers the Agent Configuration Service, but not a UI. The Agent Gateway (https://github.com/USMediaConsulting/agent_gateway) is a community-contributed service for managing this configuration.
There is not. But, as with frequency capping, you can implement this using an augmenter. The pattern is the same - upload the current state of your user data (timestamp of the last time they saw an impression from Campaign X), and read that per bid request in your augmenter. The augmenter would implement the business rule to set a filter per eligible campaign based on the last time the user had seen an impression from that campaign.
There are not, in that the system boundary defined by RTBkit stops at providing an Augmenter base class that you can derive from, and the RedisAugmenter class you can use as is, that connect RTBkit core to your Augmenter logic and data store.
That said, there are some general design issues that you should consider.
First, very fast key/value stores such as Redis or Aerospike have appropriate performance characteristics for an Augmenter, so you should think about a schema for your Augmenter data that does lookups based on a single key, for example a userId or deviceId. You want to minimize reads from the database, so a good design is often to have a long record of attributes associated with each key, for example, all the attributes you have available for a userId or deviceId. This way you can fetch all the data for the augmenter operation with one read against a key.
Second, you should think about how fresh your data needs to be. Data integrations are generally operationally complex and costly, and you should only load your data as often as you need to to support your use cases. If you are retargeting users very near time of purchase ("bottom of the funnel"), you might need near real-time data updates. If you are putting users into a segment targeting users considering a major purchase, for a 30-day campaign, you may only need to upload data daily.
This can cause significant performance issues in RTBkit core, depending on how many bidding agents you are runing. bidProbability is a factor for the bid requests that pass all filtering and reach your agents. But if all agents bid, the router needs to run an internal auction over all of the responses, which causes additional load.
RTBkit itself typically adds 0.5 to 2 ms of latency to a no-bid request for the bid request parsing, filtering, internal auction and responding to the exchange. Augmentation will add another 2-7ms, and bidding agents will add extra time. Note that RTBkit attempts to respond to exchanges within the time guidelines in the bid request, irrespective of the behaviour of augmenters and bidding agents.