aicoder: November 2011

Tuesday, November 15, 2011

What Software Engineers should know about Control Theory

Over the years I've noticed an interesting lack of specific domain knowledge among CS and software people. Other than the few co-workers that majored in Electrical Engineering, almost no one has heard of the field of 'Control Theory'.

From Wikipedia

Control theory is an interdisciplinary branch of engineering and mathematics, that deals with the behavior of dynamical systems. The desired output of a system is called the reference. When one or more output variables of a system need to follow a certain reference over time, a controller manipulates the inputs to a system to obtain the desired effect on the output of the system.

Let's imagine that you write internet web services for a living. Some Rest or SOAP APIs that take an input and give an output.

Your boss walks up to you one day and that asks for a system that does the following:

Create a webservice that calls another (or three) for data/inputs, then does X with them.
Meters the usage of the other web services.
Your webservice must respond within Y milliseconds with good output or a NULL output.
Support high concurrency, ie not use too many servers.

The problem is that these other third-party webservices are not your own. What is their response time? Will they give bad data? How should your webservice react to failures of the others?

Does this sound familiar? It should to many. This is the replicated-and-shared-connector problem (MySQL, memcached), the partitioned-services problem (federated search, and large scale search engines) and the API-as-a-service problem (Mashery, etc).

There are two basic types of controls relevant here:

Open Loop, Feed-forward: Requires good model of system inputs and response of the system.

Closed Loop, Feed-back

Types of adaptive control are as follows:

Linear Feedback
Stability Analysis
Frequency response
response time

Adaptive Schemes

Gain Scheduling
Model Reference Adaptive Systems
Self-tuning regulators
Dual Control

Here's one survey deck from a lecture. Unfortunately for software engineers, most of the presentations of the above are in linear system form rather than an algorithmic form.

Dr. Joe L Hellerstein of Google and co-workers taught a course at U of Washington in 2008 that was more software focused. He's also written a textbook on it and a few papers.

http://research.microsoft.com/en-us/um/people/liuj/cse590k2008winter/
Joseph L Hellerstein et al "Feedback control of computing systems" 2004 Wiley Google Books Amazon
Hellerstein 2003 IBM Tech Report "Challenges in Control Engineering of Computing Systems"
Hellerstein et al "Research challenges in control engineering of computing systems" Volume: 6 Issue: 4, 2010 IEEE Trans on Network and Service Management

The course page has a collection of great links to applications papers on controllers for software systems.

I'd like to see a 'software patterns' set created for easier use by software engineers. I'll attempt to present a couple common forms as patterns in a future blog post.

Thursday, November 10, 2011

Open RTB panel - IAB Ad Ops Summit 2011

Monday November 7th I was on an IAB Ops panel on OpenRTB.

The clip shows an exchange after Steve from the IAB asked a question about how webpage inventory is described in RTB. I described an example of differentiating a simple commodity, barley.

Two of the major uses of barley in the US are animal feed and malting for making beer. Malting barley has specific requirements in terms of moisture content, protein percentage and other factors. Farmers don't always know what quality their crop will finish at. They count on having two general markets, if the tested quality meets malting standards then the premium over feed prices can be healthy. A 2011 report noted that malting barley provided a 70% premium over feedstock barley. Growing specific varieties and/or using organic farming methods can provide additional premiums over generic feed barley. The curious can follow the links below.

How does this relate to publishers and advertising and OpenRTB? In my opinion we need several things standardized:

1) Inventory registration and description API. Allows publishers influence on how their inventory is exposed in various demand-side and trading-desk platforms. Publishers should fully describe their inventory in a common format. Buy-side GUIs and algorithms will benefit from increased annotation and categorization. This can also harmonize the brand-safety ratings that are not connected between the sell and buy sides.

2) Standardization of the emerging 'Private Marketplace' models in RTB. A set of best practices and trading procedures for PM needs to be defined such that the market can grow properly.

While the main bid request/response API of OpenRTB has been criticized as being 'too late' given the large implementations in production, it is not too late to define standards for the above. These things will help the buy-side better differentiate quality inventory.