Primal and dual valuation of our natural resources using O.R.

There are many ways in which one can devise the importance of O.R. to protect our environment, many of which dealing with optimization problems related to directly reducing the costs to prevent its destruction and so on. However, what about the environmental impacts from our patterns of consumption? Shall we change our way of living dramatically or rather find a balance between what we want and what we can use from our environment? Maybe O.R. can help us on that.

Roughly speaking, Operations Research (O.R.) deals mostly with finding the best way of doing something subject to a lot of different kinds of restrictions. Thus, one can indirectly consider the protection of the environment whilst solving a wide range of different optimization problems related to the daily needs of our society. For that sake, I figure two possibilities to consider the protection of the environment:

  • pricing natural resources appropriately as a subtraction to the profit of the operation;
  • limiting their use so as to avoid that we steal the share that belongs to the future generations.

I’ve already written about the first possibility in my post about optimizing public policies for urban planning. My fiancee and I devised a model to consider the environmental costs of subsidizing low income housing units at different parts of the city in what comes to daily displacement to work. However, finding the right data to run the model turned out to be our biggest problem. When one defines a penalty to the environmental impact related to the profit of an operation – for instance, using the value of carbon credits – it represents a cost to the problem. However, sometimes we might not have data to price it. But if the consumption of a given natural resource is limited by a constraint instead of penalized in the profit, it is still possible to figure the economic importance of such resource through duality. Therefore, let’s take a look at the second possibility as an alternative to finding the price of natural resources – as well as avoiding an excessive use of them.

The concept of duality in linear programming allows us to associate costs to our constraints. Suppose that our optimization problem is about finding the amount of goods of each kind to produce in order to maximize the profit that they generate subject to the limited resources available. The dual of this problem consists of finding the price of each unity of our limited resources in order to define the minimum price at which it is worthier to sell them instead of processing subject to how much profit each finished good would give us. The relationship between those two problems is quite strong: if a resource is not used up to its limit, its dual cost is zero – meaning that it does not have an economic importance according to the model. Therefore, duality can help us devising how much a limited resource is worth (if it is worth something) and thus provide a way of valuating resources according to their limitation and importance.

As a matter of fact, the more you understand the relation between primal and dual problems, the easier it becomes to talk face-to-face to economists. Indeed, this topic has a lot to do with the 1975 Nobel Prize in Economics. If you want to know more about that, the prize lectures from Kantorovich and Koopmans are a good starting point.

This post was written to the September’s INFORMS blog challenge: O.R. and the Environment.

Drug discovery optimization: a meeting point for data mining, graph theory and operations research

How can I help finding the best compound to save a life?

In a glance, the answer is what research in optimizing drug design is about. Its main goal is to cheapen and speed-up drug development whilst producing more effective and less toxic pharmaceuticals. It is an interesting topic that involves much of what business analytics and optimization professionals use daily to solve problems in many other areas. For that reason, I decided to write about that for April’s INFORMS Blog Challenge: O.R. in Health Care.

Drug design basics: in vitro vs. in silico methods and QSAR analysis

Before a drug can be prescribed to humans, it must pass through a careful evaluation. However, it would not be reasonable or even feasible to test every possible compound for each desired effect, especially if such tests involve animals. Pharmaceutical researchers are interested in screening molecules on a huge data set, so that only those molecules that are most likely to work are ultimately tested in vitro, in animals and ultimately in humans. They accomplish that with computer-aided predictive analysis commonly called in silico tests.

In silico tests are designed in the belief that there exists a valid theory to extrapolate quantitatively the activity of a compound based on similar ones, using tools like Quantitative Structure-Activity Relationship (QSAR) analysis. The goal of a QSAR analysis is to find a function capable of predicting the activity of the main molecule of a compound based on the presence and quantity of certain molecular substructures. We are then lead to the quest for a way of representing molecules and to use available data about an activity to create trustable predictions for similar molecules.

The role of data mining and molecules as graphs

For many purposes, a molecule is just a graph with vertices and edges. However, the fact that molecular data are more structured does not mean that the problem is easier. Roughly* deciding whether one graph can be found inside another one is know as the Sub-Graph Isomorphism problem. That problem is NP-Complete, what means that there is not know an efficient algorithm to solve it.

(* By roughly, I ask you to forgive the lack of formality in which I defined sub-isomorphism: graphs are not said to be inside each other. A one-to-one correspondence between one graph or sub-graph and another one is called an isomorphism. However, if I was to tell it at first, I would lose most of the audience.)

More than just finding a sub-graph isomorphism, one is usually interested in find the most frequent ones. One interesting approach to Frequent Sub-Graph (FSG) mining is gSpan, a DFS-based mining algorithm proposed by Yan and Han. It consists of defining a unique lexicographic encoding for each sub-graph so that it can be counted more easily while exploring molecular data. There is also a controversy about the validity of 2D models such as graphs for modeling molecules, specially because some geometric subtleties differ a innocuous molecule from a a carcinogenic one. Thus, it is worth of notice that predictive models are not intended to skip in vitro tests but just point out which molecules are most likely to work out.

How can operations research fit in?

There are a number of O.R. applications for what we have been discussing here.

I would like to mention one interesting application of Linear Programming (LP) to QSAR analysis by Saigo, Kodawaki and Tsuda. Their approach consisted of using LP to build a regression function for prediction in which error is minimized. It is based on positive and negative relations between molecules and activities, that is, a quantitative relation among each molecule and the activity or a constraint imposing that such relation is not relevant. The decision variables of the problem are the coefficients associated with every possible substructure. Since there can be a lot of possible substructures, they start with a small problem and there is a column generation algorithm that tries to find a new variable whose addition to the problem might improve the results.

Final remarks: OR and the invisible good

The fun of many professions is to see the benefit of what you are doing. That is even more present in health care, since you are dealing with people and their lives. On the other hand, OR is the kind of invisible science, which is on its best use when the average people can not sense its presence. Notwithstanding, OR practitioners enjoy numbers and are glad to know that things are a bit better because of their work behind the scenes. Despite that, blogging about OR can help people notice the importance of the area to support others that are commonly visible to everyone.

I would like to thank my mother-in-law for some helpful advises to write about QSAR.