When Brazil excels for real (or floating point): International Olympiads in Informatics

One gold and three bronze medals out of four competitors: that would be routine for some countries, but it meant a lot to Brazil in the 2011 edition of IOI. It was the best result of the country ever, achieved after more than a decade of continuous hard working by many people, including some professors and colleagues of mine from the University of Campinas – Unicamp – and the University of São Paulo – USP – but also from many other places. Like some friends of mine, I got more proud of that result than I would be of a World Cup title.

I had the opportunity to participate on the training for selecting the Brazilian competitors for the 2003 IOI and, despite scoring very bad at that selecting contest, I left it motivated to keep studying and practicing. In the years that followed, I tried my best in the South American and the Southwest European ICPC contests and achieved a humble result of three bronze medals. But the best part of it was that I learned a lot during those five years and so did most of my colleagues that went on the same direction, building a network of professionals that indicate each other for interesting jobs.

I do not think it is very common that IT undergrads follow the path towards an OR specialization, but that happens more often among those that engage in programming contests that valuate algorithm design and implementation skills. Such contests represent a great opportunity to leverage the area in Brazil, since the training required by the new generations can be supported by a number of professors that had their abroad doctoral studies sponsored some decades ago. Despite how far we are from devising strategic plans to excel somehow, good ideas here and there (even if decades ago) and the passionate effort of great individuals are playing an important role to the development of our country.

Revisiting operations research elements – part II: results, research… and failure!

This post resumes the discussion about key concepts of operations research started previously in this blog. However, instead of focusing only on the meaning of terms, this time I’d like to discuss about which results are worth of publishing and the unacknowledged importance of failure.

Are you the “brand new champion”?

“I never knew a soul who ever took a licking.
My friends have all been champions at everything.”
(Fernando Pessoa, Portuguese poet)

For a while, I thought that the idea described by Fernando Pessoa explained what happens in O.R.: an article is only worth of being published if it has noticeable results. However, which results are we talking about? In “Testing Heuristics: We Have It All Wrong”, John Hooker stresses many concerns about how results-driven approaches have been transforming research into plain development. He summarized most of them by the maxim “The tail wags the dog as problems begin to design algorithms”. In fact, such concerns were wide spreading by that time. In the article “Generating Hard Satisfiability Problems”, Bart Selman and others warned that bad benchmarks may lead to wrong conclusions about how hard a problem is and, conversely, about how good an algorithm to such problem is. Moreover, theoretical results such as the No Free Lunch theorems state that there is no best overall algorithm and there are always weaknesses to be acknowledged. Otherwise, it would not make sense to talk about a subject called meta-learning, which aims at selecting the most appropriate algorithm to solve each problem instance.

I remember to hear about the importance of reporting bad results in experimental physics classes, but that was quite lost in my mind after my junior and sophomore years. The big picture suggested by Selman’s group and Hooker was that many approaches being done in the area strive to be claimed the best one to solve a given problem at the cost of biasing the tests towards situations that are better suited to what is being proposed, intentionally or not. Besides, hardware differences, coding abilities and several other issues might influence what is being presented. Minding that metaphor that I used in the previous post, one must not confuse the power of an idea with the performance of a source code when dealing with a given benchmark set at a given machinery architecture. For instance, if someone is proposing an algorithm that explores fewer branches during search, it would be better to measure the number of branches instead of the running time, since it might be the case that a bad choice of data structures might hinder an adequate comparison of different techniques. Hence, one must strive to write an article that does recognize when its solution fails along with when it succeeds, whilst striving to diminish perturbation factors on its results.

After all, operations research is research, right?

What would you answer if asked that? I let it go unanswered last time. Most often, O.R. does involve scientific research: if the problem being solved was quite simple to skip that part, it would be unnecessary to have someone thinking about it. Since I was asked that while working on a long-term project, that was quite true. Notwithstanding, even if there is no scientific research involved, O.R. is still sort of a research: one wants to find the most appropriate way of performing certain operations. Therefore, some of the subtleties and misconceived concepts mentioned previously come from the differences among scientific research and operations research: research involves a problem, solutions and it may fail sometimes. Indeed, the word research in O.R. is also the same used to denote research in terms like R&D in Brazil (pesquisa), Portugal (investigação) and France (recherche). Not surprisingly, the list of ambiguous terms is far from complete. For instance, programming means both planning something or the act of producing computer code, as stressed by Lusting and Puget in the article “Program Does Not Equal Program: Constraint Programming and Its Relationship to Mathematical Programming”, in which they explain why it means both things for mathematical programming but just the later for constraint programming.

Yet another conclusion

I have decided to place the most important conclusions along with the most confusing term of the previous post – solution, but I can repeat them once more. Being aware of the differences between the common O.R. terminology and what researchers adopt anywhere else is important to be more effective when spreading the word about what you did beyond your close peers. In these two posts, I aimed at sharing some thoughts that I would be glad to have heard before. In order to succeed with that, I had the valuable help of my fiancée Sabrina, which helped me by revising this text in order to avoid that I committed the same mistakes that I was pointing. It might still not be completely accurate or clear cut, since I’m just a graduate student striving to figure out my area.

Revisiting operations research elements – part I: problem, model and solution

People often admit that they have just discovered to know nothing about something they thought they knew. I felt that about terms like “problem”, “model” and “solution” during the last year. At a first sight, those words seem quite simple to anyone: a problem is a quest one wants to resolve, a model is an insightful description of such problem and a solution is an attempt of resolving the problem. As I want to show next, big messes are made of such simple things, especially in OR.

What is your problem?

Once, when I practiced a presentation at home saying “My problem is that I have those decision variables Xi, Yi and Zi”, I was interrupted by my mother-in-law with a “No! Your problem is that you have expensive machinery, oil wells to develop and you want to maximize the profit of the company”, as I indeed had! She might not know much about combinatorial optimization problems or other related class of optimization problems, but after two daughters, Ph.D. and Post-Doc in organic chemistry and a published book, she definitely knows better what a problem is.

I used to consider the “problem” as a class of purely abstract formulations or a domain-targeted specialization of such a class. However, I was ignoring the actual problem – something that must be fixed or enhanced –, which was occasionally mentioned as a motivation for the study. Indeed, such mistake can be the first step to detach a promising subject from its context and devalue its importance. As a matter of fact, I should have done the converse path: to consider my problem as a real matter from which disclosure is motivated by the resemblance with other problems of practical importance.

What does your model explain?

For a long time, I’ve been used to denote as a “model” what I write as input to an optimization solver. However, after reading “Good and Bad Futures for Constraint Programming (and Operations Research)” by John N. Hooker, I decided to get a step back about that. In his article, Hooker emphasizes the importance of models to leverage scientific progress according to their original meaning: an elegant description of a natural phenomenon. Under that assumption, a model is as valuable as its ability to explain how a given system works and why it works that way. For sure, that is not achievable by simply stating a bunch of equations.

The input of an optimization solver is always computer source code, even if it is at a very high level of abstraction when compared to generic programming languages. Nevertheless, what you are able to describe with such programming language – its syntax – can be considered as the “model” for some sort of class of abstract formulations, and my code was just a specialization of it. Therefore, it would be more accurate to say that I’ve spent much more time coding upon a model of the problem than modeling the problem. Metaphorically, the model is an ethereal inspiration to a substantial source code (hold that thought for the next post).

What are you solving?

The term “solution” has been so often associated with variable assignments of a given problem formulation that its use in its original assertion sounds like an invitation to confusion. I felt that when reading an article about Case-Based Reasoning (CBR) applications to constraint programming (CP), in which the authors had to manage such nomenclature conflict: in CBR, a solution is understood as a technique for solving a given problem and was forcefully redefined as a “resolution”, given that, in CP – and OR in general –, a solution is the output of a technique that solves some problem. I can’t imagine a better way of handling it than theirs: the ambiguity was acknowledged and possibly taught to some unaware readers, and the article moved forward.

Maybe it is a lost battle trying to associate the term “solution” in the OR field with what the general public would expect it to be. However, if that is not possible, one must emphasize the difference, since the benefit of succeeding on that – as well as in the other cases exhibited – is to present an original work in a more accessible language to researchers of other fields. Therefore, the reward for that is a higher probability of having your effort being recognized and used somewhere else.

Why to write about this?

Operations research possesses a powerful toolset, which can be applied almost everywhere. However, many authors acknowledge that OR is in a constant danger of extinction for being self-centered and detached from the outside world. I believe that one of the reasons for such belief is the misconception of terms. For that reason, I aimed to make my point as clear as possible to someone outside the field, what demanded a great number of revisions by my fiancée Sabrina. Doing that was much harder than writing the post at once and I am very grateful for her patience with my attempt to escape my own ivory tower.