More often then not programs we created don't do what we want them to: runtime errors or invalid results appear, and we have to figure out what is wrong. It's especially intimidating if you have only 10 minutes left to find and fix the bug before the end of the programming contest.
There are several approaches to that. The best one is to check that the general idea is correct and carefully read through the code. Reading and thinking are faster than anything else, plus you can't really get around them anyway.
On the opposite side sits "debugger". Ability to pause at the key moments and carefully inspect the program state sounds awesome. In practice though, it's a very time-consuming process of stepping through the code, adding breakpoints and watch expressions for the data you are interested in. Multiple people advocates using debugger only as a last resort. I found debugger useful from time to time to understand something non-trivial while I have a luxury of time after a contest.
Logging and asserts seem like lightweight and robust tools to get some intel about what's going on in a program. Here I am going to talk about logging.
For the sake of the example let's use "find k-th smallest element" problem: you are given $n$ numbers and want to find $k$-th smallest ($0<k <= n$) in $O(n)$.
#include <bits/stdc++.h> using namespace std; int find_kh(auto a, auto b, int k) { auto pivot = *(a + (b - a) / 2); auto lower = partition(a, b, [pivot](auto const& x) { return x < pivot; }); auto upper = partition(lower, b, [pivot](auto const& x) { return x == pivot; }); if ((upper - a) <= k) return find_kh(upper, b, k - (upper - a)); if ((lower - a) > k) return find_kh(a, lower, k); return pivot; } int main() { int n, k; cin >> n >> k; vector<int> v(n); while (n--) cin >> v[n]; cout << find_kh(v.begin(), v.end(), k - 1) << endl; }
Let's add a detailed log of function internals:
int find_k(auto a, auto b, int k) { auto pivot = *(a + (b - a) / 2); cerr << "pivot " << pivot << endl; auto lower = partition(a, b, [pivot](auto const& x) { return x < pivot; }); auto upper = partition(lower, b, [pivot](auto const& x) { return x == pivot; }); cerr << "partitioned: "; auto i = a; while (i != b) cerr << *(i++) << ", "; cerr << endl; cerr << "size < : " << (lower - a) << endl; if ((lower - a) > k) { cerr << "left" << endl; return find_k(a, lower, k); } cerr << "size <= : " << (upper - a) << endl; if ((upper - a) <= k) { auto new_k = k - (upper - a); cerr << "right, new_k =" << new_k << endl; return find_k(upper, b, new_k); } return pivot; }
yikes! Look how much noise we've added to the function and now it's much harder to comprehend. Moreover, it's not possible to submit such code to judge as it might fail due to timeout as we are doing additional vector traversals and sync on stderr. Of course, there are workarounds for that, like commenting out logging or guarding it with #ifndef ONLINE_JUDGE. That works but requires more typing (aka time).
I have used to log like this. But from time to time stumbled into some cases when logging requires introducing intermediate variables or new sections (see the printing of th updated value of k above) and that wasn't seem to be the most productive way to go.
After one of the Codeforces contests, I have decided to sort this thing out. First of all, I've listed "requirements":
- don't take too much space in solution's template file (mine was already 20 lines long).
- conditional header files are fine. Would be nice if the size is small and it doesn't significantly increase compilation time.
- less typing, these << ' ' << are not fun.
- do something with the logging of return and intermediate expressions, e.g. conditions of if or function arguments.
- if I am going to include header then lets add some common print routines for vectors and tuples!
I have experimented with several non-header approaches but they took too much real estate in the template, plus there is no need to send this code. And the ability to print std containers looks so appealing. Let's create a header!
There is a really good header-only library for printing std containers by Louis Delacroix. It's included through the new header and works great with things like vectors, maps or tuples.
Let's look at the logging function itself:
template<typename T, typename... O> decltype(auto) debug(T&& arg, O&&... args) {...}
it takes one or more arguments, prints them and returns the first one. This allows debugging data "in-flight":
if (debug(a < b, a, b)) { return debug(a, "<", b); }
For a = 3, b = 4 that will print
1 3 4 3 < 4
For a time I have tried to make this function return the value of the last argument so it would make debug(a, b, a < b) behave like (a, b, a < b) (given that additional arguments don't have side effects) and one can simply remove the name of the function to stop logging while preserving semantics. And.. I have not found an easy way to do that. Moreover, what I really wanted is to stop calculating all arguments but the "main one" on judge, as additional expressions can be really expensive.
Now it's time to leave this function alone and write some macro:
#if defined(LOCAL) #include "logging.h" #define L(x...) (debug(x, #x)) #else #define L(x, ...) (x) #endif
What's happening here? First of all, if LOCAL is defined we include the header and L() will call debug with everything passed to macro as is, plus #x. This "hash" is a really neat feature of the preprocessor that puts the a string literal of macro argument. If LOCAL is not defined (as it will be on a judge) then the preprocessor will throw away everything but first macro parameter thus removing any side effects of logging.
A short note about the choice of identifier. For some period I have relied on the absence of ONLINE_JUDGE but later discovered that Yandex.contest does not specify it; so I resorted to the presence of LOCAL identifier instead that is under my control.
It's time to look how this changes the impact of logging on our function:
int find_k(auto a, auto b, int k) { auto pivot = L(*(a + (b - a) / 2), "pivot"); auto lower = partition(a, b, [pivot](auto const& x) { return x < pivot; }); auto upper = partition(lower, b, [pivot](auto const& x) { return x == pivot; }); L("partitioned", vector<int>(a, b)); if (L(lower - a) > k) return find_k(a, lower, L(k, "left")); if (L(upper - a) <= k) return find_k(upper, b, L(k - (upper - a), "right")); return pivot; }
If there is only one expression to log then you can stop printing it out by deleting a single letter L while keeping everything else intact (e.g. logging of lower - a in the condition above).
Note that as log will automatically print expression we are interested in, it's no longer necessary to provide additional string hints. On the other hand, when logging statement is an expression by itself, like the output of a range after partition above, I found it useful to do add some constant as a first argument. It's true that compiler will likely optimize out even very complex expressions if they have no side effects. I do that to be 100% sure. Maybe for you, it will make more sense to introduce another macro for standalone logging that will be completely removed on judge.
While it will definitely require one to spend some time to get used to a new way of printf-ing and learn how to deal with not so obvious cases, the speed and ease of use will massively offset that.
You can find this tiny header source in gist (don't forget to download prettyprint as well).
Maybe you know a better way to log program state? Please share your thoughts in the comments below!
P.S. while this post was in the draft, Jonathan Boccara published very similar idea of "tee" function. I invite you to check his implementation and inspiration for this method.