Theory of Computing Blog Aggregator

Can we avoid accepting what we cannot verify?

Cropped from biography source

Arthur Clarke was a British writer of great breadth and huge impact. He was a science writer, of both fiction and non-fiction. His works are too many to list, but 2001: A Space Odyssey—the novel accompanying the movie—is perhaps his most famous. He received both a British knighthood and the rarer Pride of Sri Lanka award, so that both “Sri” and “Sir” were legally prefixed to his name.

Today Dick and I want to raise questions about modern cryptography, complexity, and distinguishing science from “magic.”

Clarke added the following widely quoted statement to the 1973 revised edition of his book Profiles of the Future: An Inquiry into the Limits of the Possible as the third of three “laws” of prediction:

Any sufficiently advanced technology is indistinguishable from magic.

When I, Ken, first read the quote, I was surprised: it didn’t seem like something a proponent of science would say. Tracing versions of his book’s first chapter in which the first two “laws” appear, it seems Clarke first had nuclear fission in mind, then transistors, semiconductors, clocks incorporating relativity, and a list that strikingly includes “detecting invisible planets.” I agree with critiques expressed here and here: these phenomena are explainable. Perhaps Richard Feynman was right that “nobody really understands quantum mechanics,” but besides rebuttals like this, the point is that a natural explanation can satisfy by reduction to simpler principles without their needing to be fully understood. In all these cases too, experiment gives the power to verify.

My newfound appreciation for Clarke’s quip, and Dick’s all along, comes from realizing the potential limits of verification and explanation in our own field, computational complexity. This applies especially to crypto. Let’s see what the issues are.

Checking Results

As often in computing, issues have “ground” and “meta” levels. Let’s start on the ground.

The big center of gravity in complexity theory has always been the idea of {\mathsf{NP}}. The idea is that although it may take a lot of effort to find a solution to a problem in {\mathsf{NP}} but if and when you find one, it is easy to check. Separate from whether {\mathsf{P = NP}}, there was much success on programs whose outputs are checkable with markedly greater efficiency than the original computation. This gives not only instant assurance but usually also an explanation of why the solution works.

An even lower example in complexity terms is verifying a product {C = AB} of {n \times n} matrices. Although we may never be able to multiply {AB} in {O(n^2)} time, once we have {C} we can verify in {O(n^2)} time by randomly choosing {n}-vectors {x,y} and checking that {xCy} equals {xA} times {By}. Here the randomized check may give less explanation, but the ability to repeat it quickly confers assurance. We should mention that this idea originated with Rūsiņš Freivalds, who passed away last month—see this tribute by his student Andris Ambainis.

As the field has matured, however, we’ve run into more cases where no one has found a verifying predicate. For example, every finite field {F = F_{p^n}} has elements {g} whose powers {g^i} generate all the non-zero elements of {F}. However, no way is known to recognize them in time polynomial in {n\log p,} let alone find one. The best recent news we know is a 2013 paper by Ming-Deh Huang and Anand Narayanan that finds one in time polynomial in {pn} which is fine when {p} is small. In other things too, we seem to be moving away from {\mathsf{NP}} to non-checkable complexity classes.

Learning and Explaining

To see a still broader form of not knowing, we note this remark by Lance Fortnow on the Computational Complexity blog about a ramification of the deep-learning technique used to craft the new powerful Go-playing program by Google:

“If a computer chess program makes a surprising move, good or bad, one can work through the code and figure out why the program made that particular move. If AlphaGo makes a surprising move, we’ll have no clue why.”

To say a little more here: Today’s best chess programs train the coefficients of their position-evaluation functions by playing games against themselves. For example, the Weights array for last year’s official Stockfish 6 release had five pairs:

\displaystyle  \{289, 344\},\;\;\{233, 201\},\;\;\{221, 273\},\;\;\{46, 0\},\;\;\{321, 0\}.

Line 115 of the current Stockfish source file evaluate.cpp has four pairs:

\displaystyle  \{214, 203\},\;\;\{193, 262\},\;\;\{47, 0\},\;\;\{330, 0\}.

The myriad Score numbers following it are likewise all different. They are mysterious but they are concretely present, and one can find the position-features they index in the code to compute why the program preferred certain target positions in its search. A case of why Deep Blue played a critical good move against Garry Kasparov in 1997 caused controversy when IBM refused Kasparov’s request for verifying data, and may still not be fully resolved. But I’ve posted here a similar example of tracing the Komodo program’s refusal to win a pawn, when analyzing the first-ever recorded chess game played in the year 1475.

Deep learning, however, produces a convolutional neural network that may not so easily reveal its weights and thresholds, nor how it has learned to “chunk” the gridded input. At least with the classic case of distinguishing pictures of cats from dogs, one can more easily point after-the-fact to which visual features were isolated and made important by the classifier. For Go, however, it seems there has not emerged a way to explain the new program’s decisions beyond the top-level mixing of criteria shown by Figure 5 of the AlphaGo paper. Absent a human-intelligible explanation, the winning moves appear as if by magic.

Crypto and Illusion

James Massey, known among many things for the shift-register deducing algorithm with Elwyn Berlekamp, once recorded a much-viewed lecture titled,

“Cryptography: Science or Magic?”

What Massey meant by “magic” is different from Clarke. In virtually all cases, modern cryptographic primitives rely on the intractability of problems that belong to the realm of {\mathsf{NP}} or at worst {\mathsf{\#P}}. The widespread implementation of public-key cryptography and digital signatures via RSA relies on factoring which belongs to {\mathsf{NP \cap co}\text{-}\mathsf{NP}}. Even if we knew {\mathsf{NP \neq P}}, we still wouldn’t have a proof of the security needed to establish that the world really confers the capabilities promised by these primitives.

As above the issue is lack of ability to verify a property, but this time of an algorithmic protocol rather than a single solution or chess or Go move. This “meta” level goes to the heart of our field’s aspiration to be a science. Of course we can prove theorems conditioned on hardness assumptions, but apart from the issue we have covered quite a few times before of loopholes between security proofs and application needs, we perceive a more fundamental lack:

The theorems don’t (yet) explain from first principles what it is about the world that yields these properties.

Modern cryptography is filled with papers that really say the following: If certain simple to state complexity hardness assumptions are correct, then we can do the following “magical” operations. This is viewed by those who work in cryptography, and by those of us outside the area, as an example of a wonderful success story. That factoring is hard implies the ability to encrypt messages is not so hard to believe, but that we can make the encryption asymmetric—a public key system—was at first a real surprise. But that this assumption further allows us to do some very powerful and wonderful operations is even more surprising.

Massey gives as example the technique developed by Amos Fiat and Adi Shamir to convert any interactive proof of knowledge into one that carries the signature of the person conveying proof of knowledge without needing that person’s active participation. It preserves the feature that only the person’s fact of knowing is transmitted, not the known data itself. This is truly wondrous: you can interact offline.

We can trace the composition of hardness assumptions, plus the assumption of implementing a suitable random oracle, in a conditional theorem. Since we cannot prove any of the conditions, Massey asks, is this all an illusion? How can we tell it’s not? For us this raises something further:

If a capability seems too wondrous to believe, can that be grounds for doubting one or more of the assumptions that underlie it?

In short, the magical success of crypto could be misleading. Could the assumptions it rests on go beyond magic into fantasy? In short could factoring be easy, and therefore modern crypto be wrong? Could the whole edifice collapse? Who knows?

Open Problems

The three issues we have highlighted can be summarized as:

  • Not knowing how to test what we are getting;

  • Getting something that works in the field but not knowing how we got it; and

  • Not knowing whether things deployed in the field are really true.

Can we distinguish the parts of theory that underlie these problems from “magic”? Must we bewail this state of our field, or is it an inevitable harbinger that complexity, learning, and crypto are becoming “sufficiently advanced”?

by RJLipton+KWRegan at February 08, 2016 12:42 AM UTC

Authors: Michele Borassi
Download: PDF
Abstract: In this work, we consider the following problem: given a digraph $G=(V,E)$, for each vertex $v$, we want to compute the number of vertices reachable from $v$. In other words, we want to compute the out-degree of each vertex in the transitive closure of $G$. We show that this problem is not solvable in time $\mathcal{O}\left(|E|^{2-\epsilon}\right)$ for any $\epsilon>0$, unless the Strong Exponential Time Hypothesis is false. This result still holds if $G$ is assumed to be acyclic.

Donate to arXiv

at February 08, 2016 01:40 AM UTC

Authors: Guy Blelloch, Daniel Ferizovic, Yihan Sun
Download: PDF
Abstract: The ordered set is one of the most important data type in both theoretical algorithm design and analysis and practical programming. In this paper we study the set operations on two ordered sets, including Union, Intersect and Difference, based on four types of balanced Binary Search Trees (BST) including AVL trees, red-black trees, weight balanced trees and treaps. We introduced only one subroutine Join that needs to be implemented differently for each balanced BST, and on top of which we can implement generic, simple and efficient parallel functions for ordered sets. We first prove the work-efficiency of these Join-based set functions using a generic proof working for all the four types of balanced BSTs.

We also implemented and tested our algorithm on all the four balancing schemes. Interestingly the implementations on all four data structures and three set functions perform similarly in time and speedup (more than 45x on 64 cores). We also compare the performance of our implementation to other existing libraries and algorithms.

Donate to arXiv

at February 08, 2016 01:40 AM UTC

Authors: Dzenan Zukic, Jan Egger, Miriam H. A. Bauer, Daniela Kuhnt, Barbara Carl, Bernd Freisleben, Andreas Kolb, Christopher Nimsky
Download: PDF
Abstract: The most common sellar lesion is the pituitary adenoma, and sellar tumors are approximately 10-15% of all intracranial neoplasms. Manual slice-by-slice segmentation takes quite some time that can be reduced by using the appropriate algorithms. In this contribution, we present a segmentation method for pituitary adenoma. The method is based on an algorithm that we have applied recently to segmenting glioblastoma multiforme. A modification of this scheme is used for adenoma segmentation that is much harder to perform, due to lack of contrast-enhanced boundaries. In our experimental evaluation, neurosurgeons performed manual slice-by-slice segmentation of ten magnetic resonance imaging (MRI) cases. The segmentations were compared to the segmentation results of the proposed method using the Dice Similarity Coefficient (DSC). The average DSC for all datasets was 75.92% +/- 7.24%. A manual segmentation took about four minutes and our algorithm required about one second.

Donate to arXiv

at February 08, 2016 01:43 AM UTC

Authors: Nicolas Tremblay, Gilles Puy, Remi Gribonval, Pierre Vandergheynst
Download: PDF
Abstract: Spectral clustering has become a popular technique due to its high performance in many contexts. It comprises three main steps: create a similarity graph between N objects to cluster, compute the first k eigenvectors of its Laplacian matrix to define a feature vector for each object, and run k-means on these features to separate objects into k classes. Each of these three steps becomes computationally intensive for large N and/or k. We propose to speed up the last two steps based on recent results in the emerging field of graph signal processing: graph filtering of random signals, and random sampling of bandlimited graph signals. We prove that our method, with a gain in computation time that can reach several orders of magnitude, is in fact an approximation of spectral clustering, for which we are able to control the error. We test the performance of our method on artificial and real-world network data.

Donate to arXiv

at February 08, 2016 01:40 AM UTC

Authors: David F. Manlove, Iain McBride
Download: PDF
Abstract: The Hospitals / Residents problem with Couples ({\sc hrc}) models the allocation of intending junior doctors to hospitals where couples are allowed to submit joint preference lists over pairs of (typically geographically close) hospitals. It is known that a stable matching need not exist, so we consider {\sc min bp hrc}, the problem of finding a matching that admits the minimum number of blocking pairs (i.e., is "as stable as possible"). We show that this problem is NP-hard and difficult to approximate even if each couple finds only one hospital pair acceptable. However if we further assume that the preference list of each single resident and hospital is of length at most 2, we give a polynomial-time algorithm for this case. We then show how to adapt an earlier Integer Programming (IP) model for {\sc hrc} to yield an IP formulation for {\sc min bp hrc}. Finally, we discuss an empirical evaluation of the IP model applied to randomly-generated instances of {\sc min bp hrc}. Our main finding is that the number of blocking pairs admitted by a solution is very small, i.e., usually at most 1, and never more than 2, for the (28,000) instances considered.

Donate to arXiv

at February 08, 2016 01:43 AM UTC

Authors: H. Furmanczyk, M. Kubale
Download: PDF
Abstract: In the paper we consider the problem of scheduling $n$ identical jobs on 4 uniform machines with speeds $s_1 \geq s_2 \geq s_3 \geq s_4,$ respectively. Our aim is to find a schedule with a minimum possible length. We assume that jobs are subject to some kind of mutual exclusion constraints modeled by a bipartite incompatibility graph of degree $\Delta$, where two incompatible jobs cannot be processed on the same machine. We show that the problem is NP-hard even if $s_1=s_2=s_3$. If, however, $\Delta \leq 4$ and $s_1 \geq 12 s_2$, $s_2=s_3=s_4$, then the problem can be solved to optimality in time $O(n^{1.5})$. The same algorithm returns a solution of value at most 2 times optimal provided that $s_1 \geq 2s_2$. Finally, we study the case $s_1 \geq s_2 \geq s_3=s_4$ and give an $O(n^{1.5})$-time $32/15$-approximation algorithm in all such situations.

Donate to arXiv

at February 08, 2016 01:43 AM UTC

In which we finish the proof of Cheeger’s inequalities.

It remains to prove the following statement.

Lemma 1 Let {{\bf y}\in {\mathbb R}^V_{\geq 0}} be a vector with non-negative entries. Then there is a {0< t \leq \max_v \{ y_v \} } such that

\displaystyle \phi ( \{ v : y_v \geq t \} ) \leq \sqrt {2 R_L({\bf y}) }

We will provide a probabilistic proof. Without loss of generality (multiplication by a scalar does not affect the Rayleigh quotient of a vector) we may assume that {\max_v y_v = 1}. We consider the probabilistic process in which we pick {t>0} in such a way that {t^2} is uniformly distributed in {[0,1]} and then define the non-empty set {S_t := \{ v : y_v \geq t \}}.

We claim that

\displaystyle \frac{\mathop{\mathbb E} E(S_t, V-S_t)}{\mathop{\mathbb E} d |S_t|} \leq \sqrt{2 R_L({\bf y})} \ \ \ \ \ (1)


Notice that Lemma 1 follows from such a claim, because of the following useful fact.

Fact 2 Let {X} and {Y} be random variables such that {\mathop{\mathbb P} [ Y>0] = 1}. Then

\displaystyle \mathop{\mathbb P} \left[ \frac{X}{Y} \leq \frac{\mathop{\mathbb E} X}{\mathop{\mathbb E} Y} \right] > 0

Proof: Call {r:= \frac{\mathop{\mathbb E} X}{\mathop{\mathbb E} Y}}. Then, using linearity of expectation, we have {\mathop{\mathbb E} X - rY = 0}, from which it follows {\mathop{\mathbb P} [ X - rY \leq 0] >0}, but, whenever {Y>0}, which we assumed to happen with probability 1, the conditions {X-rY \leq 0} and {\frac {X}{Y} \leq r} are equivalent. \Box

It remains to prove (1).

To bound the denominator, we see that

\displaystyle \mathop{\mathbb E} d |S_t| = d\cdot \sum_{v\in V} \mathop{\mathbb P} [ v \in S_t] = d \sum_v y_v^2


\displaystyle \mathop{\mathbb P} [ v \in S_t] = \mathop{\mathbb P} [ y_v \geq t ] = \mathop{\mathbb P} [ y^2_v \geq t^2 ] = y^2_v

To bound the numerator, we say that an edge is cut by {S_t} if one endpoint is in {S_t} and another is not. We have

\displaystyle \mathop{\mathbb E} E(S_t,V-S_t) = \sum_{\{ u,v \} \in E} \mathop{\mathbb P} [ \{u,v\} \mbox{ is cut} ]

\displaystyle = \sum_{\{u,v\} \in E} | y_v^2 - y_u^2|= \sum_{\{u,v\} \in E} | y_v - y_u|\cdot (y_u + y_v)

Applying Cauchy-Schwarz, we have

\displaystyle \mathop{\mathbb E} E(S_t,V-S_t) \leq \sqrt{ \sum_{\{u,v\} \in E} ( y_v - y_u)^2} \cdot \sqrt{ \sum_{\{u,v\} \in E} ( y_v + y_u)^2 }

and applying Cauchy-Schwarz again (in the form {(a+b)^2 \leq 2a^2 + 2b^2}) we get

\displaystyle \sum_{\{u,v\} \in E} ( y_v + y_u)^2 \leq \sum_{\{u,v\} \in E} 2y_v + 2y_u^2 = 2d\sum_v y_v^2

Putting everything together gives

\displaystyle \frac{\mathop{\mathbb E} E(S_t, V-S_t)}{\mathop{\mathbb E} d |S_t|} \leq \sqrt{2 \frac { \sum_{\{u,v\} \in E} ( y_v - y_u)^2 } { d\sum_v y_v^2} }

which is (1).

by luca at February 07, 2016 10:12 PM UTC

[Below is the transcript of a talk I  gave at the Harvard Lectures that Last event. This was a talk for a general (non technical) audience, but I still thought some people might find it interesting. For some more technical discussion of the notion of “computational Bayesian probabilities” I hint at toward the end, see the following talk and blog post.]


Today I want to talk about one of humanities’ greatest inventions. This is an invention you encountered already when you were 5 or 6. No, I’m not talking about the wheel or fire or even the iPad, but about the place-value number system.

Because we use the place-value decimal system, we only need six digits to express number of miles from here to the moon. But if you wanted to write this distance using roman numerals, it will take more than 200 letters. The distance to the sun would fill a small book.

This means that for cultures without place-value systems, these quantities were not merely unknown — they were unspeakable . This is why the place-value system is such a great invention. It did more than simply help us do better calculations- it expanded mankind’s imagination and gave us access to concepts that we simply couldn’t comprehend otherwise.

What does an ancient Babylonian concept has to do with Computer Science? Computers, like people, need to be taught to do even simple things like multiply numbers.
And it matters how they do it. For example, according to my unscientific and non-IRB approved experiments, a 9 year old will take about an hour to multiply two 30-digit numbers using the place-value algorithm she learned in elementary school. In contrast, even the fastest supercomputer on earth will take more than a century to do this computation if it follows the naive roman-numeral way of repeatedly adding the numbers to one another. So you see, even with all the technological advances of the last 3000 years, we still need those old Babylonian insights.

In fact, Computer Science has about as much to do with computers as English has to do with printing technology. The point of computer science is not about mechanizing some calculation but about finding a novel way to express a concept we couldn’t grasp before. Finding such a way enriches us whether or not it ends up being programmed into a computer, since it lets us access what was heretofore unspeakable.

Some of the phenomena computer scientists investigate arise from modern challenges; whether it is finding a new language to express social interactions among millions of people or ways to speak about biochemical interactions among thousands of genes. But we also study age-old riddles that have been around since the dawn of civilization. For example, it turns out that the basic algorithm we all learn in elementary school is not the best way to multiply two numbers. Computer scientists found new and much better ways in the 20th century and the currently best known algorithm was only discovered in 2007.

Computer science has had amazing advances that expanded our horizons both materially and intellectually. But I am personally even more fascinated by the discovery that some notions are in fact inherently unspeakable. That is, no matter what technological advances or new inventions humanity comes up with, some tasks will be forever out of our reach. These limitations tell us about the fundamental laws of computation, in the same way that the physical laws of nature prohibit a perpetual motion machine or faster-than-light communication. Much of computer science is about dealing with these inherent limitations— how can we talk about what we can not talk about?

Let me give an example loosely related to my own work. Can we really grasp probabilities? We talk about them all the time. You hear in the news that with probability 50\% the global sea level will rise by half a meter before 2100, or that Hillary Clinton has a 54% chance of becoming president in 2016. These are important events that have serious consequences. But what do these numbers actually mean?

Economists have long thought they can answer this question. These probabilities are supposed to reflect the predictions of a perfectly rational person that has accumulated all available evidence. This is the same hypothetical rational person that makes free markets work. The only problem is that this person doesn’t exist. And this is not just because we humans are biased, lazy, or stupid. Rather, it turns out that these probabilities are often inherently unspeakable. For example, to truly compute the probability that Hillary Clinton becomes president, you would need to know the probability that she wins Florida, the probability that she wins Ohio, as well as the probability that she wins both Florida and Ohio, and so on and so forth. Unfortunately, there are literally quadrillion such combinations of the 50 states. If you considered all combinations of the 3000+ counties you get a number so large that it doesn’t even have a name. There is no way for any human or computer to compute all these probabilities. But yet this is what needed to obtain such perfectly rational predictions.

So, how do we talk about what we can not talk about? This is a topic for a longer lecture, but one approach involves borrowing insights from Voltaire, Ralph Waldo Emerson as well as quantum mechanics. To make these probabilities “speakable” we need to realize that the “perfect is the enemy of the good” and give up on the “hobgoblin of consistency”. In fact we sometimes even need to consider probabilities that are negative numbers. If this sounds mysterious, I hope it entices you to learn some more about computer science.

by Boaz Barak at February 07, 2016 05:41 AM UTC

In my previous post, I linked to seven Closer to Truth videos of me spouting about free will, Gödel’s Theorem, black holes, etc. etc.  I also mentioned that there was a segment of me talking about why the universe exists that for some reason they didn’t put up.  Commenter mjgeddes wrote, “Would have liked to hear your views on the existence of the universe question,” so I answered in another comment.

But then I thought about it some more, and it seemed inappropriate to me that my considered statement about why the universe exists should only be available as part of a comment thread on my blog.  At the very least, I thought, such a thing ought to be a top-level post.

So, without further ado:

My view is that, if we want to make mental peace with the “Why does the universe exist?” question, the key thing we need to do is forget about the universe for a while, and just focus on the meaning of the word “why.”  I.e., when we ask a why-question, what kind of answer are we looking for, what kind of answer would make us happy?

Notice, in particular, that there are hundreds of other why-questions, not nearly as prestigious as the universe one, yet that seem just as vertiginously unanswerable.  E.g., why is 5 a prime number?  Why does “cat” have 3 letters?

Now, the best account of “why”—and of explanation and causality—that I know about is the interventionist account, as developed for example in Judea Pearl’s work.  In that account, to ask “Why is X true?” is simply to ask: “What could we have changed in order to make X false?”  I.e., in the causal network of reality, what are the levers that turn X on or off?

This question can sometimes make sense even in pure math.  For example: “Why is this theorem true?” “It’s true only because we’re working over the complex numbers.  The analogous statement about real numbers is false.”  A perfectly good interventionist answer.

On the other hand, in the case of “Why is 5 prime?,” all the levers you could pull to make 5 composite involve significantly more advanced machinery than is needed to pose the question in the first place.  E.g., “5 is prime because we’re working over the ring of integers.  Over other rings, like Z[√5], it admits nontrivial factorizations.”  Not really an explanation that would satisfy a four-year-old (or me, for that matter).

And then we come to the question of why anything exists.  For an interventionist, this translates into: what causal lever could have been pulled in order to make nothing exist?  Well, whatever lever it was, presumably the lever itself was something—and so you see the problem right there.

Admittedly, suppose there were a giant red button, somewhere within the universe, that when pushed would cause the entire universe (including the button itself) to blink out of existence. In that case, we could say: the reason why the universe continues to exist is that no one has pushed the button yet. But even then, that still wouldn’t explain why the universe had existed.

by Scott at February 06, 2016 04:31 PM UTC

I've been lounging in San Diego at my favorite conference, the Information Theory and Applications workshop. It's so friendly that even the sea lions are invited (the Marine Room is where we had the conference banquet).

Sadly this year I was mired in deadlines and couldn't take advantage of the wonderful talks on tap and the over 720 people who attended. Pro tip, ITA: Could you try to avoid the ICML/KDD/COLT deadlines next time :) ?

ITA always has fun events that our more "serious" conferences should learn time. This time, the one event I attended was a Man vs Machine cookoff. Which I thought was particularly apropos since I had just written a well-received article with a cooking metaphor for thinking about algorithms and machine learning.

The premise: Chef Watson (IBM's Watson, acting as a chef) designs recipes for dinner (appetizer/entree/dessert) with an assist from a human chef. Basically the chef puts in some ingredients and Watson suggests a recipe (not from a list of recipes, but from its database of knowledge of chemicals, what tastes 'go well together' and so on. This was facilitated by Kush Varshney from IBM, who works on this project.

Each course is presented as a blind pairing of Watson and Human recipes, and its our job to vote for which one we think is which.

It was great fun. We had four special judges, and each of us had a placard with red and blue sides to make our votes. After each course, Kush gave us the answer.

The final score: 3-0. The humans guessed correctly for each course. The theme was "unusualness": the machine-generated recipes had somewhat stranger combinations, and because Watson doesn't (yet) know about texture, the machine-recipes had a different mouthfeel to them.

This was probably the only time I've heard the words 'Bayesian surprise' and 'eigenvector' used in the context of food reviews.

by Suresh Venkatasubramanian ( at February 06, 2016 01:40 PM UTC

The Theory Group in the Department of Computing Science at University of Alberta
invites applications for a postdoctoral fellowship position to start in Fall of 2016 (or until it is filled).

The applicant is expected to work closely Zachary Friggstad and Mohammad Salavatipour
in the areas of: approximation algorithms, hardness of approximation, combinatorial optimization, and randomized met

Webpage: http://


by theorycsjobs at February 05, 2016 06:38 PM UTC

Suppose that your department were given the chance to host the recipient of a very competitive post-doctoral award. That award would pay 95% of the salary of the post-doctoral researcher, who also leads a project funded in 2016 (worth 185,373.66€) and one funded in 2015 (worth 222,568.50€). I am fairly confident that your department would welcome that award- and grant-winning post-doctoral researcher with open arms.

This is not what has happened to Vincenzo Dimonte, an Italian set theorist who is presently a post-doctoral researcher at the Kurt Gödel Research Center for Mathematical Logic in Vienna. Dimonte was one of the three recipients in the field of mathematics  of a prestigious and competitive Rita Levi Montalcini award for 2016. In his application for the award, Vincenzo Dimonte gave a ranked list of five three mathematics department in Italy that were willing to host him, the top one being the Department of Mathematics at the Politecnico di Torino. I presume that he even enclosed a letter from someone at that department saying that they were willing to host him. The choice of Turin as top location in his list was natural since Turin hosts a group of top-class set theorists Andretta, Viale, Motto Ros and Camerlo (who is actually at the Politecnico).

However, when Vincenzo Dimonte won the grant, the department twice refused to host him! Of course, he'll go down his own list and I trust that one of the four other destinations he chose will actually welcome him. The fact remains that such decisions are hard to understand when viewed from a purely scientific perspective and may have a negative impact on the future career of someone who has been deemed to be worthy of a top award for young researchers in Italy.

by Luca Aceto ( at February 05, 2016 06:36 PM UTC

We prove that any algorithm for learning parities requires either a memory of quadratic size or an exponential number of samples. This proves a recent conjecture of Steinhardt, Valiant and Wager and shows that for some learning problems a large storage space is crucial. More formally, in the problem of parity learning, an unknown string $x \in \{0,1\}^n$ was chosen uniformly at random. A learner tries to learn $x$ from a stream of samples $(a_1, b_1), (a_2, b_2) \ldots$, where each~$a_t$ is uniformly distributed over $\{0,1\}^n$ and $b_t$ is the inner product of $a_t$ and $x$, modulo~2. We show that any algorithm for parity learning, that uses less than $\frac{n^2}{25}$ bits of memory, requires an exponential number of samples. Previously, there was no non-trivial lower bound on the number of samples needed, for any learning problem, even if the allowed memory size is $O(n)$ (where $n$ is the space needed to store one sample). We also give an application of our result in the field of bounded-storage cryptography. We show an encryption scheme that requires a private key of length $n$, as well as time complexity of $n$ per encryption/decription of each bit, and is provenly and unconditionally secure as long as the attacker uses less than $\frac{n^2}{25}$ memory bits and the scheme is used at most an exponential number of times. Previous works on bounded-storage cryptography assumed that the memory size used by the attacker is at most linear in the time needed for encryption/decription.

at February 05, 2016 04:30 PM UTC

We study two variants of seeded randomness extractors. The first one, as studied by Goldreich et al. \cite{goldreich2015randomness}, is seeded extractors that can be computed by $AC^0$ circuits. The second one, as introduced by Bogdanov and Guo \cite{bogdanov2013sparse}, is (strong) extractor families that consist of sparse transformations, i.e., functions that have a small number of overall input-output dependencies (called \emph{sparse extractor families}). In this paper we focus on the stronger condition where any function in the family can be computed by local functions. The parameters here are the length of the source $n$, the min-entropy $k=k(n)$, the seed length $d=d(n)$, the output length $m=m(n)$, the error $\epsilon=\epsilon(n)$, and the locality of functions $\ell=\ell(n)$. In the $AC^0$ extractor case, our main results substantially improve the positive results in \cite{goldreich2015randomness}, where for $k \geq n/\poly(\log n)$ a seed length of $O(m)$ is required to extract $m$ bits with error $1/\poly(n)$. We give constructions of strong seeded extractors for $k=\delta n \geq n/\poly(\log n)$, with seed length $d=O(\log n)$, output length $m=k^{\Omega(1)}$, and error any $1/\poly(n)$. We can then boost the output length to $\Omega(\delta k)$ with seed length $d=O(\log n)$, or to $(1-\gamma)k$ for any constant $0<\gamma<1$ with $d=O(\frac{1}{\delta}\log n)$. In the special case where $\delta$ is a constant and $\epsilon=1/\poly(n)$, our parameters are essentially optimal. In addition, we can reduce the error to $2^{-\poly(\log n)}$ at the price of increasing the seed length to $d=\poly(\log n)$. In the case of sparse extractor families, Bogdanov and Guo \cite{bogdanov2013sparse} gave constructions for any min-entropy $k$ with locality at least $O(n/k\log (m/\epsilon)\log (n/m))$, but the family size is quite large, i.e., $2^{nm}$. Equivalently, this means the seed length is at least $nm$. In this paper we significantly reduce the seed length. For $k \geq n/\poly(\log n)$ and error $1/\poly(n)$, our $AC^0$ extractor with output $k^{\Omega(1)}$ also has small locality $\ell=\poly(\log n)$, and the seed length is only $O(\log n)$. We then show that for $k \geq n/\poly(\log n)$ and $\epsilon \geq 2^{-k^{\Omega(1)}}$, we can use our error reduction techniques to get a strong seeded extractor with seed length $d =O(\log n + \frac{\log^2(1/\epsilon)}{\log n})$, output length $m = k^{\Omega(1)}$ and locality $ \log^2 (1/\epsilon) \poly(\log n) $. Finally, for min-entropy $k=\Omega(\log^2 n)$ and error $\epsilon \geq 2^{-k^{\Omega(1)}}$, we give a strong seeded extractor with seed length $d = O(k)$, $m = (1-\gamma)k$ and locality $\frac{n}{k}\log^2 (1/\epsilon) (\log n)\poly(\log k)$. As an intermediate tool for this extractor, we construct a condenser that condenses an $(n, k)$-source into a $(10k, \Omega(k))$-source with seed length $d=O(k)$, error $2^{-\Omega(k)}$ and locality $\Theta(\frac{n}{k}\log n)$.

at February 05, 2016 03:36 PM UTC

Given a weighted graph $G = (V,E,w)$, with weight function $w: E \rightarrow \mathbb{Q^+}$, a \textit{matching} $M$ is a set of pairwise non-adjacent edges. In the optimization setting, one seeks to find a matching of \textit{maximum} weight. In the \textit{multi-criteria} (or \textit{multi-budgeted}) setting, we are also given $\ell$ length functions $\alpha_1, \dots, \alpha_{\ell}$ assigning positive numbers to each edge and $\ell$ numbers $\beta_1, \dots, \beta_{\ell} \in \mathbb{Q^+}$, each one associated with the corresponding length function. The goal is to find a maximum weight matching $M$ (under function $w$) such that $\sum_{e \in M} \alpha_i (e) \leq \beta_i$, $\forall i \in [\ell]$ (these are the budgets). In this paper we are interested in the case where each edge $e \in E$ belongs to a unique budget, i.e., it has a unique \textit{color}. In \cite{DBLP:conf/mfcs/Stamoulis14} an $\frac{1}{2}$-approximation algorithm was given based on rounding the natural linear programming relaxation of the problem, and this is optimal modulo the integrality gap of the formulation. The purpose of this paper is to study to what extend linear programming methods help us design algorithms with a better performance guarantee. We prove the following \textit{unconditional} inapproximability result: even a large family of linear programs, generated by a \textit{logarithmic} number of rounds of the Sherali-Adams hierarchy \cite{DBLP:journals/siamdm/SheraliA90} or a \textit{linear} number of rounds of the \textsf{BCC} operator \cite{DBLP:journals/mp/BalasCC93}, does not suffice to change the integrality gap even the slightest.

at February 05, 2016 11:51 AM UTC

We prove the existence of (one-way) communication tasks with a subconstant versus superconstant asymptotic gap, which we call "doubly infinite," between their quantum information and communication complexities. We do so by studying the exclusion game [C. Perry et al., Phys. Rev. Lett. 115, 030504 (2015)] for which there exist instances where the quantum information complexity tends to zero as the size of the input $n$ increases. By showing that the quantum communication complexity of these games scales at least logarithmically in $n$, we obtain our result. We further show that the established lower bounds and gaps still hold even if we allow a small probability of error. However in this case, the $n$-qubit quantum message of the zero-error strategy can be compressed polynomially.

at February 05, 2016 11:46 AM UTC

Authors: Jesper Nederlof
Download: PDF
Abstract: In the subset sum problem we are given n positive integers along with a target integer t. A solution is a subset of these integers summing to t. In this short note we show that for a given subset sum instance there is a proof of size $O^*(\sqrt{t})$ of what the number of solutions is that can be constructed in $O^*(t)$ time and can be probabilistically verified in time $O^*(\sqrt{t})$ with at most constant error probability. Here, the $O^*()$ notation omits factors polynomial in the input size $n\log(t)$.

Donate to arXiv

at February 05, 2016 12:00 AM UTC

Authors: Moritz Baum, Thomas Bläsius, Andreas Gemsa, Ignaz Rutter, Franziska Wegner
Download: PDF
Abstract: Isocontours in road networks represent the area that is reachable from a source within a given resource limit. We study the problem of computing accurate isocontours in realistic, large-scale networks. We propose polygons with minimum number of segments that separate reachable and unreachable components of the network. Since the resulting problem is not known to be solvable in polynomial time, we introduce several heuristics that are simple enough to be implemented in practice. A key ingredient is a new practical linear-time algorithm for minimum-link paths in simple polygons. Experiments in a challenging realistic setting show excellent performance of our algorithms in practice, answering queries in a few milliseconds on average even for long ranges.

Donate to arXiv

at February 05, 2016 12:00 AM UTC

Authors: Hugo Gilbert, Olivier Spanjaard
Download: PDF
Abstract: In this paper, we provide a generic anytime lower bounding procedure for minmax regret optimization problems. We show that the lower bound obtained is always at least as accurate as the lower bound recently proposed by Chassein and Goerigk (2015). The validity of the bound is based on game theoretic arguments and its computation is performed via a double oracle algorithm (McMahan et al., 2003) that we specify. The lower bound can be efficiently computed for any minmax regret optimization problem whose standard version is "easy". We describe how to efficiently embed this lower bound in a branch and bound procedure. Finally we apply our approach to the robust shortest path problem. Our numerical results show a significant gain in the computation times.

Donate to arXiv

at February 05, 2016 12:00 AM UTC

Authors: Till Fluschnik, Stefan Kratsch, Rolf Niedermeier, Manuel Sorge
Download: PDF
Abstract: We study the NP-complete Minimum Shared Edges (MSE) problem. Given an undirected graph, a source and a sink vertex, and two integers p and k, the question is whether there are p paths in the graph connecting the source with the sink and sharing at most k edges. Herein, an edge is shared if it appears in at least two paths. We show that MSE is W[1]-hard when parameterized by the treewidth of the input graph and the number k of shared edges combined. We show that MSE is fixed-parameter tractable with respect to p, but does not admit a polynomial-size kernel (unless NP is contained in coNP/poly). In the proof of the fixed-parameter tractability of MSE parameterized by p, we employ the treewidth reduction technique due to Marx, O'Sullivan, and Razgon [ACM TALG 2013].

Donate to arXiv

at February 05, 2016 12:00 AM UTC

Authors: Jakob Dahlum, Sebastian Lamm, Peter Sanders, Christian Schulz, Darren Strash, Renato F. Werneck
Download: PDF
Abstract: Computing high-quality independent sets quickly is an important problem in combinatorial optimization. Several recent algorithms have shown that kernelization techniques can be used to find exact maximum independent sets in medium-sized sparse graphs, as well as high-quality independent sets in huge sparse graphs that are intractable for exact (exponential-time) algorithms. However, a major drawback of these algorithms is that they require significant preprocessing overhead, and therefore cannot be used to find a high-quality independent set quickly.

In this paper, we show that performing simple kernelization techniques in an online fashion significantly boosts the performance of local search, and is much faster than pre-computing a kernel using advanced techniques. In addition, we show that cutting high-degree vertices can boost local search performance even further, especially on huge (sparse) complex networks. Our experiments show that we can drastically speed up the computation of large independent sets compared to other state-of-the-art algorithms, while also producing results that are very close to the best known solutions.

Donate to arXiv

at February 05, 2016 12:00 AM UTC

Authors: Xiaojun Chen, Lu Xu, Yue Yang, Jan Egger
Download: PDF
Abstract: This paper presents a generalized integrated framework of semi-automatic surgical template design. Several algorithms were implemented including the mesh segmentation, offset surface generation, collision detection, ruled surface generation, etc., and a special software named TemDesigner was developed. With a simple user interface, a customized template can be semi- automatically designed according to the preoperative plan. Firstly, mesh segmentation with signed scalar of vertex is utilized to partition the inner surface from the input surface mesh based on the indicated point loop. Then, the offset surface of the inner surface is obtained through contouring the distance field of the inner surface, and segmented to generate the outer surface. Ruled surface is employed to connect inner and outer surfaces. Finally, drilling tubes are generated according to the preoperative plan through collision detection and merging. It has been applied to the template design for various kinds of surgeries, including oral implantology, cervical pedicle screw insertion, iliosacral screw insertion and osteotomy, demonstrating the efficiency, functionality and generality of our method.

Donate to arXiv

at February 05, 2016 12:00 AM UTC

Authors: Aditya Deshmukh, Rahul Vaze
Download: PDF
Abstract: The problem of online packet scheduling to minimize the required conventional grid energy for transmitting a fixed number of packets given a common deadline is considered. The total number of packets arriving within the deadline is known, but the packet arrival times are unknown, and can be arbitrary. The proposed algorithm tries to finish the transmission of each packet assuming all future packets are going to arrive at equal time intervals within the left-over time. The proposed online algorithm is shown to have competitive ratio that is logarithmic in the number of packet arrivals. The hybrid energy paradigm is also considered, where in addition to grid energy, energy is also available via extraction from renewable sources. The objective here is to minimize the grid energy use. A suitably modified version of the previous algorithm is also shown to have competitive ratio that is logarithmic in the number of packet arrivals.

Donate to arXiv

at February 05, 2016 12:00 AM UTC

Authors: Kuan Cheng, Xin Li
Download: PDF
Abstract: We study two variants of seeded randomness extractors. The first one, as studied by Goldreich et al. \cite{goldreich2015randomness}, is seeded extractors that can be computed by AC0 circuits. The second one, as introduced by Bogdanov and Guo \cite{bogdanov2013sparse}, is (strong) extractor families that consist of sparse transformations, i.e., functions that have a small number of overall input-output dependencies (called \emph{sparse extractor families}).

In the AC0 extractor case, our main results substantially improve the positive results in \cite{goldreich2015randomness}. We give constructions of strong seeded extractors for $k=\delta n \geq n/poly(\log n)$, with seed length $d=O(\log n)$, output length $m=k^{\Omega(1)}$, and error any $1/poly(n)$. We can then boost the output length to $\Omega(\delta k)$ with seed length $d=O(\log n)$, or to $(1-\gamma)k$ for any constant $0<\gamma<1$ with $d=O(\frac{1}{\delta}\log n)$.

In the case of sparse extractor families,for min-entropy $k=\Omega(\log^2 n)$ and error $\epsilon \geq 2^{-k^{\Omega(1)}}$, we give a strong seeded extractor with seed length $d = O(k)$, $m = (1-\gamma)k$ and locality $\frac{n}{k}\log^2 (1/\epsilon) (\log n)poly(\log k)$.

Donate to arXiv

at February 05, 2016 02:08 AM UTC

In January 2014, I attended an FQXi conference on Vieques island in Puerto Rico.  While there, Robert Lawrence Kuhn interviewed me for his TV program Closer to Truth, which deals with science and religion and philosophy and you get the idea.  Alas, my interview was at the very end of the conference, and we lost track of the time—so unbeknownst to me, a plane full of theorists was literally sitting on the runway waiting for me to finish philosophizing!  This was the second time Kuhn interviewed me for his show; the first time was on a cruise ship near Norway in 2011.  (Thankless hero that I am, there’s nowhere I won’t travel for the sake of truth.)

Anyway, after a two-year wait, the videos from Puerto Rico are finally available online.  While my vignettes cover what, for most readers of this blog, will be very basic stuff, I’m sort of happy with how they turned out: I still stutter and rock back and forth, but not as much as usual.  For your viewing convenience, here are the new videos:

I had one other vignette, about why the universe exists, but they seem to have cut that one.  Alas, if I knew why the universe existed in January 2014, I can’t remember any more.

One embarrassing goof: I referred to the inventor of Newcomb’s Paradox as “Simon Newcomb.”  Actually it was William Newcomb: a distant relative of Simon Newcomb, the 19th-century astronomer who measured the speed of light.

At their website, you can also see my older 2011 videos, and videos from others who might be known to readers of this blog, like Marvin Minsky, Roger Penrose, Rebecca Newberger Goldstein, David ChalmersSean Carroll, Max Tegmark, David Deutsch, Raphael Bousso, Freeman DysonNick BostromRay Kurzweil, Rodney Brooks, Stephen Wolfram, Greg Chaitin, Garrett Lisi, Seth Lloyd, Lenny Susskind, Lee Smolin, Steven Weinberg, Wojciech Zurek, Fotini Markopoulou, Juan Maldacena, Don Page, and David Albert.  (No, I haven’t yet watched most of these, but now that I linked to them, maybe I will!)

Thanks very much to Robert Lawrence Kuhn and Closer to Truth (and my previous self, I guess?) for providing Shtetl-Optimized content so I don’t have to.

Update: Andrew Critch of CFAR asked me to post the following announcement.

We’re seeking a full time salesperson for the Center for Applied Rationality in Berkeley, California. We’ve streamlined operations to handle large volume in workshop admissions, and now we need that volume to pour in. Your role would be to fill our workshops, events, and alumni community with people. Last year we had 167 total new alumni. This year we want 120 per month. Click here to find out more.

by Scott at February 04, 2016 10:53 PM UTC

Inspired by Diakonikolas and Kane (2016), we reduce the class of problems consisting of testing whether an unknown distribution over $[n]$ equals a fixed distribution to this very problem when the fixed distribution is uniform over $[n]$. Our reduction preserves the parameters of the problem, which are $n$ and the proximity parameter $\e>0$, up to a constant factor. While this reduction yields no new bounds on the sample complexity of either problems, it provides a simple way of obtaining testers for equality to arbitrary fixed distributions from testers for the uniform distribution.

at February 04, 2016 09:26 PM UTC

I was reading a post on G+ about a musician who keeps all her performance reviews on her website and annotates them with a response. Not to "fight back", but to add to the reviews (that are occasionally negative).

I'm very tempted to do the same thing myself with my submissions. I think this will provide more clarity about the nature of the review process, about the often honest and revealing discussions that take place behind the peer-review walls, and about how subtleties in the writing can change the perception of a work. I suspect that as a consequence I'll be more circumspect about submitting something half-baked (which might be a good thing). I'll have to be careful not to get defensive in my responses to the reviews (which is always hard). And I may not be able to get away as easily with "changing the introduction" to get a paper in (which happens shockingly often).

Of course the biggest problem will be getting my co-authors (who are often my students) to agree beforehand. So here's my question:
Would you work with me if you knew I was planning to make all my reviews public? 

by Suresh Venkatasubramanian ( at February 04, 2016 06:45 PM UTC

The web sites and dish up a smorgasbord of facts about every natural number from 1 to 999,999,999,999,999. Type in your favorite positive integer (provided it’s less than 1015) and you’ll get a list of prime factors, a list of divisors, the number’s representation in various bases, its square root, and lots more.

I first stumbled upon these sites (and several others like them) about a year ago. I revisited them recently while putting together the Carnival of Mathematics. I was looking for something cute to say about the new calendar year and was rewarded with the discovery that 2016 is a triangular number: 2016 bowling pins can be arranged in an equilateral triangle with 63 pins per side.

This incident set me to thinking: What does it take to build a web site like these? Clearly, the sites do not have 999,999,999,999,999 HTML files sitting on a disk drive waiting to be served up when a visitor arrives. Everything must be computed on the fly, in response to a query. And it all has to be done in milliseconds. The question that particularly intrigued me was how the programs recognize that a given number has certain properties or is a member of a certain class—a triangular number, a square, a Fibonacci, a factorial, and so on.

I thought the best way to satisfy my curiosity would be to build a toy number site of my own. Here it is:

Number Factoids

Prime factors:
Prime number ?
Square-free number ?
Square-root-smooth number ?
Square number ?
Triangular number ?
Factorial number ?
Fibonacci number ?
Catalan number ?
Somos-4 number ?

Elapsed time:

This one works a little differently from the number sites I’ve found on the web. The computation is done not on my server but on your computer. When you type a number into the input field above, a JavaScript program running in your web browser computes the prime factors of the number and checks off various other properties. (The source code for this program is available on GitHub, and there’s also a standalone version of the Number Factoids calculator.)

Because the computation is being done by your computer, the performance depends on what hardware and software you bring to the task. Especially important is the JavaScript engine in your browser. As a benchmark, you might try entering the number 999,999,999,999,989, which is the largest prime less than 1015. The elapsed time for the computation will be shown at the bottom of the panel. On my laptop, current versions of Chrome, Firefox, Safari, and Opera give running times in the range of 150 to 200 milliseconds. (But an antique iPad takes almost 8 seconds.)

Most of that time is spent in factoring the integer (or attempting to factor it in the case of a prime). Factoring is reputed to be a hard problem, and so you might suppose it would make this whole project infeasible. But the factoring computation bogs down only with really big numbers—and a quadrillion just isn’t that big anymore. Even a crude trial-division algorithm can do the job. In the worst case we need to try dividing by all the odd numbers less than \(\sqrt{10^{15}}\). That means the inner loop runs about 16 million times—a mere blink of the eye.

Once we have the list of prime factors for a number N, other properties come along almost for free. Primality: We can tell whether or not N is prime just by looking at the length of the factor list. Square-freeness: N is square-free if no prime appears more than once in the list. Smoothness: N is said to be square-root smooth if the largest prime factor is no greater than \(\sqrt{N}\). (For example, \(12 = 2 \times 2 \times 3\) is square-root smooth, but \(20 = 2 \times 2 \times 5\) is not.)

The factor list could also be used to detect square numbers. N is a perfect square if every prime factor appears in the list an even number of times. But there are lots of other ways to detect squares that don’t require factorization. Indeed, running on a machine that has a built-in square-rooter, the JavaScript code for recognizing perfect squares can be as simple as this:

function isSquare(N) {
  var root = Math.floor(Math.sqrt(N));
  return root * root === N;

If you want to test this code in the Number Factoids calculator, you might start with 999,999,961,946,176, which is the largest perfect square less than \(10^{15}\).

Note that the isSquare function is a predicate: The return statement in the last line yields a boolean value, either true or false. The program might well be more useful if it could report not only that 121 is a square but also what it’s the square of. But the Number Factoids program is just a proof of concept, so I have stuck to yes-or-no questions.

Metafactoids: The Factoids calculator tests nine boolean properties. No number can possess all of these properties, but 1 gets seven green checkmarks. Can any other number equal this score? The sequence of numbers that exhibit none of the nine properties begins 20, 44, 52, 68, 76, 88, 92,… Are they all even numbers? (Hover for answer.)

What about detecting triangular numbers? N is triangular if it is the sum of all the integers from 1 through k for some integer k. For example, \(2016 = 1 + 2 + 3 + \dots + 63\). Given k, it’s easy enough to find the kth triangular number, but we want to work in the opposite direction: Given N, we want to find out if there is a corresponding k such that \(1 + 2 + 3 + \cdots + k = N\).

Young Carl Friedrich Gauss knew a shortcut for calculating the sums of consecutive integers: \(1 + 2 + 3 + \cdots + k = N = k\,(k+1)\,/\,2\). We need to invert this formula, solving for the value of k that yields a specified N (if there is one). Rearranging the equation gives \(k^2 + k - 2N = 0\), and then we can crank up the trusty old quadratic formula to get this solution:

\[k = \frac{-1 \pm \sqrt{1 + 8N}}{2}.\]

Thus k is an integer—and N is triangular—if and only if \(8N + 1\) is an odd perfect square. (Let’s ignore the negative root, and note that if \(8N + 1\) is a square at all, it will be an odd one.) Detecting perfect squares is a problem we’ve already solved, so the predicate for detecting triangular numbers takes this simple form:

function isTriangular(N) {
  return isSquare(8 * N + 1);

Try testing it with 999,999,997,764,120, the largest triangular number less than \(10^{15}\).

Factorials are the multiplicative analogues of triangular numbers: If N is the kth factorial, then \(N = k! = 1 \times 2 \times 3 \times \cdots \times k\). Is there a multiplicative trick that generates factorials in the same way that Gauss’s shortcut generates triangulars? Well, there’s Stirling’s approximation:

\[k! \approx \sqrt{2 \pi k} \left( \frac{k}{e} \right)^k.\]

We might try to invert this formula to get a function of \(k!\) whose value is \(k\), but I don’t believe this is a promising avenue to explore. The reason is that Stirling’s formula is only an approximation. It predicts, for example, that 5! is equal to 118.02, whereas the true value is 120. Thus taking the output of the inverse function and rounding to the nearest integer would produce wrong answers. We could add correction terms to get a closer approximation—but surely there’s a better way.

One approach is to work with the gamma (\(\Gamma\)) function, which extends the concept of a factorial from the integers to the real and complex numbers; if \(n\) is an integer, then \(\Gamma(n+1) = n!\), but the \(\Gamma\) function also interpolates between the factorial values. A recent paper by Mitsuru Uchiyama gives an explicit, analytic, inverse of the gamma function, but I understand only fragments of the mathematics, and I don’t know how to implement it algorithmically.

Fifteen years ago David W. Cantrell came up with another inverse of the gamma function, although this one is only approximate. Cantrell’s version is much less intimidating, and it is based on one of my favorite mathematical gadgets, the Lambert W function. A Mathematica implementation of Cantrell’s idea works as advertised—when it is given the kth factorial number as input, it returns a real number very close to \(k+1\). However, the approximation is not good enough to distinguish true factorials from nearby numbers. Besides, JavaScript doesn’t come with a built-in Lambert W function, and I am loath to try writing my own.

On the whole, it seems better to retreat from all this higher mathematics and go back to the definition of the factorial as a product of successive integers. Then we can reliably detect factorials with a simple linear search, expressed in the following JavaScript function:

function isFactorial(N) {
  var d = 2, q = N, r = 0;
  while (q > 1 && r === 0) {
    r = q % d;
    q = q / d;
    d += 1;
  return (q === 1 && r === 0);

A factorial is built by repeated multiplication, so this algorithm takes it apart by repeated division. Initially, we set \(q = N\) and \(d = 2\). Then we replace \(q\) by \(q / d\), and \(d\) by \(d + 1\), while keeping track of the remainder \(r = q \bmod d\). If we can continue dividing until \(q\) is equal to 1, and the remainder of every division is 0, then N is a factorial. This is not a closed-form solution; it requires a loop. On the other hand, the largest factorial less than \(10^{15}\) is 17! = 355,687,428,096,000, so the program won’t be going around the loop more than 17 times.

The Fibonacci numbers are dear to the hearts of number nuts everywhere (including me). The sequence is defined by the recursion \(F_0 = 0, F_1 = 1, F_k = F_{k-1} + F_{k-2}\). How best to recognize these numbers? There is a remarkable closed-form formula, named for the French mathematician J. P. M. Binet:

\[F_k = \frac{1}{\sqrt{5}} \left[ \left(\frac{1 + \sqrt{5}}{2}\right)^k - \left(\frac{1 - \sqrt{5}}{2}\right)^k\right]\]

I call it remarkable because, unlike Stirling’s approximation for factorials, this is an exact formula; if you give it an integer k and an exact value of \(\sqrt{5}\)), it returns the kth Fibonacci number as an integer.

One afternoon last week I engaged in a strenuous wrestling match with Binet’s formula, trying to turn it inside out and thereby create a function of N that returns k if and only if \(N\) is the kth Fibonacci number. With some help from Mathematica I got as far as the following expression, which gives the right answer some of the time:

\[k(N) = \frac{\log \frac{1}{2} \left( \sqrt{5 N^2 + 4} - \sqrt{5} N \right)}{\log \frac{1}{2} \left( \sqrt{5} - 1 \right)}\]

Plugging in a few values of N yields the following table of values for \(k(N)\):

N k(N)
1 2.000000000000001
2 3.209573979673092
3 4.0000000000000036
4 4.578618254581733
5 5.03325648737724
6 5.407157747499656
7 5.724476891770392
8 6.000000000000018
9 6.243411773788614
10 6.4613916654135615
11 6.658737112471047
12 6.8390081849422675
13 7.00491857188792
14 7.158583717787527
15 7.3016843734535035
16 7.435577905992959
17 7.561376165404197
18 7.680001269357004
19 7.792226410280063
20 7.8987062604216005
21 7.999999999999939

In each of the green rows, the function correctly recognizes a Fibonacci number \(F_k\), returning the value of k as an integer. (Or almost an integer; the value would be exact if we could calculate exact square roots and logarithms.) Specifically, 1 is the second Fibonacci number (though also the first), 3 is the fourth, 8 is the sixth, and 21 is the eighth Fibonacci number. So far so good. But there’s something weird going on with the other Fibonacci numbers in the table, namely those with odd-numbered indices (red rows). For N = 2, 5, and 13, the inverse Binet function returns numbers that are close to the correct k values (3, 5, 7), but not quite close enough. What’s that about?

If I had persisted in my wrestling match, would I have ultimately prevailed? I’ll never know, because in this era of Google and MathOverflow and StackExchange, a spoiler lurks around every cybercorner. Before I could make any further progress, I stumbled upon pointers to the work of Ira Gessel of Brandeis, who neatly settled the matter of recognizing Fibonacci numbers more than 40 years ago, when he was an undergraduate at Harvard. Gessel showed that N is a Fibonacci number iff either \(5N^2 + 4\) or \(5N^2 - 4\) is a perfect square. Gessel introduced this short and sweet criterion and proved its correctness in a problem published in The Fibonacci Quarterly (1972, Vol. 10, No. 6, pp. 417–419). Phillip James, in a 2009 paper, presents the proof in a way I find somewhat easier to follow.

It is not a coincidence that the expression \(5N^2 + 4\) appears in both Gessel’s formula and in my attempt to construct an inverse Binet function. Furthermore, substituting Gessel’s \(5N^2 - 4\) into the inverse function (with a few other sign adjustments) yields correct results for the odd-indexed Fibonacci numbers. Implementing the Gessel test in JavaScript is a cinch:

function gessel(N) {
  var s = 5 * N * N;
  return isSquare(s + 4) || isSquare(s - 4);

So that takes care of the Fibonacci numbers, right? Alas, no. Although Gessel’s criterion is mathematically unassailable, it fails computationally. The problem arises from the squaring of \(N\). If \(N\) is in the neighborhood of \(10^{15}\), then \(N^2\) is near \(10^{30}\), which is roughly \(2^{100}\). JavaScript does all of its arithmetic with 64-bit double-precision floating-point numbers, which allow 53 bits for representing the mantissa, or significand. With values above \(2^{53}\), not all integers can be represented exactly—there are gaps between them. In this range the mapping between \(N\) and \(N^2\) is no longer a bijection (one-to-one in both directions), and the gessel procedure returns many errors.

I had one more hope of coming up with a closed-form Fibonacci recognizer. In the Binet formula, the term \(((1 - \sqrt{5})\,/\,2)^k\) becomes very small in magnitude as k grows large. By neglecting that term we get a simpler formula that still yields a good approximation to Fibonacci numbers:

\[F_k \approx \frac{1}{\sqrt{5}} \left(\frac{1 + \sqrt{5}}{2}\right)^k.\]

For any integer k, the value returned by that expression is within 0.5 of the Fibonacci number \(F_k\), and so simple rounding is guaranteed to yield the correct answer. But the inverse function is not so well-behaved. Although it has no \(N^2\) term that would overflow the 64-bit format, it relies on square-root and logarithm operations whose limited precision can still introduce errors.

So how does the Factoids calculator detect Fibonacci numbers? The old-fashioned way. It starts with 0 and 1 and iterates through the sequence of additions, stopping as soon as N is reached or exceeded:

function isFibo(N) {
  var a = 0, b = 1, tmp;
  while (a < N) {
    tmp = a;
    a = b;
    b = tmp + b;
  return a === N;

As with the factorials, this is not a closed-form solution, and its computational complexity scales in linear proportion to N rather than being constant regardless of N. There are tricks for speeding it up to \(\log N\); Edsger Dijkstra described one such approach. But optimization hardly seems worth the bother. For N < \(10^{15}\), the while loop cannot be executed more than 72 times.

I’ve included two more sequences in the Factoids calculator, just because I’m especially fond of them. The Catalan numbers (1, 1, 2, 5, 14, 42, 132, 429…) are famously useful for counting all sorts of things—ways of triangulating a polygon, paths through the Manhattan street grid, sequences of properly nested parentheses. The usual definition is in terms of binomial coefficients or factorials:

\[C_k = \frac{1}{k+1} \binom{2k}{k} = \frac{(2k)!}{(k+1)! k!}\]

But there is also a recurrence relation:

\[C_0 = 1,\qquad C_k = \frac{4k-2}{k+1} C_{k-1}\]

The recognizer function in the Factoids Calculator does a bottom-up iteration based on the recurrence relation:

function isCatalan(N) {
  var c = 1, k = 0;
  while (c < N) {
    k += 1;
    c = c * (4 * k - 2) / (k + 1);
  return c === N;

The final sequence in the calculator is one of several discovered by Michael Somos around 1980. It is defined by this recurrence:

\[S_0 = S_1 = S_2 = S_3 = 1,\qquad S_k = \frac{S_{k-1} S_{k-3} + S_{k-2}^2}{S_{k-4}}\]

The surprise here is that the elements of the sequence are all integers, beginning 1, 1, 1, 1, 2, 3, 7, 23, 59, 314, 1529, 8209. In writing a recognizer for these numbers I have made no attempt to be clever; I simply generate the sequence from the beginning and check for equality with the given N:

function isSomos(N) {
  var next = 1, S = [1, 1, 1, 1];
  while (next < N) {
    next = (S[3] * S[1] + S[2] * S[2]) / S[0];
  return next === N;

But there’s a problem with this code. Do you see it? As N approaches the \(10^{15}\) barrier, the subexpression S[3] * S[1] + S[2] * S[2] will surely break through that barrier. In fact, the version of the procedure shown above fails for 32,606,721,084,786, the largest Somos-4 number below \(10^{15}\). For the version of the program that’s actually running in the Factoids calculator I have repaired this flaw by rearranging the sequence of operations. (For details see the GitHub repository.)

The factorial, Fibonacci, Catalan, and Somos sequences all exhibit exponential growth, which means they are sprinkled very sparsely along the number line. That’s why a simple linear search algorithm—which just keeps going until it reaches or exceeds the target—can be so effective. For the same reason, it would be easy to precompute all of these numbers up to \(10^{15}\) and have the JavaScript program do a table lookup. I have ruled out this strategy for a simple reason: It’s no fun. It’s not sporting. I want to do real computing, not just consult a table.

Other number series, such as the square and triangular numbers, are more densely distributed. There are more than 30 million square and triangular numbers up to \(10^{15}\); downloading a table of that size would take longer than recomputing quite a few squares and triangulars. And then there are the primes—all 29,844,570,422,669 of them.

What would happen if we broke out of the 64-bit sandbox and offered to supply factoids about larger numbers? A next step might be a Megafactoids calculator that doubles the digit count, accepting integers up to \(10^{30}\). Computations in this system would require multiple-precision arithmetic, capable of handling numbers with at least 128 bits. Some programming languages offer built-in support for numbers of arbitrary size, and libraries can add that capability to other languages, including JavaScript. Although there is a substantial speed penalty for extended precision, most of the algorithms running in the Factoids program would still give correct results in acceptable time. In particular, there would no problem recognizing squares and triangulars, factorials, Fibonaccis, Catalans and Somos-4 numbers.

The one real problem area in a 30-digit factoid calculator is factoring. Trial division would be useless; instead of milliseconds, the worst-case running time would be months or years. However, much stronger factoring algorithms have been devised in the past 40 years. The algorithm that would be most suitable for this purpose is called the elliptic curve method, invented by Hendrik Lenstra in the 1980s. An implementation of this method built into PARI/GP, which in turn is built into Sage, can factor 30-digit numbers in about 20 milliseconds. A JavaScript implementation of the elliptic curve method seems quite doable. Whether it’s worth doing is another question. The world is not exactly clamoring for more and better factoids.

Addendum 2016-02-05: I’ve just learned (via Hacker News) that I may need to add a few more recognition predicates: detectors for “artisanal integers” in flavors hand-crafted in the Mission District, Brooklyn, and London.

by Brian Hayes at February 04, 2016 03:50 PM UTC

In 2009 I posted about a surprising new approach that moved computer Go from programs that lose to beginners to where it could beat good amateurs. That approach, now called Monte Carlo Tree Search, involves evaluating a position using random game play and doing a bounded-depth tree search to maximize the evaluation.

Google last week announced AlphaGo, a program that uses ML techniques to optimize MCTS. This program beat the European Go champion five games to none, a huge advance over beating amateurs. In March AlphaGo will play the world Go champion, Lee Sedol, in a five game match. The world will follow the match closely (on YouTube naturally). For now we should keep our expectations in check, Deep Blue failed to beat Kasparov in its first attempt.

Google researchers describe AlphaGo in detail in a readable Nature article. To oversimplify they train deep neural nets to learn two functions, the probability of a win from a given position, and the probability distribution used to choose the next move in the random game play in MCTS. First they train with supervised learning based on historical game data between expert players and then reinforcement learning by basically having the program play itself. AlphaGo uses these functions to guide the Monte Carlo Tree Search.

AlphaGo differs quite a bit from chess algorithms.
  • AlphaGo uses no built-in strategy for Go. The same approach could be used for most other two player games. I guess this approach would fail miserably for Chess but I would love to see it tried.
  • Machine learning has the nice property that you can train offline slowly and then apply the resulting neural nets quickly during gameplay. While we can refine the chess algorithms offline but all the computation generally happens during the game.
  • If a computer chess program makes a surprising move, good or bad, one can work through the code and figure out why the program made that particular move. If AlphaGo makes a surprising move, we'll have no clue why.
  • I wonder if the same applies to human play. A chess player can explain the reasoning behind a particular move. Can Go players do the same or do great Go players rely more on intuition?
Machine learning applications like AlphaGo seemingly tackle difficult computational problems with virtually no built-in domain knowledge. Except for generating the game data for the supervised learning, humans play little role in how AlphaGo decides what to do. ML uses the same techniques to translate languages, to weed out spam, to set the temperature in my house, and in the near future to drive my car. Will they prove our theorems? Time will tell. 

by Lance Fortnow ( at February 04, 2016 12:20 PM UTC

I have a new preprint, "Distance-Sensitive Planar Point Location" (arXiv:1602.00767, with Aronov, de Berg, Roeloffzen, and Speckmann) whose title and abstract may already be a bit intimidating. So I thought I would at least try to explain what it's about in simpler terms.

Point location is something you do every time you orient yourself on a map. You know where you are, in the world, and you try to figure out what that means about where you are on the map. Let's model the map mathematically as a unit square, subdivided into polygons. You already know the coordinates of a point in the square, but now you want to figure out which polygon contains that point. To do so, you might make a sequence of comparisons, for instance checking which side of one of the polygon edges you're on. You want to do as few comparisons as possible before getting the answer.

One limit on the number of comparisons you might need is entropy, a fancy word for the number of bits of information you need to send to someone else to tell them where you are. If you can solve point location in a certain number of comparisons, you can also say where you are by communicating the results of those comparisons. So, the entropy is no bigger than the time complexity of point location. We'd like to turn that around, by finding methods for point location that match any method you might come up with for communicating your location. For instance, if there are n different polygons on your map, you could name them by strings of log2n bits, and communicate your location by saying one of those names. And it turns out that there are methods of point location that take only O(log n) comparisons, so we can match this naming scheme by a point location method.

But there are other naming schemes that might do even better than that, and we'd like to also match them. For instance, you might be in some places more frequently than others. Most of the time, I'm in California, and when I'm outside California I'm more likely to be in states where I have relatives or that are the frequent sites of computer science conferences (say, Massachusetts) than others (say, Maine). So it would be more efficient on the average for me to tell people where I am using a naming scheme in which California has a very short name and South Dakota has a longer one. Several researchers, including Iacono, Arya, Malamatos, and Mount, have provided point location schemes that can match the complexity of any such naming scheme. But to achieve this, they require all of the polygons to have simple shapes like triangles, not complicated ones like the boundaries of some US states.

In our paper, we use a different idea. Instead of considering the popularity of different regions, we consider the distance to the nearest boundary. If you're far from the boundary of a region, it should be easy to tell where you are, and if you're close to the boundary you'll have to look more carefully. If your distance to the nearest boundary is d, then you're inside a circle that doesn't cross any boundary and whose area is proportional to d2. There can be only O(1/d2) polygons that contain a circle that big, and you can communicate which one you're in by sending O(log(1/d2)) bits of information (a name of one of those big polygons). A simple point location scheme based on a quadtree data structure turns out to match that bound.

That's a warmup to the main results of the paper, which combine popularity with distance to the boundary. If you're well within a polygon, at the center of a circle that's inside the polygon and has area proportional to it, then the time to find out where you are is proportional only to the length of the name of the polygon (in your favorite naming scheme) regardless of how complicated its shape is. But if you're closer to the boundary than that, then the time to locate yourself will be slower, by a number of steps that depends inverse-logarithmically on how close you are.

It would be nice to have a method that depends only on the lengths of the names of the polygons, without assuming they're all nicely shaped, and without depending on other quantities like the distance to the boundary. But that's not possible, as an example in our introduction details. For, if there are only two polygons on your map (say polygon 0 and polygon 1), then the lengths of both of their names are constant. If we want to match that by the complexity of a point location scheme, then we can only make a constant number of comparisons, and only solve location problems for which the shapes of the polygons are simple.

at February 04, 2016 04:28 AM UTC

Authors: Till Fluschnik, Manuel Sorge
Download: PDF
Abstract: We study the Minimum Shared Edges problem introduced by Omran et al. [Journal of Combinatorial Optimization, 2015] on planar graphs: Planar MSE asks, given a planar graph G = (V,E), two distinct vertices s,t in V , and two integers p, k, whether there are p s-t paths in G that share at most k edges, where an edges is called shared if it appears in at least two of the p s-t paths. We show that Planar MSE is NP-hard by reduction from Vertex Cover. We make use of a grid-like structure, where the alignment (horizontal/vertical) of the edges in the grid correspond to selection and validation gadgets respectively.

Donate to arXiv

at February 04, 2016 01:41 AM UTC

Authors: Chenhan D. Yu, William B. March, Bo Xiao, George Biros
Download: PDF
Abstract: We present a parallel algorithm for computing the approximate factorization of an $N$-by-$N$ kernel matrix. Once this factorization has been constructed (with $N \log^2 N $ work), we can solve linear systems with this matrix with $N \log N $ work. Kernel matrices represent pairwise interactions of points in metric spaces. They appear in machine learning, approximation theory, and computational physics. Kernel matrices are typically dense (matrix multiplication scales quadratically with $N$) and ill-conditioned (solves can require 100s of Krylov iterations). Thus, fast algorithms for matrix multiplication and factorization are critical for scalability.

Recently we introduced ASKIT, a new method for approximating a kernel matrix that resembles N-body methods. Here we introduce INV-ASKIT, a factorization scheme based on ASKIT. We describe the new method, derive complexity estimates, and conduct an empirical study of its accuracy and scalability. We report results on real-world datasets including "COVTYPE" ($0.5$M points in 54 dimensions), "SUSY" ($4.5$M points in 8 dimensions) and "MNIST" (2M points in 784 dimensions) using shared and distributed memory parallelism. In our largest run we approximately factorize a dense matrix of size 32M $\times$ 32M (generated from points in 64 dimensions) on 4,096 Sandy-Bridge cores. To our knowledge these results improve the state of the art by several orders of magnitude.

Donate to arXiv

at February 04, 2016 01:43 AM UTC

Authors: Radoslav Fulek
Download: PDF
Abstract: We show that c-planarity is solvable in a polynomial time for flat clustered graphs with three clusters if the combinatorial embedding of the abstract graph is fixed. In other words, given a graph G embedded in the plane whose vertices are partition into three parts our algorithm decides if there exists a plane supergraph G' of G on the same vertex set in which the vertices of each part induce a connected sub-graph contained in the outer-face of its complement in G'. We proceed by a reduction to the problem of testing the existence of a perfect matching in planar bipartite graphs. We formulate our result in a slightly more general setting of cyclic cluster graphs.

Donate to arXiv

at February 04, 2016 01:44 AM UTC

Authors: Petra Berenbrink, Tom Friedetzky, Peter Kling, Frederik Mallmann-Trenn, Chris Wastell
Download: PDF
Abstract: We consider \emph{plurality consensus} in a network of $n$ nodes. Initially, each node has one of $k$ opinions. The nodes execute a (randomized) distributed protocol to agree on the plurality opinion (the opinion initially supported by the most nodes). Nodes in such networks are often quite cheap and simple, and hence one seeks protocols that are not only fast but also simple and space efficient. Typically, protocols depend heavily on the employed communication mechanism, which ranges from sequential (only one pair of nodes communicates at any time) to fully parallel (all nodes communicate with all their neighbors at once) communication and everything in-between.

We propose a framework to design protocols for a multitude of communication mechanisms. We introduce protocols that solve the plurality consensus problem and are with probability 1-o(1) both time and space efficient. Our protocols are based on an interesting relationship between plurality consensus and distributed load balancing. This relationship allows us to design protocols that generalize the state of the art for a large range of problem parameters. In particular, we obtain the same bounds as the recent result of Alistarh et al. (who consider only two opinions on a clique) using a much simpler protocol that generalizes naturally to general graphs and multiple opinions.

Donate to arXiv

at February 04, 2016 01:43 AM UTC

Authors: Andreas Björklund, Petteri Kaski
Download: PDF
Abstract: We study a design framework for robust, independently verifiable, and workload-balanced distributed algorithms working on a common input. An algorithm based on the framework is essentially a distributed encoding procedure for a Reed--Solomon code, which enables (a) robustness against byzantine failures with intrinsic error-correction and identification of failed nodes, and (b) independent randomized verification to check the entire computation for correctness, which takes essentially no more resources than each node individually contributes to the computation. The framework builds on recent Merlin--Arthur proofs of batch evaluation of Williams~[{\em Electron.\ Colloq.\ Comput.\ Complexity}, Report TR16-002, January 2016] with the observation that {\em Merlin's magic is not needed} for batch evaluation---mere Knights can prepare the proof, in parallel, and with intrinsic error-correction.

The contribution of this paper is to show that in many cases the verifiable batch evaluation framework admits algorithms that match in total resource consumption the best known sequential algorithm for solving the problem. As our main result, we show that the $k$-cliques in an $n$-vertex graph can be counted {\em and} verified in per-node $O(n^{(\omega+\epsilon)k/6})$ time and space on $O(n^{(\omega+\epsilon)k/6})$ compute nodes, for any constant $\epsilon>0$ and positive integer $k$ divisible by $6$, where $2\leq\omega<2.3728639$ is the exponent of matrix multiplication. This matches in total running time the best known sequential algorithm, due to Ne{\v{s}}et{\v{r}}il and Poljak [{\em Comment.~Math.~Univ.~Carolin.}~26 (1985) 415--419], and considerably improves its space usage and parallelizability. Further results include novel algorithms for counting triangles in sparse graphs, computing the chromatic polynomial of a graph, and computing the Tutte polynomial of a graph.

Donate to arXiv

at February 04, 2016 12:00 AM UTC

Authors: Carl Barton, Tomasz Kociumaka, Solon P. Pissis, Jakub Radoszewski
Download: PDF
Abstract: The problem of finding factors of a text string which are identical or similar to a given pattern string is a central problem in computer science. A generalised version of this problem consists in implementing an index over the text to support efficient on-line pattern queries. We study this problem in the case where the text is weighted: for every position of the text and every letter of the alphabet a probability of occurrence of this letter at this position is given. Sequences of this type, also called position weight matrices, are commonly used to represent imprecise or uncertain data. A weighted sequence may represent many different strings, each with probability of occurrence equal to the product of probabilities of its letters at subsequent positions. Given a probability threshold $1/z$, we say that a pattern string $P$ matches a weighted text at position $i$ if the product of probabilities of the letters of $P$ at positions $i,\ldots,i+|P|-1$ in the text is at least $1/z$. In this article, we present an $O(nz)$-time construction of an $O(nz)$-sized index that can answer pattern matching queries in a weighted text in optimal time improving upon the state of the art by a factor of $z \log z$. Other applications of this data structure include an $O(nz)$-time construction of the weighted prefix table and an $O(nz)$-time computation of all covers of a weighted sequence, which improve upon the state of the art by the same factor.

Donate to arXiv

at February 04, 2016 01:41 AM UTC

This is the last piece I received in response to my call for opinions on the report on logic activities in Europe that Yuri Gurevich wrote in 1992.

Joost-Pieter Katoen and Wolfgang Thomas discuss the sections of Yuri's report devoted to the European research environment (funding, research centres and other issues) related to logic in computation. You can read their contribution here. Thanks to Joost-Pieter and Wolfgang  for taking the time to write this piece and for allowing me to share it on this blog. Enjoy it!

by Luca Aceto ( at February 03, 2016 11:01 AM UTC

A non-announcement announcement

Crop from Farkas Prize src

Michel Goemans is the chair of this year’s ACM/IEEE Knuth Prize committee. He teaches at MIT and among many wonderful achievements co-won the 2000 MOS/AMS Fulkerson Prize with David Williamson for their great work on approximation for MAX CUT and MAX SAT and other optimization problems.

A few days ago he emailed me to ask if Ken and I would announce this year’s call for nominations.

Of course, as Michel wrote in his email to me, he realizes that we really do not usually make announcements. And indeed he is correct. So Ken and I are confronted with a dilemma. We want to help, but our main purpose at GLL is to present time-independent posts that balance history and technical details. Besides, if we start doing announcements like this, then year-over-year it would become like Groundhog Day. Our problem is:

How do we make the announcement and still follow our usual form?

A second potential problem is that if we were to start doing announcements like this, then year-over-year it could become like “Groundhog Day.” So we ask, “How do we make the announcement and still follow our usual form?” Besides…

Prizes I

One idea we had was that we would look at the history of prizes. Prizes in science and math have been around for many many years. There are two types of prizes; well there are many types, but there are two extremes. One is called ex ante prizes and the other is called ex post prizes. An ex ante prize is an attempt to use money to direct research. They are of the form:

If you can solve X, then you will win the following amount of money.

An early example was the Longitude Prize, which was based on the reality that knowledge of latitude is easy to refresh by looking at the sun and stars, but using them for longitude requires accurately knowing the local time.


In 1714, the British government offered a very large amount of money for determining a ship’s longitude. The top prize was 20 thousand British pounds—an immense amount of money at the time. John Harrison was awarded the top prize in 1773, which he solved by creating a clock—amazingly compact like a watch—that kept accurate time even on a ship at sea. It did this in calm seas, rough seas, hot temperatures, or cold temperatures. A great read about the prize, its solution, and the controversy in paying Harrison is Longitude: The True Story of a Lone Genius Who Solved the Greatest Scientific Problem of His Time, by Dava Sobel.

A math example of an ex ante prize was the Wolfskehl Prize. Paul Wolfskehl created the prize named for him for the first to present a valid proof of Fermat’s Last Theorem. Of course, Andrew Wiles won the prize in 1997.

Some have stated that such prizes are not very useful. They often increase visibility of the problem, but attract amateurs who may or may not really understand the problem. The more recent Clay Prizes are ex ante, and it is unclear if they had any effect at all on the solution of the first of the prize problems to be solved: the Poincaré Problem. Recall the solver, Grigori Perelman, refused the prize money of $1,000,000.

Prizes II

Nobel Prizes are ex post prizes as are our own Turing Awards—okay they are called “awards” but that is a minor point. The Knuth Prize is definitely an ex post prize. It is given for not one achievement, but rather for a lifetime of work in the area of theory of computing. The call states:

The Prize is awarded for major research accomplishments and contributions to the foundations of computer science over an extended period of time.

I did win the prize two years ago, in 2014. I was thrilled, honored, and surprised. There are so many great theorists that I was very honored to receive it. Clearly, now is the time to try and put together a strong case for your favorite colleague. Only one can win, but you cannot win if there is not a serious case put together. So good luck to all. The committee consists of Allan Borodin, Uri Feige, Michel Goemans, Johan Håstad, Satish Rao, and Shang-Hua Teng. Here is the information on making a nomination. Nominations are due on Mar 31, 2016.

Laci Babai won it last year. Of course this was for his past work and its great innovations and deep executions, including a share of the first ever Gödel Prize. But we may have to credit the 2015 committee with prescience given Laci’s achievement with Graph Isomorphism (GI). Likewise, planning for a Schloss Dagstuhl workshop on GI began in 2014 without knowing how felicitous the December 13–18, 2015 dates would be. Perhaps there is a stimulating effect that almost amounts to a third category of prizes—as per a tongue-in-cheek time-inverting post we wrote a year ago.

Open Problems

Who will win this year? Of course the winner has to be nominated first—unless it’s the MVP of the NHL All-Star Game. There are many terrific candidates and I am glad not to be on the committee that has to decide. One prediction is that whoever wins will be extremely elated and honored.

[spellfix in subtitle]

by RJLipton+KWRegan at February 03, 2016 04:00 AM UTC

The main contribution of this work is an explicit construction of extractors for near logarithmic min-entropy. For any $\delta > 0$ we construct an extractor for $O(1/\delta)$ $n$-bit sources with min-entropy $(\log{n})^{1+\delta}$. This is most interesting when $\delta$ is set to a small constant, though the result also yields an extractor for $O(\log\log{n})$ sources with logarithmic min-entropy. Prior to this work, the best explicit extractor in terms of supporting least-possible min-entropy, due to Li (FOCS'15), requires min-entropy $(\log{n})^{2+\delta}$ from its $O(1/\delta)$ sources. Further, all current techniques for constructing multi-source extractors "break" below min-entropy $(\log{n})^2$. In fact, existing techniques do not provide even a disperser for $o(\log{n})$ sources each with min-entropy $(\log{n})^{1.99}$. Apart from being a natural problem, supporting logarithmic min-entropy has applications to combinatorics. A two-source disperser, let alone an extractor, for min-entropy $O(\log{n})$ induces a $(\log{n})^{O(1)}$-Ramsey graph on $n$ vertices. Thus, constructing such dispersers would be a significant step towards constructively matching Erd?s' proof for the existence of $(2\log{n})$-Ramsey graphs on $n$ vertices. Our construction does not rely on the sophisticated primitives that were key to the substantial recent progress on multi-source extractors, such as non-malleable extractors, correlation breakers, the lightest-bin condenser, or extractors for non-oblivious bit-fixing sources, although some of these primitives can be combined with our construction so to improve the output length and the error guarantee. Instead, at the heart of our construction is a new primitive called an independence-preserving merger.

at February 03, 2016 02:12 AM UTC