Theory of Computing Blog Aggregator

Indistinguishability Obfuscation and Multi-linear Maps: A Brave New World (Guest Post by Ran Canetti)

The following post is written by Ran Canetti

A bunch of us hapless cryptographers got the following boilerplate comment from the FOCS’15 PC:

Overall, submissions related to multi-linear maps and indistinguishability obfuscation were held to a somewhat higher standard. The PC expressed some concern with the recent flurry of activities pertaining to multi-linear maps and indistinguishability obfuscation, given how little we understand and can say and *prove* about the underlying hardness assumptions.

This comment was clearly written with the best of intentions, to explain views expressed at the PC deliberations. And I’m thankful to it – mainly since it made the underlying misconceptions so explicit that it mandated a response. So, after discussing and commiserating with colleagues here at Simons, and after amusing ourselves with some analogues of above statement (e.g., “results on NP completeness are held to a higher standard given how little we understand and can say and *prove* about the hardness solving SAT in polynomial time”), I decided to try to write an – obviously subjective – account for the recent developments in multilinear maps and indistinguishability obfuscation (IO) and why this exciting research should be embraced and highlighted rather than “held to a somewhat higher standard” — in spite of how little we understand about the underlying assumptions. The account is aimed at the general CS-theorist.

Let me start by giving rough definitions of the concepts involved. An Indistinguishability Obfuscator (IO) is a randomized algorithm O that takes as input a circuit C and outputs a (distribution over) circuits O(C) with the properties that:

• C and O(C) have the same functionality,
• O(C) is only polynomially larger than C,
• for any two same-size, functionally equivalent circuits C and C’ we have that O(C) ~ O(C’) (i.e., the distributions over strings representing O(C) and O(C’) are computationally indistinguishable).

IO has been proposed as a notion of obfuscation in 2000 (Hada, Barak-Goldreich-Impagliazzo-Sahai-Vadhan-Yang). Indeed, it is arguably a clean and appealing notion – in some sense the natural extension of semantic security of standard encryption to “functionality-preserving encryption of programs”. However, it has been largely viewed as too weak to be of real applicability or interest. (There were also no candidate polytime IO schemes, but this in my eyes is a secondary point, see below.)

Things changed dramatically in 2013 when Sahai and Waters demonstrated how IO schemes can be ingeniously combined with other rather “mundane” cryptographic constructs to do some amazing things. Since then dozens of papers came about that extend the SW techniques and apply them to obtain even more amazing things – that by now have transcended crypto and spilled over to other areas. (e.g.: deniable encryption, succinct delegation, succinct multi-party computation with hardly any interaction, one message succinct witness hiding and witness indistinguishable proofs, hash functions with random-oracle-like properties, hardness results for PPAD, and many more). In fact, think about a result in your area that assumes that some computation is done inside a black box – most probably IO can replace that assumption in one way or another…

Still, my (subjective but distinct) feeling is that we are far from understanding the limits and full power of IO. Furthermore, the study of IO has brought with it a whole new toolbox of techniques that are intriguing in their own right, and teach us about the power and limitations of working with “encrypted computations”.

So far I have not mentioned any candidate constructions of IO – and indeed the above study is arguably valuable as a pure study of this amazing concept, even without any candidate constructions. (Paraphrasing Levin on quantum computers, one can take the viewpoint that the above is the study of impossibility results for IO…)

However, unlike quantum computers, here we also have candidate constructions. This is where multilinear maps come to play.

Multi-linear maps are this cool new technical tool (or set of tools) that was recently put forth. (The general concept was proposed by Boneh and Silverberg around 2000, and the first candidate construction of one of the current variants was presented in 2012 by Garg, Gentry and Halevi.) Essentially, a multilinear map scheme is a fully homomorphic encryption scheme where the public key provides, in addition to the ability to encrypt elements and perform homomorphic operations on ciphertexts, also the ability to partially decrypt ciphertexts under certain restrictions. There are many incomparable variants of this general paradigm, which differ both in the functionality provided and in the security guarantees. Indeed, variants appear to be closely tied to candidate constructions. Furthermore, our understanding of what’s possible here has been evolving considerably, with multiple new constructions, attacks, and fixes reported.
Still, the number and variety of applications of multi-linear maps makes it clear that this “family of primitives” is extremely powerful and well worth studying – both at the level of candidate constructions, at the level of finding the “right” computational abstractions, and at the level of applications. In a sense, we are here back to the 70’s: we are faced with this new set of algebraic and number theoretic tools, and are struggling to find good ways to use them and abstract them.

Indeed, some of the most powerful applications of multilinear maps are candidate constructions of IO schemes. The first such candidate construction (by Garg, Gentry, Halevi, Raykova, Sahai and Waters in 2013) came with only heuristic arguments for security; However more rigorous analyses of this and other constructions, based on well-defined formulations of multi-linear map variants, soon followed suite. Some of these analyses have eventually been “broken” in the sense that we currently don’t have candidate constructions that satisfy the properties they assume. Still, other analyses do remain valid. Indeed, there are no attacks against the actual basic IO scheme of Garg etal.

The fact that the only current candidate constructions of IO need to assume existence of some variant of multi-linear maps at some point or another may make it seem as it the two concepts are somehow tied together. However, there is no reason to believe that this is the case. For all we know, multi-linear maps are just the path first uncovered to IO, and other paths may well be found. Similarly, even if IO turns out to be unobtainable for some reason, the study of multilinear maps and their power will still remain very relevant.

So, to sum up this long-winded account:

• IO is a natural and fascinating computational concept. Studying its consequences (both within and outside cryptography) is a well worth endeavor.
• Studying new candidate constructions of IO and/or new analyses of their security is another well worth endeavor.
• Multilinear maps are an intriguing and powerful set of techniques and tools. Finding better candidate constructions and abstractions is of central importance to cryptography. Finding new cool uses of these maps is another intriguing challenge.
• The three should be treated as separate (although touching and potentially interleaving) research efforts.

———–
I’d like to thank Guy Rothblum and Vinod Vaikuntanathan for great comments that significantly improved this post.

by Luca Trevisan at July 06, 2015 04:05 PM UTC

ICALP/LICS 2015: Day 1

from Luca Aceto

Despite my jet-lag induced drowsiness and the fact that the clock on my laptop kept reminding that I was chairing the first meeting of the EATCS Council at around 4am in the morning, I enjoyed the first day of the conference a lot.

According to the organizers, ICALP/LICS 2015 had 400 early registrations and there were 200 people who attended the pre-conference workshops.The plenary invited talks by Ken-ichi Kawarabayashi (NII, Japan) and Luke Ong (University of Oxford, UK) were delivered to a packed room and were both excellent.

Ken-ichi's talk was entitled Digraphs Structures: Minors and Algorithms and presented his joint work with Stephan Kreutzer on extending results on minors from graphs to digraphs. (See this paper in particular.)

Luke's talk was devoted to Higher-Order Model Checking: An Overview. Higher-order model checking is about checking whether trees generated by recursion schemes satisfy formulae that can be expressed in some kind of logic, such as Monadic Second Order logic. In his talk, Luke surveyed the considerable progress on this problem in both theory and practice that has been made over the past fifteen years or so.

There were lots interesting talks in the conference programs as well as two sessions devoted to the best paper award recipients.

The busy day ended socially on a high note with an excellent welcome reception.

Looking at the future, I am pleased to inform you that the invited speakers for ICALP 2016 will be
I will have more to say about future editions of the conference after the general assembly tomorrow.

by Luca Aceto (noreply@blogger.com) at July 06, 2015 01:59 PM UTC

News for June 2015

This month we have a number of results that are related to query complexity though not directly related to property testing.

Diamond Sampling for Approximate Maximum All-pairs Dot-product (MAD) Search by Grey Ballard, Tamara G. Kolda, Ali Pinar and C. Seshadhri (arXiv). For a long time we have been hoping to use our tools and techniques to solve problems that are useful in the real world One such problem which has importance in the real life applications is, given two sets of vectors the problem is to find the t pairs of vectors with the highest dot product. In this paper they use clever sampling techniques to design algorithms which are better than the state-of-the-art algorithms. They not only give theoretical guarantee but also validate their results empirically.  Bridging the gap between theory and practice is an extremely important at the same time a very challenging job. We hope more work will be done in this direction.

Sub-linear Upper Bounds on Fourier dimension of Boolean Functions in terms of Fourier sparsity by Swagato Sanyal (arXiv). Understanding the relationship between Fourier dimension of a Boolean function and sparsity is an important problem. In this paper a better bound on the Fourier dimension in terms of sparsity is obtained. The main technique is to use the fact that the Fourier dimension is equivalent to the the non-adaptive parity decision tree and then bounding the parity decision tree in terms of sparsity.

Relationship between Deterministic Query Complexity, Randomized Query Complexity and Quantum Query Complexity.   In the world of query complexity understanding the exact relationship between the the various models of computation is the main problem. It is known that Deterministic Query Complexity, Randomized Query Complexity and Quantum Query complexity are all polynomially related. But the exact polynomial relation between them is not known. Last month there was a sudden burst of activity in this area with three papers addressing this problem coming out is a span of two weeks.  In the papers Separations in Query Complexity Based on Pointer Functions by Andris Ambainis, Kaspars Balodis, Aleksandrs Belovs, Troy Lee, Miklos Santha and Juris Smotrovs (arXiv) and Towards Better Separation between Deterministic and Randomized Query Complexity by Sagnik Mukhopadhyay Swagato Sanyal (arXiv) it is proved that the Randomized Query Complexity and the Deterministic Query Complexity are quadratically related and that this bound is tight up to logarithmic factors. Very soon after in A Super-Grover Separation Between Randomized and Quantum Query Complexities by Shalev Ben-David (arXiv) it was proved that the separation between the Quantum Query Complexity and Randomized query Complexity is super quadratic. With these three results our knowledge about the query complexity is slightly more clearer.

by Sourav Chakraborty at July 06, 2015 07:25 AM UTC

Indistinguishability Obfuscation and Multi-linear Maps: A Brave New World – Guest Post by Ran Canetti

A bunch of us hapless cryptographers got the following boilerplate comment from the FOCS’15 PC:

“Overall, submissions related to multi-linear maps and indistinguishability obfuscation were held to a somewhat higher standard. The PC expressed some concern with the recent flurry of activities pertaining to multi-linear maps and indistinguishability obfuscation, given how little we understand and can say and *prove* about the underlying hardness assumptions”.

This comment was clearly written with the best of intentions, to explain views expressed at the PC deliberations.  And I’m thankful to it – mainly since it made the underlying misconceptions so explicit that it mandated a response.  So, after discussing and commiserating with colleagues here at Simons, and after amusing ourselves with some analogues of above statement (e.g., “results on NP completeness are held to  a higher standard  given how little we understand and can say and  *prove* about the hardness solving SAT in polynomial time”),  I decided to try to write an – obviously subjective – account for the recent developments in multilinear maps and  indistinguishability obfuscation (IO)  and why this exciting research should be embraced and highlighted rather than “held to a somewhat higher standard”  —  in spite of how little we understand about the underlying assumptions. The account is aimed at the general CS-theorist.

Let me start by giving rough definitions of the concepts involved.  An Indistinguishability Obfuscator (IO) is a randomized algorithm O that takes as input a circuit  C and outputs a (distribution over) circuits O(C) with the properties that:

1. C and O(C) have the same functionality,
2. O(C) is only polynomially larger than C
3. for any two same-size, functuinally equivalent circuits C and C’ we have that O(C) ~ O(C’)   (i.e., the distributions over strings representing O(C) and O(C’) are computationally indistinguishable).

IO has been proposed as a notion of obfuscation in 2000 (Hada, Barak-Goldreich-Impagliazzo-Sahai-Vadhan-Yang). Indeed, it is arguably a clean and appealing notion – in some sense the natural extension of semantic security of standard encryption to “functionality-preserving encryption of programs”. However, it has been largely viewed as too weak to be of real applicability or interest.  (There were also no candidate polytime IO schemes, but this in my eyes is a secondary point, see below.)

Things changed dramatically in 2013 when Sahai and Waters demonstrated how IO schemes can be ingeniously combined with other rather “mundane” cryptographic constructs to do some amazing things. Since then dozens of papers came about that extend the SW techniques and apply them to obtain even more amazing things – that by now have transcended crypto and spilled over to other areas. (e.g.: deniable encryption, succinct delegation, succinct multi-party computation with hardly any interaction, one message succinct witness hiding and witness indistinguishable proofs, hash functions with random-oracle-like properties, hardness results for PPAD, and many more). In fact, think about a result in your area that assumes that some computation is done inside a black box – most probably IO can replace that assumption in one way or another…

Still, my (subjective but distinct) feeling is that we are far from understanding the limits and full power of IO.  Furthermore, the study of IO has brought with it a whole new toolbox of techniques that are intriguing in their own right, and teach us about the power and limitations of working with “encrypted computations”.

So far I have not mentioned any candidate constructions of IO – and indeed the above study is arguably valuable as a pure study of this amazing concept, even without any candidate constructions.  (Paraphrasing Levin on quantum computers, one can take the viewpoint that the above is the study of impossibility results for IO…)

However, unlike quantum computers, here we also have candidate constructions. This is where multilinear maps come to play.

Multi-linear maps are this cool new technical tool (or set of tools) that was recently put forth. (The general concept was proposed by Boneh and Silverberg around 2000, and the first candidate construction of one of the current variants was presented in 2012 by Garg, Gentry and Halevi.)   Essentially, a multilinear map scheme is a fully homomorphic encryption scheme where the public key provides, in addition to the ability to encrypt elements and perform homomorphic operations on ciphertexts, also the ability to partially decrypt ciphertexts under certain restrictions. There are many incomparable variants of this general paradigm, which differ both in the functionality provided and in the security guarantees. Indeed, variants appear to be closely tied to candidate constructions. Furthermore, our understanding of what’s possible here has been evolving considerably, with multiple new constructions, attacks, and fixes reported.

Still, the number and variety of applications of multi-linear maps makes it clear that this “family of primitives” is extremely powerful and well worth studying – both at the level of candidate constructions, at the level of finding the “right” computational abstractions, and at the level of applications.  In a sense, we are here back to the 70’s: we are faced with this new set of algebraic and number theoretic tools, and are struggling to find good ways to use them and abstract them.

Indeed, some of the most powerful applications of multilinear maps are candidate constructions of IO schemes. The first such candidate construction (by Garg, Gentry, Halevi, Raykova, Sahai and Waters in 2013) came with only heuristic arguments for security; However more rigorous analyses of this and other constructions, based on well-defined formulations of multi-linear map variants, soon followed suite.  Some of these analyses have eventually been “broken” in the sense that we currently don’t have candidate constructions that satisfy the properties they assume. Still, other analyses do remain valid. Indeed, there are no attacks against the actual basic IO scheme of Garg et al.

The fact that the only current candidate constructions of IO need to assume existence of some variant of multi-linear maps at some point or another may make it seem as it the two concepts are somehow tied together. However, there is no reason to believe that this is the case. For all we know, multi-linear maps are just the path first uncovered to IO, and other paths may well be found. Similarly, even if IO turns out to be unobtainable for some reason, the study of multilinear maps and their power will still remain very relevant.

So, to sum up this long-winded account:

• IO is a natural and fascinating computational concept. Studying its consequences (both within and outside cryptography) is a well worth endeavor.
• Studying new candidate constructions of IO and/or new analyses of their security is another well worth endeavor.
• Multilinear maps are an intriguing and powerful set of techniques and tools. Finding better candidate constructions and abstractions is of central importance to cryptography. Finding new cool uses of these maps is another intriguing challenge.
• The three should be treated as separate (although touching and potentially interleaving) research efforts.

———–

I’d like to thank Guy Rothblum and Vinod Vaikuntanathan for great comments that significantly improved this post.

by Guy Rothblum at July 06, 2015 03:14 AM UTC

Historical talks

During the summer program on cryptography, every Monday afternoon there is a talk on the history of a classic paper or series of papers. Last week, Russell Impagliazzo spoke on his work with Steven Rudich on the impossibility of basing one-way permutations or key agreement on the existence of one-way functions. The week before, Umesh Vazirani spoke on quantum computing, and earlier Ron Rivest spoke about the origins of modern cryptography.

All talks are recorded, and the recordings are available here.

Tomorrow afternoon, Daniele Micciancio will speak at 3pm on lattice-based cryptography.

Tomorrow is also the first day of the Workshop on the mathematics of modern cryptography. The program starts at 9:30am Pacific time, and all talk are broadcast live, as usual.

by Luca Trevisan at July 06, 2015 02:05 AM UTC

Does Bob Deserve the lavish acknowledgement: A problem in Logic

Alice and Carol are real mathematicians.
Bob is an English major who does not know any mathematics.

(This story is based on a true incident.)

Alice writes a math paper. Carol reads it and offers corrections of style and grammar and how-to-say-things. She also helps simplify some of the proofs. She does not deserve a co-authorship but Alice does of course write in the acknowledgements

I would like to thank Carol for proofreading and for help with some of the proofs.

Bob points out that this is silly --- if she would like to thank Carol then do so. So Alice changes it to

I thank Carol for proofreading and for help with some of the proofs.

Even though Bob does not understand the math he begins reading the paper. He finds a few grammar mistakes, some points of style, and even a math mistake:

BOB: Alice, this sentence mentions A1 and A2, is A1 the steak sauce?

ALICE:  Its A sub 1 and no it is not the steak sauce.

BOB: But later in the sentence there is a reference to A? Maybe its implicit what A is and I don't get it since I don't know the math, but it does look funny.

ALICE: Well pierce my ears and call be drafty! You're right! It should be A1, A2, and A_1 ∩ A_2.

SO, in the end Bob DID proofread the paper and DID help. Alice wants to include him in the acknowledgements. She modifies the ack to

I thank Bob and Carol for proofreading and help with some of the proofs.

Is that correct? Bob just did proofreading, and Carol did proofreading AND helped with some proofs. In logical terms

If B did X and C did X and Y then

B AND C did X AND  Y

does seem correct.

But is also seems misleading. Alice could separate it out:

I thank Carol for proofreading and help with some of the proofs.
I thank Bob and Carol for proofreading.

Thats more accurate but also more cumbersome.

But my real question is, is the I THANK BOB AND CAROL... statement correct or incorrect? In logic correct, in English, perhaps not. We could ask Bob who is an English major and maybe get a paper out of it which Carol can proofread!

by GASARCH (noreply@blogger.com) at July 06, 2015 01:29 AM UTC

Using the Johnson-Lindenstrauss lemma in linear and integer programming

Authors: Ky Vu, Pierre-Louis Poirion, Leo Liberti
Abstract: The Johnson-Lindenstrauss lemma allows dimension reduction on real vectors with low distortion on their pairwise Euclidean distances. This result is often used in algorithms such as $k$-means or $k$ nearest neighbours since they only use Euclidean distances, and has sometimes been used in optimization algorithms involving the minimization of Euclidean distances. In this paper we introduce a first attempt at using this lemma in the context of feasibility problems in linear and integer programming, which cannot be expressed only in function of Euclidean distances.

Algebraic and model theoretic methods in constraint satisfaction

Authors: Michael Pinsker
Abstract: This text is related to the tutorials I gave at the Banff International Research Station and within a "Doc-course" at Charles University Prague in the fall of 2014. It describes my current research and some of the most important open questions related to it.

Optimal linear Bernoulli factories for small mean problems

Authors: Mark Huber
Abstract: Suppose a coin with unknown probability $p$ of heads can be flipped as often as desired. A Bernoulli factory for a function $f$ is an algorithm that uses flips of the coin together with auxiliary randomness to flip a single coin with probability $f(p)$ of heads. Applications include near perfect sampling from the stationary distribution of regenerative processes. When $f$ is analytic, the problem can be reduced to a Bernoulli factory of the form $f(p) = Cp$ for constant $C$. Presented here is a new algorithm where for small values of $Cp$, requires roughly only $C$ coin flips to generate a $Cp$ coin. From information theory considerations, this is also conjectured to be (to first order) the minimum number of flips needed by any such algorithm.

For $Cp$ large, the new algorithm can also be used to build a new Bernoulli factory that uses only 80\% of the expected coin flips of the older method, and applies to the more general problem of a multivariate Bernoulli factory, where there are $k$ coins, the $k$th coin has unknown probability $p_k$ of heads, and the goal is to simulate a coin flip with probability $C_1 p_1 + \cdots + C_k p_k$ of heads.

Anti-concentration for random polynomials

Authors: Oanh Nguyen, Van Vu
Abstract: We prove anti-concentration results for polynomials of independent Rademacher random variables, with arbitrary degree. Our results extend the classical Littlewood-Offord result for linear polynomials, and improve several earlier estimates. As an application, we address a challenge in complexity theory posed by Razborov and Viola.

Online Self-Indexed Grammar Compression

Authors: Yoshimasa Takabatake, Yasuo Tabei, Hiroshi Sakamoto
Abstract: Although several grammar-based self-indexes have been proposed thus far, their applicability is limited to offline settings where whole input texts are prepared, thus requiring to rebuild index structures for given additional inputs, which is often the case in the big data era. In this paper, we present the first online self-indexed grammar compression named OESP-index that can gradually build the index structure by reading input characters one-by-one. Such a property is another advantage which enables saving a working space for construction, because we do not need to store input texts in memory. We experimentally test OESP-index on the ability to build index structures and search query texts, and we show OESP-index's efficiency, especially space-efficiency for building index structures.

Truthful Online Scheduling with Commitments

Authors: Yossi Azar, Inna Kalp-Shaltiel, Brendan Lucier, Ishai Menache, Joseph Naor, Jonathan Yaniv
Abstract: We study online mechanisms for preemptive scheduling with deadlines, with the goal of maximizing the total value of completed jobs. This problem is fundamental to deadline-aware cloud scheduling, but there are strong lower bounds even for the algorithmic problem without incentive constraints. However, these lower bounds can be circumvented under the natural assumption of deadline slackness, i.e., that there is a guaranteed lower bound $s > 1$ on the ratio between a job's size and the time window in which it can be executed.

In this paper, we construct a truthful scheduling mechanism with a constant competitive ratio, given slackness $s > 1$. Furthermore, we show that if $s$ is large enough then we can construct a mechanism that also satisfies a commitment property: it can be determined whether or not a job will finish, and the requisite payment if so, well in advance of each job's deadline. This is notable because, in practice, users with strict deadlines may find it unacceptable to discover only very close to their deadline that their job has been rejected.

Approximate Deadline-Scheduling with Precedence Constraints

Authors: Hossein Efsandiari, MohammadTaghi Hajiaghyi, Jochen Koenemann, Hamid Mahini, David Malec, Laura Sanita
Abstract: We consider the classic problem of scheduling a set of n jobs non-preemptively on a single machine. Each job j has non-negative processing time, weight, and deadline, and a feasible schedule needs to be consistent with chain-like precedence constraints. The goal is to compute a feasible schedule that minimizes the sum of penalties of late jobs. Lenstra and Rinnoy Kan [Annals of Disc. Math., 1977] in their seminal work introduced this problem and showed that it is strongly NP-hard, even when all processing times and weights are 1. We study the approximability of the problem and our main result is an O(log k)-approximation algorithm for instances with k distinct job deadlines.

Extremal eigenvalues of local Hamiltonians

Authors: Aram W. Harrow, Ashley Montanaro
Abstract: We apply classical algorithms for approximately solving constraint satisfaction problems to find bounds on extremal eigenvalues of local Hamiltonians. We consider qubit Hamiltonians for which we have an upper bound on the number of terms in which each qubit participates, and find asymptotically optimal bounds for the operator norm and ground-state energy of such Hamiltonians under this constraint. In each case the bound is achieved by a product state which can be found efficiently using a classical algorithm.

The Two Formulations of the Szemerédi–Trotter Theorem

The Szemerédi–Trotter theorem is usually presented in one of the two following formulations. Theorem 1. Given a set of points and a set of lines, both in , the number of incidences between the two sets is . Theorem 2. Given a set of lines in , the number of points in that are incident […]

by adamsheffer at July 05, 2015 09:02 PM UTC

On the different stages of learning and teaching (algorithms)

from The Geomblog

Descending a rabbit hole of links prompted by a MeFi discussion (thanks, +David Eppstein) of Steven Pinker's essay on the curse of knowledge (thanks, +Jeff Erickson), I came across an article by Alistair Cockburn on a learning framework inspired by aikido called 'Shu-Ha-Ri'.

In brief,

• In the Shu stage, you're a beginning learner trying to find one way to solve a problem. It doesn't matter that there might be multiple ways. The goal is to learn one path, and learn it well.
• In the Ha stage, you understand one way well enough to realize its limits, and are ready to encounter many different strategies for reaching your goal. You might even begin to understand the pros and cons of these different approaches. In effect, you have detached from commitment to a single approach.
• In the Ri stage, you have "transcended" the individual strategies. You might use one, or another, or mix and match as needed. You'll create new paths as you need them, and move fluidly through the space of possibilities.
Reading through this article while I ponder (yet again) my graduate algorithms class for the fall, I realize that this three-stage development process maps quite well to what we expect from undergraduates, masters students and Ph.D students learning about an area.

The undergraduate is learning a tool for the first time (recurrence analysis say) and if they can understand the master theorem and apply it, that's pretty good.

At the next level, they realize the limitations of the master theorem, and might learn about the Akra-Bazzi method, or annihilators, or even some probabilistic recurrence methods.

Of course, once you're dealing with some thorny recurrence for the analysis in your next SODA submission, then the standard templates are helpful, but you'll often have to do something creative and nontrivial to wrestle the analysis into a form where it makes sense.

Pick your own topic if you don't like recurrences.

Which also explains why it's hard to explain how to prove things. Beginning students expect a standard formula (which is why induction and proof by contradiction get taught over and over). But once you go beyond this, there aren't really good templates. In effect, there's no good second level with a set of proof techniques that you can throw at most problems, which explains why students taking a grad algorithms class tend to struggle with exactly this step.

by Suresh Venkatasubramanian (noreply@blogger.com) at July 04, 2015 08:12 PM UTC

The Halting Problem For Linear Machines

from Richard Lipton

A small idea before the fireworks show

Thoralf Skolem was a mathematician who worked in mathematical logic, set theory, and number theory. He was the only known PhD student of Axel Thue, whose Thue systems were an early word-based model of computation. Skolem had only one PhD student, Öystein Ore, who did not work in logic or computation. Ore did, however, have many students including Grace Hopper and Marshall Hall, Jr., and Hall had many more including Don Knuth.

Today Ken and I try to stimulate progress on a special case of Skolem’s problem on linear sequences.

Although Ore worked mainly on ring theory and graph theory the seeds still collected around Skolem’s tree: Hall’s dissertation was titled “An Isomorphism Between Linear Recurring Sequences and Algebraic Rings.” Sequences defined by a finite linear operator are about the simplest computational process we can imagine:

$\displaystyle x_{n} = a_1 x_{n-1} + a_2 x_{n-2} + \cdots + a_m x_{n-m}.$

The coefficients ${a_1,\dots,a_m}$ and initial values ${x_{0},\cdots,x_{m-1}}$ can be integers or relaxed to be algebraic numbers. Skolem posed the problem of deciding whether there is ever an ${n}$ such that ${x_n = 0}$.

This is a kind of halting problem. It seems like it should be simple to analyze—it is just linear algebra—but it has remained open for over 80 years. We have discussed it several times before. This 2012 survey by Joel Ouaknine and James Worrell, plus this new one, give background on this and some related problems.

The Sum-of-Powers Case

Let ${f(n)}$ be

$\displaystyle \lambda_{1}^{n} + \cdots + \lambda_{m}^{n},$

where each ${\lambda_{i}}$ is an algebraic integer. Our problem is:

Does there exist a natural number ${n}$ so that ${f(n)=0}$?

This is a special case of the Skolem problem. It arises when the coefficients ${a_1,\dots,a_m}$ are the evaluations of the elementary symmetric polynomials ${s_1,\dots,s_m}$ at ${\lambda_1,\dots,\lambda_m}$ with alternating signs. For example, with ${m = 2}$ we get

$\displaystyle a_1 = \lambda_1 + \lambda_2, \quad a_2 = -\lambda_1 \lambda_2,$

which for ${x_0 = 2}$ and ${x_1 = \lambda_1 + \lambda_2}$ gives

$\displaystyle x_2 = a_1 x_1 + a_2 x_0 = (\lambda_1 + \lambda_2)^2 - 2\lambda_1 \lambda_2 = \lambda_1^2 + \lambda_2^2$

and so on. For ${m = 3}$ we have

$\displaystyle a_1 = \lambda_1 + \lambda_2 + \lambda_3, \quad a_2 = -(\lambda_1 \lambda_2 + \lambda_1 \lambda_3 + \lambda_2 \lambda_3),\quad a_3 = \lambda_1\lambda_2\lambda_3.$

Then ${f(n) = 0}$ means ${\lambda_1^n + \lambda_2^n + \lambda_3^n = 0.}$ If the ${\lambda_i}$ are nonzero integers then for odd ${n \geq 3}$ this is asking whether ${\lambda_1,\lambda_2,-\lambda_3}$ is a solution to Pierre Fermat’s equation, and we can simply answer “no.” Of course whether ${X}$ is a solution can be easier than asking whether the equation has a solution, but this shows our case contains some of the flavor of Fermat’s Last Theorem.

We can point up some minor progress on this problem. Our methods can handle somewhat more general cases where the sum of ${n}$-th powers is multiplied by ${cn^{d}}$ for some fixed constants ${c}$ and ${d}$, but we will stay with the simpler case. Our larger hope is that this case embodies the core of the difficulty in Skolem’s problem, so that solving it might throw open the road to the full solution.

Proof

Let’s begin the proof for the case when ${n}$ is a prime ${p}$. Suppose that ${f(p)=0}$. Recall

$\displaystyle f(n)=\lambda_{1}^{n} + \cdots + \lambda_{m}^{n},$

Clearly we can assume that ${f(1) \neq 0}$. Note that this is decidable. Put ${X = \mathsf{Norm}(f(1))}$. The key is to look at the quantity

$\displaystyle Y = \mathsf{Norm}( (\lambda_{1} + \cdots + \lambda_{m})^{p} ),$

where ${p}$ is a prime. We employ the following generalization of the binomial theorem:

$\displaystyle (x_1 + x_2 + \cdots + x_m)^n = \sum_{k_1+k_2+\cdots+k_m=n} {n \choose k_1, k_2, \ldots, k_m} \prod_{1\le t\le m}x_{t}^{k_{t}},$

where

$\displaystyle {n \choose k_1, k_2, \ldots, k_m} = \frac{n!}{k_1!\, k_2! \cdots k_m!}.$

The upshot is that all terms are divisible by a proper factor of ${n}$ except those from the cases ${k_j = n}$, all other ${k_i = 0}$. Each gives a factor of ${{n \choose n} = 1}$ and leaves the term ${\lambda_i^n}$. When ${n}$ is a prime ${p}$ this factor must include ${p}$ itself. Thus we get that ${Y = \mathsf{Norm(y)}}$ for some ${y}$ of the form

$\displaystyle y = \lambda_{1}^{p} + \cdots + \lambda_{m}^{p} + pR,$

where ${R}$ is an algebraic integer. But by the supposition ${f(p) = 0}$ this simplifies to ${y = pR}$, and so ${Y}$ is divisible by ${p}$. Thus

$\displaystyle Y \equiv 0 \bmod p.$

Since ${Y = \mathsf{Norm}(f(1)^p) = \mathsf{Norm}(f(1))^p = X^p}$, ${X}$ too is divisible by ${p}$. But ${X}$ is independent of ${p.}$ Hence, ${X}$ acts as a bound on any possible prime ${p}$ such that ${f(p) = 0}$. Testing the finitely many values of ${p}$ up to ${X}$ thus yields a decision procedure for this restricted case of Skolem’s problem.

Fine-Tuning and Fireworks

Ken chimes in an observation that might be distantly related: The Vandermonde determinant

$\displaystyle \prod_{1 \leq i < j \leq m} (x_j - x_i)$

is the “smallest” alternating polynomial in ${m}$ variables. Together with the symmetric polynomials it generates all alternating polynomials. When the ${x_i}$ are the ${m}$-th roots of unity it gives the determinant of the Fourier matrix ${F_m}$ up to sign. This determinant has absolute value

$\displaystyle D = m^{m/2} = 2^{\frac{1}{2}m\log_2 m}.$

It is also the product of the lengths of the ${{m \choose 2}}$ chords formed by ${m}$ equally-spaced points on the unit circle. The observation is that this 2-to-the-nearly-linear quantity is extraordinarily finely tuned.

To see how, let’s estimate the product of the chords in what is caricatured as the style of physicists: The length of an average chord is ${\sqrt{2}}$. So we can estimate the size of the product as

$\displaystyle (\sqrt{2})^{m \choose 2} \approx 2^{\frac{1}{4}m^2} = 2^{\Theta(m^2)}.$

This is off by an order of magnitude in the exponent—not even close. We can be a little smarter and use the average length of a chord instead, integrating ${\frac{1}{\pi}\sqrt{2 - 2\cos(x)}}$ from ${0}$ to ${\pi}$ to get ${4/\pi}$. This is still a number greater than ${1}$ and plugs in to yield ${2^{\Theta(m^2)}}$ anyway.

Such a calculation looks silly but isn’t. If we enlarge the circle by a factor of ${(1 + \delta)}$ then every term in the product is multiplied by that factor and it dominates:

$\displaystyle D_{1+\delta} \approx (1 + \delta)^{m \choose 2} = 2^{\Theta(m^2)}.$

If we shrink the circle by ${(1 - \delta)}$ the opposite happens: we divide by ${2^{\Theta(m^2)}}$ which crushes everything to make the analogous quantity ${D_{1-\delta}}$ virtually zero. Furthermore this “big crush” happens under more-plausible slight perturbations such as forbidding any of the ${m}$ points from occupying the arc between ${0}$ and ${\epsilon}$ radians, which prevents the equal-spacing maximization when ${m > 2\pi/\epsilon}$. We covered this at length in 2011.

The underlying reality is that when you take the logarithm of the product of chords, the terms of all growth orders between ${m^2}$ and ${m^{1 + \delta}}$ all magically cancel. There are many more chords of length ${> 1}$ than chords of length ${< 1}$, but the latter can be unboundedly short in a way that perfectly balances the multitudes of longer chords. The actual value of ${D}$ seems tiny amidst these perturbative possibilities.

This gigantic cancellation reminds Dick and me of the present argument over the tiny observed magnitude of the cosmological constant ${\Lambda}$. Estimation via quantum field theory prescribes a value 120 orders of magnitude higher—one that would instantly cause our universe to explode in fireworks—unless vast numbers of terms exactly cancel. Quoting Wikipedia:

This discrepancy has been called “the worst theoretical prediction in the history of physics” … the cosmological constant problem [is] the worst problem of fine-tuning in physics: there is no known natural way to derive the tiny [value] from particle physics.

“Fine-tuning” of constants without explanation is anathema to science, and many scientists have signed onto theories that there is a multiverse with 500 or more orders of magnitude of universes, enough to generate some with the tiny ${\Lambda}$ needed to allow life as we know it. However, any fine-tuning discovered in mathematics cannot be anathema. Perhaps the universe picks up the Fourier fine balancing act in ways we do not yet understand. More prosaically, the fine balance in quantities similar to ${f(n) = \lambda_1^n + \cdots \lambda_m^n}$ above could be just what makes Skolem’s problem hard.

Open Problems

I believe that the general case of the Skolem can be handled, not just the simple case. But the problem of handling more than just primes seems hard. I believe that this method can be used to handle more cases than just primes. Ken and I are working on this. Meanwhile, we wish everyone Stateside a happy Fourth of July, whether or not that includes fireworks.

[added link to new survey in intro]

Entropy optimality: Analytic psd rank and John’s theorem

from tcs math

Recall that our goal is to sketch a proof of the following theorem, where the notation is from the last post. I will assume a knowledge of the three posts on polyhedral lifts and non-negative rank, as our argument will proceed by analogy.

Theorem 1 For every ${m \geq 1}$ and ${g : \{0,1\}^m \rightarrow \mathbb R_+}$, there exists a constant ${C(g)}$ such that the following holds. For every ${n \geq 2m}$,

$\displaystyle 1+n^{d/2} \geq \mathrm{rank}_{\mathsf{psd}}(M_n^g) \geq C \left(\frac{n}{\log n}\right)^{(d-1)/2}\,. \ \ \ \ \ (1)$

where ${d = \deg_{\mathsf{sos}}(g).}$

In this post, we will see how John’s theorem can be used to transform a psd factorization into one of a nicer analytic form. Using this, we will be able to construct a convex body that contains an approximation to every non-negative matrix of small psd rank.

1.1. Finite-dimensional operator norms

Let ${H}$ denote a finite-dimensional Euclidean space over ${\mathbb R}$ equipped with inner product ${\langle \cdot,\cdot\rangle}$ and norm ${|\cdot|}$. For a linear operator ${A : H \rightarrow H}$, we define the operator, trace, and Frobenius norms by

$\displaystyle \|A\| = \max_{x \neq 0} \frac{|Ax|}{x},\quad \|A\|_* = \mathrm{Tr}(\sqrt{A^T A}),\quad \|A\|_F = \sqrt{\mathrm{Tr}(A^T A)}\,.$

Let ${\mathcal M(H)}$ denote the set of self-adjoint linear operators on ${H}$. Note that for ${A \in \mathcal M(H)}$, the preceding three norms are precisely the ${\ell_{\infty}}$, ${\ell_1}$, and ${\ell_2}$ norms of the eigenvalues of ${A}$. For ${A,B \in \mathcal M(H)}$, we use ${A \succeq 0}$ to denote that ${A}$ is positive semi-definite and ${A \succeq B}$ for ${A-B \succeq 0}$. We use ${\mathcal D(H) \subseteq \mathcal M(H)}$ for the set of density operators: Those ${A \in \mathcal M(H)}$ with ${A \succeq 0}$ and ${\mathrm{Tr}(A)=1}$.

One should recall that ${\mathrm{Tr}(A^T B)}$ is an inner product on the space of linear operators, and we have the operator analogs of the Hölder inequalities: ${\mathrm{Tr}(A^T B) \leq \|A\| \cdot \|B\|_*}$ and ${\mathrm{Tr}(A^T B) \leq \|A\|_F \|B\|_F}$.

1.2. Rescaling the psd factorization

As in the case of non-negative rank, consider finite sets ${X}$ and ${Y}$ and a matrix ${M : X \times Y \rightarrow \mathbb R_+}$. For the purposes of proving a lower bound on the psd rank of some matrix, we would like to have a nice analytic description.

To that end, suppose we have a rank-${r}$ psd factorization

$\displaystyle M(x,y) = \mathrm{Tr}(A(x) B(y))$

where ${A : X \rightarrow \mathcal S_+^r}$ and ${B : Y \rightarrow \mathcal S_+^r}$. The following result of Briët, Dadush and Pokutta (2013) gives us a way to “scale” the factorization so that it becomes nicer analytically. (The improved bound stated here is from an article of Fawzi, Gouveia, Parrilo, Robinson, and Thomas, and we follow their proof.)

Lemma 2 Every ${M}$ with ${\mathrm{rank}_{\mathsf{psd}}(M) \leq r}$ admits a factorization ${M(x,y)=\mathrm{Tr}(P(x) Q(y))}$ where ${P : X \rightarrow \mathcal S_+^r}$ and ${Q : Y \rightarrow \mathcal S_+^r}$ and, moreover,

$\displaystyle \max \{ \|P(x)\| \cdot \|Q(y)\| : x \in X, y \in Y \} \leq r \|M\|_{\infty}\,,$

where ${\|M\|_{\infty} = \max_{x \in X, y \in Y} M(x,y)}$.

Proof: Start with a rank-${r}$ psd factorization ${M(x,y) = \mathrm{Tr}(A(x) B(y))}$. Observe that there is a degree of freedom here, because for any invertible operator ${J}$, we get another psd factorization ${M(x,y) = \mathrm{Tr}\left(\left(J A(x) J^T\right) \cdot \left((J^{-1})^T B(y) J^{-1}\right)\right)}$.

Let ${U = \{ u \in \mathbb R^r : \exists x \in X\,\, A(x) \succeq uu^T \}}$ and ${V = \{ v \in \mathbb R^r : \exists y \in X\,\, B(y) \succeq vv^T \}}$. Set ${\Delta = \|M\|_{\infty}}$. We may assume that ${U}$ and ${V}$ both span ${\mathbb R^r}$ (else we can obtain a lower-rank psd factorization). Both sets are bounded by finiteness of ${X}$ and ${Y}$.

Let ${C=\mathrm{conv}(U)}$ and note that ${C}$ is centrally symmetric and contains the origin. Now John’s theorem tells us there exists a linear operator ${J : \mathbb R^r \rightarrow \mathbb R^r}$ such that

$\displaystyle B_{\ell_2} \subseteq J C \subseteq \sqrt{r} B_{\ell_2}\,, \ \ \ \ \ (2)$

where ${B_{\ell_2}}$ denotes the unit ball in the Euclidean norm. Let us now set ${P(x) = J A(x) J^T}$ and ${Q(y) = (J^{-1})^T B(y) J^{-1}}$.

Eigenvalues of ${P(x)}$: Let ${w}$ be an eigenvector of ${P(x)}$ normalized so the corresponding eigenvalue is ${\|w\|_2^2}$. Then ${P(x) \succeq w w^T}$, implying that ${J^{-1} w \in U}$ (here we use that ${A \succeq 0 \implies S A S^T \succeq 0}$ for any ${S}$). Since ${w = J(J^{-1} w)}$, (2) implies that ${\|w\|_2 \leq \sqrt{r}}$. We conclude that every eigenvalue of ${P(x)}$ is at most ${r}$.

Eigenvalues of ${Q(y)}$: Let ${w}$ be an eigenvector of ${Q(y)}$ normalized so that the corresponding eigenvalue is ${\|w\|_2^2}$. Then as before, we have ${Q(y) \succeq ww^T}$ and this implies ${J^T w \in V}$. Now, on the one hand we have

$\displaystyle \max_{z \in JC}\, \langle z,w\rangle \geq \|w\|_2 \ \ \ \ \ (3)$

since ${JC \supseteq B_{\ell_2}}$.

On the other hand:

$\displaystyle \max_{z \in JC}\, \langle z,w\rangle^2 = \max_{z \in C}\, \langle Jz, w\rangle^2 = \max_{z \in C}\, \langle z, J^T w\rangle^2\,. \ \ \ \ \ (4)$

Finally, observe that for any ${u \in U}$ and ${v \in V}$, we have

$\displaystyle \langle u,v\rangle^2 =\langle uu^T, vv^T\rangle \leq \max_{x \in X, y \in Y} \langle A(x), B(y)\rangle \leq \Delta\,.$

By convexity, this implies that ${\max_{z \in C}\, \langle z,v\rangle^2 \leq \Delta}$ for all ${v \in V}$, bounding the right-hand side of (4) by ${\Delta}$. Combining this with (3) yields ${\|w\|_2^2 \leq \Delta}$. We conclude that all the eigenvalues of ${Q(y)}$ are at most ${\Delta}$. $\Box$

1.3. Convex proxy for psd rank

Again, in analogy with the non-negative rank setting, we can define an “analytic psd rank” parameter for matrices ${N : X \times Y \rightarrow \mathbb R_+}$:

$\displaystyle \alpha_{\mathsf{psd}}(N) = \min \Big\{ \alpha \mid \exists A : X \rightarrow \mathcal S_+^k, B : Y \rightarrow \mathcal S_+^k\,,$

$\displaystyle \hphantom{xx} \mathop{\mathbb E}_{x \in X}[A(x)]=I,$

$\displaystyle \hphantom{xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx} \|B(y)\| \leq \frac{\alpha}{k}\, \mathop{\mathbb E}_{y \in Y}[\mathrm{Tr}(B(y))] \quad \forall y \in Y$

$\displaystyle \hphantom{\qquad\qquad} \|A(x)\| \leq \alpha \quad \forall x \in X\Big\}\,.$

Note that we have implicit equipped ${X}$ and ${Y}$ with the uniform measure. The main point here is that ${k}$ can be arbitrary. One can verify that ${\alpha_{\mathsf{psd}}}$ is convex.

And there is a corresponding approximation lemma. We use ${\|N\|_{\infty}=\max_{x,y} |N(x,y)|}$ and ${\|N\|_1 = \mathop{\mathbb E}_{x,y} |N(x,y)|}$.

Lemma 3 For every non-negative matrix ${M : X \times Y \rightarrow \mathbb R_+}$ and every ${\eta \in (0,1]}$, there is a matrix ${N}$ such that ${\|M-N\|_{\infty} \leq \eta \|M\|_{\infty}}$ and

$\displaystyle \alpha_{\mathsf{psd}}(N) \leq O(\mathrm{rank}_{\mathsf{psd}}(M)) \frac{1}{\eta} \frac{\|M\|_{\infty}}{\|M\|_1}\,.$

Using Lemma 2 in a straightforward way, it is not particularly difficult to construct the approximator ${N}$. The condition ${\mathop{\mathbb E}_x [A(x)] = I}$ poses a slight difficulty that requires adding a small multiple of the identity to the LHS of the factorization (to avoid a poor condition number), but this has a correspondingly small affect on the approximation quality. Putting “Alice” into “isotropic position” is not essential, but it makes the next part of the approach (quantum entropy optimization) somewhat simpler because one is always measuring relative entropy to the maximally mixed state.

by James at July 04, 2015 08:13 AM UTC

The High Cost of Conferences

At some point, I'm convinced the "conference structure" is going to fall apart.

Case in point -- I haven't bought my tickets yet for SIGCOMM because, unless I'm missing something, a schedule isn't up yet, and unfortunately, because ACM has scheduled a SIG chair meeting overlapping with SIGCOMM (which I don't understand also, but perhaps beside the point), I want to see what's going on when at the conference to plan my timing.

6+ weeks out, round trip tickets from Boston to London are over $2500 on nonstop economy flights. And those don't seem to be on US carriers; since I have to stop back through New York for this other meeting, and I need to find a US carrier (or figure out if this is a case where it's an exception to what is the currently believed NSF policy), tickets look to be well over$3000.  Then there's registration, hotel, etc.

At some point, this becomes unsustainable, I think.

by Michael Mitzenmacher (noreply@blogger.com) at July 03, 2015 03:07 PM UTC

Fast, Provable Algorithms for Isotonic Regression in all $\ell_{p}$-norms

Authors: Rasmus Kyng, Anup Rao, Sushant Sachdeva
Abstract: Given a directed acyclic graph $G,$ and a set of values $y$ on the vertices, the Isotonic Regression of $y$ is a vector $x$ that respects the partial order described by $G,$ and minimizes $||x-y||,$ for a specified norm. This paper gives improved algorithms for computing the Isotonic Regression for all weighted $\ell_{p}$-norms with rigorous performance guarantees. Our algorithms are quite practical, and their variants can be implemented to run fast in practice.

A Characterization of the Complexity of Resilience and Responsibility for Self-join-free Conjunctive Queries

Authors: Cibele Freire, Wolfgang Gatterbauer, Neil Immerman, Alexandra Meliou
Abstract: Several research thrusts in the area of data management have focused on understanding how changes in the data affect the output of a view or standing query. Example applications are explaining query results, propagating updates through views, and anonymizing datasets. These applications usually rely on understanding how interventions in a database impact the output of a query. An important aspect of this analysis is the problem of deleting a minimum number of tuples from the input tables to make a given Boolean query false. We refer to this problem as "the resilience of a query" and show its connections to the well-studied problems of deletion propagation and causal responsibility. In this paper, we study the complexity of resilience for self-join-free conjunctive queries, and also make several contributions to previous known results for the problems of deletion propagation with source side-effects and causal responsibility: (1) We define the notion of resilience and provide a complete dichotomy for the class of self-join-free conjunctive queries with arbitrary functional dependencies; this dichotomy also extends and generalizes previous tractability results on deletion propagation with source side-effects. (2) We formalize the connection between resilience and causal responsibility, and show that resilience has a larger class of tractable queries than responsibility. (3) We identify a mistake in a previous dichotomy for the problem of causal responsibility and offer a revised characterization based on new, simpler, and more intuitive notions. (4) Finally, we extend the dichotomy for causal responsibility in two ways: (a) we treat cases where the input tables contain functional dependencies, and (b) we compute responsibility for a set of tuples specified via wildcards.

On the Approximability of Digraph Ordering

Authors: Sreyash Kenkre, Vinayaka Pandit, Manish Purohit, Rishi Saket
Abstract: Given an n-vertex digraph D = (V, A) the Max-k-Ordering problem is to compute a labeling $\ell : V \to [k]$ maximizing the number of forward edges, i.e. edges (u,v) such that $\ell$(u) < $\ell$(v). For different values of k, this reduces to Maximum Acyclic Subgraph (k=n), and Max-Dicut (k=2). This work studies the approximability of Max-k-Ordering and its generalizations, motivated by their applications to job scheduling with soft precedence constraints. We give an LP rounding based 2-approximation algorithm for Max-k-Ordering for any k={2,..., n}, improving on the known 2k/(k-1)-approximation obtained via random assignment. The tightness of this rounding is shown by proving that for any k={2,..., n} and constant $\varepsilon > 0$, Max-k-Ordering has an LP integrality gap of 2 - $\varepsilon$ for $n^{\Omega\left(1/\log\log k\right)}$ rounds of the Sherali-Adams hierarchy.

A further generalization of Max-k-Ordering is the restricted maximum acyclic subgraph problem or RMAS, where each vertex v has a finite set of allowable labels $S_v \subseteq \mathbb{Z}^+$. We prove an LP rounding based $4\sqrt{2}/(\sqrt{2}+1) \approx 2.344$ approximation for it, improving on the $2\sqrt{2} \approx 2.828$ approximation recently given by Grandoni et al. (Information Processing Letters, Vol. 115(2), Pages 182-185, 2015). In fact, our approximation algorithm also works for a general version where the objective counts the edges which go forward by at least a positive offset specific to each edge.

The minimization formulation of digraph ordering is DAG edge deletion or DED(k), which requires deleting the minimum number of edges from an n-vertex directed acyclic graph (DAG) to remove all paths of length k. We show that both, the LP relaxation and a local ratio approach for DED(k) yield k-approximation for any $k\in [n]$.

Approximation Algorithms for Connected Maximum Cut and Related Problems

Authors: MohammadTaghi Hajiaghayi, Guy Kortsarz, Robert MacDavid, Manish Purohit, Kanthi Sarpatwar
Abstract: An instance of the Connected Maximum Cut problem consists of an undirected graph G = (V, E) and the goal is to find a subset of vertices S $\subseteq$ V that maximizes the number of edges in the cut \delta(S) such that the induced graph G[S] is connected. We present the first non-trivial \Omega(1/log n) approximation algorithm for the connected maximum cut problem in general graphs using novel techniques. We then extend our algorithm to an edge weighted case and obtain a poly-logarithmic approximation algorithm. Interestingly, in stark contrast to the classical max-cut problem, we show that the connected maximum cut problem remains NP-hard even on unweighted, planar graphs. On the positive side, we obtain a polynomial time approximation scheme for the connected maximum cut problem on planar graphs and more generally on graphs with bounded genus.

Compressed Manifold Modes: Fast Calculation and Natural Ordering

Authors: Kevin Houston
Abstract: Compressed manifold modes are locally supported analogues of eigenfunctions of the Laplace-Beltrami operator of a manifold. In this paper we describe an algorithm for the calculation of modes for discrete manifolds that, in experiments, requires on average 47% fewer iterations and 44% less time than the previous algorithm. We show how to naturally order the modes in an analogous way to eigenfunctions, that is we define a compressed eigenvalue. Furthermore, in contrast to the previous algorithm we permit unlumped mass matrices for the operator and we show, unlike the case of eigenfunctions, that modes can, in general, be oriented.

I/O-Efficient Similarity Join

Authors: Rasmus Pagh, Ninh Pham, Francesco Silvestri, Morten Stöckel
Abstract: We present an I/O-efficient algorithm for computing similarity joins based on locality-sensitive hashing (LSH). In contrast to the filtering methods commonly suggested our method has provable sub-quadratic dependency on the data size. Further, in contrast to straightforward implementations of known LSH-based algorithms on external memory, our approach is able to take significant advantage of the available internal memory: Whereas the time complexity of classical algorithms includes a factor of $N^\rho$, where $\rho$ is a parameter of the LSH used, the I/O complexity of our algorithm merely includes a factor $(N/M)^\rho$, where $N$ is the data size and $M$ is the size of internal memory. Our algorithm is randomized and outputs the correct result with high probability. It is a simple, recursive, cache-oblivious procedure, and we believe that it will be useful also in other computational settings such as parallel computation.

Improved Purely Additive Fault-Tolerant Spanners

Authors: Davide Bilò, Fabrizio Grandoni, Luciano Gualà, Stefano Leucci, Guido Proietti
Abstract: Let $G$ be an unweighted $n$-node undirected graph. A \emph{$\beta$-additive spanner} of $G$ is a spanning subgraph $H$ of $G$ such that distances in $H$ are stretched at most by an additive term $\beta$ w.r.t. the corresponding distances in $G$. A natural research goal related with spanners is that of designing \emph{sparse} spanners with \emph{low} stretch.

In this paper, we focus on \emph{fault-tolerant} additive spanners, namely additive spanners which are able to preserve their additive stretch even when one edge fails. We are able to improve all known such spanners, in terms of either sparsity or stretch. In particular, we consider the sparsest known spanners with stretch $6$, $28$, and $38$, and reduce the stretch to $4$, $10$, and $14$, respectively (while keeping the same sparsity).

Our results are based on two different constructions. On one hand, we show how to augment (by adding a \emph{small} number of edges) a fault-tolerant additive \emph{sourcewise spanner} (that approximately preserves distances only from a given set of source nodes) into one such spanner that preserves all pairwise distances. On the other hand, we show how to augment some known fault-tolerant additive spanners, based on clustering techniques. This way we decrease the additive stretch without any asymptotic increase in their size. We also obtain improved fault-tolerant additive spanners for the case of one vertex failure, and for the case of $f$ edge failures.

Lexicographic Matroid Optimization

Authors: Asaf Levin, Shmuel Onn
Abstract: We show that finding lexicographically minimal $n$ bases in a matroid can be done in polynomial time in the oracle model. This follows from a more general result that the shifted problem over a matroid can be solved in polynomial time as well.

Approximate Span Programs

Authors: Tsuyoshi Ito, Stacey Jeffery
Abstract: Span programs are a model of computation that have been used to design quantum algorithms, mainly in the query model. For any decision problem, there exists a span program that leads to an algorithm with optimal quantum query complexity, but finding such an algorithm is generally challenging.

We consider new ways of designing quantum algorithms using span programs. We show how any span program that decides a problem $f$ can also be used to decide "property testing" versions of $f$, or more generally, approximate the span program witness size, a property of the input related to $f$. For example, using our techniques, the span program for OR, which can be used to design an optimal algorithm for the OR function, can also be used to design optimal algorithms for: threshold functions, in which we want to decide if the Hamming weight of a string is above a threshold or far below, given the promise that one of these is true; and approximate counting, in which we want to estimate the Hamming weight of the input. We achieve these results by relaxing the requirement that 1-inputs hit some target exactly in the span program, which could make design of span programs easier.

We also give an exposition of span program structure, which increases the understanding of this important model. One implication is alternative algorithms for estimating the witness size when the phase gap of a certain unitary can be lower bounded. We show how to lower bound this phase gap in some cases.

As applications, we give the first upper bounds in the adjacency query model on the quantum time complexity of estimating the effective resistance between $s$ and $t$, $R_{s,t}(G)$, of $\tilde O(\frac{1}{\epsilon^{3/2}}n\sqrt{R_{s,t}(G)})$, and, when $\mu$ is a lower bound on $\lambda_2(G)$, by our phase gap lower bound, we can obtain $\tilde O(\frac{1}{\epsilon}n\sqrt{R_{s,t}(G)/\mu})$, both using $O(\log n)$ space.

Meet the Julians

from bit-player

JuliaCon, the annual gathering of the community focused on the Julia programming language, convened last week at MIT. I hung out there for a couple of days, learned a lot, and had a great time. I want to report back with a few words about the Julia language and a few more about the Julia community.

It’s remarkable that after 70 years of programming language development, we’re still busily exploring all the nooks and crannies of the design space. The last time I looked at the numbers, there were at least 2,500 programming languages, and maybe 8,500. But it seems there’s always room for one more. The slide at left (from a talk by David Beach of Numerica) sums up the case for Julia: We need something easier than C++, faster than Python, freer than Matlab, and newer than Fortran. Needless to say, the consensus at this meeting was that Julia is the answer.

Where does Julia fit in among all those older languages? The surface syntax of Julia code looks vaguely Pythonic, turning its back on the fussy punctuation of the C family. Other telltale traits suggest a heritage in Fortran and Matlab; for example, arrays are indexed starting with 1 rather than 0, and they are stored in column-major order. And there’s a strong suite of tools for working with matrices and other elements of linear algebra, appealing to numericists. Looking a little deeper, some of the most distinctive features of Julia have no strong connection with any of the languages mentioned in Beach’s slide. In particular, Julia relies heavily on generic functions, which came out of the Lisp world. (Roughly speaking, a generic function is a collection of methods, like the methods of an object in Smalltalk, but without the object.)

Perhaps a snippet of code is a better way to describe the language than all these comparisons. Here’s a fibonacci function:

function fib(n)
a = b = BigInt(1)
for i in 1:n
a, b = b, a+b
end
return a
end

Note the syntax of the for loop, which is similar to Python’s for i in range(n):, and very different from C’s for (var i=0; i<n; i++). But Julia dispenses with Python’s colons, instead marking the end of a code block. And indentation is strictly for human readers; it doesn’t determine program meaning, as it does in Python.

For a language that emphasizes matrix operations, maybe this version of the fibonacci function would be considered more idiomatic:

function fibmat(n)
a = BigInt[1 1; 1 0]
return (a^n)[1, 2]
end

What’s happening here, in case it’s not obvious, is that we’re taking the nth power of the matrix $\begin{bmatrix}1& 1\\1& 0\end{bmatrix}$ and returning the lower left element of the product, which is equal to the nth fibonacci number. The matrix-power version is 25 times faster than the loop version.

@time fib(10000)
elapsed time: 0.007243102 seconds (4859088 bytes allocated)

@time fibmat(10000)
elapsed time: 0.000265076 seconds (43608 bytes allocated)

Julia’s base language has quite a rich assortment of built-in functions, but there are also 600+ registered packages that extend the language in ways large and small, as well as a package manager to automate their installation. The entire Julia ecosystem is open source and managed through GitHub.

When it comes to programming environments, Julia offers something for everybody. You can use a traditional edit-compile-run cycle; there’s a REPL that runs in a terminal window; and there’s a lightweight IDE called Juno. But my favorite is the IPython/Jupyter notebook interface, which works just as smoothly for Julia as it does for Python. (With a cloud service called JuliaBox, you can run Julia in a browser window without installing anything.)

I’ve been following the Julia movement for a couple of years, but last week’s meeting was my first exposure to the community of Julia developers. Immediate impression: Youth! It’s not just that I was the oldest person in the room; I’m used to that. It’s how much older. Keno Fischer is now an undergrad at Harvard, but he was still in high school when he wrote the Julia REPL. Zachary Yedidia, who demoed an amazing package for physics-based simulations and animations, has not yet finished high school. Several other speakers were grad students. Even the suits in attendance—a couple of hedge fund managers whose firm helped fund the event—were in jeans with shirt tails untucked.

Four of the ringleaders of the Julia movement. From left: Stefan Karpinski, Viral B. Shah, Jeff Bezanson, Keno Fischer.

Second observation: These kids are having fun! They have a project they believe in; they’re zealous and enthusiastic; they’re talented enough to build whatever they want and make it work. And the world is paying attention. Everybody gets to be a superhero.

By now we’re well into the second generation of the free software movement, and although the underlying principles haven’t really changed, the vibe is different. Early on, when GNU was getting started, and then Linux, and projects like OpenOffice, the primary goal was access to source code, so that you could know what a program was doing, fix it if it broke, customize it to meet your needs, and take it with you when you moved to new hardware. Within the open-source community, that much is taken for granted now, but serious hackers want more. The game is not just to control your own copy of a program but to earn influence over the direction of the project as a whole. To put it in GitHub terminology, it’s not enough to be able to clone or fork the repo, and thereby get a private copy; you want the owners of the repo to accept your pull requests, and merge your own work into the main branch of development.

GitHub itself may have a lot to do with the emergence of this mode of collective work. It puts everything out in public—not just the code but also discussions among programmers and a detailed record of who did what. And it provides a simple mechanism for anyone to propose an addition or improvement. Earlier open-source projects tended to put a little more friction into the process of becoming a contributor.

In any case, I am fascinated by the social structure of the communities that form around certain GitHub projects. There’s a delicate balance between collaboration (everybody wants to advance the common cause) and competition (everyone wants to move up the list of contributors, ranked by number of commits to the code base). Maintaining that balance is also a delicate task. The health of the enterprise depends on attracting brilliant and creative people, and persuading them to freely contribute their work. But brilliant creative people bring ideas and agendas of their own.

The kind of exuberance I witnessed at JuliaCon last week can’t last forever. That’s sad, but there’s no helping it. One reason we have those 2,500 (or 8,500) programming languages is that it’s a lot more fun to invent a new one than it is to maintain a mature one. Julia is still six tenths of a version number short of 1.0, with lots of new territory to explore; I plan to enjoy it while I can.

Quick notes on a few of the talks at the conference.

Zenna Tavares described sigma.jl, a Julia package for probabilistic programming—another hot topic I’m trying to catch up with. Probabilistic programming languages try to automate the process of statistical modeling and inference, which means they need to build things like Markov chain Monte Carlo solvers into the infrastructure of the programming language. Tavares’s language also has a SAT solver built in.

Chiyuan Zhang gave us mocha.jl, a deep-learning/neural-network package inspired by the C++ framework Caffe. Watching the demo, I had the feeling I might actually be able to set up my own multilayer neural network on my kitchen table, but I haven’t put that feeling to the test yet.

Finally, because of parallel sessions I missed the first half of Pontus Stenetorp’s talk on “Suitably Naming a Child with Multiple Nationalities using Julia.” I got there just in time for the big unveiling. I was sure the chosen name would turn out to be “Julia.” But it turns out the top three names for the offspring of Swedish and Japanese parents is:

Steneport wants to extend his algorithm to more language pairs. And he also needs to tell his spouse about the results of this work.

by Brian Hayes at July 02, 2015 09:59 PM UTC

Goodbye SIGACT and CRA

Tuesday I served my last day on two organizations, the ACM SIGACT Executive Committee and the CRA Board of Directors.

I spent ten years on the SIGACT (Special Interest Group on Algorithms and Computation Theory) EC, four years as vice-chair, three years as chair and three years as ex-chair, admittedly not so active those last three years. SIGACT is the main US academic organization for theoretical computer science and organizes STOC as its flagship conference. I tried to do big things, managed a few smaller things (ToCT, a few more accepted papers in STOC, poster sessions, workshops, moving Knuth and Distinguished Service to annual awards, an award for best student presentation, a tiered PC), some of them stuck and some of them didn't. Glad to see a new movement to try big changes to meet the main challenge that no conference, including STOC, really brings the theory community together anymore. As Michael Mitzenmacher becomes chair and Paul Beame takes my place as ex-chair, I wish them them and SIGACT well moving forward.

The Computing Research Association's main efforts promotes computing research to industry and government and increasing the diversity in computing research. It's a well-run organization and we can thank them particularly for helping improve the funding situation for computing in difficult financial times. The CRA occasionally puts out best practices memos like a recent one recommending quality over quantity for hiring and promotion. Serving on the board, I most enjoyed interacting with computer scientists from across the entire field, instead of just hanging with theorists at the usual conferences and workshops.

One advantage of leaving these committees: I can now kibbitz more freely on the theory community and computing in general. Should be fun.

by Lance Fortnow (noreply@blogger.com) at July 02, 2015 11:56 AM UTC

On the minimum dimension of a Hilbert space needed to generate a quantum correlation

Authors: Jamie Sikora, Antonios Varvitsiotis, Zhaohui Wei
Abstract: Consider a two-party correlation that can be generated by performing local measurements on a bipartite quantum system. A question of fundamental importance is to understand how many resources, which we quantify by the dimension of the underlying quantum system, are needed to reproduce this correlation. In this paper, we identify an easy-to-compute lower bound on the smallest Hilbert space dimension needed to generate an arbitrary two-party quantum correlation. To derive the lower bound, we combine a new geometric characterization for the set of quantum correlations (arXiv:1506.07297) with techniques that were recently used to lower bound the PSD-rank of a nonnegative matrix, an important notion to mathematical optimization and quantum communication theory (arXiv:1407.4308). We show that our bound is tight on the correlations generated by optimal quantum strategies for the CHSH and the Magic Square Game and also reprove that a family of PR-boxes cannot be realized using quantum strategies.

An exponential lower bound for homogeneous depth-5 circuits over finite fields

Authors: Mrinal Kumar, Ramprasad Saptharishi
Abstract: In this paper, we show exponential lower bounds for the class of homogeneous depth-$5$ circuits over all small finite fields. More formally, we show that there is an explicit family $\{P_d : d \in \mathbb{N}\}$ of polynomials in $\mathsf{VNP}$, where $P_d$ is of degree $d$ in $n = d^{O(1)}$ variables, such that over all finite fields $\mathbb{F}_q$, any homogeneous depth-$5$ circuit which computes $P_d$ must have size at least $\exp(\Omega_q(\sqrt{d}))$.

To the best of our knowledge, this is the first super-polynomial lower bound for this class for any field $\mathbb{F}_q \neq \mathbb{F}_2$.

Our proof builds up on the ideas developed on the way to proving lower bounds for homogeneous depth-$4$ circuits [GKKS13, FLMS13, KLSS14, KS14] and for non-homogeneous depth-$3$ circuits over finite fields [GK98, GR00]. Our key insight is to look at the space of shifted partial derivatives of a polynomial as a space of functions from $\mathbb{F}_q^n \rightarrow \mathbb{F}_q$ as opposed to looking at them as a space of formal polynomials and builds over a tighter analysis of the lower bound of Kumar and Saraf [KS14].

Private Approximations of the 2nd-Moment Matrix Using Existing Techniques in Linear Regression

Authors: Or Sheffet
Abstract: We introduce three differentially-private algorithms that approximates the 2nd-moment matrix of the data. These algorithm, which in contrast to existing algorithms output positive-definite matrices, correspond to existing techniques in linear regression literature. Specifically, we discuss the following three techniques. (i) For Ridge Regression, we propose setting the regularization coefficient so that by approximating the solution using Johnson-Lindenstrauss transform we preserve privacy. (ii) We show that adding a small batch of random samples to our data preserves differential privacy. (iii) We show that sampling the 2nd-moment matrix from a Bayesian posterior inverse-Wishart distribution is differentially private provided the prior is set correctly. We also evaluate our techniques experimentally and compare them to the existing "Analyze Gauss" algorithm of Dwork et al.

On the Communication Complexity of Distributed Clustering

Authors: Qin Zhang
Abstract: In this paper we give a first set of communication lower bounds for distributed clustering problems, in particular, for k-center, k-median and k-means. When the input is distributed across a large number of machines and the number of clusters k is small, our lower bounds match the current best upper bounds up to a logarithmic factor. We have designed a new composition framework in our proofs for multiparty number-in-hand communication complexity which may be of independent interest.

Polytopix news app (for iOS)

from Shiva Kintali

I am very excited to announce that our Polytopix news app (for iOS) is now available on the app store. Check it out.

Polytopix news app (for iOS)

Filed under: Uncategorized

by kintali at July 01, 2015 08:02 PM UTC

Popularizing TOC

It is hard to overestimate the impact of Popular Science books such as “A Brief History of Time” and “Chaos: Making a New Science” on Scientific Research. The indirect impact of popularizing Science and Scientific Education often surpass the direct contribution that most scientists can hope to achieve in their life time. For this reason, many of the greatest scientists (including in our field) choose to invest considerable time in this blessed endeavor. I personally believe that the Theory of Computing deserves more popularization than it gets (and I hope to someday contribute my share). Nevertheless, this post is meant as a tribute to our colleagues who already made wonderful such contributions. I will continuously edit this post with TOC popular books and educational resources (based on my own knowledge and suggestions in the comments).

Popular TOC books:

Scott Aaronson, Quantum Computing since Democritus

Martin Davis, Engines of Logic: Mathematicians and the Origin of the Computer

David Harel, Computers Ltd.: What They Really Can’t Do

David Harel with Yishai Feldman, Algorithmics: The Spirit of Computing

Douglas Hofstadter: Gödel, Escher, Bach: An Eternal Golden Braid

Lance Fortnow, The Golden Ticket: P, NP, and the Search for the Impossible

Cristopher Moore and Stephan Mertens, The Nature of Computation

Dennis Shasha and Cathy Lazere, Out of their Minds: The Lives and Discoveries of 15 Great Computer Scientists

Leslie Valiant, Probably Approximately Correct: Nature’s Algorithms for Learning and Prospering in a Complex World

Leslie Valiant, Circuits of the Mind

Noson S. Yanofsky, The Outer Limits of Reason: What Science, Mathematics, and Logic Cannot Tell Us

Hector Zenil, Randomness Through Computation: Some Answers, More Questions

Fiction

Apostolos Doxiadis and Christos Papadimitriou, Logicomix: An epic search for truth

Christos H. Papadimitriou, Turing (A Novel about Computation)

Other Resources:

CS Unplugged (including a book)

by Omer Reingold at July 01, 2015 05:28 PM UTC

TR15-109 | An exponential lower bound for homogeneous depth-5 circuits over finite fields | Mrinal Kumar, Ramprasad Saptharishi

from ECCC papers

In this paper, we show exponential lower bounds for the class of homogeneous depth-$5$ circuits over all small finite fields. More formally, we show that there is an explicit family $\{P_d : d \in N\}$ of polynomials in $VNP$, where $P_d$ is of degree $d$ in $n = d^{O(1)}$ variables, such that over all finite fields $F_q$, any homogeneous depth-$5$ circuit which computes $P_d$ must have size at least $\exp(\Omega_q(\sqrt{d}))$. To the best of our knowledge, this is the first super-polynomial lower bound for this class for any field $F_q \neq F_2$. Our proof builds up on the ideas developed on the way to proving lower bounds for homogeneous depth-$4$ circuits [GKKS14, FLMS14, KLSS14, KS14b] and for non-homogeneous depth-$3$ circuits over finite fields [GK98, GR00]. Our key insight is to look at the space of shifted partial derivatives of a polynomial as a space of functions from $F_q^n \rightarrow F_q$ as opposed to looking at them as a space of formal polynomials and builds over a tighter analysis of the lower bound of Kumar and Saraf [KS14b].

Christos on the Greek Vote

I was sitting by the lake in Lausanne, winds and waves lapping at the sands, humanity in slim clothes, wine and meat on our plates, when I leaned over and asked Christos Papadimitriou, " Christos, When are you going to write something about the Greek crisis?", and he said in a way we have come to expect from him unfailingly, "I already did". Enjoy (use Google translate)!

by metoo (noreply@blogger.com) at July 01, 2015 08:19 AM UTC

A note on the large spectrum and generalized Riesz products

from tcs math

I wrote a short note entitled Covering the large spectrum and generalized Riesz products that simplifies and generalizes the approach of the first few posts on Chang’s Lemma and Bloom’s variant.

The approximation statement is made in the context of general probability measures on a finite set (though it should extend at least to the compact case with no issues). The algebraic structure only comes into play when the spectral covering statements are deduced (easily) from the general approximation theorem. The proofs are also done in the general setting of finite abelian groups.

Comments are encouraged, especially about references I may have missed.

by James at July 01, 2015 07:13 AM UTC

Quantum query complexity: the other shoe drops

from Scott Aaronson

Two weeks ago I blogged about a breakthrough in query complexity: namely, the refutation by Ambainis et al. of a whole slew of conjectures that had stood for decades (and that I mostly believed, and that had helped draw me into theoretical computer science as a teenager) about the largest possible gaps between various complexity measures for total Boolean functions. Specifically, Ambainis et al. built on a recent example of Göös, Pitassi, and Watson to construct bizarre Boolean functions f with, among other things, near-quadratic gaps between D(f) and R0(f) (where D is deterministic query complexity and R0 is zero-error randomized query complexity), near-1.5th-power gaps between R0(f) and R(f) (where R is bounded-error randomized query complexity), and near-4th-power gaps between D(f) and Q(f) (where Q is bounded-error quantum query complexity). See my previous post for more about the definitions of these concepts and the significance of the results (and note also that Mukhopadhyay and Sanyal independently obtained weaker results).

Because my mental world was in such upheaval, in that earlier post I took pains to point out one thing that Ambainis et al. hadn’t done: namely, they still hadn’t shown any super-quadratic separation between R(f) and Q(f), for any total Boolean function f. (Recall that a total Boolean function, f:{0,1}n→{0,1}, is one that’s defined for all 2n possible input strings x∈{0,1}n. Meanwhile, a partial Boolean function is one where there’s some promise on x: for example, that x encodes a periodic sequence. When you phrase them in the query complexity model, Shor’s algorithm and other quantum algorithms achieving exponential speedups work only for partial functions, not for total ones. Indeed, a famous result of Beals et al. from 1998 says that D(f)=O(Q(f)6) for all total functions f.)

So, clinging to a slender reed of sanity, I said it “remains at least a plausible conjecture” that, if you insist on a fair comparison—i.e., bounded-error quantum versus bounded-error randomized—then the biggest speedup quantum algorithms can ever give you over classical ones, for total Boolean functions, is the square-root speedup that Grover’s algorithm easily achieves for the n-bit OR function.

Today, I can proudly report that my PhD student, Shalev Ben-David, has refuted that conjecture as well.  Building on the Göös et al. and Ambainis et al. work, but adding a new twist to it, Shalev has constructed a total Boolean function f such that R(f) grows roughly like Q(f)2.5 (yes, that’s Q(f) to the 2.5th power). Furthermore, if a conjecture that Ambainis and I made in our recent “Forrelation” paper is correct—namely, that a problem called “k-fold Forrelation” has randomized query complexity roughly Ω(n1-1/k)—then one would get nearly a cubic gap between R(f) and Q(f).

The reason I found this question so interesting is that it seemed obvious to me that, to produce a super-quadratic separation between R and Q, one would need a fundamentally new kind of quantum algorithm: one that was unlike Simon’s and Shor’s algorithms in that it worked for total functions, but also unlike Grover’s algorithm in that it didn’t hit some impassable barrier at the square root of the classical running time.

Flummoxing my expectations once again, Shalev produced the super-quadratic separation, but not by designing any new quantum algorithm. Instead, he cleverly engineered a Boolean function for which you can use a combination of Grover’s algorithm and the Forrelation algorithm (or any other quantum algorithm that gives a huge speedup for some partial Boolean function—Forrelation is just the maximal example), to get an overall speedup that’s a little more than quadratic, while still keeping your Boolean function total. I’ll let you read Shalev’s short paper for the details, but briefly, it once again uses the Göös et al. / Ambainis et al. trick of defining a Boolean function that equals 1 if and only if the input string contains some hidden substructure, and the hidden substructure also contains a pointer to a “certificate” that lets you quickly verify that the hidden substructure was indeed there. You can use a super-fast algorithm—let’s say, a quantum algorithm designed for partial functions—to find the hidden substructure assuming it’s there. If you don’t find it, you can simply output 0. But if you do find it (or think you found it), then you can use the certificate, together with Grover’s algorithm, to confirm that you weren’t somehow misled, and that the substructure really was there. This checking step ensures that the function remains total.

Are there further separations to be found this way? Almost certainly! Indeed, Shalev, Robin Kothari, and I have already found some more things (as well as different/simpler proofs of known separations), though nothing quite as exciting as the above.

Update (July 1): Ronald de Wolf points out in the comments that this “trust-but-verify” trick, for designing total Boolean functions with unexpectedly low quantum query complexities, was also used in a recent paper by himself and Ambainis (while Ashley Montanaro points out that a similar trick was used even earlier, in a different context, by Le Gall).  What’s surprising, you might say, is that it took as long as it did for people to realize how many applications this trick has.

Update (July 2): In conversation with Robin Kothari and Cedric Lin, I realized that Shalev’s superquadratic separation between R and Q, combined with a recent result of Lin and Lin, resolves another open problem that had bothered me since 2001 or so. Given a Boolean function f, define the “projective quantum query complexity,” or P(f), to be the minimum number of queries made by a bounded-error quantum algorithm, in which the answer register gets immediately measured after each query. This is a model of quantum algorithms that’s powerful enough to capture (for example) Simon’s and Shor’s algorithms, but not Grover’s algorithm. Indeed, one might wonder whether there’s any total Boolean function for which P(f) is asymptotically smaller than R(f)—that’s the question I wondered about around 2001, and that I discussed with Elham Kashefi. Now, by using an argument based on the “Vaidman bomb,” Lin and Lin recently proved the fascinating result that P(f)=O(Q(f)2) for all functions f, partial or total. But, combining with Shalev’s result that there exists a total f for which R(f)=Ω(Q(f)2.5), we get that there’s a total f for which R(f)=Ω(P(f)1.25). In the other direction, the best I know is that P(f)=Ω(bs(f)) and therefore R(f)=O(P(f)3).

by Scott at July 01, 2015 03:06 AM UTC