# Theory of Computing Blog Aggregator

from Scott Aaronson

With two looming paper deadlines, two rambunctious kids, an undergrad class, program committee work, faculty recruiting, and an imminent trip to Capitol Hill to answer congressional staffers’ questions about quantum computing (and for good measure, to give talks at UMD and Johns Hopkins), the only sensible thing to do is to spend my time writing a blog post.

So: a bunch of people asked for my reaction to the new Nature Communications paper by Daniela Frauchiger and Renato Renner, provocatively titled “Quantum theory cannot consistently describe the use of itself.”  Here’s the abstract:

Quantum theory provides an extremely accurate description of fundamental processes in physics.  It thus seems likely that the theory is applicable beyond the, mostly microscopic, domain in which it has been tested experimentally.  Here, we propose a Gedankenexperiment to investigate the question whether quantum theory can, in principle, have universal validity.  The idea is that, if the answer was yes, it must be possible to employ quantum theory to model complex systems that include agents who are themselves using quantum theory.  Analysing the experiment under this presumption, we find that one agent, upon observing a particular measurement outcome, must conclude that another agent has predicted the opposite outcome with certainty.  The agents’ conclusions, although all derived within quantum theory, are thus inconsistent.  This indicates that quantum theory cannot be extrapolated to complex systems, at least not in a straightforward manner.

I first encountered Frauchiger and Renner’s argument back in July, when Renner (who I’ve known for years) presented it at a summer school in Boulder, CO where I was also lecturing.  I was sufficiently interested (or annoyed?) that I pulled an all-nighter working through the argument, then discussed it at lunch with Renner as well as John Preskill.  I enjoyed figuring out exactly where I get off Frauchiger and Renner’s train—since I do get off their train.  While I found their paper thought-provoking, I reject the contention that there’s any new problem with QM’s logical consistency: for reasons I’ll explain, I think there’s only the same quantum weirdness that (to put it mildly) we’ve known about for quite some time.

In more detail, the paper makes a big deal about how the new argument rests on just three assumptions (briefly, QM works, measurements have definite outcomes, and the “transitivity of knowledge”); and how if you reject the argument, then you must reject at least one of the three assumptions; and how different interpretations (Copenhagen, Many-Worlds, Bohmian mechanics, etc.) make different choices about what to reject.

But I reject an assumption that Frauchiger and Renner never formalize.  That assumption is, basically: “it makes sense to chain together statements that involve superposed agents measuring each other’s brains in different incompatible bases, as if the statements still referred to a world where these measurements weren’t being done.”  I say: in QM, even statements that look “certain” in isolation might really mean something like “if measurement X is performed, then Y will certainly be the outcome.”  The trouble arises when we have multiple such statements, involving different measurements X1, X2, …, and (let’s say) performing X1 destroys the original situation in which we were talking about performing X2.

But I’m getting ahead of myself.  The first thing to understand about Frauchiger and Renner’s argument is that, as they acknowledge, it’s not entirely new.  As Preskill helped me realize, the argument can be understood as simply the “Wigner’s-friendification” of Hardy’s Paradox.  In other words, the new paradox is exactly what you get if you take Hardy’s paradox from 1992, and promote its entangled qubits to the status of conscious observers who are in superpositions over thinking different thoughts.  Having talked to Renner about it, I don’t think he fully endorses the preceding statement.  But since I fully endorse it, let me explain the two ingredients that I think are getting combined here—starting with Hardy’s paradox, which I confess I didn’t know (despite knowing Lucien Hardy himself!) before the Frauchiger-Renner paper forced me to learn it.

Hardy’s paradox involves the two-qubit entangled state

$$\left|\psi\right\rangle = \frac{\left|00\right\rangle + \left|01\right\rangle + \left|10\right\rangle}{\sqrt{3}}.$$

And it involves two agents, Alice and Bob, who measure the left and right qubits respectively, both in the {|+〉,|-〉} basis.  Using the Born rule, we can straightforwardly calculate the probability that Alice and Bob both see the outcome |-〉 as 1/12.

So what’s the paradox?  Well, let me now “prove” to you that Alice and Bob can never both get |-〉.  Looking at |ψ〉, we see that conditioned on Alice’s qubit being in the state |0〉, Bob’s qubit is in the state |+〉, so Bob can never see |-〉.  And conversely, conditioned on Bob’s qubit being in the state |0〉, Alice’s qubit is in the state |+〉, so Alice can never see |-〉.  OK, but since |ψ〉 has no |11〉 component, at least one of the two qubits must be in the state |0〉, so therefore at least one of Alice and Bob must see |+〉!

When it’s spelled out so plainly, the error is apparent.  Namely, what do we even mean by a phrase like “conditioned on Bob’s qubit being in the state |0〉,” unless Bob actually measured his qubit in the {|0〉,|1〉} basis?  But if Bob measured his qubit in the {|0〉,|1〉} basis, then we’d be talking about a different, counterfactual experiment.  In the actual experiment, Bob measures his qubit only in the {|+〉,|-〉} basis, and Alice does likewise.  As Asher Peres put it, “unperformed measurements have no results.”

Anyway, as I said, if you strip away the words and look only at the actual setup, it seems to me that Frauchiger and Renner’s contribution is basically to combine Hardy’s paradox with the earlier Wigner’s friend paradox.  They thereby create something that doesn’t involve counterfactuals quite as obviously as Hardy’s paradox does, and so requires a new discussion.

But to back up: what is Wigner’s friend?  Well, it’s basically just Schrödinger’s cat, except that now it’s no longer a cat being maintained in coherent superposition but a person, and we’re emphatic in demanding that this person be treated as a quantum-mechanical observer.  Thus, suppose Wigner entangles his friend with a qubit, like so:

$$\left|\psi\right\rangle = \frac{\left|0\right\rangle \left|FriendSeeing0\right\rangle + \left|1\right\rangle \left|FriendSeeing1\right\rangle}{\sqrt{2}}.$$

From the friend’s perspective, the qubit has been measured and has collapsed to either |0〉 or |1〉.  From Wigner’s perspective, no such thing has happened—there’s only been unitary evolution—and in principle, Wigner could even confirm that by measuring |ψ〉 in a basis that included |ψ〉 as one of the basis vectors.  But how can they both be right?

Many-Worlders will yawn at this question, since for them, of course “the collapse of the wavefunction” is just an illusion created by the branching worlds, and with sufficiently advanced technology, one observer might experience the illusion even while a nearby observer doesn’t.  Ironically, the neo-Copenhagenists / Quantum Bayesians / whatever they now call themselves, though they consider themselves diametrically opposed to the Many-Worlders (and vice versa), will also yawn at the question, since their whole philosophy is about how physics is observer-relative and it’s sinful even to think about an objective, God-given “quantum state of the universe.”  If, on the other hand, you believed both that

1. collapse is an objective physical event, and
2. human mental states can be superposed just like anything else in the physical universe,

then Wigner’s thought experiment probably should rock your world.

OK, but how do we Wigner’s-friendify Hardy’s paradox?  Simple: in the state

$$\left|\psi\right\rangle = \frac{\left|00\right\rangle + \left|01\right\rangle + \left|10\right\rangle}{\sqrt{3}},$$

we “promote” Alice’s and Bob’s entangled qubits to two conscious observers, call them Charlie and Diane respectively, who can think two different thoughts that we represent by the states |0〉 and |1〉.  Using far-future technology, Charlie and Diane have been not merely placed into coherent superpositions over mental states but also entangled with each other.

Then, as before, Alice will measure Charlie’s brain in the {|+〉,|-〉} basis, and Bob will measure Diane’s brain in the {|+〉,|-〉} basis.  Since the whole setup is mathematically identical to that of Hardy’s paradox, the probability that Alice and Bob both get the outcome |-〉 is again 1/12.

Ah, but now we can reason as follows:

1. Whenever Alice gets the outcome |-〉, she knows that Diane must be in the |1〉 state (since, if Diane were in the |0〉 state, then Alice would’ve certainly seen |+〉).
2. Whenever Diane is in the |1〉 state, she knows that Charlie must be in the |0〉 state (since there’s no |11〉 component).
3. Whenever Charlie is in the |0〉 state, she knows that Diane is in the |+〉 state, and hence Bob can’t possibly see the outcome |-〉 when he measures Diane’s brain in the {|+〉,|-〉} basis.

So to summarize, Alice knows that Diane knows that Charlie knows that Bob can’t possibly see the outcome |-〉.  By the “transitivity of knowledge,” this implies that Alice herself knows that Bob can’t possibly see |-〉.  And yet, as we pointed out before, quantum mechanics predicts that Bob can see |-〉, even when Alice has also seen |-〉.  And Alice and Bob could even do the experiment, and compare notes, and see that their “certain knowledge” was false.  Ergo, “quantum theory can’t consistently describe its own use”!

You might wonder: compared to Hardy’s original paradox, what have we gained by waving a magic wand over our two entangled qubits, and calling them “conscious observers”?  Frauchiger and Renner’s central claim is that, by this gambit, they’ve gotten rid of the illegal counterfactual reasoning that we needed to reach a contradiction in our analysis of Hardy’s paradox.  After all, they say, none of the steps in their argument involve any measurements that aren’t actually performed!  But clearly, even if no one literally measures Charlie in the {|0〉,|1〉} basis, he’s still there, thinking either the thought corresponding to |0〉 or the thought corresponding to |1〉.  And likewise Diane.  Just as much as Alice and Bob, Charlie and Diane both exist even if no one measures them, and they can reason about what they know and what they know that others know.  So then we’re free to chain together the “certainties” of Alice, Bob, Charlie, and Diane in order to produce our contradiction.

As I already indicated, I reject this line of reasoning.  Specifically, I get off the train at what I called step 3 above.  Why?  Because the inference from Charlie being in the |0〉 state to Bob seeing the outcome |+〉 holds for the original state |ψ〉, but in my view it ceases to hold once we know that Alice is going to measure Charlie in the {|+〉,|-〉} basis, which would involve a drastic unitary transformation (specifically, a “Hadamard”) on the quantum state of Charlie’s brain.  I.e., I don’t accept that we can take knowledge inferences that would hold in a hypothetical world where |ψ〉 remained unmeasured, with a particular “branching structure” (as a Many-Worlder might put it), and extend them to the situation where Alice performs a rather violent measurement on |ψ〉 that changes the branching structure by scrambling Charlie’s brain.

In quantum mechanics, measure or measure not: there is no if you hadn’t measured.

Unrelated Announcement: My awesome former PhD student Michael Forbes, who’s now on the faculty at the University of Illinois Urbana-Champaign, asked me to advertise that the UIUC CS department is hiring this year in all areas, emphatically including quantum computing. And, well, I guess my desire to do Michael a solid outweighed my fear of being tried for treason by my own department’s recruiting committee…

Another Unrelated Announcement: As of Sept. 25, 2018, it is the official editorial stance of Shtetl-Optimized that the Riemann Hypothesis and the abc conjecture both remain open problems.

### Assistant Professor at Salem State University (apply by March 31, 2019)

from CCI: jobs

The computer science department at Salem State University is seeking candidates with interest and experience in teaching all levels of computer science courses for a full-time, tenure-track, faculty position.

by shacharlovett at September 25, 2018 12:40 AM UTC

### TR18-166 | An overview of some semantic and syntactic complexity classes | Tayfun Pay, James Cox

from ECCC papers

We review some semantic and syntactic complexity classes that were introduced to better understand the relationship between complexity classes P and NP. We also define several new complexity classes, some of which are associated with Mersenne numbers, and show their location in the complexity hierarchy.

### Preview of the Atiyah Talk

from Richard Lipton

Why the Riemann hypothesis is hard and some other observations.

 ICM 2018 “Matchmaking” source

Michael Atiyah, as we previously posted, claims to have a proof that the Riemann Hypothesis (RH) is true. In the second half of a 9:00–10:30am session (3:00–4:30am Eastern time), he will unveil his claim.

Today we discuss what the RH is and why it is hard.

Of course we hope that it was hard and now is easy. If Atiyah is correct, and we at GLL hope that he is, then the RH becomes an “easy” problem.

Update 9/24, 8am: The website Aperiodical whose post on RH was mentioned below have a Twitter thread on the talk and another with all the slides. Atiyah has released a short paper with the main technical work contained in a second paper, “The Fine Structure Constant.” This Reddit thread has Python code for a short computation relevant to the latter. A typo ${F(b) = 0}$ in the main slide is corrected to ${F(0) = 0}$ in the paper. The analysis rests on an mapping ${T}$ named for John Todd and used by Friedrich Hirzebruch. It is variously represented as a mapping ${T: \mathbb{C} \to \mathbb{C}}$ and as an operator on power series, and in what is evidently the latter form yields the following simple complex function in composition with zeta:

$\displaystyle F(s) = T\{1 + \zeta(s + b)\} - 1.$

The claim is that ${F(2s) = 2F(s)}$ for ${s}$ in the critical strip and then the analyticity of ${F(s)}$ at ${0}$ yields the contradiction of ${\zeta}$ vanishing everywhere. We will defer further comment at this time but there are opinions in the above.

In its original form, the RH concerns the simple zeta function of one complex variable ${s}$:

$\displaystyle \zeta(s) = \sum_{n=1}^{\infty} n^{-s} = 1 + \frac{1}{2^s} + \frac{1}{3^s} + \frac{1}{4^s} + \cdots$

Leonhard Euler analyzed this function for ${s}$ a positive integer and proved that ${\zeta(2) = \frac{\pi^2}{6}}$. Further values quickly followed for positive even numbers but the nature of ${\zeta(3)}$ remained open until Roger Apéry proved it irrational in 1978.

As a function the sum converges provided the real part of ${s}$ is greater than ${1}$, but for other complex ${s}$ it is definable by analytic continuation. This yields the surprising—perhaps—values ${\zeta(0) = - \frac{1}{2}}$, ${\zeta(-1) = -\frac{1}{12}}$, and ${\zeta(-2) = \zeta(-4) = \zeta(-6) = \cdots = 0}$. The latter are the trivial zeroes of zeta. In complex analysis analytic continuation is a method that extends a function—when possible—to more values. This extension is one reason we believe that the RH is hard. The RH is the statement:

All other zeroes of ${\zeta(s)}$ have real part ${\frac{1}{2}}$, that is, ${s = \frac{1}{2} + i\tau}$ for some ${\tau}$.

That is to say, there are other zeroes besides the trivial ones. The behavior of ${\zeta(s)}$ near those zeroes is not only complex but universally so. The impact of analyzing that behavior was presaged by Leonhard Euler’s discovery

$\displaystyle \zeta(s) = \prod_{p \text{ prime}} \frac{1}{1 - p^{-s}}.$

For intuition, consider how ${\frac{1}{1 - x} = 1 + x + x^2 + \cdots}$ converges for ${x \in [0,1)}$ and picture the product of all these series over ${x = \frac{1}{p}}$ for each prime ${p}$. Every finite term in the product gives the prime factorization of a unique ${n \geq 1}$ and the exponent ${-s}$ merely carries through.

The way in which ${\zeta}$ encodes the “gene sequence of the primes” endures when ${s}$ is complex. There are many equivalent forms of RH, many having to do with tightness of upper or lower bounds for things—a staple of analysis but also what we like to talk about in computational complexity theory. One of our favorites is described in this post:

${\text{RH} \iff (\forall \epsilon > 0)(\exists C)(\forall x)\sum_{k = 1}^x \mu(k) \le Cx^{1/2 + \epsilon}.}$

Here ${\mu(k)}$ is the Möbius function, named for the same August Möbius as the strip, and defined by

$\displaystyle \mu(k) = \begin{cases} 1 & \text{if } k \text{ is a product of an even number of distinct primes}\\ -1 & \text{if } k \text{ is a product of an odd number of distinct primes}\\ 0 & \text{otherwise, i.e., if } k \text{ is not square-free.}\\ \end{cases}$

In this and other ways, the zeroes of ${\zeta}$ govern regularities of the distribution of the primes. One rogue zero off the line causes enough ripples to disturb the ${x^{1/2 + o(1)}}$ bound. But the ripples are not tsunamis and the tight RH bound is demanding a lot: an ${o(x)}$ bound is equivalent both to the Prime Number Theorem (PNT) and to no zero wandering over as far as the line for real part ${1}$.

Even wider ramifications emerge from this essay by Alain Connes. Connes like Atiyah is a Fields Medalist (1982) and among his later work is an extension of the famous index theorem proved by Atiyah with Isadore Singer. Connes’s essay even goes into the wild topic of structures that behave like “fields of characteristic ${1}$“—but maybe not so wild, since Boolean algebra where ${1 \vee 1 = 1}$ is a simple and familiar example.

## Why Hard?

We thought we would try to give some intuition why the RH is hard. Of course like all open problems, its hard because we have not yet proved it. Every open problem is hard. But lets try and give some intuition why it is hard. Enough about “hard.”

Consider any quantity ${A}$ that is defined by a formula of the summation type:

$\displaystyle A = a_{1} + a_{2} + \dots$

Now showing that ${A}$ is not zero is quite difficult in general. Even if the summation is finite, showing it does not sum to ${0}$ is in general a tough one.

Imagine that ${A}$ is also equal to a product type:

$\displaystyle B = b_{1} \times b_{2} \times \dots$

Now showing that ${B}$ is not zero is not so impossible: If the product was a finite one, then you need only show that each term ${b_{k}}$ is not zero. If the product is an infinite product, then its harder. But there is at least hope.

As above, the Riemann function is indeed given by a summation as its definition. But it also is given by the product type formula. This formula is the key to proving even weaker than RH bounds on where the Riemann zeroes are located, such as for PNT.

Back to ${B}$ as a product. Take logarithms naively and see that we can replace studying ${B}$ by

$\displaystyle L = \log(b_{1}) + \log(b_{2}) + \dots$

This can be made to work. But. One must be quite careful when handling the logarithm over the complex numbers. Put simply, the logarithm function is multi-valued. It is only defined up to a multiple of ${2\pi i}$ and so must be handled very carefully.

This is the same reason that

$\displaystyle 1 = e^{2\pi i} = e^{2\cdot 2\pi i} = e^{3 \cdot 2\pi i} = \dots$

is true. Apparently this phenomenon is one reason that many attempts at the RH have failed. At some point the proof compute two quantities say ${Q}$ and ${R}$ and conclude incorrectly that they are equal. But in reality

$\displaystyle Q = R + 2\pi i,$

for example. Of course there is no way that this is a mistake an Atiyah can make, but many others have run into this issue.

It may help also to discuss a product type formula that is classic.

$\displaystyle \sin(\pi z) = \pi z\prod_{n=1}^{\infty} \left( 1 - \frac{z^{2}}{n^{2}} \right).$

Note that this converges for all values of ${z}$. This helps one see that the zeroes of the sine function are exactly as you expected: multiples of ${\pi}$. If there was a product formula like this for the zeta function the RH would be easy. Of course the known formulas for zeta are only convergent for values that have a real part greater than ${1}$. Too bad.

## Some Observations

On the eve of the lecture, we have not found much more information since the news broke early Thursday. There is a more-visual description of RH by Katie Steckles and Paul Taylor, who are attending the Heidelberg Laureate Forum and will be blogging from there. But we have a couple of meta-observations.

One is to compare with how Andrew Wiles’s June 1993 announcement of a proof of Fermat’s Last Theorem (FLT) was handled. Wiles was given time for three long lectures on three days of a meeting at Cambridge University under the generic title, “Elliptic Curves and Galois Representations.” He kept his cards in his hand during the first two lectures as he built the tools for his proof; the rumors of where it was going ramped up mainly before the third.

His proof had already been gone over in depth by colleagues at Princeton, initially Nicholas Katz and with John Conway helping to keep the process leak-free. Nevertheless, the proof contained a subtle error in an estimate that was not found until Katz posed followup questions three months later. Repairing the gap took a year with assistance by Richard Taylor and required a change in the strategy of the proof. We draw the following observations and contrasts:

• We do not know if any of Atiyah’s colleagues at Oxford or elsewhere have read his manuscript. This doesn’t mean anything yet—no one knew about Wiles until the day-of.

• Wiles used an avenue of attack on FLT that had been opened by Ken Ribet and others in 1986 and that already had opened a fruitful web of connections to other areas of analysis and number theory. He proved a newer conjecture that implies FLT. Atiyah’s proof evidently draws on mathematical physics but we have not seen any harbinger or informed speculation of what channels it may use.

• Atiyah has just one 45-minute time slot. This may not be enough time. Wiles’s example shows how a small detail can have wide impact.

• Atiyah gave the Abel Lecture at the International Congress of Mathematicians in Rio de Janeiro just last month. Its title—“The future of mathematical physics: new ideas in old bottles”—may be read as hinting about “new ideas.” However, the talk is almost entirely historical and elementary. We leave it to our audience to read any tea leaves here.

Our final observation—have we said it already?—is that RH is hard. How hard was impressed on Ken by a story he recollects as having been told by Bernard Dwork during a undergraduate seminar at Princeton in 1980 on Norman Levinson’s proof that over one-third of the nontrivial zeroes lie on the “critical line” of real part ${\frac{1}{2}}$. Different versions may be found on the Net but the one Ken recalls went approximately this way:

After giving up on a lifetime of prayers to prove Riemann, an already-famous mathematician turned to the other side for help early on a Monday. The usual price was no object, but the Devil said that because of the unusual subject he could not offer the usual same-day service. The contract was drawn up for delivery by Saturday midnight. Projecting his gratitude, the mathematician arranged a private feast for that day and exchanged his usual rumpled clothes for a tailored suit to match the Devil’s dapper figure. The sun set as he poured his wine and kept a roast pig and fine fare on the burner, but there was no sign of his companion. The clock struck 9, 10, 11, and the minute hand swept round the dial. Suddenly at the first chime a sulfrous blast through a window revealed that the Devil had missed his landing point by a dozen yards. The mathematician opened his door and through it staggered an unshaven frazzled figure, horns askew and parchments akimbo, pleading: “I just need one more lemma…”

The moral of the story is the same as in other versions: there are no brilliant mathematicians Down There.

## Open Problems

Well we will see soon if the RH is still hard or if it is now easy—or on the road to easy. It is rare to have such excellence in our mathematical endeavors. We hope that the talk is sufficiently clear that we will be able to applaud the brilliance of Atiyah. We wish him well.

[Added update at top and made separate section “About RH”; 9:50am changed general “a” in “as” to “2” since 2 is special in the paper; fixed Euler product formula for sines; expanded observations about the Todd function in the intro including using braces in the equation defining ${F(s)}$]

by RJLipton+KWRegan at September 24, 2018 01:54 AM UTC

### A numerical approach for the filtered generalized Cech complex

Authors: Jesus F. Espinoza, Rosalia Hernandez-Amador, Hector A. Hernandez, Beatriz Ramonetti-Valencia
Abstract: In this paper, we present an algorithm to compute the filtered generalized \v{C}ech complex for a finite collection of disks in the plane, which don't necessarily have the same radius.

The key step behind the algorithm is to calculate the minimum scale factor needed to ensure rescaled disks have a nonempty intersection, through a numerical approach, whose convergence is guaranteed by a generalization of the well-known Vietoris-Rips Lemma, which we also prove in an alternative way, using elementary geometric arguments.

We present two applications of our main results. We give an algorithm for computing the 2-dimensional filtered generalized \v{C}ech complex of a finite collection of $d$-dimensional disks in $\mathbb{R}^d$. In addition, we show how the algorithm yields the minimal enclosing ball for a finite set of points in the plane.

### Data-compression for Parametrized Counting Problems on Sparse graphs

Authors: Eun Jung Kim, Maria Serna, Dimitrios M. Thilikos
Abstract: We study the concept of \emph{compactor}, which may be seen as a counting-analogue of kernelization in counting parameterized complexity. For a function $F:\Sigma^*\to \Bbb{N}$ and a parameterization $\kappa: \Sigma^*\to \Bbb{N}$, a compactor $({\sf P},{\sf M})$ consists of a polynomial-time computable function ${\sf P}$, called \emph{condenser}, and a computable function ${\sf M}$, called \emph{extractor}, such that $F={\sf M}\circ {\sf P}$, and the condensing ${\sf P}(x)$ of $x$ has length at most $s(\kappa(x))$, for any input $x\in \Sigma^*.$ If $s$ is a polynomial function, then the compactor is said to be of polynomial-size. Although the study on counting-analogue of kernelization is not unprecedented, it has received little attention so far. We study a family of vertex-certified counting problems on graphs that are MSOL-expressible; that is, for an MSOL-formula $\phi$ with one free set variable to be interpreted as a vertex subset, we want to count all $A\subseteq V(G)$ where $|A|=k$ and $(G,A)\models \phi.$ In this paper, we prove that every vertex-certified counting problems on graphs that is \emph{MSOL-expressible} and \emph{treewidth modulable}, when parameterized by $k$, admits a polynomial-size compactor on $H$-topological-minor-free graphs with condensing time $O(k^2n^2)$ and decoding time $2^{O(k)}.$ This implies the existence of an {\sf FPT}-algorithm of running time $O(n^2k^2)+2^{O(k)}.$ All aforementioned complexities are under the Uniform Cost Measure (UCM) model where numbers can be stored in constant space and arithmetic operations can be done in constant time.

### Constructing Hard Examples for Graph Isomorphism

Authors: Anuj Dawar, Kashif Khan
Abstract: We describe a method for generating graphs that provide difficult examples for practical Graph Isomorphism testers. We first give the theoretical construction, showing that we can have a family of graphs without any non-trivial automorphisms which also have high Weisfeiler-Leman dimension. The construction is based on properties of random 3XOR-formulas. We describe how to convert such a formula into a graph which has the desired properties with high probability. We validate the method by an experimental implementation. We construct random formulas and validate them with a SAT solver to filter through suitable ones, and then convert them into graphs. Experimental results demonstrate that the resulting graphs do provide hard examples that match the hardest known benchmarks for graph isomorphism.

### Distributed coloring of graphs with an optimal number of colors

Authors: Étienne Bamas, Louis Esperet
Abstract: This paper studies sufficient conditions to obtain efficient distributed algorithms coloring graphs optimally (i.e. with the minimum number of colors) in the LOCAL model of computation. Most of the work on distributed vertex coloring so far has focused on coloring graphs of maximum degree $\Delta$ with at most $\Delta+1$ colors (or $\Delta$ colors when some simple obstructions are forbidden). When $\Delta$ is a sufficiently large and $k\ge \Delta-k_\Delta+1$, for some integer $k_\Delta\approx \sqrt{\Delta}-2$, we give a distributed algorithm that given a $k$-colorable graph $G$ of maximum degree $\Delta$, finds a $k$-coloring of $G$ in $\min\{O(\Delta^\lambda\log n), 2^{O(\log \Delta+\sqrt{\log \log n})}\}$ rounds w.h.p., for any $\lambda>0$. The lower bound $\Delta-k_\Delta+1$ is best possible in the sense that for infinitely many values of $\Delta$, we prove that when $\chi(G)\le \Delta -k_\Delta$, finding an optimal coloring of $G$ requires $\Omega(n)$ rounds. Our proof is a light adaptation of a remarkable result of Molloy and Reed, who proved that for $\Delta$ large enough, for any $k\ge \Delta-k_\Delta$ deciding whether $\chi(G)\le k$ is in P, while Embden-Weinert et al. proved that for $k\le \Delta-k_\Delta-1$, the same problem is NP-complete. Note that the sequential and distributed thresholds differ by one.

Our second result covers a larger range of parameters, but gives a weaker bound on the number of colors: For any sufficiently large $\Delta$, and $\Omega(\log \Delta)\le d \le \Delta/100$, we prove that every graph of maximum degree $\Delta$ and clique number at most $\Delta-d$ can be efficiently colored with at most $\Delta-\epsilon d$ colors, for some absolute constant $\epsilon >0$, with a randomized algorithm running w.h.p. in $\min\{O(\log_\Delta n),2^{O(\log \Delta+\sqrt{\log \log n})}\}$ rounds.

### Polyline Drawings with Topological Constraints

Authors: Emilio Di Giacomo, Peter Eades, Giuseppe Liotta, Henk Meijer, Fabrizio Montecchiani
Abstract: Let $G$ be a simple topological graph and let $\Gamma$ be a polyline drawing of $G$. We say that $\Gamma$ \emph{partially preserves the topology} of $G$ if it has the same external boundary, the same rotation system, and the same set of crossings as $G$. Drawing $\Gamma$ fully preserves the topology of $G$ if the planarization of $G$ and the planarization of $\Gamma$ have the same planar embedding. We show that if the set of crossing-free edges of $G$ forms a connected spanning subgraph, then $G$ admits a polyline drawing that partially preserves its topology and that has curve complexity at most three (i.e., at most three bends per edge). If, however, the set of crossing-free edges of $G$ is not a connected spanning subgraph, the curve complexity may be $\Omega(\sqrt{n})$. Concerning drawings that fully preserve the topology, we show that if $G$ has skewness $k$, it admits one such drawing with curve complexity at most $2k$; for skewness-1 graphs, the curve complexity can be reduced to one, which is a tight bound. We also consider optimal $2$-plane graphs and discuss trade-offs between curve complexity and crossing angle resolution of drawings that fully preserve the topology.

### Compressed Sensing with Adversarial Sparse Noise via L1 Regression

Authors: Sushrut Karmalkar, Eric Price
Abstract: We present a simple and effective algorithm for the problem of \emph{sparse robust linear regression}. In this problem, one would like to estimate a sparse vector $w^* \in \mathbb{R}^n$ from linear measurements corrupted by sparse noise that can arbitrarily change an adversarially chosen $\eta$ fraction of measured responses $y$, as well as introduce bounded norm noise to the responses. For Gaussian measurements, we show that a simple algorithm based on L1 regression can successfully estimate $w^*$ for any $\eta < \eta_0 \approx 0.239$, and that this threshold is tight for the algorithm. The number of measurements required by the algorithm is $O(k \log \frac{n}{k})$ for $k$-sparse estimation, which is within constant factors of the number needed without any sparse noise. Of the three properties we show---the ability to estimate sparse, as well as dense, $w^*$; the tolerance of a large constant fraction of outliers; and tolerance of adversarial rather than distributional (e.g., Gaussian) dense noise---to the best of our knowledge, no previous polynomial time algorithm was known to achieve more than two.

### Total Variation Isoperimetric Profiles

Authors: Daryl DeFord, Hugo Lavenant, Zachary Schutzman, Justin Solomon
Abstract: Applications in political redistricting demand quantitative measures of geometric compactness to distinguish between simple and contorted shapes of legislative voting districts. While the isoperimetric quotient, or ratio of area to perimeter squared, is commonly used in practice, it is sensitive to noisy data and irrelevant geographic features like coastline. These issues are addressed in theory by the isoperimetric profile, which plots the minimum perimeter needed to inscribe shapes of different prescribed areas within the boundary of a shape; algorithms for computing this profile, however, are not known in practice. Hence, in this paper, we propose a convex Eulerian relaxation of the isoperimetric profile using total variation. We prove theoretical properties of our relaxation, showing that it still satisfies an isoperimetric inequality and yields a convex function of the prescribed area. Furthermore, we provide a discretization of the problem, an optimization technique, and experiments demonstrating the value of our relaxation.

### Local Computation Algorithms for the Lov\'{a}sz Local Lemma

Authors: Dimitris Achlioptas, Themis Gouleakis, Fotis Iliopoulos
Abstract: We consider the task of designing Local Computation Algorithms (LCA) for applications of the Lov\'{a}sz Local Lemma (LLL). LCA is a class of sublinear algorithms proposed by Rubinfeld et al. that have received a lot of attention in recent years. The LLL is an existential, sufficient condition for a collection of sets to have non-empty intersection (in applications, often, each set comprises all objects having a certain property). The ground-breaking algorithm of Moser and Tardos made the LLL fully constructive, following earlier works by Beck and Alon giving algorithms under significantly stronger LLL-like conditions. LCAs under those stronger conditions were given in the paper of Rubinfeld et al. and later work by Alon et al., where it was asked if the Moser-Tardos algorithm can be used to design LCAs under the standard LLL condition. The main contribution of this paper is to answer this question affirmatively. In fact, our techniques yields LCAs for settings beyond the standard LLL condition.

### On Folding and Twisting (and whatknot): towards a topological view of syntax

Authors: Diego Gabriel Krivochen
Abstract: Syntactic theory has traditionally adopted a constructivist approach, in which a set of atomic elements are manipulated by combinatory operations to yield derived, complex elements. Syntactic structure is thus seen as the result or discrete recursive combinatorics over lexical items which get assembled into phrases, which are themselves combined to form sentences. This view is common to European and American structuralism (e.g., Benveniste, 1971; Hockett, 1958) and different incarnations of generative grammar, transformational and non-transformational (Chomsky, 1956, 1995; and Kaplan & Bresnan, 1982; Gazdar, 1982). Since at least Uriagereka (2002), there has been some attention paid to the fact that syntactic operations must apply somewhere, particularly when copying and movement operations are considered. Contemporary syntactic theory has thus somewhat acknowledged the importance of formalizing aspects of the spaces in which elements are manipulated, but it is still a vastly underexplored area. In this paper we explore the consequences of conceptualizing syntax as a set of topological operations applying over spaces rather than over discrete elements. We argue that there are empirical advantages in such a view for the treatment of long-distance dependencies and cross-derivational dependencies: constraints on possible configurations emerge from the dynamics of the system.

### Lecturer (Tenured Assistant Professor) at Royal Holloway, University of London (apply by October 17, 2018)

from CCI: jobs

The Department of Computer Science at Royal Holloway, University of London, invites applications for a Lecturer position (a full-time and permanent (tenured) post). The call is open to candidates working in all areas of Computer Science, including Algorithms and Complexity.

Website: https://jobs.royalholloway.ac.uk/vacancy.aspx?ref=0918-375

by shacharlovett at September 23, 2018 11:33 PM UTC

### UIUC is looking for quantum people

from Sariel Har-Peled

UIUC is looking for quantum and not necessarily quantum people to hire. See here for details.

by Sariel at September 23, 2018 08:43 PM UTC

### TR18-165 | Resolution and the binary encoding of combinatorial principles | Stefan Dantchev, Nicola Galesi, Barnaby Martin

from ECCC papers

We investigate the size complexity of proofs in $RES(s)$ -- an extension of Resolution working on $s$-DNFs instead of clauses -- for families of contradictions given in the {\em unusual binary} encoding. A motivation of our work is size lower bounds of refutations in Resolution for families of contradictions in the {\em usual unary} encoding. Our main interest is the $k$-Clique Principle, whose Resolution complexity is still unknown. The approach is justified by the observation that for a large class of combinatorial principles (those expressible as $\Pi_2$ first-order formulae) short $RES(\log n)$ refutations for the binary encoding are reducible to short Resolution refutations of the unary encoding. Our main result is a $n^{\Omega(k)}$ lower bound for the size of refutations of the binary $k$-Clique Principle in $RES(\lfloor \frac{1}{2}\log \log n\rfloor)$. This improves the result of Lauria, Pudl\'ak et al. [24] who proved the lower bound for Resolution, that is $\RES(1)$. A lower bound in $RES(\log n)$ for the binary $k$-Clique Principle would prove a lower bound in Resolution for its unary version. Resolution lower bounds for the (unary) $k$-Clique Principle are known only when refutations are either treelike [10] or read-once [4] (regular Resolution). To contrast the proof complexity between the unary and binary encodings of combinatorial principles, we consider the binary (weak) Pigeonhole principle $BinPHP^m_n$ for $m>n$. Our second lower bound proves that in $RES(s)$ for $s\leq \log^{\frac{1}{2-\epsilon}}(n)$, the shortest proofs of the $BinPHP^m_n$, requires size $2^{n^{1-\delta}}$, for any $\delta>0$. By a result of Buss and Pitassi [15] we know that for the (unary, weak) Pigeonhole principle $PHP^m_n$, exponential lower bounds (in the size of $PHP^m_n$) are not possible in Resolution when $m \geq 2^{\sqrt{n\log n}}$ since there is an upper bound of $2^{O(\sqrt{n\log n})}$. Our lower bound for $BinPHP^m_n$, together with the fact short $RES(1)$ refutations for $PHP^m_n$ can be translated into short $RES(\log n)$ proofs for $\BinPHP^m_n$, shows a form of tightness of the upper bound of [15]. Furthermore we prove that $BinPHP^m_n$ can be refuted in size $2^{\Theta(n)}$ in treelike $RES(1)$, contrasting with the unary case, where $PHP^m_n$ requires treelike $RES(1)$ refutations of size $2^{\Omega(n \log n)}$ [9,16]. In order to compare the complexity of refuting binary encodings in Resolution with respect to their unary version, we study under what conditions the complexity of refutations in Resolution will not increase significantly (more than a polynomial factor) when shifting between the unary encoding and the binary encoding. We show that this is true, from unary to binary, for propositional encodings of principles expressible as a $\Pi_2$-formula and involving total variable comparisons. We then show that this is true, from binary to unary, when one considers the {\em functional unary encoding}. In particular, we derive a polynomial upper bound in $RES(1)$ for the binary version $bRLOP$ of a variant of the Linear Ordering principle, $RLOP$, which exponentially separates read-once Resolution from Resolution (see [2]). Finally we prove that the binary encoding of the general Ordering principle $OP$ -- with no total ordering constraints -- is polynomially provable in Resolution. These last results can be interpreted as addressing the property that shifting to the binary encoding is preserving the proof hardness of the corresponding unary encodings when working in Resolution.

### Open Rank Faculty Position in Quantum Computing at University of Illinois, Urbana-Champaign (UIUC) (apply by January 4, 2019)

from CCI: jobs

The CS Department has several tenure-track positions at all ranks. Quantum computing (broadly construed) is one of the priority areas. The large and vibrant College of Engineering which houses CS, Physics, and Electrical Engineering Departments provides excellent opportunities for interdisciplinary research and collaboration. Applications before November 25, 2018 are strongly encouraged.

Email: chekuri@illinois.edu

by shacharlovett at September 22, 2018 03:51 AM UTC

### Video of my ICM2018 lecture: Noise Stability, Noise Sensitivity, and the Quantum Computer Puzzle

from Gil Kalai

The Video of my ICM2018 lecture is now on the air

Here are the (slightly improved) slides.

I made the mistake of trying to improve my slides in the evening before the lecture and in the morning I discovered that the file disappeared (of course, after the lecture it reappeared) and I had to reconstruct it from an earlier version.

And here is the playlist of all plenary lectures:

by Gil Kalai at September 21, 2018 08:41 AM UTC

### The Specter of Simple(r) Proofs

from Richard Lipton

When has a strikingly simple proof come first?

 Cropped from London Times 2017 source

Michael Atiyah is giving a lecture next Monday morning at the Heidelberg Laureate Forum (HLF). It is titled, simply, “The Riemann Hypothesis.” An unsourced image of his abstract says what his title does not: that he is claiming not only a proof of Riemann but a “simple proof using a radically new approach.”

Today we discuss cases where theorems had radically simpler proofs than were first contemplated.

Sir Michael’s talk is at 9:45am Central European time, which is 3:45am Eastern. It will be live-streamed from the HLF site and will appear later on HLF’s YouTube channel. We have not found any other hard information. The titles of all talks are clickable on the program, but abstracts are not yet posted there.

Preceding him from 9:00 to 9:45am, the opening talk of the week’s proceedings, is John Hopcroft whom we know so well. If you’ve heard of the expression “a hard act to follow,” John has the opposite challenge. He is speaking on “An Introduction to AI and Deep Learning.”

Although all speakers are past winners of major prizes—Abel, Fields, Nevanlinna, Turing, and the ACM Prize in Computing—the forum is oriented forward to inspire young researchers. Our very own graduate Michael Wehar received support to attend in 2016, and one of this year’s young attendees is John Urschel, whom we profiled here.

## Simpler Proofs

Naturally, in a blog named for Kurt Gödel we should lead with him as an example. There is an often-overlooked component to the title of his famous paper on incompleteness theorems in logic:

Über formal unentscheidbare Sätze der Principia Mathematica und verwandter Systeme I.

The component does not need to be translated. It is the “I” at the end. It does not mean “System 1”—it means that this is Part One of a longer intended paper. Gödel wrote the following at the very end on page 26—OK, now we will translate:

In this work we’ve essentially limited attention to System ${P}$ and have only hinted at how it applies to other systems. The results in full generality will be enunciated and proved in the next installment. That work will also give a fully detailed presentation of Theorem 11, whose proof was only sketched here.

Gödel’s “Part II” was slated to stretch over 100 pages. It never appeared because the essence of Gödel’s argument was perceived and accepted quickly, details notwithstanding. Once Alan Turing’s theory of computation emerged six years later, it became possible to convey the essence in just one page:

1. There are explicitly-defined sets ${D}$ that are not computably enumerable.

2. For any effective formal system, the set ${E}$ of cases ${x}$ such that it proves “${x \in D}$is computably enumerable.

3. Hence ${E}$ cannot equal ${D}$.

4. If ${E}$ contains a case ${y}$ that is not in ${D}$, then the system proves “${y \in D}$” but that is false.

5. Thus if the system is sound, ${E}$ must be a proper subset of ${D}$.

6. Then any ${z \in D \setminus E}$ yields a statement “${z \in D}$” that is true but not provable in the system.

The condition that the system is sound is technically stronger than needed, but Gödel didn’t optimize his conditions either—Barkley Rosser found a simple trick by which it suffices that the system be consistent. The Turing-based rollout can be optimized similarly. For one, we can define ${D}$ in such a way that the falseness of “${y \in D}$” has a finite demonstration, and can specify “effective” so that exceptional cases ${z}$ are computed. For a fully sharp treatment see pages 34 and 35 of these notes by Arun Debray—scribing Ryan Williams—which I am using this term.

The relevant point of comparison is that Gödel’s essence lay unperceived for decades while the leading lights believed in the full mechanizability of mathematics. Had it slept five years more, the lectures by Max Newman that Turing attended in the spring of 1935 might have ended not with Gödel but with more on the effort to build computing machines—which Newman later joined. See how David Hilbert, more than Gödel, animates Andrew Hodges’s description of Turing’s machines on pages 96–107 of his famous biography, and see what Hodges writes in his second paragraph here. The lack of Gödel might have held Alonzo Church back more.

Then Turing would have burst out with a “simple” refutation of Hilbert’s programme via “a radically new approach.” Equating “simple” with “one page” as above is unfair—Turing computability needs development that includes Gödel’s pioneering idea of encoding programs and logical formulas. But the analogy with Atiyah still works insofar as his use of “simple” evidently bases on the following works:

• A 1954 paper on algebraic varieties by Friedrich Hirzebruch;

• Paul Dirac’s seminal 1928 paper, “The Quantum Theory of the Electron”; and

• A 1936 paper by John von Neumann, “On an algebraic generalization of the quantum mechanical formalism (Part I).” (This “Part I” was actually a “Part II” to an earlier paper by von Neumann with two others.)

We have on several occasions noted the appreciation that methods originating in quantum computing have gained wide application in areas of complexity theory that seem wholly apart from quantum. The quantum flow may run even deeper than anyone has thought.

## Other Examples

${\bullet}$ Abel-Ruffini Theorem. Whether fifth-degree polynomial equations have a closed-form solution had been open far longer than Riemann has been. Paolo Ruffini in 1799 circulated a proof manuscript that ran over 500 pages and yet was incomplete. Niels Abel, whom we just mentioned, in 1824 gave a full sketch in only six pages, which were later expanded to twenty. This was still six years before the nature of the impossibility was enduringly expounded by Évariste Galois.

${\bullet}$ Lasker-Noether Theorem. Emanuel Lasker was not only the world chess champion but also a PhD graduate in 1900 of Max Noether, who was Emmy Noether’s father. In 1905, Lasker published a 97-page paper proving the primary decomposition theorem for polynomial ideals. Sixteen years later, Emmy Noether proved a more powerful theorem in a paper of 43 pages. But as Wikipedia’s list of long proofs states,

Lasker’s [proof] has since been simplified: modern proofs are less than a page long.

We invite readers to suggest further favorite examples. This MathOverflow thread lists many cases of finding shorter proofs, but we want to distinguish cases where the later simpler proof ushered in a powerful new theory.

The flip side is when the simpler proof comes first. These are harder to judge especially in regard to whether longer “elementary” methods could have succeeded. The Prime Number Theorem could be regarded as an example in that the original analytic proofs are elegant and came with the great 19th-century tide of using analysis on problems in number theory.

## Open Problems

Of course the main open problem will be whether Sir Michael’s claims are correct. Even if he walks them back to saying just that he has a new approach, its viability will merit further investigation.

Coincidentally, we note today’s—now yesterday’s—Quanta Magazine article on mounting doubt about the correctness of Shinichi Mochizuki’s claimed 500-page proof of the ABC Conjecture. The challenge comes from Peter Scholze of Bonn and Jacob Stix of Frankfurt. Luboš Motl today has coverage of both stories. Scholze, a newly minted Fields Medalist, will also attend in Heidelberg next week.

Updates 9/21: Stories by IFLScience! (which I neglected to link above) and by the New Scientist. The latter has new quotes by Atiyah. John Cook has more details about both Riemann and ABC. A MathOverflow thread has been started.

by KWRegan at September 21, 2018 08:24 AM UTC

### Small Uncolored and Colored Choice Dictionaries

Authors: Torben Hagerup
Abstract: A choice dictionary can be initialized with a parameter $n\in\mathbb{N}$ and subsequently maintains an initially empty subset $S$ of $\{1,\ldots,n\}$ under insertion, deletion, membership queries and an operation $\textit{choice}$ that returns an arbitrary element of $S$. The choice dictionary is fundamental in space-efficient computing and has numerous applications. The best previous choice dictionary can be initialized with $n$ and $t\in\mathbb{N}$ and subsequently executes all operations in $O(t)$ time and occupies $n+O(n({t/w})^t+\log n)$ bits on a word RAM with a word length of $w=\Omega(\log n)$ bits. We describe a new choice dictionary that executes all operations in constant time and, in addition to the space needed to store the integer $n$, occupies only $n+1$ bits, which is shown to be optimal if $w=o(n)$.

A generalization of the choice dictionary called a colored choice dictionary is initialized with $c\in\mathbb{N}$ in addition to $n$ and subsequently maintains a semipartition $(S_0,\ldots,S_{c-1})$ of $\{1,\ldots,n\}$ under the operations $\textit{setcolor}(j,\ell)$, which moves $\ell$ from its current subset to $S_j$, $\textit{color}(\ell)$, which returns the unique $j\in\{0,\ldots,c-1\}$ with $\ell\in S_j$, and $\textit{choice}(j)$, which returns an arbitrary element of $S_j$. We describe new colored choice dictionaries that, if initialized with constant $c$, execute $\textit{setcolor}$, $\textit{color}$ and $\textit{choice}$ in constant time and occupy $n\log_2\!c+1$ bits plus the space needed to store $n$ if $c$ is a power of 2, and at most $n\log_2\!c+n^\epsilon$ bits in general, for arbitrary fixed $\epsilon>0$. We also study the possibility of iterating over the set $S$ or over $S_j$ for given $j\in\{0,\ldots,c-1\}$ and an application of this to breadth-first search.

### Sparsified SGD with Memory

Authors: Sebastian U. Stich, Jean-Baptiste Cordonnier, Martin Jaggi
Abstract: Huge scale machine learning problems are nowadays tackled by distributed optimization algorithms, i.e. algorithms that leverage the compute power of many devices for training. The communication overhead is a key bottleneck that hinders perfect scalability. Various recent works proposed to use quantization or sparsification techniques to reduce the amount of data that needs to be communicated, for instance by only sending the most significant entries of the stochastic gradient (top-k sparsification). Whilst such schemes showed very promising performance in practice, they have eluded theoretical analysis so far.

In this work we analyze Stochastic Gradient Descent (SGD) with k-sparsification or compression (for instance top-k or random-k) and show that this scheme converges at the same rate as vanilla SGD when equipped with error compensation (keeping track of accumulated errors in memory). That is, communication can be reduced by a factor of the dimension of the problem (sometimes even more) whilst still converging at the same rate. We present numerical experiments to illustrate the theoretical findings and the better scalability for distributed applications.

### $L_1$ Shortest Path Queries in Simple Polygons

Authors: Sang Won Bae, Haitao Wang
Abstract: Let $P$ be a simple polygon of $n$ vertices. We consider two-point $L_1$ shortest path queries in $P$. We build a data structure of $O(n)$ size in $O(n)$ time such that given any two query points $s$ and $t$, the length of an $L_1$ shortest path from $s$ to $t$ in $P$ can be computed in $O(\log n)$ time, or in $O(1)$ time if both $s$ and $t$ are vertices of $P$, and an actual shortest path can be output in additional linear time in the number of edges of the path. To achieve the result, we propose a mountain decomposition of simple polygons, which may be interesting in its own right. Most importantly, our approach is much simpler than the previous work on this problem.

### Local Density Estimation in High Dimensions

Authors: Xian Wu, Moses Charikar, Vishnu Natchu
Abstract: An important question that arises in the study of high dimensional vector representations learned from data is: given a set $\mathcal{D}$ of vectors and a query $q$, estimate the number of points within a specified distance threshold of $q$. We develop two estimators, LSH Count and Multi-Probe Count that use locality sensitive hashing to preprocess the data to accurately and efficiently estimate the answers to such questions via importance sampling. A key innovation is the ability to maintain a small number of hash tables via preprocessing data structures and algorithms that sample from multiple buckets in each hash table. We give bounds on the space requirements and sample complexity of our schemes, and demonstrate their effectiveness in experiments on a standard word embedding dataset.

### Sub-Gaussian Mean Estimation in Polynomial Time

Authors: Samuel B. Hopkins
Abstract: We study polynomial time algorithms for estimating the mean of a random vector $X$ in $\mathbb{R}^d$ from $n$ independent samples $X_1,\ldots,X_n$ when $X$ may be heavy-tailed. We assume only that $X$ has finite mean $\mu$ and covariance $\Sigma$. In this setting, the radius of confidence intervals achieved by the empirical mean are large compared to the case that $X$ is Gaussian or sub-Gaussian. In particular, for confidence $\delta > 0$, the empirical mean has confidence intervals with radius of order $\sqrt{\text{Tr} \Sigma / \delta n}$ rather than $\sqrt{\text{Tr} \Sigma /n } + \sqrt{ \lambda_{\max}(\Sigma) \log (1/\delta) / n}$ from the Gaussian case.

We offer the first polynomial time algorithm to estimate the mean with sub-Gaussian confidence intervals under such mild assumptions. Our algorithm is based on a new semidefinite programming relaxation of a high-dimensional median. Previous estimators which assumed only existence of $O(1)$ moments of $X$ either sacrifice sub-Gaussian performance or are only known to be computable via brute-force search procedures requiring $\exp(d)$ time.

Authors: Travis Gagie, Garance Gourdel, Gonzalo Navarro, Jared Simpson
Abstract: The advent of high-throughput sequencing has resulted in massive genomic datasets, some consisting of assembled genomes but others consisting of raw reads. We consider how to reduce the amount of space needed to index a set of reads, in particular how to reduce the number of runs in the Burrows-Wheeler Transform (BWT) that is the basis of FM-indexing. The best current fully-functional index for repetitive collections (Gagie et al., SODA 2018) uses space proportional to this number.

### One or more PhD Stipends in Internet of Things (IoT) connectivity beyond 5G wireless systems (8-18070 at Department of Electronic Systems, Aalborg University (apply by October 15, 2018)

from CCI: jobs

At the Technical Faculty of IT and Design, Department of Electronic Systems, one ore more PhD stipends in Internet of Things (IoT) connectivity beyond 5G wireless systems are available within the general study programme Wireless Communications. The stipends are open for appointment from December 1 2018, or as soon as possible thereafter.

Website: https://www.stillinger.aau.dk/vis-stilling/?vacancy=1005037

by shacharlovett at September 20, 2018 10:19 AM UTC

### The Multi-Round Sequential Selection Problem

Authors: Mathilde Fekom, Argyris Kalogeratos, Nicolas Vayatis
Abstract: In the Sequential Selection Problem (SSP), immediate and irrevocable decisions need to be made while candidates from a finite set are being examined one-by-one. The goal is to assign a limited number of $b$ available jobs to the best possible candidates. Standard SSP variants begin with an empty selection set (cold-starting) and perform the selection process once (single-round), over a single candidate set. In this paper we introduce the Multi-round Sequential Selection Problem (MSSP) which launches a new round of sequential selection each time a new set of candidates becomes available. Each new round has at hand the output of the previous one, i.e. its $b$ selected employees, and tries to update optimally that selection by reassigning each job at most once. Our setting allows changes to take place between two subsequent selection rounds: resignations of previously selected subjects or/and alterations of the quality score across the population. The challenge for a selection strategy is thus to efficiently adapt to such changes. For this novel problem we adopt a cutoff-based approach, where a precise number of candidates should be rejected first before starting to select. We set a rank-based objective of the process over the final job-to-employee assignment and we investigate analytically the optimal cutoff values with respect to the important parameters of the problem. Finally, we present experimental results that compare the efficiency of different selection strategies, as well as their convergence rates towards the optimal solution in the case of stationary score distributions.

### A Physical Perspective on Control Points and Polar Forms: B\'ezier Curves, Angular Momentum and Harmonic Oscillators

Authors: Márton Vaitkus
Abstract: Bernstein polynomials and B\'ezier curves play an important role in computer-aided geometric design and numerical analysis, and their study relates to mathematical fields such as abstract algebra, algebraic geometry and probability theory. We describe a theoretical framework that incorporates the different aspects of the Bernstein-B\'ezier theory, based on concepts from theoretical physics. We relate B\'ezier curves to the theory of angular momentum in both classical and quantum mechanics, and describe physical analogues of various properties of B\'ezier curves -- such as their connection with polar forms -- in the context of quantum spin systems. This previously unexplored relationship between geometric design and theoretical physics is established using the mathematical theory of Hamiltonian mechanics and geometric quantization. An alternative description of spin systems in terms of harmonic oscillators serves as a physical analogue of P\'olya's urn models for B\'ezier curves. We relate harmonic oscillators to Poisson curves and the analytical blossom as well. We present an overview of the relevant mathematical and physical concepts, and discuss opportunities for further research.

### Covering and Packing of Rectilinear Subdivision

Authors: Satyabrata Jana, Supantha Pandit
Abstract: We study a class of geometric covering and packing problems for bounded regions on the plane. We are given a set of axis-parallel line segments that induces a planar subdivision with a set of bounded (rectilinear) faces. We are interested in the following problems. (P1) Stabbing-Subdivision: Stab all bounded faces by selecting a minimum number of points in the plane. (P2) Independent-Subdivision: Select a maximum size collection of pairwise non-intersecting bounded faces. (P3) Dominating-Subdivision: Select a minimum size collection of faces such that any other face has a non-empty intersection (i.e., sharing an edge or a vertex) with some selected faces. We show that these problems are NP-hard. We even prove that these problems are NP-hard when we concentrate only on the rectangular faces of the subdivision. Further, we provide constant factor approximation algorithms for the Stabbing-Subdivision problem.

### Improved Bounds for the Traveling Salesman Problem with Neighborhoods on Uniform Disks

Authors: Ioana O. Bercea
Abstract: Given a set of $n$ disks of radius $R$ in the Euclidean plane, the Traveling Salesman Problem With Neighborhoods (TSPN) on uniform disks asks for the shortest tour that visits all of the disks. The problem is a generalization of the classical Traveling Salesman Problem(TSP) on points and has been widely studied in the literature. For the case of disjoint uniform disks of radius $R$, Dumitrescu and Mitchell[2001] show that the optimal TSP tour on the centers of the disks is a $3.547$-approximation to the TSPN version. The core of their analysis is based on bounding the detour that the optimal TSPN tour has to make in order to visit the centers of each disk and shows that it is at most $2Rn$ in the worst case. H\"{a}me, Hyyti\"{a} and Hakula[2011] asked whether this bound is tight when $R$ is small and conjectured that it is at most $\sqrt{3}Rn$.

We further investigate this question and derive structural properties of the optimal TSPN tour to describe the cases in which the bound is smaller than $2Rn$. Specifically, we show that if the optimal TSPN tour is not a straight line, at least one of the following is guaranteed to be true: the bound is smaller than $1.999Rn$ or the $TSP$ on the centers is a $2$-approximation. The latter bound of $2$ is the best that we can get in general. Our framework is based on using the optimality of the TSPN tour to identify local structures for which the detour is large and then using their geometry to derive better lower bounds on the length of the TSPN tour. This leads to an improved approximation factor of $3.53$ for disjoint uniform disks and $6.728$ for the general case. We further show that the H\"{a}me, Hyyti\"{a} and Hakula conjecture is true for the case of three disks and discuss the method used to obtain it. Along the way, we will uncover one way in which the TSPN problem is structurally different from the classical TSP.

### Twisty Takens: A Geometric Characterization of Good Observations on Dense Trajectories

Authors: Boyan Xu, Christopher J. Tralie, Alice Antia, Michael Lin, Jose A. Perea
Abstract: In nonlinear time series analysis and dynamical systems theory, Takens' embedding theorem states that the sliding window embedding of a generic observation along finite, uniformly sampled trajectories in a state space, recovers a sampling of the region traversed by the dynamics. This can be used, for instance, to show that sliding window embeddings of periodic signals recover topological loops, and that sliding window embeddings of quasiperiodic signals recover high-dimensional torii. However, in spite of these motivating examples, Takens' theorem does not in general prescribe how to choose such an observation function given particular dynamics in a state space. In this work, we state conditions on observation functions defined on compact Riemannian manifolds, that lead to successful reconstructions for particular dynamics. We apply our theory and construct families of time series whose sliding window embeddings trace tori, Klein bottles, spheres, and projective planes. This greatly enriches the set of examples of time series known to concentrate on various shapes via sliding window embeddings. We also present numerical experiments showing that we can recover low dimensional representations of the underlying dynamics on state space, by using the persistent cohomology of sliding window embeddings and Eilenberg-MacLane (i.e., circular and real projective) coordinates.

### Data-Driven Clustering via Parameterized Lloyd's Families

Authors: Maria-Florina Balcan, Travis Dick, Colin White
Abstract: Algorithms for clustering points in metric spaces is a long-studied area of research. Clustering has seen a multitude of work both theoretically, in understanding the approximation guarantees possible for many objective functions such as k-median and k-means clustering, and experimentally, in finding the fastest algorithms and seeding procedures for Lloyd's algorithm. The performance of a given clustering algorithm depends on the specific application at hand, and this may not be known up front. For example, a "typical instance" may vary depending on the application, and different clustering heuristics perform differently depending on the instance.

In this paper, we define an infinite family of algorithms generalizing Lloyd's algorithm, with one parameter controlling the the initialization procedure, and another parameter controlling the local search procedure. This family of algorithms includes the celebrated k-means++ algorithm, as well as the classic farthest-first traversal algorithm. We design efficient learning algorithms which receive samples from an application-specific distribution over clustering instances and learn a near-optimal clustering algorithm from the class. We show the best parameters vary significantly across datasets such as MNIST, CIFAR, and mixtures of Gaussians. Our learned algorithms never perform worse than k-means++, and on some datasets we see significant improvements.

### Sublinear Time Low-Rank Approximation of Distance Matrices

Authors: Ainesh Bakshi, David P. Woodruff
Abstract: Let $\mathbf{P}=\{ p_1, p_2, \ldots p_n \}$ and $\mathbf{Q} = \{ q_1, q_2 \ldots q_m \}$ be two point sets in an arbitrary metric space. Let $\mathbf{A}$ represent the $m\times n$ pairwise distance matrix with $\mathbf{A}_{i,j} = d(p_i, q_j)$. Such distance matrices are commonly computed in software packages and have applications to learning image manifolds, handwriting recognition, and multi-dimensional unfolding, among other things. In an attempt to reduce their description size, we study low rank approximation of such matrices. Our main result is to show that for any underlying distance metric $d$, it is possible to achieve an additive error low-rank approximation in sublinear time. We note that it is provably impossible to achieve such a guarantee in sublinear time for arbitrary matrices $\mathbf{A}$, and consequently our proof exploits special properties of distance matrices. We develop a recursive algorithm based on additive projection-cost preserving sampling. We then show that in general, relative error approximation in sublinear time is impossible for distance matrices, even if one allows for bicriteria solutions. Additionally, we show that if $\mathbf{P} = \mathbf{Q}$ and $d$ is the squared Euclidean distance, which is not a metric but rather the square of a metric, then a relative error bicriteria solution can be found in sublinear time.

### PAIM: Platoon-based Autonomous Intersection Management

Authors: Masoud Bashiri, Hassan Jafarzadeh, Cody Fleming
Abstract: With the emergence of autonomous ground vehicles and the recent advancements in Intelligent Transportation Systems, Autonomous Traffic Management has garnered more and more attention. Autonomous Intersection Management (AIM), also known as Cooperative Intersection Management (CIM) is among the more challenging traffic problems that poses important questions related to safety and optimization in terms of delays, fuel consumption, emissions and reliability. Previously we introduced two stop-sign based policies for autonomous intersection management that were compatible with platoons of autonomous vehicles. These policies outperformed regular stop-sign policy both in terms of average delay per vehicle and variance in delay. This paper introduces a reservation-based policy that utilizes the cost functions from our previous work to derive optimal schedules for platoons of vehicles. The proposed policy guarantees safety by not allowing vehicles with conflicting turning movement to be in the conflict zone at the same time. Moreover, a greedy algorithm is designed to search through all possible schedules to pick the best that minimizes a cost function based on a trade-off between total delay and variance in delay. A simulator software is designed to compare the results of the proposed policy in terms of average delay per vehicle and variance in delay with that of a 4-phase traffic light.

### Testing SensoGraph, a geometric approach for fast sensory evaluation

Authors: David Orden, Encarnación Fernández-Fernández, José M. Rodríguez-Nogales, Josefina Vila-Crespo
Abstract: This paper introduces SensoGraph, a novel approach for fast sensory evaluation using two-dimensional geometric techniques. In the tasting sessions, the assessors follow their own criteria to place samples on a tablecloth, according to the similarity between samples. In order to analyse the data collected, first a geometric clustering is performed to each tablecloth, extracting connections between the samples. Then, these connections are used to construct a global similarity matrix. Finally, a graph drawing algorithm is used to obtain a 2D consensus graphic, which reflects the global opinion of the panel by (1) positioning closer those samples that have been globally perceived as similar and (2) showing the strength of the connections between samples. The proposal is validated by performing four tasting sessions, with three types of panels tasting different wines, and by developing a new software to implement the proposed techniques. The results obtained show that the graphics provide similar positionings of the samples as the consensus maps obtained by multiple factor analysis (MFA), further providing extra information about connections between samples, not present in any previous method. The main conclusion is that the use of geometric techniques provides information complementary to MFA, and of a different type. Finally, the method proposed is computationally able to manage a significantly larger number of assessors than MFA, which can be useful for the comparison of pictures by a huge number of consumers, via the Internet.

### Computing Wasserstein Distance for Persistence Diagrams on a Quantum Computer

Authors: Jesse J. Berwald, Joel M. Gottlieb, Elizabeth Munch
Abstract: Persistence diagrams are a useful tool from topological data analysis which can be used to provide a concise description of a filtered topological space. What makes them even more useful in practice is that they come with a notion of a metric, the Wasserstein distance (closely related to but not the same as the homonymous metric from probability theory). Further, this metric provides a notion of stability; that is, small noise in the input causes at worst small differences in the output. In this paper, we show that the Wasserstein distance for persistence diagrams can be computed through quantum annealing. We provide a formulation of the problem as a Quadratic Unconstrained Binary Optimization problem, or QUBO, and prove correctness. Finally, we test our algorithm, exploring parameter choices and problem size capabilities, using a D-Wave 2000Q quantum annealing computer.

from Scott Aaronson

Merry Yom Kippur!

This is my annual post where I tell you about opportunities available at UT Austin, which has long been a safe space for CS research, and which we hope will rapidly become (or return to its historical role as…) a safe space for quantum computing and information.

If you’re interested in faculty positions in computer science at UT, I have some great news: we plan to do a lot of hiring this year!  Because of the sheer volume of interviews we’ll be doing, we’d like to start our recruiting season already in the fall.  So we’re extending an unusual invitation: if you already have your materials ready, we encourage you to apply for faculty positions right now.  If you’re chosen for an interview, we could schedule it for the next few months.

We’ll be looking for great candidates across all parts of CS, but one particular interest is hiring another quantum computing theorist in CS (i.e., besides me), most likely a junior person.  While not everyone who reads this blog is a plausible candidate, and not every plausible candidate reads this blog, the intersection is surely non-negligible!  So again: we encourage you to apply right now, so we can start scheduling interviews already.

I’m also on the lookout for postdocs, mainly in theoretical quantum computing and information.  (I, and others in the theory group, are also collectively interested in postdocs in classical computational complexity.)  If you’re interested in doing a postdoc with me starting in Fall 2019, the procedure, like in previous years, is this:

• Email me introducing yourself (if I don’t already know you), and include your CV and up to three representative papers.  Do this even if you already emailed me before.
• Arrange for two recommendation letters to be emailed to me.

We’ll set a deadline for this of December 15.

Finally, if you’re interested in pursuing a PhD in CS at UT, please apply here!  The deadline, again, is December 15.  Just like every year, I’m on the lookout for superb, complexity-loving, quantum- or quantum-curious, lower-bound-hungry students of every background, and if you specify that you want to work with me, I’ll be sure to see your application.  Emailing me won’t help: everything is done through the application process.

As we like to say down here in Texas, hook ’em Hadamards!  (Well OK, no, we don’t especially like to say that.  It’s just a slogan that I found amusing a few years ago.)

### Finding k-Dissimilar Paths with Minimum Collective Length

Authors: Theodoros Chondrogiannis, Panagiotis Bouros, Johann Gamper, Ulf Leser, David B. Blumenthal
Abstract: A classical multi-agent fence patrolling problem asks: What is the maximum length $L$ of a line that $k$ agents with maximum speeds $v_1,\ldots,v_k$ can patrol if each point on the line needs to be visited at least once every unit of time. It is easy to see that $L = \alpha \sum_{i=1}^k v_i$ for some efficiency $\alpha \in [\frac{1}{2},1)$. After a series of works giving better and better efficiencies, it was conjectured that the best possible efficiency approaches $\frac{2}{3}$. No upper bounds on the efficiency below $1$ were known. We prove the first such upper bounds and tightly bound the optimal efficiency in terms of the minimum ratio of speeds $s = \frac{v_{\max}}{v_{\min}}$ and the number of agents $k$. Our bounds of $\alpha \leq \frac{1}{1 + \frac{1}{s}}$ and $\alpha \leq 1 - \frac{1}{2\sqrt{k}}$ imply that in order to achieve efficiency $1 - \epsilon$, at least $k \geq \Omega(\epsilon^{-2})$ agents with a speed ratio of $s \geq \Omega(\epsilon^{-1})$ are necessary. Guided by our upper bounds, we construct a scheme whose efficiency approaches $1$, disproving the conjecture of Kawamura and Soejima. Our scheme asymptotically matches our upper bounds in terms of the maximal speed difference and the number of agents used, proving them to be asymptotically tight.