# Theory of Computing Blog Aggregator

I just received the Cornell Math Matters, dedicated to the memory of  Eugene Dynkin who passed away on November 14 at the age of 90. In my freshman year at Cornell, Dynkin recruited me into his undergraduate research seminar building on the success he had with a similar seminar he ran when in Russia. I didn't last long, making the conscious choice not to work on research as an undergrad but to focus on enjoying the college experience. I missed out on a great opportunity but I don't regret that decision.

Reluctantly I wouldn't give that advice to today's undergrads. Getting into a good graduate program has become much more competitive and even a small amount of research experience may make a large difference in your application. I encourage any undergrad who may consider a PhD in their future to talk to some professors and get started in a research program. But don't let it run your life, make sure you enjoy your time at college. You'll have plenty of time to spend every waking moment on research once you start graduate school.

by Lance Fortnow (noreply@blogger.com) at December 22, 2014 10:33 PM UTC

### TR14-182 | Direct Product Testing With Nearly Identical Sets | Dana Moshkovitz

from ECCC papers

In this work we analyze a direct product test in which each of two provers receives a subset of size n of a ground set U, and the two subsets intersect in about (1-\delta)n elements. We show that if each of the provers provides labels to the n elements it received, and the labels of the two provers agree in the intersection between the subsets with non-negligible probability, then the answers of the provers must correspond to a certain global assignment to the elements of U. While previous results only worked for intersection of size at most n/2, in our model the questions and expected answers of the two provers are nearly identical. This is related to a recent construction of a unique games instance (ECCC TR14-142) where this setup arises at the outer verifier'' level. Our main tool is a hypercontractive bound on the Bernoulli-Laplace model (aka a slice of the Boolean hypercube), from which we can deduce a small set expansion''-type lemma. We then use ideas from a recent work of the author about fortification'' to reduce the case of large intersection to the already studied case of smaller intersection.

### TR14-181 | The space &quot;just above&quot; BQP | Adam Bouland, Scott Aaronson, Joseph Fitzsimons, Mitchell Lee

from ECCC papers

We explore the space "just above" BQP by defining a complexity class PDQP (Product Dynamical Quantum Polynomial time) which is larger than BQP but does not contain NP relative to an oracle. The class is defined by imagining that quantum computers can perform measurements that do not collapse the wavefunction. This (non-physical) model of computation can efficiently solve problems such as Graph Isomorphism and Approximate Shortest Vector which are believed to be intractable for quantum computers. Furthermore, it can search an unstructured N-element list in $\tilde O(N^{1/3})$ time, but no faster than $\Omega(N^{1/4})$, and hence cannot solve NP-hard problems in a black box manner. In short, this model of computation is more powerful than standard quantum computation, but only slightly so. Our work is inspired by previous work of Aaronson on the power of sampling the histories of hidden variables. However Aaronson's work contains an error in its proof of the lower bound for search, and hence it is unclear whether or not his model allows for search in logarithmic time. Our work can be viewed as a conceptual simplification of Aaronson's approach, with a provable polynomial lower bound for search.

### TR14-180 | Space-Efficient Approximations for Subset Sum | Anna Gal, Jing-Tang Jang, Nutan Limaye, Meena Mahajan, Karteek Sreenivasaiah

from ECCC papers

SUBSET SUM is a well known NP-complete problem: given $t \in Z^{+}$ and a set $S$ of $m$ positive integers, output YES if and only if there is a subset $S^\prime \subseteq S$ such that the sum of all numbers in $S^\prime$ equals $t$. The problem and its search and optimization versions are known to be solvable in pseudo-polynomial time in general. We develop a 1-pass deterministic streaming algorithm that uses space $O\left(\frac{\log t}{\epsilon}\right)$ and decides if some subset of the input stream adds up to a value in the range $\{(1\pm\epsilon)t\}$. Using this algorithm, we design space efficient Fully Polynomial-Time Approximation Schemes (FPTAS) solving the search and optimization versions of SUBSET SUM. Our algorithms run in $O(\frac{1}{\epsilon}m^2)$ time and $O(\frac{1}{\epsilon})$ space on unit cost RAMs, where $1+\epsilon$ is the approximation factor. This implies constant space quadratic time FPTAS on unit cost RAMs when $\epsilon$ is a constant. Previous FPTAS used space linear in $m$. In addition, we show that on certain inputs, when a solution is located within a short prefix of the input sequence, our algorithms may run in sublinear time. We apply our techniques to the problem of finding balanced separators, and we extend our results to some other variants of the more general knapsack problem. When the input numbers are encoded in unary, the decision version has been known to be in log space. We give streaming space lower and upper bounds for unary SUBSET SUM. If the input length is $N$ when the numbers are encoded in unary, we show that randomized $s$-pass streaming algorithms for exact SUBSET SUM need space $\Omega(\frac{\sqrt{N}}{s})$, and give a simple deterministic two pass streaming algorithm using $O(\sqrt{N} \log N)$ space. Finally, we formulate an encoding under which unary SUBSET SUM is monotone, and show that the exact and approximate versions in this formulation have monotone $O(\log^2 t)$ depth Boolean circuits. We also show that any circuit using $\varepsilon$-approximator gates for Subset Sum under this encoding needs $\Omega(n/\log n)$ gates to compute the Disjointness function.

### Guest post by Sasho Nikolov: Beating Monte Carlo

from Sébastian Bubeck

If you work long enough in any mathematical science, at some point you will need to estimate an integral that does not have a simple closed form. Maybe your function is really complicated. Maybe it’s really high dimensional. Often you cannot even write it down: it could be a quantity associated with a complex system, that you can only “query” at certain points by running an experiment. But you still need your integral, and then you turn to the trustworthy old Monte Carlo method. (Check this article by Nicholas Metropolis for the history of the method and what it has to do with Stanislaw Ulam’s uncle’s gambling habbit.) My goal in this post is to tell you a little bit about how you can do better than Monte Carlo using discrepancy theory.

The Problem and the Monte Carlo Method

Let us fix some notation and look at the simplest possible setting. We have a function , that maps the real interval to the reals, and we want to know

We will estimate this integral with the average , where is a sequence of numbers in . The error of this estimate is

And here is the main problem I will talk about in this post:

How do we choose a sequence of points in so that the average approximates the integral as closely as possible?

Intuitively, for larger the approximation will be better, but what is the best rate we can achieve? Notice that we want a single sequence, so that if we want a more accurate approximation, we just take a few more terms and re-normalize, rather than start everything from scratch. The insight of the Monte Carlo method (surely familiar to every reader of this blog) is to take each to be a fresh random sample from . Then for any , the expectation of is exactly (from now on I will skip the limits of my integrals: they will all be from to ). The standard deviation is of order

and standard concentration inequalities tell us that, with high probability, will not be much larger than the latter quantity.

Quasi-Monte Carlo and Discrepancy

For a fixed function of bounded norm, the Monte Carlo method achieves roughly on the order of . In other words, if we want to approximate our integral to within , we need to take about random samples. It’s clear that in general, even for smooth functions, we cannot hope to average over fewer than points, but is really the best we can do? It turns out that for reasonably nice we can do a lot better using discrepancy. We define the star discrepancy function of a sequence as

Notice that this is really just a special case of where is the indicator function of the interval . A beautiful inequality due to Koksma shows that in a sense these are the only functions we need to care about:

is the total variation of , a measure of smoothness, and for continuously differentiable functions it is equal to . The important part is that we have bounded the integration error by the product of a term that quantifies how nice is, and a term that quantifies how “random” the sequence is. With this, our task is reduced to finding a sequence with small star discrepancy, hopefully smaller than . This is the essence of the quasi-Monte Carlo method.

The van der Corput Sequence

A simple sequence with low star discrepancy was discovered by van der Corput in the beginning of the previous century. Let us write the integer in binary as , i.e. . Then we define to be number we get by flipping the binary digits of around the radix point: , or, in binary, . The van der Corput sequence is .

I have plotted the first eight terms of the van der Corput sequence on the left in the above picture: the index is on the -axis and the value on the -axis. The terms alternate between and ; they also visit each of , , , exactly once before they return, and so on. For example, each shaded rectangle on the right in the above picture contains exactly one point (the rectangles are open on the top). The key property of the van der Corput sequence then is that if and only if . So for any such dyadic interval, the discrepancy is at most : the number of integers less than such that is either or . We can greedily decompose an interval into dyadic intervals plus a leftover interval of length ; therefore the star discrepancy of the van der Corput sequence is . Remember that, together with Koksma’s inequality, this means that we can estimate the integral of any function with bounded variation with error , which, for sufficiently smooth , is almost quadratically better than the Monte Carlo method! And with a deterministic algorithm, to boot. This was not that hard, so maybe we can achieve the ultimate discrepancy bound ? This is the question (asked by van der Corput) which essentially started discrepancy theory. The first proof that is not achievable was given by van Aardenne-Ehrenfest. Klaus Roth simplified her bound and strengthened it to using a brilliant proof based on Haar wavelets. Schmidt later proved that van der Corput’s bound is assymptotically the best possible.

Higher Dimension, Other Measures, and Combinatorial Discrepancy

Quasi-Monte Carlo methods are used in real world applications, for example in quantitative finance, because of the better convergence rates they offer. But there are many complications that arise in practice. One issue is that we usually need to estimate integrals of high-dimensional functions, i.e. functions of a large number of variables. The Koksma inequality generalizes to higher dimensions (the generalization is known as the Koksma-Hlawka inequality), but we need to redefine both discrepancy and total variation for that purpose. The star discrepancy of a sequence of -dimensional points measures the worst-case absolute difference between the -dimensional volume (Lebesgue measure) of any anchored box and the fraction of points in the sequence that fall in the box. The generalization of total variation is the Hardy-Krause total variation. Naturally, the best achievable discrepancy increases with dimension, while the class of functions of bounded total variation becomes more restrictive. However, we do not even know what is the best achievable star discrepancy for or higher dimensional sequences! We know that no -dimensional sequence has discrepancy better than , where is some constant that goes to with . The van der Corput construction generalizes to higher dimensions and gives sequences with discrepancy (the implied constants here and in the lower bounds depend on ). The discrepancy theory community refers to closing this significant gap as “The Great Open Problem”.

Everything so far was about integration with respect to the Lebesgue measure, but in practice we are often interested in a different measure space. We could absorb the measure into the function to be integrated, but this can affect the total variation badly. We could do a change of variables, but, unless we have a nice product measure, this will result in a notion of discrepancy in which the test sets are not boxes anymore. Maybe the most natural solution is to redefine star discrepancy with respect to the measure we care about. But how do we find a low-discrepancy sequence with the new definition? It turns out that combinatorial discrepancy is very helpful here. A classical problem in combinatorial discrepancy, Tusnady’s problem, asks for is the smallest function such that any set of points in can be colored with red and blue so that in any axis-aligned box the absolute difference between the number of red and the number of blue points is at most (see this post for a generalization of this problem). A general transference theorem in discrepancy theory shows that for any probability measure in there exists a sequence with star discrepancy at most . The best bound for is , only slightly worse than the best star discrepancy for Lebesgue measure. This transference result has long been seen as purely existential, because most non-trivial results in combinatorial discrepancy were not constructive, but recently we have seen amazing progress in algorithms for minimizing combinatorial discrepancy. While even with these advances we don’t get sequences that are nearly as explicit as the van der Corput sequence, there certainly is hope we will get there.

Conclusion

I have barely scratched the surface of Quasi Monte Carlo methods and geometric discrepancy. Koksma-Hlawka type inequalities, discrepancy with respect to various test sets and measures, combinatorial discrepancy are each a big topic in itself. The sheer breadth of mathematical tools that bear on discrepancy questions is impressive: diophantine approximation to construct low discrepancy sequences, reproducing kernels in Hilbert spaces to prove Koksma-Hlawka inequalities, harmonic analysis to prove discrepancy lower bounds, convex geometry for upper and lower bounds in combinatorial discrepancy. Luckily, there are some really nice references available. Matousek has a very accessible book on geometric discrepancy. Chazelle focuses on computer science applications. A new collection of surveys edited by Chen, Srivastav, and Travaglini has many of the latest developments.

by Sebastien Bubeck at December 22, 2014 10:20 AM UTC

### Modulating the Permanent

from Richard Lipton

When does it become hard to compute?

Thomas Muir coined the term “permanent” as a noun in his treatise on determinants in 1882. He took it from Augustin Cauchy’s distinction in 1815 between symmetric functions that alternate when rows of a matrix are interchanged versus ones that “stay permanent.” To emphasize that all terms of the permanent have positive sign, he modified the contemporary notation ${\left| A \right|}$ for the determinant of a matrix ${A}$ into

$\displaystyle \overset{+}{|} A \overset{+}{|}$

for the permanent. Perhaps we should be glad that this notation did not become permanent.

Today Ken and I wish to highlight some interesting results on computing the permanent modulo some integer value.

Recall the permanent of a square matrix ${A}$ is the function defined as follows by summing over all permutations of ${\{1,\dots,n\}}$, that is, over all members of the symmetric group ${S_n}$:

$\displaystyle \mathrm{perm}(A)=\sum_{\sigma\in S_n}\prod_{i=1}^n a_{i,\sigma(i)}.$

It looks simpler than the determinant formula,

$\displaystyle \det(A) = \sum_{\sigma \in S_n} \mathrm{sgn}(\sigma) \prod_{i=1}^n a_{i,\sigma_i},$

but soon acquired a reputation as being ‘strangely unfriendly’ compared to the determinant. We owe to Les Valiant the brilliant explanation that computing the permanent exactly, even restricted to matrices with all entries 0 or 1, is likely to be very hard, whereas the determinant is easy to compute.

Muir is known for rigorizing a lemma by Arthur Cayley involving another matrix polynomial that at first looks hard to compute but turns out to be easy. The Pffafian of a ${2n \times 2n}$ matrix ${A}$ is defined by

$\displaystyle \mathrm{pf}(A)= \frac{1}{2^n n!}\sum_{\sigma\in S_{2n}} \mathrm{sgn}(\sigma) \prod_{i=1}^n a_{\sigma(2i-1),\sigma(2i)}.$

This vanishes unless ${A}$ is skew-symmetric, meaning ${A^T = -A}$, whereupon Muir following Cayley proved the relation

$\displaystyle \mathrm{pf}^2(A) = \det(A).$

The Pfaffian, and its use in the FKT algorithm for counting matchings in planar graphs, figures large in Valiant’s 2006 discovery that some generally-hard computations become easy modulo certain primes. All this is background to our query about which matrices might have easy permanent computations modulo which primes.

## Permanent Computations

Herbert Ryser found an inclusion-exclusion method to compute the permanent in ${O(2^{n}n)}$ operations:

$\displaystyle \mathrm{perm} (A) = (-1)^n \sum_{S\subseteq\{1,\dots,n\}} (-1)^{|S|} \prod_{i=1}^n \sum_{j\in S} a_{ij}$

This was found in 1963 and still stands as the best exact method. Note that it is exponential but is still better than the naive method which would sum ${n!}$ terms. David Glynn recently found a different formula giving the same order of performance.

Mark Jerrum, Alistair Sinclair, and Eric Vigoda found an approximation method for non-negative matrices that runs in probabilistic polynomial time. This award-winning result is based on delicate analysis of certain random walks. It fails for a matrix with even one negative term, since they show that such matrices can have permanents that are NP-hard to approximate.

Modulo 2, of course, the determinant and permanent of integer matrices are the same. It seems to be less well known that the permanent is easy modulo any power of 2. Modulo ${2^k}$, the known time is ${O(n^{4k-3})}$, and this too was proved in Valiant’s famous paper. However, subsequent to that paper, computing the permanent modulo any other integer was shown to be NP-hard under randomized reductions.

But wait. There are some special cases modulo ${3}$ that we would like to point out that actually are easy to compute—that take only polynomial time.

## Permanent Modulo 3

Grigory Kogan wrote a paper in FOCS 1996 that addresses this issue. It is titled “Computing Permanents over Fields of Characteristic 3: Where and Why It Becomes Difficult.” His main positive result was the following:

Theorem 1 Let ${F}$ be a field of characteristic ${3}$. Let ${U}$ be a ${n \times n}$ matrix over ${F}$ such that ${UU^{T}=I_{n}}$. Then ${\mathrm{perm}(U)}$ can be computed in time ${O(n^{4})}$.

Further, he gave a slightly worse polynomial time for computing ${\mathrm{perm}(U)}$ when ${UU^{T} - I_{n}}$ is a matrix of rank one. When ${UU^{T} - I_{n}}$ has rank two, however, computing the permanent mod 3 remains randomly hard for ${\mathsf{NP}}$, indeed complete for ${\mathsf{Mod3P}}$.

The details in the full version involve using mod 3 to regard certain matrices as skew-symmetric and hence work with their Pfaffians. The proof also uses extension fields in which ${\sqrt{2} = \sqrt{-1}}$ exists, and the theorem holds over any such field.

We wonder what similar tricks might be available modulo other primes. One advantage of working modulo ${p}$ is that the permanent becomes randomly self-reducible with no loss in numerical accuracy.

## Possible Ideas

Let’s look at using this idea to answer questions about the permanent of the famous Hadamard matrices. As we have remarked before, the original ones of order ${n = 2^k}$ were previously defined by Joseph Sylvester:

$\displaystyle \begin{array}{rcl} H_1 &=& \begin{bmatrix} 1 \end{bmatrix},\\ H_2 &=& \begin{bmatrix} 1 & 1 \\ 1 & -1 \end{bmatrix},\\ H_{2^k} &=& \begin{bmatrix} H_{2^{k-1}} & H_{2^{k-1}}\\ H_{2^{k-1}} & -H_{2^{k-1}}\end{bmatrix}. \end{array}$

Jacques Hadamard gave the functional equation for any matrix bearing his name,

$\displaystyle HH^T = nI,$

which for ${n \geq 4}$ is known to require ${n}$ to be a multiple of ${4}$. Put ${n = 2^k m}$ with ${m}$ odd and ${k \geq 2}$. Whether such matrices exist is known when ${m^2 < 2^k}$, when ${n - 1}$ is a prime power, or when ${k=2}$ and ${2m - 1}$ is a prime power. The case ${n = 668 = 4\cdot 167}$ avoids these since ${667 = 23\cdot 29}$ and ${333 = 9 \cdot 37}$, and remains unknown.

If ${n \equiv 1 \bmod{3}}$ then ${H}$ satisfies Kogan’s theorem. If ${n \equiv 2}$ then it appears we can still leverage the proof by working with ${U = \sqrt{-1} H}$ instead. However—and this is by way of caution—if ${n}$ is a multiple of 3 then every ${H}$ is nilpotent mod 3, and it follows that ${\mathrm{perm}(A) \equiv 0 \bmod{3}}$. Nevertheless, all this means that we can compute the permanent of any Hadamard-type matrix modulo ${3}$ in polynomial time.

A further open question is whether there exists a Hadamard matrix ${H}$ with ${\mathrm{perm}(H) = 0}$. This is open even for Sylvester’s matrices. This is known to need ${n \geq 32}$. Of course, to vanish, it must vanish mod 3. We wonder how far gaining knowledge about behavior modulo 3 and other primes might help with these problems.

Some of these papers treat related questions for other matrices of entries ${\pm 1}$, perhaps with some ${0}$‘s and/or a normalizing constant factor. Greater reasons for interest in questions about permanents has come recently from boson sampling, for which we also recommend these online notes scribed by Ernesto Galvão of lectures given by Scott Aaronson in Rio de Janeiro a year ago. A main issue is whether the randomized equivalence of worst and average case for permanents mod ${p}$ can be carried over to the kinds of real-or-complex matrices that arise.

## Open Problems

Can we do better? Can we compute the permanent for a larger class of orthogonal matrices?

### The space "just above" BQP

Authors: Scott Aaronson, Adam Bouland, Joseph Fitzsimons, Mitchell Lee
Abstract: We explore the space "just above" BQP by defining a complexity class PDQP (Product Dynamical Quantum Polynomial time) which is larger than BQP but does not contain NP relative to an oracle. The class is defined by imagining that quantum computers can perform measurements that do not collapse the wavefunction. This (non-physical) model of computation can efficiently solve problems such as Graph Isomorphism and Approximate Shortest Vector which are believed to be intractable for quantum computers. Furthermore, it can search an unstructured N-element list in $\tilde O$(N^{1/3}) time, but no faster than {\Omega}(N^{1/4}), and hence cannot solve NP-hard problems in a black box manner. In short, this model of computation is more powerful than standard quantum computation, but only slightly so.

Our work is inspired by previous work of Aaronson on the power of sampling the histories of hidden variables. However Aaronson's work contains an error in its proof of the lower bound for search, and hence it is unclear whether or not his model allows for search in logarithmic time. Our work can be viewed as a conceptual simplification of Aaronson's approach, with a provable polynomial lower bound for search.

### Finding 2-edge and 2-vertex strongly connected components in quadratic time

Authors: Monika Henzinger, Sebastian Krinninger, Veronika Loitzenbauer
Abstract: We present faster algorithms for computing the 2-edge and 2-vertex strongly connected components of a directed graph, which are straightforward generalizations of strongly connected components. While in undirected graphs the 2-edge and 2-vertex connected components can be found in linear time, in directed graphs only rather simple $O(m n)$-time algorithms were known. We use a hierarchical sparsification technique to obtain algorithms that run in time $O(n^2)$. For 2-edge strongly connected components our algorithm gives the first running time improvement in 20 years. Additionally we present an $O(m^2 / \log{n})$-time algorithm for 2-edge strongly connected components, and thus improve over the $O(m n)$ running time also when $m = O(n)$. Our approach extends to k-edge and k-vertex strongly connected components for any constant k with a running time of $O(n^2 \log^2 n)$ for edges and $O(n^3)$ for vertices.

### Existential Second-Order Logic Over Graphs: A Complete Complexity-Theoretic Classification

Authors: Till Tantau
Abstract: Descriptive complexity theory aims at inferring a problem's computational complexity from the syntactic complexity of its description. A cornerstone of this theory is Fagin's Theorem, by which a graph property is expressible in existential second-order logic (ESO logic) if, and only if, it is in NP. A natural question, from the theory's point of view, is which syntactic fragments of ESO logic also still characterize NP. Research on this question has culminated in a dichotomy result by Gottlob, Kolatis, and Schwentick: for each possible quantifier prefix of an ESO formula, the resulting prefix class either contains an NP-complete problem or is contained in P. However, the exact complexity of the prefix classes inside P remained elusive. In the present paper, we clear up the picture by showing that for each prefix class of ESO logic, its reduction closure under first-order reductions is either FO, L, NL, or NP. For undirected, self-loop-free graphs two containment results are especially challenging to prove: containment in L for the prefix $\exists R_1 \cdots \exists R_n \forall x \exists y$ and containment in FO for the prefix $\exists M \forall x \exists y$ for monadic $M$. The complex argument by Gottlob, Kolatis, and Schwentick concerning polynomial time needs to be carefully reexamined and either combined with the logspace version of Courcelle's Theorem or directly improved to first-order computations. A different challenge is posed by formulas with the prefix $\exists M \forall x\forall y$: We show that they express special constraint satisfaction problems that lie in L.

### The simplified weighted sum function and its average sensitivity

Authors: Jiyou Li, Chu Luo
Abstract: In this paper we simplify the definition of the weighted sum Boolean function which used to be inconvenient to compute and use. We show that the new function has essentially the same properties as the previous one. In particular, the bound on the average sensitivity of the weighted sum Boolean function remains unchanged after the simplification.

### Inapproximability of Truthful Mechanisms via Generalizations of the VC Dimension

Authors: Amit Daniely, Michael Schapira, Gal Shahaf
Abstract: Algorithmic mechanism design (AMD) studies the delicate interplay between computational efficiency, truthfulness, and optimality. We focus on AMD's paradigmatic problem: combinatorial auctions. We present a new generalization of the VC dimension to multivalued collections of functions, which encompasses the classical VC dimension, Natarajan dimension, and Steele dimension. We present a corresponding generalization of the Sauer-Shelah Lemma and harness this VC machinery to establish inapproximability results for deterministic truthful mechanisms. Our results essentially unify all inapproximability results for deterministic truthful mechanisms for combinatorial auctions to date and establish new separation gaps between truthful and non-truthful algorithms.

### Manycore processing of repeated k-NN queries over massive moving objects observations

Authors: Francesco Lettich, Salvatore Orlando, Claudio Silvestri
Abstract: The ability to timely process significant amounts of continuously updated spatial data is mandatory for an increasing number of applications. In this paper we focus on a specific data-intensive problem concerning the repeated processing of huge amounts of k nearest neighbours (k-NN) queries over massive sets of moving objects, where the spatial extents of queries and the position of objects are continuously modified over time. In particular, we propose a novel hybrid CPU/GPU pipeline that significantly accelerate query processing thanks to a combination of ad-hoc data structures and non-trivial memory access patterns. To the best of our knowledge this is the first work that exploits GPUs to efficiently solve repeated k-NN queries over massive sets of continuously moving objects, even characterized by highly skewed spatial distributions. In comparison with state-of-the-art sequential CPU-based implementations, our method highlights significant speedups in the order of 10x-20x, depending on the datasets, even when considering cheap GPUs.

### Short Paths on the Voronoi Graph and the Closest Vector Problem with Preprocessing

Abstract: Improving on the Voronoi cell based techniques of Micciancio and Voulgaris (SIAM J. Comp. 13), and Sommer, Feder and Shalvi (SIAM J. Disc. Math. 09), we give a Las Vegas $\tilde{O}(2^n)$ expected time and space algorithm for CVPP (the preprocessing version of the Closest Vector Problem, CVP). This improves on the $\tilde{O}(4^n)$ deterministic runtime of the Micciancio Voulgaris algorithm, henceforth MV, for CVPP (which also solves CVP) at the cost of a polynomial amount of randomness (which only affects runtime, not correctness). As in MV, our algorithm proceeds by computing a short path on the Voronoi graph of the lattice, where lattice points are adjacent if their Voronoi cells share a common facet, from the origin to a closest lattice vector.

Our main technical contribution is a randomized procedure that given the Voronoi relevant vectors of a lattice - the lattice vectors inducing facets of the Voronoi cell - as preprocessing and any "close enough" lattice point to the target, computes a path to a closest lattice vector of expected polynomial size. This improves on the $\tilde{O}(4^n)$ path length given by the MV algorithm. Furthermore, as in MV, each edge of the path can be computed using a single iteration over the Voronoi relevant vectors. As a byproduct of our work, we also give an optimal relationship between geometric and path distance on the Voronoi graph, which we believe to be of independent interest.

### Achieving Exact Cluster Recovery Threshold via Semidefinite Programming

Authors: Bruce Hajek, Yihong Wu, Jiaming Xu
Abstract: The binary symmetric stochastic block model deals with a random graph of $n$ vertices partitioned into two equal-sized clusters, such that each pair of vertices is connected independently with probability $p$ within clusters and $q$ across clusters. In the asymptotic regime of $p=a \log n/n$ and $q=b \log n/n$ for fixed $a,b$ and $n \to \infty$, we show that the semidefinite programming relaxation of the maximum likelihood estimator achieves the optimal threshold for exactly recovering the partition from the graph with probability tending to one, resolving a conjecture of Abbe et al. \cite{Abbe14}. Furthermore, we show that the semidefinite programming relaxation also achieves the optimal recovery threshold in the planted dense subgraph model containing a single cluster of size proportional to $n$.

### Hashing Pursuit for Online Identification of Heavy-Hitters in High-Speed Network Streams

Authors: Michael Kallitsis, Stilian Stoev, George Michailidis
Abstract: Distributed Denial of Service (DDoS) attacks have become more prominent recently, both in frequency of occurrence, as well as magnitude. Such attacks render key Internet resources unavailable and disrupt its normal operation. It is therefore of paramount importance to quickly identify malicious Internet activity. The DDoS threat model includes characteristics such as: (i) heavy-hitters that transmit large volumes of traffic towards "victims", (ii) persistent-hitters that send traffic, not necessarily large, to specific destinations to be used as attack facilitators, (iii) host and port scanning for compiling lists of un-secure servers to be used as attack amplifiers, etc. This conglomeration of problems motivates the development of space/time efficient summaries of data traffic streams that can be used to identify heavy-hitters associated with the above attack vectors. This paper presents a hashing-based framework and fast algorithms that take into account the large-dimensionality of the incoming network stream and can be employed to quickly identify the culprits. The algorithms and data structures proposed provide a synopsis of the network stream that is not taxing to fast-memory, and can be efficiently implemented in hardware due to simple bit-wise operations. The methods are evaluated using real-world Internet data from a large academic network.

### Tenure Track Faculty at University at Buffalo, SUNY (apply by December 31, 2014)

from CCI: jobs

The department has multiple openings in all areas. The department is interested in hiring theoreticians whose work have applications in other applied areas including (but not limited to) big data, machine learning and security.

Email: atri@buffalo.edu

by theorycsjobs at December 21, 2014 03:26 AM UTC

### TR14-179 | Deterministic Randomness Extraction from Generalized and Distributed Santha-Vazirani Sources | Omid Etesami, Salman Beigi, Amin Gohari

from ECCC papers

A Santha-Vazirani (SV) source is a sequence of random bits where the conditional distribution of each bit, given the previous bits, can be partially controlled by an adversary. Santha and Vazirani show that deterministic randomness extraction from these sources is impossible. In this paper, we study the generalization of SV sources for non-binary sequences. We show that unlike the binary case, deterministic randomness extraction in the generalized case is sometimes possible. We present a necessary condition and a sufficient condition for the possibility of deterministic randomness extraction. These two conditions coincide in non-degenerate" cases. Next, we turn to a distributed setting. In this setting the SV source consists of a random sequence of pairs $(a_1, b_1), (a_2, b_2), \ldots$ distributed between two parties, where the first party receives $a_i$'s and the second one receives $b_i$'s. The goal of the two parties is to extract common randomness without communication. Using the notion of \emph{maximal correlation}, we prove a necessary condition and a sufficient condition for the possibility of common randomness extraction from these sources. Based on these two conditions, the problem of common randomness extraction essentially reduces to the problem of randomness extraction from (non-distributed) SV sources. This result generalizes results of G\'acs and K\"orner, and Witsenhausen about common randomness extraction from i.i.d.\ sources to adversarial sources.

### Getting a Research Internship

For graduate students December is probably the best time to start applying for a summer internship. The process of getting a software engineering internship at places like Google, Facebook, Microsoft, Twitter, Quora, Dropbox, etc. is so much streamlined that there is even a movie about it.

Getting an internship in research is a lot more of a unique experience. Graduate students ask me about this a lot and even once invited to give an informal talk on the topic at the Wharton graduate students statistics seminar. Surprisingly, I haven't seen any guide about research internships avaialble online so I decided to write down things I usually say.

Disclaimer: advice below is my personal opinon and is biased towards computer science and the U.S, although sometimes applies more broadly (e.g. to pure math, applied math and statistics). My experience is based on internships at AT&T Research (Shannon Laboratory), IBM Research (Almaden) and Microsoft Research (Silicon Valley and Redmond) and might be somewhat specific to these places, only 50% of which still exist. I also visited Yahoo! and Google Research a lot and can say that these places work similarly but with a few important differences described below. Last but not least, I tried to do my best to avoid making any comparisons (and especially controversial ones) in terms of quality between different labs and academia.

# Why Do It?

Doing a research internship is a great way to meet your future collaborators and friends. I did it 4 times (the most you can do as an F1 student) and spent 50% of my 3-year PhD in research labs. While this is quite unusual, I should say that I enjoyed the experience tremendously and still work and talk regularly with most of my former mentors (Graham Cormode, Howard Karloff, David Woodruff, Alex Andoni and Konstantin Makarychev), who have also been an invaluable source of advice for me over the years. I met many of my best friends from grad school times in the labs and still visit places where I did an internship whenever I happen to be in the area.

# Where to Apply?

This highly depends on your area of expertise. These days there is no single best place like Bell Labs in its glory days but there are multiple options to consider.

### Pure Theory

For pure theoretical comptuter science I would suggest to start with MSR and IBM. While I am certainly biased and there can't possibly be a ranking of research labs I would say that these are the top places:
• Microsoft Research. Posting, multiple locations (main offices are in Redmond, Cambridge and NYC). Redmond office is the oldest and covers almost all areas. NYC and Cambridge offices are smaller and somewhat similar, being particularly strong in machine learning, social sciences, algorithmic game theory and computational complexity among other areas.
• IBM Research. Posting, multiple locations (main offices are in Yorktown Heights, NY and Bay Area). Posting from Ken Clarkson about theory positions at IBM Almaden. Almost all major areas are represetned in either the Yorktown Heights or the Bay Area location.

Another great place is Toyota Technological Institute at Chicago (posting). I put it in a slightly different category than the industrial research labs because of its close ties with the University of Chicago and the overall feel of a more academic rather than industrial environment. While AT&T and Bell are shadows of their past, they still have some amazing people:

• AT&T Labs – Research. Posting, multiple locations (main office is in New Jersey).
• Bell Labs. Posting, multiple locations (main office is in New Jersey).

### More Applied

Here is a list of some more applied research places from the top of my head:

• Google Research. Google jobs website, multiple locations (including NYC and Bay Area). Google usually doesn't make a distinction between research and software engineering positions in their search. Once you pass the standard software engineering screening process, you get into the host matching phase and you can find a mentor interested in research.
• Yahoo! Labs. Posting, multiple locations (including NYC, Bay Area and Barcelona).
• Facebook Research. Posting from Yann LeCun, mulitple locations (including NYC and Bay Area).
• Ebay Labs. Posting, located in Bay Area.
• Technicolor. Posting, located in Bay Area.
• HP Labs. Posting, main location in New Jersey.
• NEC Labs. Posting, main location in New Jersey.
• VMWare. New lab founded by some of the former Microsoft SVC researchers, main location in Bay Area. Added to the list by suggestion from one of the founding members, Udi Wieder.

I am only familiar with the first two, which seem to have a less strong commitment to fundamental research than the labs listed before. However, the total number of job and internship openings in this slightly more applied list is probably almost an order of magnitude larger.

### National Labs

There are also many national labs. As a Russian citizen I can't tell you much about experience at these (especially in crypto, where I am sure my list is highly incomplete), but here are some options. Somewhat surprisingly, even these places sometimes have opportunities for international students, which may not be well advertised. Here is a couple of places to consider:

• Sandia Labs. Postings, main locations include Albuquerque, NM and Livermore, CA
• Berkeley Lawrence. Postings, located at Berkeley, CA.

# Internship Tips

Internships at research labs and talented interns are both scarce and unique resources so there isn't much data to look at and the decision making process is highly random. However, there are a few things you can do to improve your chances.
• List your top choices and potential mentors. It helps a lot if you know your future mentors in person. At the very least make sure that you are familiar with their work. A great lab can easily get hundereds of applications. What matters most for the success of your application is whether there is a mentor who will pick you from the pile. Don't hesitate to contact your top choices either directly or through your advisor, but avoid pestering people. Also, mention your potential mentors' names in different parts of your application (forms, research statement, etc.)
• Apply everywhere. While your chances are significantly reduced when you send a cold application, sometimes there are things you don't know and forces beyond your control. This is especially true for graduate students in their early years. Indicating your interest may be important by itself, e.g. I gave my first talk at an industrial lab which couldn't offer me an internship but invited for a short visit. It felt a lot like Peggy Olson's (Mad Men) first business trip to Richmond but better – thanks to IBM I had no dogs having sex as a view from the hotel room :)
• Recommendation letters. Ask your letter writers as soon as possible (ideally at least a month in advance), picking them based on the list of your top choices and other places where you plan to apply. Ask your letter writers for suggestions about places and feedback on your application materials.

# FAQ

• How much does it pay? Usually about the same or slightly more than a software engineering internship. I would expect $6–8K/mo (fixed, no negotiation, overtime or bonuses) + standard benefits such as relocation, car rental and housing discounts. So moneywise this is certainly better than academia, but can easily be at least two times less than an internship in finance (if you charge overtime, include bonuses, etc.). However, if you are doing what you enjoy most then you might care less about the money. • Can I do an internship during the Fall/Spring semester? Yes. The main advantage is that researchers at the lab are likely to be more available during these semesters. Another advantage is that if you are doing a summer internship in the same area then you can do two internships back to back and reduce the pain of relocation (I spent six months in Bay Area this way). There are certain disadvantages: less interns and corporate events during the semester, some schools require you to register for credits even if you are away (read as you and/or your advisor will have to pay money and do some paperwork). • Can I do an internship after my last year in grad school? Yes if you graduate after the internship. However, it also depends on the place – Google wouldn't allow me to do this but Microsoft Research did. • What if I am on an F1 visa? Then you have to jump through more paperwork hoops and in particular get a CPT. You can accumulate at most 12 months of CPT employment without losing your OPT, which limits the number of typical 3-month internships available to you down to 3 or 4. # Internship in Labs vs. Academia In theoretical computer science there is not too much difference in the style of research between research labs and academia. Also, there have been a lot of discussions online about advantages and disadvantages of each (e.g., here, here, here, here and following the links from there). However, from the intern's perspective these issues are less relevant, e.g. it is highly unlikely that a lab will be shut down during your internship. From intern's point of view, I would say that a few obvious differences are: • More face time. If you enjoy having long brainstorming sessions lasting for several hours every day then an industrial lab might be an ideal place for you. In academia you are unlikely to see your advisor more than twice a week for a couple of hours. This means that at an industrial lab you can make a lot of progress on one project in a very short period of time. In my experience, industrial resesarchers tend to have personalities suitable for thinking long hours on a deep problem together with an environment that lets them do this. In academia professors' busy schedules seem to interfere with research a lot and graduate students often work a lot either by themselves or with other students. Coming from team programming competitions background I really enjoyed these long brainstorming sessions in the labs. • Patents. There are many controversies around patents, but ultimately they play a very important role at research labs (e.g. Nathan Myhrvold, the founder of the controversial Intellectual Ventures, was also the founder of Microsoft Research). While doing an internship, keep in mind that some parts of your research may later be filed as a patent. Depending on the company, you might get some money for this. Also, it is an interesting experience to see a paper converted into a patent by lawyers. • Social aspects. Researchers at labs work and interact with each other a lot more than professors do. This also includes going for lunch as a group and means that you can have lunch with some of the biggest stars in your field every day! You can do lots of other things together too, such as running, cycling, ping pong, etc. I got into triathlons during the group rides at IBM Almaden. # Alternatives If you can't get an internship at your dream lab you still have multiple options. While I haven't tried them, many of my friends did. • Unpaid Internship. Unfortunately, not all great labs are well-funded. If you can't find a paid position but the lab is interested in working with you sometimes your advisor can pay you from their grant. • Visiting a Lab. Sometimes you can get paid for a short or long visit (usually works only for well-funded labs). • Consulting. This is a slightly unusual option for a graduate student, but some of my friends did this. It probably works best if there is a lab next to the place where you live and does involve some paperwork. Getting hired as a consultatnt is also sometimes a way to keep your access to the company's data after an internship if, say, you are still doing experiments for a paper you started while being at the lab. • Fellowship. Many fellowships come together with internship opportunities. These are even better because you are not tied to a specific location/mentor. A great list of fellowships is maintained by CMU. • Visiting Another University. Summer might be a good time to visit another university because professors are not teaching, although they might be traveling. Visiting during the Fall/Spring semester might be also good but for exactly the opposite reasons. Getting a Research Internship was originally published by Grigory Yaroslavtsev at The Big Data Theory on December 20, 2014. by Grigory Yaroslavtsev (grigory@grigory.us) at December 20, 2014 12:00 AM UTC ### Getting a Research Internship For graduate students December is probably the best time to start applying for a summer internship. The process of getting a software engineering internship at places like Google, Facebook, Microsoft, Twitter, Quora, Dropbox, etc. is so much streamlined that there is even a movie about it. Getting an internship in research is a lot more of a unique experience. Graduate students ask me about this a lot and even once invited to give an informal talk on the topic at the Wharton graduate students statistics seminar. Surprisingly, I haven’t seen any guide about research internships available online so I decided to write down things I usually say. Disclaimer: advice below is my personal opinion and is biased towards computer science and the U.S, although sometimes applies more broadly (e.g. to pure math, applied math and statistics). My experience is based on internships at AT&T Research (Shannon Laboratory), IBM Research (Almaden) and Microsoft Research (Silicon Valley and Redmond) and might be somewhat specific to these places, only 50% of which still exist. I also visited Yahoo! and Google Research a lot and can say that these places work similarly but with a few important differences described below. Last but not least, I tried to do my best to avoid making any comparisons (and especially controversial ones) in terms of quality between different labs and academia. # Why Do It? Doing a research internship is a great way to meet your future collaborators and friends. I did it 4 times (the most you can do as an F1 student) and spent 50% of my 3-year PhD in research labs. While this is quite unusual, I should say that I enjoyed the experience tremendously and still work and talk regularly with most of my former mentors (Graham Cormode, Howard Karloff, David Woodruff, Alex Andoni and Konstantin Makarychev), who have also been an invaluable source of advice for me over the years. I met many of my best friends from grad school times in the labs and still visit places where I did an internship whenever I happen to be in the area. # Where to Apply? This highly depends on your area of expertise. These days there is no single best place like Bell Labs in its glory days but there are multiple options to consider. ### Pure Theory For pure theoretical computer science I would suggest to start with MSR and IBM. While I am certainly biased and there can't possibly be a ranking of research labs I would say that these are the top places: • Microsoft Research. Posting, multiple locations (main offices are in Redmond, Cambridge and NYC). Redmond office is the oldest and covers almost all areas. NYC and Cambridge offices are smaller and somewhat similar, being particularly strong in machine learning, social sciences, algorithmic game theory and computational complexity among other areas. • IBM Research. Posting, multiple locations (main offices are in Yorktown Heights, NY and Bay Area). Posting from Ken Clarkson about theory positions at IBM Almaden. Almost all major areas are represented in either the Yorktown Heights or the Bay Area location. Another great place is Toyota Technological Institute at Chicago (posting). I put it in a slightly different category than the industrial research labs because of its close ties with the University of Chicago and the overall feel of a more academic rather than industrial environment. While AT&T and Bell are shadows of their past, they still have some amazing people: • AT&T Labs – Research. Posting, multiple locations (main office is in New Jersey). • Bell Labs. Posting, multiple locations (main office is in New Jersey). ### More Applied Here is a list of some more applied research places off the top of my head: • Google Research. Google jobs website, multiple locations (including NYC and Bay Area). Google usually doesn't make a distinction between research and software engineering positions in their search. Once you pass the standard software engineering screening process, you get into the host matching phase and you can find a mentor interested in research. • Yahoo! Labs. Posting, multiple locations (including NYC, Bay Area and Barcelona). • Facebook Research. Posting from Yann LeCun, multiple locations (including NYC and Bay Area). • Ebay Labs. Posting, located in Bay Area. • Technicolor. Posting, located in Bay Area. • HP Labs. Posting, main location in New Jersey. • NEC Labs. Posting, main location in New Jersey. • VMWare. New lab founded by some of the former Microsoft SVC researchers, main location in Bay Area. Added to the list by suggestion from one of the founding members, Udi Wieder. I am only familiar with the first two, which seem to have a less strong commitment to fundamental research than the labs listed before. However, the total number of job and internship openings in this slightly more applied list is probably almost an order of magnitude larger. ### National Labs There are also many national labs. As a Russian citizen I can't tell you much about experience at these (especially in crypto, where I am sure my list is highly incomplete), but here are some options. Somewhat surprisingly, even these places sometimes have opportunities for international students, which may not be well advertised. Here are a couple of places to consider: • Sandia Labs. Postings, main locations include Albuquerque, NM and Livermore, CA • Berkeley Lawrence. Postings, located at Berkeley, CA. # Internship Tips Internships at research labs and talented interns are both scarce and unique resources so there isn't much data to look at and the decision making process is highly random. However, there are a few things you can do to improve your chances. • List your top choices and potential mentors. It helps a lot if you know your future mentors in person. At the very least make sure that you are familiar with their work. A great lab can easily get hundreds of applications. What matters most for the success of your application is whether there is a mentor who will pick you from the pile. Don't hesitate to contact your top choices either directly or through your advisor, but avoid pestering people. Also, mention your potential mentors' names in different parts of your application (forms, research statement, etc.) • Apply everywhere. While your chances are significantly reduced when you send a cold application, sometimes there are things you don't know and forces beyond your control. This is especially true for graduate students in their early years. Indicating your interest may be important by itself, e.g. I gave my first talk at an industrial lab which couldn't offer me an internship but invited for a short visit. It felt a lot like Peggy Olson's (Mad Men) first business trip to Richmond but better – thanks to IBM I had no dogs having sex as a view from the hotel room :) • Recommendation letters. Ask your letter writers as soon as possible (ideally at least a month in advance), picking them based on the list of your top choices and other places where you plan to apply. Ask your letter writers for suggestions about places and feedback on your application materials. • Research statement. For pure research positions you will need to write a research statement. This is a great opportunity to take time to work on improving your vision. If you are like me then this is a process both difficult and rewarding. Every year when I applied I started by throwing my previous research statement into a trash bin because it looked absolutely terrible. I even remember myself getting very upset once because my vision was such a crap compared to some of the people in the labs where I applied. While your research statement is going to be unlike anyone else's I still recommend looking for inspiration at the research statements of your role models (maybe even potential mentors if they are available). You can ask for feedback on your statement from your advisor, colleagues and friends but I wouldn't expect too much because your statement is truly yours. Make sure you customize some parts of your statement for different places. Finally, for an internship application the research statement often doesn't matter too much, so you don't have to stress too much over it. However, I would still recommend to think of it as a dress rehearsal for your future applications as well as an opportunity to develop your vision. # FAQ • How much does it pay? Usually about the same or slightly more than a software engineering internship. I would expect$6–8K/mo (fixed, no negotiation, overtime or bonuses) + standard benefits such as relocation, car rental and housing discounts. So money wise this is certainly better than academia, but can easily be at least two times less than an internship in finance (if you charge overtime, include bonuses, etc.). However, if you are doing what you enjoy most then you might care less about the money.
• Can I do an internship during the Fall/Spring semester? Yes. The main advantage is that researchers at the lab are likely to be more available during these semesters. Another advantage is that if you are doing a summer internship in the same area then you can do two internships back to back and reduce the pain of relocation (I spent six months in Bay Area this way). There are certain disadvantages: less interns and corporate events during the semester, some schools require you to register for credits even if you are away (read as you and/or your advisor will have to pay money and do some paperwork).
• Can I do an internship after my last year in grad school? Yes if you graduate after the internship. However, it also depends on the place – Google wouldn't allow me to do this but Microsoft Research did.
• What if I am on an F1 visa? Then you have to jump through more paperwork hoops and in particular get a CPT. You can accumulate at most 12 months of CPT employment without losing your OPT, which limits the number of typical 3-month internships available to you down to 3 or 4.

# Internship in Labs vs. Academia

In theoretical computer science there is not too much difference in the style of research between research labs and academia. Also, there have been a lot of discussions online about advantages and disadvantages of each (e.g., here, here, here, here and following the links from there). However, from the intern's perspective these issues are less relevant, e.g. it is highly unlikely that a lab will be shut down during your internship.

From intern's point of view, I would say that a few obvious differences are:

• More face time. If you enjoy having long brainstorming sessions lasting for several hours every day then an industrial lab might be an ideal place for you. In academia you are unlikely to see your advisor more than twice a week for a couple of hours. This means that at an industrial lab you can make a lot of progress on one project in a very short period of time. In my experience, industrial researchers tend to have personalities suitable for thinking long hours on a deep problem together with an environment that lets them do this. In academia professors' busy schedules seem to interfere with research a lot and graduate students often work a lot either by themselves or with other students. Coming from team programming competitions background I really enjoyed these long brainstorming sessions in the labs.
• Patents. There are many controversies around patents, but ultimately they play a very important role at research labs (e.g. Nathan Myhrvold, the founder of the controversial Intellectual Ventures, was also the founder of Microsoft Research). While doing an internship, keep in mind that some parts of your research may later be filed as a patent. Depending on the company, you might get some money for this. Also, it is an interesting experience to see a paper converted into a patent by lawyers.
• Social aspects. Researchers at labs work and interact with each other a lot more than professors do. This also includes going for lunch as a group and means that you can have lunch with some of the biggest stars in your field every day! You can do lots of other things together too, such as running, cycling, ping pong, etc. I got into triathlons during the group rides at IBM Almaden.

# Alternatives

If you can't get an internship at your dream lab you still have multiple options. While I haven't tried them, many of my friends did.

• Unpaid Internship. Unfortunately, not all great labs are well-funded. If you can't find a paid position but the lab is interested in working with you sometimes your advisor can pay you from their grant.
• Visiting a Lab. Sometimes you can get paid for a short or long visit (usually works only for well-funded labs).
• Consulting. This is a slightly unusual option for a graduate student, but some of my friends did this. It probably works best if there is a lab next to the place where you live and does involve some paperwork. Getting hired as a consultant is also sometimes a way to keep your access to the company's data after an internship if, say, you are still doing experiments for a paper you started while being at the lab.
• Fellowship. Many fellowships come together with internship opportunities. These are even better because you are not tied to a specific location/mentor. A great list of fellowships is maintained by CMU.
• Visiting Another University. Summer might be a good time to visit another university because professors are not teaching, although they might be traveling. Visiting during the Fall/Spring semester might be also good but for exactly the opposite reasons.

Getting a Research Internship was originally published by Grigory Yaroslavtsev at The Big Data Theory on December 20, 2014.

by Grigory Yaroslavtsev (grigory@grigory.us) at December 20, 2014 12:00 AM UTC

### When Do a Few Colors Suffice?

from Gil Kalai

When can we properly color the vertices of a graph with a few colors? This is a notoriously difficult problems. Things get a little better if we consider simultaneously a graph together with all its induced subgraphs. Recall that an induced subgraph of a graph G is a subgraph formed by a subset of the vertices of G together with all edges of G spanned  on these vertices.  An induced cycle of length larger than three is called a hole, and an induced subgraph which is a complement of a cycle of length larger than 3 is called an anti-hole. As usual, $\chi (G)$ is the chromatic number of G and $\omega (G)$ is the clique number of G (the maximum number of vertices that form a complete subgraph. Clearly, for every graph G

$\chi(G) \ge \omega (G)$.

## Perfect graphs

Question 1: Describe the class $\cal G$  of graphs closed under induced subgraphs, with the property that $\chi(G)=\omega (G)$ for every $G\in{\cal G}$.

A graph G is called perfect if  $\chi(H)=\omega (H)$ for every induced subgraph H of G. So Question 1 asks for a description of perfect graphs. The study of perfect graphs is among the most important areas of graph theory, and much progress was made along the years.

Interval graphs, chordal graphs, comperability graphs of POSETS  , … are perfect.

Two major theorems about perfect graphs, both conjectured by Claude Berge are:

The perfect graph theorem (Lovasz, 1972): The complement of a perfect graph is perfect

The strong perfect graph theorem (Chudnovsky, Robertson, Seymour and Thomas, 2002): A graph is perfect if and only if it does not contain as an odd hole and an odd anti-hole.

## Mycielski Graphs

There are triangle-free graphs with arbitrary large chromatic number. An important construction by Mycielski goes as follows: Given a triangle graph G with n vertices $v_1,v_2, \dots, v_n$ create a new graph  G’ as follows: add n new vertices $u_1, u_2\dots u_n$ and a vertex w. Now add w to each $u_i$ and for every i and j for which $v_i$ and $v_j$ are adjacent add also an edge between $v_i$ and $u_j$ (and thus also between $u_i$ and $v_j$.)

## Classes of Graphs with bounded chromatic numbers

Question 2: Describe classes of graphs closed under induced subgraphs with bounded chromatic numbers.

Here are three theorems in this direction. The first answers affirmatively a conjecture by Kalai and Meshulam. The second and third prove conjectures by Gyarfas.

## Trinity Graphs

The Trinity graph theorem (Bonamy, Charbit and Thomasse, 2013): Graphs without induced cycles of  length divisible by three have bounded chromatic numbers.

(The paper: Graphs with large chromatic number induce 3k-cycles.)

### Steps toward Gyarfas conjecture

Theorem (Scott and Semour, 2014):  Triangle-free graphs without odd induced cycles have bounded chromatic number.

(The paper:  Colouring graphs with no odd holes.)

A graph is pentagonal of all its odd holes are pentagons.

Theorem (Chudnovsky, Seymour and Scott 2014): Pentagonal graphs can be colored by 82200 colors.

(The paper: Three steps towards Gyarfas’ conjecture.)

It is possible that all pentagonal graphs are 4-colorable.

### Connections with homology: Conjectures by Kalai and Meshulam

Consider the independence complex of a graph, namely the simplicial complex whose faces are independent sets of vertices. For a simplicial complex K let $b_i(K)$ denotes the ith Betti number of K and let b(K) denote the sum of the Betti numbers. Here are a few conjectures by Roy Meshulam and me made about a decade ago.

Conjecture 1: For every C>0 there is K(C)  such that for a graph G if |b(I(H))| ≤ C for every induced subgraph H then G can be colored by K(C) colors.

Conjecture 2:  For a graph G  we have that |b(I(H))| ≤ 1 for every induced subgraph H, if and only if G does not contain and induced cycle of length divisible by 3.

Conjectures 1 and 2 have led us to conjecture the Trinity Theorem now proved by Marthe Bonamy, Pierre Charbit, and Stéphan Thomassé.

Stronger forms of Conjecture 1 are given by:

Conjecture 3:  For every function g, with $g(m)=o(m^2)$  there is K(g)  such that for a graph G if |b(I(H))| ≤ g(n(H)) for every induced subgraph H with n(H) vertices, then G can be colored by K(g) colors.

Conjecture 4: For conjectures 1 and 3 we can even replace the sum of Betti numbers by the absolute value of the Euler characteristics.

### Graphs whose chromatic number is bounded by some function of their clique number

A variant of Question 2 is:

Question 3: Describe the classes $\cal G$  of graphs closed under induced subgraphs, with the property that $\chi(G) \le f(\omega (G))$ for some function f, for every $G \in {\cal G}$.

We will call such a class of graphs a class of accomplished graphs (with respect to the function f). “Accomplished” is offered as a weak form of “perfect” and being accomplished is an asymptotic property. Of course, I am open to any other name for this property.

Gyarfas conjecture (1985): The class graphs without odd holes of length more than $\ell$ is accomplished.

Theorem (Seymour and Scott 2014):  The class of graphs without odd holes is accomplished.

In other words, for every  t there exists n such that every graph with no $K_t$ subgraph and no odd hole is n-colourable.

Conjecture (Kalai and Meshulam 2005): The class of graphs with b(I(G)) (or even $|\chi (I(G))|$ bounded by a fixed polynomial function of the number of vertices is accomplished.

A special case of this conjecture follows from a 2003 theorem by Alon, Kalai, Matousek and Meshulam.

Theorem: The class of graphs with $b_i(I(G))=0$ for i>d  is accomplished.

## The Erdos-Hajnal Conjecture

One theme for this post is that there is much better hope to understand the chromatic number for a graph and all its induced subgraph than for just an individual graph. Here is a conjecture which puts this insight into a formal mathematical statement.

Conjecture (Erdos and Hajnal) : For every H there is beta such that every graph with $n^\beta$  vertices without an induced subgraph isomorphic to H contains a clicque of size n or an empty graph of size n.

Erdos classical bound on Ramsey’s numbers asserts that for general graphs you need at least $2^{n/2}$ vertices and that $2^{2n}$ vertices suffice.

## High Dimensional extensions

My conjectures with Roy and our results came actually from a wider context of higher dimensional complexes. I will not mention this here beside asking

Question 4: What is the “right” extension of perfect graphs for simplicial complexes of dimension greater than one.

by Gil Kalai at December 19, 2014 12:42 PM UTC

### A Fire Fighter's Problem

Authors: Rolf Klein, Elmar Langetepe, Christos Levcopoulos University of Bonn, Germany, Institute of Computer Science I, University of Lund, Sweden, Department of Computer Science)
Abstract: Suppose that a circular fire spreads in the plane at unit speed. A fire fighter can build a barrier at speed $v>1$. How large must $v$ be to ensure that the fire can be contained, and how should the fire fighter proceed? We provide two results. First, we analyze the natural strategy where the fighter keeps building a barrier along the frontier of the expanding fire. We prove that this approach contains the fire if $v>v_c=2.6144 \ldots$ holds. Second, we show that any "spiralling" strategy must have speed $v>1.618$ in order to succeed.

Keywords: Motion Planning, Dynamic Environments, Spiralling strategies, Lower and upper bounds

### Compression of high throughput sequencing data with probabilistic de Bruijn graph

Authors: Gaëtan Benoit, Claire Lemaitre, Dominique Lavenier, Guillaume Rizk

### On families of anticommuting matrices

Authors: Pavel Hrubeš
Abstract: Let $e_{1},\dots, e_{k}$ be complex $n\times n$ matrices such that $e_{i}e_{j}=-e_{j}e_{i}$ whenever $i\not=j$. We conjecture that $\hbox{rk}(e_{1}^{2})+\hbox{rk}(e_{2}^{2})+\cdots+\hbox{rk}(e_{k}^{2})\leq O(n\log n)$, and prove some results in this direction.

### An Algorithm for Online K-Means Clustering

Authors: Edo Liberty, Ram Sriharsha, Maxim Sviridenko
Abstract: This paper shows that one can be competitive with with the k-means objective while operating online. In this model, the algorithm receives vectors v1,...,vn one by one in arbitrary order. For each vector vi the algorithm outputs a cluster identifier before receiving vi+1. Our online algorithm generates ~O(k) clusters whose k-means cost is ~O(W*) where W* is the optimal k-means cost using k clusters and ~O suppresses poly logarithmic factors. We also show that, experimentally, it is not much worse than k-means++ while operating in a strictly more constrained computational model.

### Kinetic $k$-Semi-Yao Graph and its Applications

Authors: Zahed Rahmati, Mohammad Ali Abam, Valerie King, Sue Whitesides
Abstract: This paper introduces a new proximity graph, called the $k$-Semi-Yao graph ($k$-SYG), on a set $P$ of points in $\mathbb{R}^d$, which is a supergraph of the $k$-nearest neighbor graph ($k$-NNG) of $P$. We provide a kinetic data structure (KDS) to maintain the $k$-SYG on moving points, where the trajectory of each point is a polynomial function whose degree is bounded by some constant. Our technique gives the first KDS for the theta graph (\ie, $1$-SYG) in $\mathbb{R}^d$. It generalizes and improves on previous work on maintaining the theta graph in $\mathbb{R}^2$.

As an application, we use the kinetic $k$-SYG to provide the first KDS for maintenance of all the $k$-nearest neighbors in $\mathbb{R}^d$, for any $k\geq 1$. Previous works considered the $k=1$ case only. Our KDS for all the $1$-nearest neighbors is deterministic. The best previous KDS for all the $1$-nearest neighbors in $\mathbb{R}^d$ is randomized. Our structure and analysis are simpler and improve on this work for the $k=1$ case. We also provide a KDS for all the $(1+\epsilon)$-nearest neighbors, which in fact gives better performance than previous KDS's for maintenance of all the exact $1$-nearest neighbors.

As another application, we present the first KDS for answering reverse $k$-nearest neighbor queries on moving points in $\mathbb{R}^d$, for any $k\geq 1$.

### Boolean function monotonicity testing requires (almost) $n^{1/2}$ non-adaptive queries

Authors: Xi Chen, Anindya De, Rocco A. Servedio, Li-Yang Tan
Abstract: We prove a lower bound of $\Omega(n^{1/2 - c})$, for all $c>0$, on the query complexity of (two-sided error) non-adaptive algorithms for testing whether an $n$-variable Boolean function is monotone versus constant-far from monotone. This improves a $\tilde{\Omega}(n^{1/5})$ lower bound for the same problem that was recently given in [CST14] and is very close to $\Omega(n^{1/2})$, which we conjecture is the optimal lower bound for this model.

### New algorithms and lower bounds for monotonicity testing

Authors: Xi Chen, Rocco A. Servedio, Li-Yang Tan
Abstract: We consider the problem of testing whether an unknown Boolean function $f$ is monotone versus $\epsilon$-far from every monotone function. The two main results of this paper are a new lower bound and a new algorithm for this well-studied problem.

Lower bound: We prove an $\tilde{\Omega}(n^{1/5})$ lower bound on the query complexity of any non-adaptive two-sided error algorithm for testing whether an unknown Boolean function $f$ is monotone versus constant-far from monotone. This gives an exponential improvement on the previous lower bound of $\Omega(\log n)$ due to Fischer et al. [FLN+02]. We show that the same lower bound holds for monotonicity testing of Boolean-valued functions over hypergrid domains $\{1,\ldots,m\}^n$ for all $m\ge 2$.

Upper bound: We give an $\tilde{O}(n^{5/6})\text{poly}(1/\epsilon)$-query algorithm that tests whether an unknown Boolean function $f$ is monotone versus $\epsilon$-far from monotone. Our algorithm, which is non-adaptive and makes one-sided error, is a modified version of the algorithm of Chakrabarty and Seshadhri [CS13a], which makes $\tilde{O}(n^{7/8})\text{poly}(1/\epsilon)$ queries.

### On Google Scholar H-Index Manipulation by Merging Articles

Authors: René van Bevern, Christian Komusiewicz, Rolf Niedermeier, Manuel Sorge, Toby Walsh
Abstract: Google Scholar allows merging multiple article versions into one. This merging affects the H-index computed by Google Scholar. We analyze the parameterized complexity of maximizing the H-index using article merges. Herein, multiple possible measures for computing the citation count of a merged article are considered. Among others, for the measure used by Google Scholar, we give an algorithm that maximizes the H-index in linear time if there is only a constant number of versions of the same article. In contrast, if we are allowed to merge arbitrary articles, then already increasing the H-index by one is NP-hard.

### Introduction to Dynamic Unary Encoding

Authors: Ernst D. Berg
Abstract: Dynamic unary encoding takes unary encoding to the next level. Every n-bit binary string is an encoding of dynamic unary and every n-bit binary string is encodable by dynamic unary. By utilizing both forms of unary code and a single bit of parity information dynamic unary encoding partitions 2^n non-negative integers into n sets of disjoint cycles of n-bit elements. These cycles have been employed as virtual data sets, binary transforms and as a mathematical object. Characterization of both the cycles and of the cycle spectrum is given. Examples of encoding and decoding algorithms are given. Examples of other constructs utilizing the principles of dynamic unary encoding are presented. The cycle as a mathematical object is demonstrated.

### Quick comments on the NIPS experiment

[One can tell it’s reviewing and letter-writing season when I escape to blogging more often..]

There’s been some discussion on the NIPS experiment, enough of it that even my neuro-scientist brother sent me a link to Eric Price’s blog post. The gist of it is that the program chairs duplicated the reviewing process for 10% of the papers, to see how many would get inconsistent decisions, and it turned out that 25.9% of them did (one of the program chairs predicted that it would be 20% and the other that it would be 25%, see also herehere and here). Eric argues that the right way to measure disagreement is to look at the fraction of papers that process A accepted that would have been rejected by process B, which comes out to more than 50%.

It’s hard for me to interpret this number. One interpretation is that it’s a failure of the refereeing process that people can’t agree on more than half of the list of accepted papers. Another viewpoint is that since the disagreement is not much larger than predicted beforehand, we shouldn’t be that surprised about it. It’s tempting when having such discussions to model papers as having some inherent quality score, where the goal of the program committee is to find all papers above a certain threshold. The truth is that different papers have different, incomparable qualities, that appeal to different subsets of people. The goal of the program committee is to curate an a diverse and intellectually stimulating program for the conference. This is an inherently subjective task, and it’s not surprising that different committees would arrive at different conclusions. I do not know what’s the “optimal” amount of variance in this process, but I would have been quite worried if it was zero, since it would be a clear sign of groupthink. Lastly, I think this experiment actually points out to an important benefit of the conference system. Unlike journals, where the editorial board tends to stay constant for a long period, in conferences one gets a fresh draw of the committee every 6 months or a year.

by Boaz Barak at December 18, 2014 09:45 PM UTC

### The negative impacts of random conference decisions

The NIPS experiment is making waves.  If you are unaware, for the last NIPS conference, the PC was broken into two independent halves A and B.  A random selection of the submissions were assigned to both committees.  The result: 57% of the papers that were accepted by committee A were rejected by committee B (and […]

The post The negative impacts of random conference decisions appeared first on Glencora Borradaile.

by Glencora Borradaile at December 18, 2014 04:23 PM UTC

### The NIPS Experiment

The NIPS (machine learning) conference ran an interesting experiment this year. They had two separate and disjoint program committees with the submissions split between them. 10% (166) of the submissions were given to both committees. If either committee accepted one of those papers it was accepted to NIPS.

According to an analysis by Eric Price, of those 166, about 16 (about 10%) were accepted by both committees, 43 (26%) by exactly one of the committees and 107 (64%) rejected by both committees. Price notes that of the accepted papers, over half (57%) of them would not have been accepted with a different PC. On the flip side 83% of the rejected papers would still be rejected. More details of the experiment here.

No one who has ever served on a program committee should be surprised by these results. Nor is there anything really wrong or bad going on here. A PC will almost always accept the great papers and almost always reject the mediocre ones, but the middle ground are at a similar quality level and personal tastes come into play. There is no objective perfect ordering of the papers and that's why we task a program committee to make those tough choices. The only completely fair committees would either accept all the papers or reject all the papers.

These results can lead to a false sense of self worth. If your paper is accepted you might think you had a great submission, more likely you had a good submission and got lucky. If your paper was rejected, you might think you had a good submission and was unlucky, more likely you had a mediocre paper that would never get in.

In the few days since NIPS announced these results, I've already seen people try to use them not only to trash program committees but for many other subjective decision making. In the end we have to make choices on who to hire, who to promote and who to give grants. We need to make subjective decisions and those done by our peers aren't always consistent but they work much better than the alternatives. Even the machine learning conference doesn't use machine learning to choose which papers to accept.

by Lance Fortnow (noreply@blogger.com) at December 18, 2014 12:44 PM UTC

### A Self-Tester for Linear Functions over the Integers with an Elementary Proof of Correctness

Abstract: We present simple, self-contained proofs of correctness for algorithms for linearity testing and program checking of linear functions on finite subsets of integers represented as n-bit numbers. In addition we explore two generalizations of self-testing to multiple variables - the case of multilinear functions and homomorphisms on a multidimensional vector space.

We show that our self-testing algorithm for the univariate case can be directly generalized to vector space domains. The number of queries made by our algorithms are independent of domain size. However, linearity testing for multilinear functions requires a different testing algorithm. We give an algorithm for the k-linearity problem with queries independent of the size of the domain.

### Solving Totally Unimodular LPs with the Shadow Vertex Algorithm

Authors: Tobias Brunsch, Anna Großwendt, Heiko Röglin
Abstract: We show that the shadow vertex simplex algorithm can be used to solve linear programs in strongly polynomial time with respect to the number $n$ of variables, the number $m$ of constraints, and $1/\delta$, where $\delta$ is a parameter that measures the flatness of the vertices of the polyhedron. This extends our recent result that the shadow vertex algorithm finds paths of polynomial length (w.r.t. $n$, $m$, and $1/\delta$) between two given vertices of a polyhedron.

Our result also complements a recent result due to Eisenbrand and Vempala who have shown that a certain version of the random edge pivot rule solves linear programs with a running time that is strongly polynomial in the number of variables $n$ and $1/\delta$, but independent of the number $m$ of constraints. Even though the running time of our algorithm depends on $m$, it is significantly faster for the important special case of totally unimodular linear programs, for which $1/\delta\le n$ and which have only $O(n^2)$ constraints.

### Optimal-Depth Sorting Networks

Authors: Daniel Bundala, Michael Codish, Luís Cruz-Filipe, Peter Schneider-Kamp, Jakub Závodný
Abstract: We solve a 40-year-old open problem on the depth optimality of sorting networks. In 1973, Donald E. Knuth detailed, in Volume 3 of "The Art of Computer Programming", sorting networks of the smallest depth known at the time for n =< 16 inputs, quoting optimality for n =< 8. In 1989, Parberry proved the optimality of the networks with 9 =< n =< 10 inputs. In this article, we present a general technique for obtaining such optimality results, and use it to prove the optimality of the remaining open cases of 11 =< n =< 16 inputs. We show how to exploit symmetry to construct a small set of two-layer networks on n inputs such that if there is a sorting network on n inputs of a given depth, then there is one whose first layers are in this set. For each network in the resulting set, we construct a propositional formula whose satisfiability is necessary for the existence of a sorting network of a given depth. Using an off-the-shelf SAT solver we show that the sorting networks listed by Knuth are optimal. For n =< 10 inputs, our algorithm is orders of magnitude faster than the prior ones.

### The switch Markov chain for sampling irregular graphs

Authors: Catherine Greenhill
Abstract: The problem of efficiently sampling from a set of(undirected) graphs with a given degree sequence has many applications. One approach to this problem uses a simple Markov chain, which we call the switch chain, to perform the sampling. The switch chain is known to be rapidly mixing for regular degree sequences. We prove that the switch chain is rapidly mixing for any degree sequence with minimum degree at least 1 and with maximum degree $d_{\max}$ which satisfies $3\leq d_{\max}\leq \frac{1}{4}\, \sqrt{M}$, where $M$ is the sum of the degrees. The mixing time bound obtained is only an order of $n$ larger than that established in the regular case, where $n$ is the number of vertices.

### Shallow Packings in Geometry

Authors: Esther Ezra
Abstract: We refine the bound on the packing number, originally shown by Haussler, for shallow geometric set systems. Specifically, let $\V$ be a finite set system defined over an $n$-point set $X$; we view $\V$ as a set of indicator vectors over the $n$-dimensional unit cube. A $\delta$-separated set of $\V$ is a subcollection $\W$, s.t. the Hamming distance between each pair $\uu, \vv \in \W$ is greater than $\delta$, where $\delta > 0$ is an integer parameter. The $\delta$-packing number is then defined as the cardinality of the largest $\delta$-separated subcollection of $\V$. Haussler showed an asymptotically tight bound of $\Theta((n/\delta)^d)$ on the $\delta$-packing number if $\V$ has VC-dimension (or \emph{primal shatter dimension}) $d$. We refine this bound for the scenario where, for any subset, $X' \subseteq X$ of size $m \le n$ and for any parameter $1 \le k \le m$, the number of vectors of length at most $k$ in the restriction of $\V$ to $X'$ is only $O(m^{d_1} k^{d-d_1})$, for a fixed integer $d > 0$ and a real parameter $1 \le d_1 \le d$ (this generalizes the standard notion of \emph{bounded primal shatter dimension} when $d_1 = d$). In this case when $\V$ is "$k$-shallow" (all vector lengths are at most $k$), we show that its $\delta$-packing number is $O(n^{d_1} k^{d-d_1}/\delta^d)$, matching Haussler's bound for the special cases where $d_1=d$ or $k=n$. As an immediate consequence we conclude that set systems of halfspaces, balls, and parallel slabs defined over $n$ points in $d$-space admit better packing numbers when $k$ is smaller than $n$. Last but not least, we describe applications to (i) spanning trees of low total crossing number, and (ii) geometric discrepancy, based on previous work by the author.

### Spiral Toolpaths for High-Speed Machining of 2D Pockets with or without Islands

Authors: Mikkel Abrahamsen
Abstract: We describe new methods for the construction of spiral toolpaths for high-speed machining. In the simplest case, our method takes a polygon as input and a number $\delta>0$ and returns a spiral starting at a central point in the polygon, going around towards the boundary while morphing to the shape of the polygon. The spiral consists of linear segments and circular arcs, it is $G^1$ continuous, it has no self-intersections, and the distance from each point on the spiral to each of the neighboring revolutions is at most $\delta$. Our method has the advantage over previously described methods that it is easily adjustable to the case where there is an island in the polygon to be avoided by the spiral. In that case, the spiral starts at the island and morphs the island to the outer boundary of the polygon. It is shown how to apply that method to make significantly shorter spirals in polygons with no islands. Finally, we show how to make a spiral in a polygon with multiple islands by connecting the islands into one island.

### Size sensitive packing number for Hamming cube and its consequences

Authors: Kunal Dutta, Arijit Ghosh
Abstract: We prove a size-sensitive version of Haussler's Packing lemma~\cite{Haussler92spherepacking} for set-systems with bounded primal shatter dimension, which have an additional {\em size-sensitive property}. This answers a question asked by Ezra~\cite{Ezra-sizesendisc-soda-14}. We also partially address another point raised by Ezra regarding overcounting of sets in her chaining procedure. As a consequence of these improvements, we get an improvement on the size-sensitive discrepancy bounds for set systems with the above property. Improved bounds on the discrepancy for these special set systems also imply an improvement in the sizes of {\em relative $(\varepsilon, \delta)$-approximations} and $(\nu, \alpha)$-samples.