# Theory of Computing Blog Aggregator

### Approximating Edit Distance

I was excited to see “Approximating Edit Distance Within Constant Factor in Truly Sub-Quadratic Time” by Chakraborty, Das, Goldenberg, Koucky, and Saks in this year’s list of FOCS accepted papers. As you can guess from the title, their main result is a constant-factor approximation for edit distance in truly subquadratic time (i.e. $O(n^{2-\delta})$, for some constant $\delta$). I spent quite a bit of time thinking about this problem in the past couple of years, and wiser people than me have spent much more time for a couple of decades. Congratulations!!

Given strings $A$ and $B$ of $n$ characters each, the textbook dynamic programming algorithm finds their edit distance in time $O(n^{2})$ (if you haven’t seen this in your undergrad algorithms class, consider asking your university for a refund on your tuition). Recent complexity breakthroughs [1][2] show that under plausible assumptions like SETH, quadratic time is almost optimal for exact algorithms. This is too bad, because we like to compute edit distance between very long strings, like entire genomes of two organisms (see also this article in Quanta). There is a sequence of near-linear time algorithms with improved approximation factors [1][2][3][4], but until now the state of the art was polylogarithmic; actually for near-linear time, this is still the state of the art:

Open question 1: Is there a constant-factor approximation to edit distance that runs in near-linear time?

## Algorithm sketch

Here is a sketch of an algorithm. It is somewhat different from the algorithm in the paper because I wrote this post before finding the full paper online.

### Step 0: window-compatible matchings

We partition each string into $t$ windows, or consecutive substrings of length $d:=n/t$ each. We then restrict our attention to window-compatible matchings: that is, instead of looking for the globally optimum way to transform $A$ to $B$, we look for a partial matching between the $A$– and $B$-windows, and transform each $A$-window to its matching $B$-windows (unmatched $A$-windows are deleted, and unmatched $B$-windows are inserted). It turns out that restricting to window-compatible matchings is almost without loss of generality.

In order to find the optimum window-compatible matching, we can find the distances between every pair of windows, and then use a (weighted) dynamic program of size $t\times t\ll n^{2}$. The reason I call it “Step 0” is because so far we made zero progress on running time: we still have to compute the edit distance between $t^{2}$ pairs, and each computation takes time $d^{2}$, so $t^{2}d^{2}=n^{2}$ time in total.

Approximating all the pairwise distances reduces to the following problem: given threshold $\Delta$, compute the bipartite graph $G_{\Delta}$ over the windows, where two windows $A_{i}$ and $B_{j}$ share an edge if $ED (A_i, B_j) \leq \Delta$. In fact it suffices to compute an approximate $\widetilde{G}_{\Delta}$, where $A_{i}$ and $B_{j}$ may share an edge even if their edit distance is a little more than $\Delta$.

New Goal: Compute $\widetilde{G}_{\Delta}$ faster than naively computing all $t^{2}$ pairwise edit distances.

### Step 1: sparsifying $G_{\Delta}$

While there are many edges in $G_{\Delta}\setminus\widetilde{G}_{\Delta}$, say average degree $\geq t^{3/4}$: Draw a random edge $(A_{1},B_{1})$, and let $B_{j}, A_i$ be two other neighbors of $A_{1},B_{1}$, respectively. Applying the triangle inequality (twice), we have that $ED(A_{i},B_{j})\leq3\Delta$, so we can immediately add $(A_{i},B_{j})$ to $\widetilde{G}_{\Delta}$. In expectation, $A_{1},B_{1}$ have $\geq t^{3/4}$ neighbors each, so we discovered a total of $\geq t^{6/4}$ pairs; of which we expect that roughly $\geq t^{5/4}$ correspond to new edges in $\widetilde{G}_{\Delta}$. Repeat at most $t^{3/4}$ times until we discovered almost all the edges in $G_{\Delta}$. Notice that each iteration took us $td^{2}$ time (computing all the edit distances from $A_{1}$ and $B_{1}$); hence in total only $t^{7/4}d^{2}\ll t^{2}d^{2}=n^{2}$. Thus we reduced to the sparse case in truly subquadratic time.

The algorithm up to this point is actually due to a recent paper by Boroujeni et al; for the case when $G_{\Delta}$ is relatively sparse, they use Grover Search to discover all the remaining edges in quantum subquadratic time. It remains to see how to do it classically…

### Step 2: when $G_{\Delta}$ is relatively sparse

The main observation we need for this part is that if windows $A_{i}$ and $A_{j}$ are close, then in an optimum window-compatible matching they are probably not matched to $B$-windows that are very far apart. And in the rare event that they are matched to far-apart $B$-windows, the cost of inserting so many characters between $A_{i}$ and $A_{j}$ outweighs the cost of completely replacing $A_{j}$ if we had to. So once we have a candidate list of $B$-windows that $A_{i}$ might match to, it is safe to only search for good matches for $A_{j}$ around each of those candidates. But when the graph $G_{\Delta}$ is sparse, we have such a short list: the neighbors of $A_{i}$!

We have to be a bit careful: for example, it is possible that $A_{i}$ is not matched to any of its neighbors in $G_{\Delta}$. But if we sample enough $A_{i}$‘s from some interval around $A_{j}$, then either (i) at least one of them is matched to a neighbor in $G_{\Delta}$; or (ii) $G_{\Delta}$ doesn’t contribute much to reducing the edit distance for this interval, so it’s OK to miss some of those edges.

## Improving the approximation factor

On the back of my virtual envelope, I think the above ideas give a $(3+\varepsilon)$-approximation. But as far as genomes go, up to a $(3+\varepsilon)$-approximation, you’re as closely related to your dog as to a banana. So it would be great to improve the approximation factor:

Open question 2: Is there a $(1+\varepsilon)$-approximation algorithm for edit distance in truly subquadratic (or near-linear) time?

Note that only the sparsification step loses more than $(1+\varepsilon)$ in approximation. Also, none of the existing fine-grained hardness results rule out an $(1+\varepsilon)$-approximation, even in linear time!

by aviad.rubinstein at July 20, 2018 10:38 AM UTC

### Optimal Las Vegas Approximate Near Neighbors in $\ell_p$

Authors: Alexander Wei
Abstract: We show that approximate near neighbor search in high dimensions can be solved in a Las Vegas fashion (i.e., without false negatives) for $\ell_p$ ($1\le p\le 2$) while matching the performance of optimal locality-sensitive hashing. Specifically, we construct a data-independent Las Vegas data structure with query time $O(dn^{\rho})$ and space usage $O(dn^{1+\rho})$ for $(r, c r)$-approximate near neighbors in $\mathbb{R}^{d}$ under the $\ell_p$ norm, where $\rho = 1/c^p + o(1)$. Furthermore, we give a Las Vegas locality-sensitive filter construction for the unit sphere that can be used with the data-dependent data structure of Andoni et al. (SODA 2017) to achieve optimal space-time tradeoffs in the data-dependent setting. For the symmetric case, this gives us a data-dependent Las Vegas data structure with query time $O(dn^{\rho})$ and space usage $O(dn^{1+\rho})$ for $(r, c r)$-approximate near neighbors in $\mathbb{R}^{d}$ under the $\ell_p$ norm, where $\rho = 1/(2c^p - 1) + o(1)$.

Our data-independent construction improves on the recent Las Vegas data structure of Ahle (FOCS 2017) for $\ell_p$ when $1 < p\le 2$. Our data-dependent construction does even better for $\ell_p$ for all $p\in [1, 2]$ and is the first Las Vegas approximate near neighbors data structure to make use of data-dependent approaches. We also answer open questions of Indyk (SODA 2000), Pagh (SODA 2016), and Ahle by showing that for approximate near neighbors, Las Vegas data structures can match state-of-the-art Monte Carlo data structures in performance for both the data-independent and data-dependent settings and across space-time tradeoffs.

### Exact Algorithms for Finding Well-Connected 2-Clubs in Real-World Graphs: Theory and Experiments

Authors: Christian Komusiewicz, André Nichterlein, Rolf Niedermeier, Marten Picker
Abstract: Finding (maximum-cardinality) "cliquish" subgraphs is a central topic in graph mining and community detection. A popular clique relaxation are 2-clubs: instead of asking for subgraphs of diameter one (these are cliques), one asks for subgraphs of diameter two (these are 2-clubs). A drawback of the 2-club model is that it produces hub-and-spoke structures (typically star-like) as maximum-cardinality solutions. Hence, we study 2-clubs with the additional constraint to be well-connected. More specifically, we investigate the algorithmic complexity for three variants of well-connected 2-clubs, all established in the literature: robust, hereditary, and "connected" 2-clubs. Finding these more dense 2-clubs is NP-hard; nevertheless, we develop an exact combinatorial algorithm, extensively using efficient data reduction rules. Besides several theoretical insights we provide a number of empirical results based on an engineered implementation of our exact algorithm. In particular, the algorithm significantly outperforms existing algorithms on almost all (large) real-world graphs we considered.

### Prophet Secretary Through Blind Strategies

Authors: Jose Correa, Raimundo Saona, Bruno Ziliotto
Abstract: In the classic prophet inequality, samples from independent random variables arrive online. A gambler that knows the distributions must decide at each point in time whether to stop and pick the current sample or to continue and lose that sample forever. The goal of the gambler is to maximize the expected value of what she picks and the performance measure is the worst case ratio between the expected value the gambler gets and what a prophet, that sees all the realizations in advance, gets. In the late seventies, Krengel and Sucheston, and Gairing (1977) established that this worst case ratio is a universal constant equal to 1/2. In the last decade prophet inequalities has resurged as an important problem due to its connections to posted price mechanisms, frequently used in online sales. A very interesting variant is the Prophet Secretary problem, in which the only difference is that the samples arrive in a uniformly random order. For this variant several algorithms achieve a constant of 1-1/e and very recently this barrier was slightly improved. This paper analyzes strategies that set a nonincreasing sequence of thresholds to be applied at different times. The gambler stops the first time a sample surpasses the corresponding threshold. Specifically we consider a class of strategies called blind quantile strategies. They consist in fixing a function which is used to define a sequence of thresholds once the instance is revealed. Our main result shows that they can achieve a constant of 0.665, improving upon the best known result of Azar et al. (2018), and on Beyhaghi et al. (2018) (order selection). Our proof analyzes precisely the underlying stopping time distribution, relying on Schur-convexity theory. We further prove that blind strategies cannot achieve better than 0.675. Finally we prove that no nonadaptive algorithm for the gambler can achieve better than 0.732.

### A Tale of Santa Claus, Hypergraphs and Matroids

Authors: Sami Davies, Thomas Rothvoss, Yihao Zhang
Abstract: A well-known problem in scheduling and approximation algorithms is the Santa Claus problem. Suppose that Santa Claus has a set of gifts, and he wants to distribute them among a set of children so that the least happy child is made as happy as possible. Here, the value that a child $i$ has for a present $j$ is of the form $p_{ij} \in \{ 0,p_j\}$. The only known polynomial time algorithm by Annamalai et al. gives a $12.33$-approximation algorithm and is based on a modification of Haxell's hypergraph matching argument. This factor compares to the value of an exponential size \emph{configuration LP}.

In this paper, we introduce a \emph{matroid} version of the Santa Claus problem with unit size presents and design an algorithm which gives a polynomial time $(3+\varepsilon)$-approximation compared to a natural, compact LP. Our algorithm is also based on Haxell's augmentation tree, but despite the greater generality, it is cleaner than previous methods. Our result can then be used as a blackbox to obtain a $(6+\varepsilon)$-approximation for Santa Claus (with arbitrary present sizes). This factor also compares against a natural, compact LP for Santa Claus.

### A $\phi$-Competitive Algorithm for Scheduling Packets with Deadlines

Authors: Pavel Veselý, Marek Chrobak, Łukasz Jeż, Jiří Sgall
Abstract: In the online packet scheduling problem with deadlines (PacketScheduling, for short), the goal is to schedule transmissions of packets that arrive over time in a network switch and need to be sent across a link. Each packet $p$ has a deadline $d_p$, representing its urgency, and a non-negative weight $w_p$, that represents its priority. Only one packet can be transmitted in any time slot, so, if the system is overloaded, some packets will inevitably miss their deadlines and be dropped. In this scenario, the natural objective is to compute a transmission schedule that maximizes the total weight of packets which are successfully transmitted. The problem is inherently online, with the scheduling decisions made without the knowledge of future packet arrivals. The central problem concerning PacketScheduling, that has been a subject of intensive study since 2001, is to determine the optimal competitive ratio of online algorithms, namely the worst-case ratio between the optimum total weight of a schedule (computed by an offline algorithm) and the weight of a schedule computed by a (deterministic) online algorithm.

We solve this open problem by presenting a $\phi$-competitive online algorithm for PacketScheduling (where $\phi\approx 1.618$ is the golden ratio), matching the previously established lower bound.

### Approximation Schemes for Low-Rank Binary Matrix Approximation Problems

Authors: Fedor V. Fomin, Petr A. Golovach, Daniel Lokshtanov, Fahad Panolan, Saket Saurabh
Abstract: We provide a randomized linear time approximation scheme for a generic problem about clustering of binary vectors subject to additional constrains. The new constrained clustering problem encompasses a number of problems and by solving it, we obtain the first linear time-approximation schemes for a number of well-studied fundamental problems concerning clustering of binary vectors and low-rank approximation of binary matrices. Among the problems solvable by our approach are \textsc{Low GF(2)-Rank Approximation}, \textsc{Low Boolean-Rank Approximation}, and various versions of \textsc{Binary Clustering}. For example, for \textsc{Low GF(2)-Rank Approximation} problem, where for an $m\times n$ binary matrix $A$ and integer $r>0$, we seek for a binary matrix $B$ of $GF_2$ rank at most $r$ such that $\ell_0$ norm of matrix $A-B$ is minimum, our algorithm, for any $\epsilon>0$ in time $f(r,\epsilon)\cdot n\cdot m$, where $f$ is some computable function, outputs a $(1+\epsilon)$-approximate solution with probability at least $(1-\frac{1}{e})$. Our approximation algorithms substantially improve the running times and approximation factors of previous works. We also give (deterministic) PTASes for these problems running in time $n^{f(r)\frac{1}{\epsilon^2}\log \frac{1}{\epsilon}}$, where $f$ is some function depending on the problem. Our algorithm for the constrained clustering problem is based on a novel sampling lemma, which is interesting in its own.

### Fast and Deterministic Approximations for $k$-Cut

Authors: Kent Quanrud
Abstract: In an undirected graph, a $k$-cut is a set of edges whose removal breaks the graph into at least $k$ connected components. The minimum weight $k$-cut can be computed in $O(n^{2(k-1)})$ randomized time [Karger and Stein 1996] and $O(n^{2k + O(1)})$ deterministic time [Thorup 2008], but when $k$ is treated as part of the input, computing the minimum weight $k$-cut is NP-Hard [Holdschmidt and Hochbaum 1994]. For $poly(m,n,k)$-time algorithms, the best possible approximation factor is essentially 2 under the small set expansion hypothesis [Manurangsi 2017]. Saran and Vazirani [1995] showed that a $(2 - 2/k)$-approximately minimum weight $k$-cut can be computed by $O(k)$ minimum cuts, which implies an $\tilde{O}(mk)$ randomized running time and a $\tilde{O}(k m n)$ deterministic running time. These results prompt two basic questions. First, is there a deterministic algorithm for 2-approximate $k$-cuts in $\tilde{O}(mk)$ time? Second, can 2-approximate $k$-cuts be computed as fast as the (exact) minimum cut - in $\tilde{O}(m)$ randomized time or $\tilde{O}(mn)$ deterministic time?

We make progress on both of these questions with a deterministic approximation algorithm that computes $(2 + \epsilon)$-minimum $k$-cuts in $O(m \log^3(n) / \epsilon^2)$ time, via a $(1 + \epsilon)$-approximate for an LP relaxation of $k$-cut.

### Reconstructing Latent Orderings by Spectral Clustering

Authors: Antoine Recanati, Thomas Kerdreux, Alexandre d'Aspremont
Abstract: Spectral clustering uses a graph Laplacian spectral embedding to enhance the cluster structure of some data sets. When the embedding is one dimensional, it can be used to sort the items (spectral ordering). A number of empirical results also suggests that a multidimensional Laplacian embedding enhances the latent ordering of the data, if any. This also extends to circular orderings, a case where unidimensional embeddings fail. We tackle the task of retrieving linear and circular orderings in a unifying framework, and show how a latent ordering on the data translates into a filamentary structure on the Laplacian embedding. We propose a method to recover it, illustrated with numerical experiments on synthetic data and real DNA sequencing data. The code and experiments are available at https://github.com/antrec/mdso.

### A Fixed-Parameter Linear-Time Algorithm to Compute Principal Typings of Planar Flow Networks

Authors: Assaf Kfoury
Abstract: We present an alternative and simpler method for computing principal typings of flow networks. When limited to planar flow networks, the method can be made to run in fixed-parameter linear-time -- where the parameter not to be exceeded is what is called the edge-outerplanarity of the networks' underlying graphs.

### Learning Sums of Independent Random Variables with Sparse Collective Support

Authors: Anindya De, Philip M. Long, Rocco Servedio
Abstract: We study the learnability of sums of independent integer random variables given a bound on the size of the union of their supports. For $\mathcal{A} \subset \mathbf{Z}_{+}$, a sum of independent random variables with collective support $\mathcal{A}$} (called an $\mathcal{A}$-sum in this paper) is a distribution $\mathbf{S} = \mathbf{X}_1 + \cdots + \mathbf{X}_N$ where the $\mathbf{X}_i$'s are mutually independent (but not necessarily identically distributed) integer random variables with $\cup_i \mathsf{supp}(\mathbf{X}_i) \subseteq \mathcal{A}.$ We give two main algorithmic results for learning such distributions:

1. For the case $| \mathcal{A} | = 3$, we give an algorithm for learning $\mathcal{A}$-sums to accuracy $\epsilon$ that uses $\mathsf{poly}(1/\epsilon)$ samples and runs in time $\mathsf{poly}(1/\epsilon)$, independent of $N$ and of the elements of $\mathcal{A}$.

2. For an arbitrary constant $k \geq 4$, if $\mathcal{A} = \{ a_1,...,a_k\}$ with $0 \leq a_1 < ... < a_k$, we give an algorithm that uses $\mathsf{poly}(1/\epsilon) \cdot \log \log a_k$ samples (independent of $N$) and runs in time $\mathsf{poly}(1/\epsilon, \log a_k).$

We prove an essentially matching lower bound: if $|\mathcal{A}| = 4$, then any algorithm must use $\Omega(\log \log a_4)$ samples even for learning to constant accuracy. We also give similar-in-spirit (but quantitatively very different) algorithmic results, and essentially matching lower bounds, for the case in which $\mathcal{A}$ is not known to the learner.

### The parameterised complexity of computing the maximum modularity of a graph

Authors: Kitty Meeks, Fiona Skerman
Abstract: The maximum modularity of a graph is a parameter widely used to describe the level of clustering or community structure in a network. Determining the maximum modularity of a graph is known to be NP-complete in general, and in practice a range of heuristics are used to construct partitions of the vertex-set which give lower bounds on the maximum modularity but without any guarantee on how close these bounds are to the true maximum. In this paper we investigate the parameterised complexity of determining the maximum modularity with respect to various standard structural parameterisations of the input graph G. We show that the problem belongs to FPT when parameterised by the size of a minimum vertex cover for G, and is solvable in polynomial time whenever the treewidth or max leaf number of G is bounded by some fixed constant; we also obtain an FPT algorithm, parameterised by treewidth, to compute an additive approximation to the maximum modularity. On the other hand we show that the problem is W[1]-hard (and hence unlikely to admit an FPT algorithm) when parameterised simultaneously by pathwidth and the size of a minimum feedback vertex set.

### An ETH-Tight Exact Algorithm for Euclidean TSP

Authors: Mark de Berg, Hans L. Bodlaender, Sandor Kisfaludi-Bak, Sudeshna Kolay
Abstract: We study exact algorithms for {\sc Euclidean TSP} in $\mathbb{R}^d$. In the early 1990s algorithms with $n^{O(\sqrt{n})}$ running time were presented for the planar case, and some years later an algorithm with $n^{O(n^{1-1/d})}$ running time was presented for any $d\geq 2$. Despite significant interest in subexponential exact algorithms over the past decade, there has been no progress on {\sc Euclidean TSP}, except for a lower bound stating that the problem admits no $2^{O(n^{1-1/d-\epsilon})}$ algorithm unless ETH fails. Up to constant factors in the exponent, we settle the complexity of {\sc Euclidean TSP} by giving a $2^{O(n^{1-1/d})}$ algorithm and by showing that an $2^{o(n^{1-1/d})}$ algorithm does not exist unless ETH fails.

### Time-Bounded Influence Diffusion with Incentives

Authors: Gennaro Cordasco, Luisa Gargano, Joseph Peters, Adele Anna Rescigno, Ugo Vaccaro
Abstract: A widely studied model of influence diffusion in social networks represents the network as a graph $G=(V,E)$ with an influence threshold $t(v)$ for each node. Initially the members of an initial set $S\subseteq V$ are influenced. During each subsequent round, the set of influenced nodes is augmented by including every node $v$ that has at least $t(v)$ previously influenced neighbours. The general problem is to find a small initial set that influences the whole network. In this paper we extend this model by using \emph{incentives} to reduce the thresholds of some nodes. The goal is to minimize the total of the incentives required to ensure that the process completes within a given number of rounds. The problem is hard to approximate in general networks. We present polynomial-time algorithms for paths, trees, and complete networks.

### An Information-theoretic Framework for the Lossy Compression of Link Streams

Authors: Robin Lamarche-Perrin
Abstract: Graph compression is a data analysis technique that consists in the replacement of parts of a graph by more general structural patterns in order to reduce its description length. It notably provides interesting exploration tools for the study of real, large-scale, and complex graphs which cannot be grasped at first glance. This article proposes a framework for the compression of temporal graphs, that is for the compression of graphs that evolve with time. This framework first builds on a simple and limited scheme, exploiting structural equivalence for the lossless compression of static graphs, then generalises it to the lossy compression of link streams, a recent formalism for the study of temporal graphs. Such generalisation relies on the natural extension of (bidimensional) relational data by the addition of a third temporal dimension. Moreover, we introduce an information-theoretic measure to quantify and to control the information that is lost during compression, as well as an algebraic characterisation of the space of possible compression patterns to enhance the expressiveness of the initial compression scheme. These contributions lead to the definition of a combinatorial optimisation problem, that is the Lossy Multistream Compression Problem, for which we provide an exact algorithm.

### Smoothed Analysis of the Expected Number of Maximal Points in Two Dimensions

Authors: Josep Diaz, Mordecai Golin
Abstract: The {\em Maximal} points in a set S are those that aren't {\em dominated} by any other point in S. Such points arise in multiple application settings in which they are called by a variety of different names, e.g., maxima, Pareto optimums, skylines. Because of their ubiquity, there is a large literature on the {\em expected} number of maxima in a set S of n points chosen IID from some distribution. Most such results assume that the underlying distribution is uniform over some spatial region and strongly use this uniformity in their analysis. This work was initially motivated by the question of how this expected number changes if the input distribution is perturbed by random noise. More specifically, let Ballp denote the uniform distribution from the 2-d unit Lp ball, delta Ballq denote the 2-d Lq-ball, of radius delta and Ballpq be the convolution of the two distributions, i.e., a point v in Ballp is reported with an error chosen from delta Ballq. The question is how the expected number of maxima change as a function of delta. Although the original motivation is for small delta the problem is well defined for any delta and our analysis treats the general case. More specifically, we study, as a function of n,\delta, the expected number of maximal points when the n points in S are chosen IID from distributions of the type Ballpq where p,q in {1,2,infty} for delta > 0 and also of the type Ballp infty-q, where q in [1,infty) for delta > 0.

### The Scheduler is Very Powerful in Competitive Analysis of Distributed List Accessing

Authors: Joan Boyar, Faith Ellen, Kim S. Larsen
Abstract: This work is a continuation of efforts to define and understand competitive analysis of algorithms in a distributed shared memory setting, which is surprisingly different from the classical online setting. In fact, in a distributed shared memory setting, we find a counter-example to the theorem concerning classical randomized online algorithms which shows that, if there is a $c$-competitive randomized algorithm against an adaptive offline adversary, then there is a $c$-competitive deterministic algorithm [Ben-David, Borodin, Karp, Tardos, Wigderson, 1994]. In a distributed setting, there is additional lack of knowledge concerning what the other processes have done. There is also additional power for the adversary, having control of the scheduler which decides when each process is allowed to take steps.

We consider the list accessing problem, which is a benchmark problem for sequential online algorithms. In the distributed version of this problem, each process has its own finite sequence of requests to a shared list. The scheduler arises as a major issue in its competitive analysis. We introduce two different adversaries, which differ in how they are allowed to schedule processes, and use them to perform competitive analysis of distributed list accessing. We prove tight upper and lower bounds on combinatorial properties of merges of the request sequences, which we use in the analysis. Our analysis shows that the effects of the adversarial scheduler can be quite significant, dominating the usual quality loss due to lack of information about the future.

### Deterministic oblivious distribution (and tight compaction) in linear time

Authors: Enoch Peserico
Abstract: In an array of N elements, M positions and M elements are "marked". We show how to permute the elements in the array so that all marked elements end in marked positions, in time O(N) (in the standard word-RAM model), deterministically, and obliviously - i.e. with a sequence of memory accesses that depends only on N and not on which elements or positions are marked.

As a corollary, we answer affirmatively to an open question about the existence of a deterministic oblivious algorithm with O(N) running time for tight compaction (move the M marked elements to the first M positions of the array), a building block for several cryptographic constructions. Our O(N) result improves the running-time upper bounds for deterministic tight compaction, for randomized tight compaction, and for the simpler problem of randomized loose compaction (move the M marked elements to the first O(M) positions) - until now respectively O(N lg N), O(N lg lg N), and O(N lg*N).

### Massively Parallel Symmetry Breaking on Sparse Graphs: MIS and Maximal Matching

Abstract: The success of massively parallel computation (MPC) paradigms such as MapReduce has led to a significant interest in better understanding their true computational power. The fundamental question in this regard is how the advantages of this model (e.g. free local computation) can be leveraged to improve the round-complexity of algorithms inherited from traditional parallel and distributed models such as PRAM or LOCAL. Problems such as maximal independent set (MIS) or maximal matching are among the most intensively studied problems in the field and logarithmic round algorithms have been known for these problems from 1980s. However, prior to our work, no sublogarithmic MPC algorithm was known for these problems using a truly sublinear space of $n^{1-\Omega(1)}$ per machine where $n$ denotes the number of vertices.

Our main result is a truly sublinear algorithm that takes $O(\log \alpha + \log^2\log n)$ rounds to compute MIS or maximal matching using an optimal total space of $\tilde{O}(m)$ where $m$ denotes the number of edges and $\alpha$ denotes the arboricity of the input graph. We believe parametrization by arboricity is particularly interesting for this regime of MPC since most families of sparse graphs have a small arboricity. Our algorithms do not assume arboricity is constant and do not require to be given $\alpha$. This is the first substantial improvement over the known PRAM/LOCAL algorithms for these problems on such a wide class of graphs.

Since trees have arboricity one, our algorithm improves and generalizes the recent algorithm of Brandt et al.~[arXiv:1802.06748] that finds MIS on trees in $O(\log^3\log n)$ rounds. Moreover, the $n$-dependency of our algorithm exponentially improves over the corresponding $O(\log \alpha + \sqrt{\log n})$ PRAM/LOCAL bounds by Barenboim et al.~[FOCS'12] and Ghaffari~[SODA'16].

### Supermodular Locality Sensitive Hashes

Authors: Maxim Berman, Matthew B. Blaschko
Abstract: In this work, we show deep connections between Locality Sensitive Hashability and submodular analysis. We show that the LSHablility of the most commonly analyzed set similarities is in one-to-one correspondance with the supermodularity of these similarities when taken with respect to the symmetric difference of their arguments. We find that the supermodularity of equivalent LSHable similarities can be dependent on the set encoding. While monotonicity and supermodularity does not imply the metric condition necessary for supermodularity, this condition is guaranteed for the more restricted class of supermodular Hamming similarities that we introduce. We show moreover that LSH preserving transformations are also supermodular-preserving, yielding a way to generate families of similarities both LSHable and supermodular. Finally, we show that even the more restricted family of cardinality-based supermodular Hamming similarities presents promising aspects for the study of the link between LSHability and supermodularity. We hope that the several bridges that we introduce between LSHability and supermodularity paves the way to a better understanding both of supermodular analysis and LSHability, notably in the context of large-scale supermodular optimization.

### Derandomizing the Lovasz Local Lemma via log-space statistical tests

Authors: David G. Harris
Abstract: The Lov\'{a}sz Local Lemma (LLL) is a keystone principle in probability theory, guaranteeing the existence of configurations which avoid a collection $\mathcal B$ of "bad" events which are mostly independent and have low probability. In its simplest form, it asserts that whenever a bad-event has probability $p$ and affects at most $d$ other bad-events, and $e p (d+1) < 1$, then a configuration avoiding all $\mathcal B$ exists.

A seminal algorithm of Moser & Tardos (2010) gives randomized algorithms for most constructions based on the LLL. However, deterministic algorithms have lagged behind. Notably, prior deterministic LLL algorithms have required stringent conditions on $\mathcal B$; for example, they have required that events in $\mathcal B$ have low decision-tree complexity or depend on a small number of variables. For this reason, they can only be applied to small fraction of the numerous LLL applications in practice.

We describe an alternate deterministic parallel (NC) algorithm for the LLL, based on a general derandomization method of Sivakumar (2002) using log-space statistical tests. The only requirement here is that bad-events should be computable via a finite automaton with $\text{poly}(d)$ states. This covers most LLL applications to graph theory and combinatorics. No auxiliary information about the bad-events, including any conditional probability calculations, are required. Additionally, the proof is a straightforward combination of general derandomization results and high-level analysis of the Moser-Tardos algorithm.

We illustrate with applications to defective vertex coloring, domatic partition, and independent transversals.

### The Online $k$-Taxi Problem

Authors: Christian Coester, Elias Koutsoupias
Abstract: We consider the online $k$-taxi problem, a generalization of the $k$-server problem, in which $k$ taxis serve a sequence of requests in a metric space. A request consists of two points $s$ and $t$, representing a passenger that wants to be carried by a taxi from $s$ to $t$. The goal is to serve all requests while minimizing the total distance traveled by all taxis. The problem comes in two flavors, called the easy and the hard $k$-taxi problem: In the easy $k$-taxi problem, the cost is defined as the total distance traveled by the taxis; in the hard $k$-taxi problem, the cost is only the distance traveled by taxis while not carrying a passenger, i.e., the distance from $s$ to $t$ is excluded from the cost.

The hard $k$-taxi problem is substantially more difficult than the easy version with at least an exponential deterministic competitive ratio, $\Omega(2^k)$, admitting a reduction from the layered width-$k$ graph traversal problem. In contrast, the easy $k$-taxi problem has exactly the same competitive ratio as the $k$-server problem. For the hard $k$-taxi problem on hierarchically separated trees (HSTs), we present a memoryless randomized algorithm with competitive ratio $2^k-1$ against adaptive online adversaries and provide a matching lower bound. Due to well-known HST embedding techniques, this yields a randomized $O(2^k\log n)$-competitive algorithm for arbitrary $n$-point metric spaces. This is the first competitive algorithm for the hard $k$-taxi problem for general finite metric spaces and general $k$. For the special case of $k=2$, we obtain a precise answer of $9$ for the competitive ratio on general metric spaces.

### Distributed Triangle Detection via Expander Decomposition

Authors: Yi-Jun Chang, Seth Pettie, Hengjie Zhang
Abstract: We present improved distributed algorithms for triangle detection and its variants in the CONGEST model. We show that Triangle Detection, Counting, and Enumeration can be solved in $\tilde{O}(n^{1/2})$ rounds. In contrast, the previous state-of-the-art bounds for Triangle Detection and Enumeration were $\tilde{O}(n^{2/3})$ and $\tilde{O}(n^{3/4})$, respectively, due to Izumi and LeGall (PODC 2017).

The main technical novelty in this work is a distributed graph partitioning algorithm. We show that in $\tilde{O}(n^{1-\delta})$ rounds we can partition the edge set of the network $G=(V,E)$ into three parts $E=E_m\cup E_s\cup E_r$ such that

(a) Each connected component induced by $E_m$ has minimum degree $\Omega(n^\delta)$ and conductance $\Omega(1/\text{poly} \log(n))$. As a consequence the mixing time of a random walk within the component is $O(\text{poly} \log(n))$.

(b) The subgraph induced by $E_s$ has arboricity at most $n^{\delta}$.

(c) $|E_r| \leq |E|/6$.

All of our algorithms are based on the following generic framework, which we believe is of interest beyond this work. Roughly, we deal with the set $E_s$ by an algorithm that is efficient for low-arboricity graphs, and deal with the set $E_r$ using recursive calls. For each connected component induced by $E_m$, we are able to simulate congested clique algorithms with small overhead by applying a routing algorithm due to Ghaffari, Kuhn, and Su (PODC 2017) for high conductance graphs.

### Group Theory Is Tough

from Richard Lipton

Some musings on group theory

 Isaacs honorary conference source

Martin Isaacs is a group theorist emeritus from the University of Wisconsin, Madison. I just picked up a copy of his 2008 book, Finite Group Theory. Simple title; great book.

Today I want to make a few short comments about group theory.

I always hated group theory. Well not really hated it. But I have never been able to get decent intuition about groups.

Ken has a similar issue and puts it this way: We are “spoiled” by first learning about fields and rings with ${\mathbb{Z},\mathbb{Q},\mathbb{R},\mathbb{C}}$ as all-pervading examples, then ${\mathbb{Z}_p}$ and ${\mathbb{Z}_m}$ for prime ${p}$ and non-prime ${m}$. Each includes an additive group and (for the fields) a multiplicative group but is so much more. They combine to form other spaces such as ${\mathbb{R}^n}$ and ${\mathbb{Z}[x_1,\dots, x_n]}$.

Groups alone seem neither to command such sway as individual mathematical objects nor to combine so prolifically. The group-based analogue of a vector space is called a module and feels like a similar “hole” in our intuition.

Ken remembers Martin, Marty, well having known him during Marty’s visit to Oxford in 1984. Marty was often found in the lobby or the tea room of the old Mathematical Institute animatedly talking about open problems and ideas for teaching. His “group” included Graham Higman, Peter Neumann, Hilary Priestley, Robin Gandy, and anyone who happened to stop by. Ken remembers geniality, openness, and great ways of putting things.

## Groups and Complexity

Of course group theory is important in many parts of complexity theory. Here are two important examples of group theory open questions:

• How hard is group isomorphism?

• Are solvable groups as powerful as simple groups?

The latter is of course asking if a solvable group can be used like a non-abelian simple group in computing via bounded width programs. We have discussed both of these questions before: check out
this and this.

I definitely would suggest you look at Issacs’s book if you are at all interested in getting some insight into group theory. I have been looking at it and now feel that I have intuition about groups. Not much. But some. The issue is all mine, since the book is so well written.

## A Remark

Issacs makes a cool remark in his book on group theory. Suppose that ${G}$ is a group with a non-trivial normal subgroup ${N}$. Then often we can conclude that at least one of ${N}$ or ${G/N}$ is solvable. This can be done in the case that the orders of ${N}$ and ${G/N}$ are co-prime. The proof of this is quite amusing: it depends on two theorems:

Theorem 1 If ${x}$ and ${y}$ are co-prime numbers, then at least one of ${x}$ and ${y}$ is odd.

Theorem 2 Every group of odd order is solvable.

One of these is very trivial and the other is very non-trivial. I trust you know which is which.

## Another Remark

Group theory uses second order reasoning, even for elementary results, quite often. This sets it apart from other parts of mathematics. A typical group theory result is often proved by the following paradigm:

Let ${H}$ be a subgroup of ${G}$ that is largest with regard to some property ${X}$. Then show that ${H}$ dose not exist. or that ${H}$ must also have some other property ${Y}$.

When we are talking about finite groups then of course there are only finitely many subgroups and so the notion of “largest” involves nothing subtle. Moreover, one can transfer all this into first-order sentences about their string encodings. But the theory is really naturally second-order. For infinite groups the notion of maximal can be really tricky. For example, the group of all ${2^n}$ complex roots of unity over ${n = 1,2,3,4,\dots}$ has no maximal subgroups at all.

## Open Problems

Should we teach groups and modules in a richer fashion? Is it really hard to get intuition in group theory? Or is that just an example of why mathematics is hard?

[fixed name in line 1; fixed to “N and G/N” in section 3]

by rjlipton at July 18, 2018 09:01 PM UTC

### Counting higher order tangencies for plane curves

Authors: Joshua Zahl
Abstract: We prove that $n$ plane algebraic curves determine $O(n^{(k+2)/(k+1)})$ points of $k$-th order tangency. This generalizes an earlier result of Ellenberg, Solymosi, and Zahl on the number of (first order) tangencies determined by $n$ plane algebraic curves.

### Fisher zeros and correlation decay in the Ising model

Authors: Jingcheng Liu, Alistair Sinclair, Piyush Srivastava
Abstract: In this note, we show that the zero field Ising partition function has no complex zeros in the interaction parameter (known as Fisher zeros) in a complex neighborhood of the correlation decay regime.

### Optimization of the n-dimensional sliding window inter-channel correlation algorithm for multi-core architecture

Authors: Alexey Poyda, Mikhail Zhizhin
Abstract: Calculating the correlation in a sliding window is a common method of statistical evaluation of the interconnect between two sets of data. And although the calculation of a single correlation coefficient is not resource-intensive and algorithmically complex, sequential computation in a large number of windows on large data sets can take quite a long time. In this case, each value in the data, falling into different windows, will be processed many times, increasing the complexity of the algorithm and the processing time. We took this fact into account and optimized the correlation calculation in the sliding window, reducing the number of operations in the overlapping area of the windows. In addition, we developed a parallel version of the optimized algorithm for the GPU architecture. Experimental studies have shown that for a 7x7 correlation window sliding in one pixel increments, we were able to accelerate the processing of an 12 MPixel image pixels on the GPU by about 60 times compared to the serial version running on the CPU. The article presents an optimized version of the algorithm, a scheme for its parallelization, as well as the results of experimental studies.

### Local Rejection Sampling with Soft Filters

Authors: Weiming Feng, Yahui Liu, Yitong Yin
Abstract: We study a new LLL-like framework for rejection sampling, which naturally generalizes the variable-framework for the Lov\'asz local lemma (LLL) and the LLL-based Partial Rejection Sampling of Guo, Jerrum and Liu [STOC'17]. In this new framework, the occurrence of a bad event is biased by the current evaluation of variables, rather than being determined by them as in the standard setting of LLL. Such generalization allows us to model rejection samplings with soft filters. We give a generic algorithm, called Local Rejection Sampling, for sampling from the correct distribution arising from the rejection rules. For rejection sampling with soft filters, this algorithm only updates variables which are local to the violated events in the dependency graph, making it suitable for sampling with local computation or on dynamic input.

As a nontrivial consequence, for any distribution defined by local constraints, if a decay of correlation property known as the strong spatial mixing holds, there is a distributed zero-error Las Vegas algorithm for sampling precisely from this distribution within $O(\log^3 n)$ rounds.

### Tracking the $\ell_2$ Norm with Constant Update Time

Authors: Chi-Ning Chou, Zhixian Lei, Preetum Nakkiran
Abstract: The $\ell_2$ tracking problem is the task of obtaining a streaming algorithm that, given access to a stream of items $a_1,a_2,a_3,\ldots$ from a universe $[n]$, outputs at each time $t$ an estimate to the $\ell_2$ norm of the frequency vector $f^{(t)}\in \mathbb{R}^n$ (where $f^{(t)}_i$ is the number of occurrences of item $i$ in the stream up to time $t$). The previous work [Braverman-Chestnut-Ivkin-Nelson-Wang-Woodruff, FOCS 2017] gave an streaming algorithm with (the optimal) space using $O(\epsilon^{-2}\log(1/\delta))$ words and $O(\epsilon^{-2}\log(1/\delta))$ update time to obtain an $\epsilon$-accurate estimate with probability at least $1-\delta$.

We give the first algorithm that achieves update time of $O(\log 1/\delta)$ which is independent of the accuracy parameter $\epsilon$, together with the optimal space using $O(\epsilon^{-2}\log(1/\delta))$ words. Our algorithm is obtained using the Count Sketch of [Charilkar-Chen-Farach-Colton, ICALP 2002].

### Quantum Chebyshev's Inequality and Applications

Authors: Yassine Hamoudi, Frédéric Magniez
Abstract: In this paper we provide new quantum algorithms with polynomial speed-up for a range of problems for which no such results were known, or we improve previous algorithms. First, we consider the approximation of the frequency moments $F_k$ of order $k \geq 3$ in the multi-pass streaming model with updates (turnstile model). We design a $P$-pass quantum streaming algorithm with space memory $M$ satisfying a tradeoff of $P^2 M = \tilde{O}(n^{1-2/k})$, whereas the best classical algorithm requires $P M = \Theta(n^{1-2/k})$. Then, we study the problem of estimating the number $m$ of edges and the number $t$ of triangles given query access to an $n$-vertex graph. We describe optimal quantum algorithms that performs $\tilde{O}(\sqrt{n}/m^{1/4})$ and $\tilde{O}(\sqrt{n}/t^{1/6} + m^{3/4}/\sqrt{t})$ queries respectively. This is a quadratic speed-up compared to the classical complexity of these problems.

For this purpose we develop a new quantum paradigm that we call Quantum Chebyshev's inequality. Namely we demonstrate that one can approximate with relative error the mean of any random variable with a number of quantum samples that is linear in the ratio of the square root of the variance to the mean. Classically the dependency is quadratic. Our result is optimal and subsumes a previous result of Montanaro [Mon15]. This new paradigm is based on a refinement of the Amplitude Estimation algorithm [BHMT02], and of previous quantum algorithms for the mean estimation problem. For our applications, we also adapt the variable-time amplitude amplification technique of Ambainis [Amb10] into a variable-time amplitude estimation algorithm, improving a recent result of Chakraborty, Gily\'en and Jeffery [CGJ18].

### Computing the Homology of Semialgebraic Sets I: Lax Formulas

Authors: Peter Bürgisser, Felipe Cucker, Josué Tonelli-Cueto
Abstract: We describe and analyze an algorithm for computing the homology (Betti numbers and torsion coefficients) of closed semialgebraic sets given by Boolean formulas without negations over lax polynomial inequalities. The algorithm works in weak exponential time. This means that outside a subset of data having exponentially small measure, the cost of the algorithm is single exponential in the size of the data.

All previous algorithms solving this problem have doubly exponential complexity (and this is so for almost all input data). Our algorithm thus represents an exponential acceleration over state-of-the-art algorithms for all input data outside a set that vanishes exponentially fast.

### Using statistical encoding to achieve tree succinctness never seen before

Authors: Michał Gańczorz
Abstract: We propose a new succinct representation of labeled trees which represents a tree T using |T|H_k(T) number of bits (plus some smaller order terms), where |T|H_k(T) denotes the k-th order (tree label) entropy, as defined by Ferragina at al. 2005. Our representation employs a new, simple method of partitioning the tree, which preserves both tree shape and node degrees. Previously, the only representation that used |T|H_k(T) bits was based on XBWT, a transformation that linearizes tree labels into a single string, combined with compression boosting. The proposed representation is much simpler than the one based on XBWT, which used additional linear space (bounded by 0.01n) hidden in the "smaller order terms" notion, as an artifact of using zeroth order entropy coder; our representation uses sublinear additional space (for reasonable values of k and size of the label alphabet {\sigma}). The proposed representation can be naturally extended to a succinct data structure for trees, which uses |T|H_k(T) plus additional O(|T|k log_{\sigma}/ log_{\sigma} |T| + |T| log log_{\sigma} |T|/ log_{\sigma} |T|) bits and supports all the usual navigational queries in constant time. At the cost of increasing the query time to O(log log |T|/ log |T|) we can further reduce the space redundancy to O(|T| log log |T|/ log_{\sigma} |T|) bits, assuming k <= log_{\sigma} |T|. This is a major improvement over representation based on XBWT: even though XBWT-based representation uses |T|H_k(T) bits, the space needed for structure supporting navigational queries is much larger: (...)

### Near-optimal Bootstrapping of Hitting Sets for Algebraic Circuits

Authors: Mrinal Kumar, Ramprasad Saptharishi, Anamay Tengse
Abstract: The classical lemma of Ore-DeMillo-Lipton-Schwartz-Zippel states that any nonzero polynomial $f(x_1,\ldots, x_n)$ of degree at most $s$ will evaluate to a nonzero value at some point on a grid $S^n \subseteq \mathbb{F}^n$ with $|S| > s$. Thus, there is a deterministic polynomial identity test (PIT) for all degree-$s$ size-$s$ algebraic circuits in $n$ variables that runs in time $\mathrm{poly}(s) \cdot (s+1)^n$. In a surprising recent result, Agrawal, Ghosh and Saxena (STOC 2018) showed any deterministic blackbox PIT algorithm for degree-$s$, size-$s$, $n$-variate circuits with running time as bad as $s^{n^{0.5 - \delta}}\mathrm{Huge}(n)$, where $\delta > 0$ and $\mathrm{Huge}(n)$ is an arbitrary function, can be used to construct blackbox PIT algorithms for degree-$s$ size $s$ circuits with running time $s^{\exp \circ \exp (O(\log ^\ast s))}$.

The authors asked if a similar conclusion followed if their hypothesis was weakened to having deterministic PIT with running time $s^{o(n)}\cdot \mathrm{Huge}(n)$. In this paper, we answer their question in the affirmative. We show that, given a deterministic blackbox PIT that runs in time $s^{o(n)}\cdot \mathrm{Huge}(n)$ for all degree-$s$ size-$s$ algebraic circuits over $n$ variables, we can obtain a deterministic blackbox PIT that runs in time $s^{\exp \circ \exp(O(\log^{*}s))}$ for all degree-$s$ size-$s$ algebraic circuits over $n$ variables. In other words, any blackbox PIT with just a slightly non-trivial exponent of $s$ compared to the trivial $s^{O(n)}$ test can be used to give a nearly polynomial time blackbox PIT algorithm.

### Generation of planar tensegrity structures through cellular multiplication

Authors: Omar Aloui, David Orden, Landolf Rhode-Barbarigos
Abstract: Tensegrity structures are frameworks in a stable self-equilibrated prestress state that have been applied in various fields in science and engineering. Research into tensegrity structures has resulted in reliable techniques for their form finding and analysis. However, most techniques address topology and form separately. This paper presents a bio-inspired approach for the combined topology identification and form finding of planar tensegrity structures. Tensegrity structures are generated using tensegrity cells (elementary stable self-stressed units that have been proven to compose any tensegrity structure) according to two multiplication mechanisms: cellular adhesion and fusion. Changes in the dimension of the self-stress space of the structure are found to depend on the number of adhesion and fusion steps conducted as well as on the interaction among the cells composing the system. A methodology for defining a basis of the self-stress space is also provided. Through the definition of the equilibrium shape, the number of nodes and members as well as the number of self-stress states, the cellular multiplication method can integrate design considerations, providing great flexibility and control over the tensegrity structure designed and opening the door to the development of a whole new realm of planar tensegrity systems with controllable characteristics.

### Parameterized Query Complexity of Hitting Set using Stability of Sunflowers

Authors: Arijit Bishnu, Arijit Ghosh, Sudeshna Kolay, Gopinath Mishra, Saket Saurabh
Abstract: In this paper, we study the query complexity of parameterized decision and optimization versions of {\sc Hitting-Set}, {\sc Vertex Cover}, {\sc Packing} , \match{} and {\sc Max-Cut}. The main focus is the query complexity of {\sc Hitting Set}. In doing so, we use an oracle known as \bis{} introduced by Beame et al.~\cite{BeameHRRS18} and its generalizations to hypergraphs. The query models considered are the \gpis{} and \gpise{} oracles :

(i) the \gpis{} oracle takes as input $d$ pairwise disjoint non-empty vertex subsets $A_1, \ldots, A_d$ in a hypergraph $\cal H$ and answers whether there is a hyperedge with vertices in each $A_i$,

(ii) the \gpise{} oracle takes the same input and returns a hyperedge that has vertices in each $A_i$; NULL, otherwise.

The \gpis{} and \gpise{} oracles are used for the decision and optimization versions of the problems, respectively. For $d=2$, we refer \gpis{} and \gpise{} as \bis{} and \bise{}, respectively. We use color coding and queries to the oracles to generate subsamples from the hypergraph, that retain some structural properties of the original hypergraph. We use the stability of the sunflowers in a non-trivial way to do so.

### Classical lower bounds from quantum upper bounds

Authors: Shalev Ben-David, Adam Bouland, Ankit Garg, Robin Kothari
Abstract: We prove lower bounds on complexity measures, such as the approximate degree of a Boolean function and the approximate rank of a Boolean matrix, using quantum arguments. We prove these lower bounds using a quantum query algorithm for the combinatorial group testing problem.

We show that for any function f, the approximate degree of computing the OR of n copies of f is Omega(sqrt{n}) times the approximate degree of f, which is optimal. No such general result was known prior to our work, and even the lower bound for the OR of ANDs function was only resolved in 2013.

We then prove an analogous result in communication complexity, showing that the logarithm of the approximate rank (or more precisely, the approximate gamma_2 norm) of F: X x Y -> {0,1} grows by a factor of Omega~(sqrt{n}) when we take the OR of n copies of F, which is also essentially optimal. As a corollary, we give a new proof of Razborov's celebrated Omega(sqrt{n}) lower bound on the quantum communication complexity of the disjointness problem.

Finally, we generalize both these results from composition with the OR function to composition with arbitrary symmetric functions, yielding nearly optimal lower bounds in this setting as well.

### Sparsifying Distributed Algorithms with Ramifications in Massively Parallel Computation and Centralized Local Computation

Authors: Mohsen Ghaffari, Jara Uitto
Abstract: We introduce a method for sparsifying distributed algorithms and exhibit how it leads to improvements that go past known barriers in two algorithmic settings of large-scale graph processing: Massively Parallel Computation (MPC), and Local Computation Algorithms (LCA).

- MPC with Strongly Sublinear Memory: Recently, there has been growing interest in obtaining MPC algorithms that are faster than their classic $O(\log n)$-round parallel counterparts for problems such as MIS, Maximal Matching, 2-Approximation of Minimum Vertex Cover, and $(1+\epsilon)$-Approximation of Maximum Matching. Currently, all such MPC algorithms require $\tilde{\Omega}(n)$ memory per machine. Czumaj et al. [STOC'18] were the first to handle $\tilde{\Omega}(n)$ memory, running in $O((\log\log n)^2)$ rounds. We obtain $\tilde{O}(\sqrt{\log \Delta})$-round MPC algorithms for all these four problems that work even when each machine has memory $n^{\alpha}$ for any constant $\alpha\in (0, 1)$. Here, $\Delta$ denotes the maximum degree. These are the first sublogarithmic-time algorithms for these problems that break the linear memory barrier.

- LCAs with Query Complexity Below the Parnas-Ron Paradigm: Currently, the best known LCA for MIS has query complexity $\Delta^{O(\log \Delta)} poly(\log n)$, by Ghaffari [SODA'16]. As pointed out by Rubinfeld, obtaining a query complexity of $poly(\Delta\log n)$ remains a central open question. Ghaffari's bound almost reaches a $\Delta^{\Omega\left(\frac{\log \Delta}{\log\log \Delta}\right)}$ barrier common to all known MIS LCAs, which simulate distributed algorithms by learning the local topology, \`{a} la Parnas-Ron [TCS'07]. This barrier follows from the $\Omega(\frac{\log \Delta}{\log\log \Delta})$ distributed lower bound of Kuhn, et al. [JACM'16]. We break this barrier and obtain an MIS LCA with query complexity $\Delta^{O(\log\log \Delta)} poly(\log n)$.

### Faster Algorithms via Waring Decompositions

Authors: Kevin Pratt
Abstract: We show that decompositions of certain polynomials as sums of powers of linear forms yield faster algorithms for some algebraic problems. Such a decomposition is known as a Waring decomposition. Our results are:

1. We give a black-box algorithm for computing the sum of the coefficients of the degree-k multilinear monomials in a polynomial over a field of characteristic zero. Our algorithm runs in time $O^*(n^{k/2})$ and space $poly(n)$. This solves an open problem of Koutis and Williams [1].

2. We give a randomized $O^*((3e/2)^k) \approx O^*(4.08^k)$ time, polynomial space, black-box algorithm for detecting multilinear monomials of degree k over a field of characteristic zero. This improves on the $O^*(4.32^k)$ time, exponential space algorithm given in [2]. We note that there is a $O^*(2^k)$ lower bound on our approach.

### Anaconda: A Non-Adaptive Conditional Sampling Algorithm for Distribution Testing

Authors: Gautam Kamath, Christos Tzamos
Abstract: We investigate distribution testing with access to non-adaptive conditional samples. In the conditional sampling model, the algorithm is given the following access to a distribution: it submits a query set $S$ to an oracle, which returns a sample from the distribution conditioned on being from $S$. In the non-adaptive setting, all query sets must be specified in advance of viewing the outcomes.

Our main result is the first polylogarithmic-query algorithm for equivalence testing, deciding whether two unknown distributions are equal to or far from each other. This is an exponential improvement over the previous best upper bound, and demonstrates that the complexity of the problem in this model is intermediate to the the complexity of the problem in the standard sampling model and the adaptive conditional sampling model. We also significantly improve the sample complexity for the easier problems of uniformity and identity testing. For the former, our algorithm requires only $\tilde O(\log n)$ queries, matching the information-theoretic lower bound up to a $O(\log \log n)$-factor.

Our algorithm works by reducing the problem from $\ell_1$-testing to $\ell_\infty$-testing, which enjoys a much cheaper sample complexity. Necessitated by the limited power of the non-adaptive model, our algorithm is very simple to state. However, there are significant challenges in the analysis, due to the complex structure of how two arbitrary distributions may differ.

### A PTAS for $\ell_p$-Low Rank Approximation

Authors: Frank Ban, Vijay Bhattiprolu, Karl Bringmann, Pavel Kolev, Euiwoong Lee, David P. Woodruff
Abstract: A number of recent works have studied algorithms for entrywise $\ell_p$-low rank approximation, namely algorithms which given an $n \times d$ matrix $A$ (with $n \geq d$), output a rank-$k$ matrix $B$ minimizing $\|A-B\|_p^p=\sum_{i,j} |A_{i,j} - B_{i,j}|^p$. We show the following:
On the algorithmic side, for $p \in (0,2)$, we give the first $n^{\text{poly}(k/\epsilon)}$ time $(1+\epsilon)$-approximation algorithm. For $p = 0$, there are various problem formulations, a common one being the binary setting for which $A\in\{0,1\}^{n\times d}$ and $B = U \cdot V$, where $U\in\{0,1\}^{n \times k}$ and $V\in\{0,1\}^{k \times d}$. There are also various notions of multiplication $U \cdot V$, such as a matrix product over the reals, over a finite field, or over a Boolean semiring. We give the first PTAS for what we call the Generalized Binary $\ell_0$-Rank-$k$ Approximation problem, for which these variants are special cases. Our algorithm runs in time $(1/\epsilon)^{2^{O(k)}/\epsilon^{2}} \cdot nd \cdot \log^{2^k} d$. For the specific case of finite fields of constant size, we obtain an alternate algorithm with time $n \cdot d^{\text{poly}(k/\epsilon)}$.
On the hardness front, for $p \in (1,2)$, we show under the Small Set Expansion Hypothesis and Exponential Time Hypothesis (ETH), there is no constant factor approximation algorithm running in time $2^{k^{\delta}}$ for a constant $\delta > 0$, showing an exponential dependence on $k$ is necessary. For $p = 0$, we observe that there is no approximation algorithm for the Generalized Binary $\ell_0$-Rank-$k$ Approximation problem running in time $2^{2^{\delta k}}$ for a constant $\delta > 0$. We also show for finite fields of constant size, under the ETH, that any fixed constant factor approximation algorithm requires $2^{k^{\delta}}$ time for a constant $\delta > 0$.