Theory of Computing Blog Aggregator

Like previous years (2014, 2015) I have compiled the list of accepted papers for this year’s edition of COLT together with links to the arxiv submission whenever I could find one. These 63 papers were selected from about 200 submissions (a healthy 11% increase in terms of submissions from last year).

COLT 2016 accepted papers

by Sebastien Bubeck at May 06, 2016 02:44 AM UTC

Authors: Johannes Bausch, Toby Cubitt, Maris Ozols
Download: PDF
Abstract: We prove that estimating the smallest eigenvalue of a translationally-invariant, nearest-neighbour Hamiltonian on a 1D chain of quantum systems is QMAEXP-complete, even for systems of low dimension. This improves on the work by Gottesman and Irani (2009), where this dimension was several orders of magnitude larger, whereas our dimension is comparable to the best-known non-translationally-invariant result by Hallgren et al. (2013). The main novelty of our work is the introduction of a new approach to encoding quantum computation in the ground state of a local Hamiltonian, and for analyzing its ground state energy. While previous constructions directly encoded one of the standard models, we introduce a new Turing-complete model of quantum computation - a quantum ring machine. We then translate the ring machine of a QMAEXP verifier into another new model - a quantum Thue system. This hybrid classical-quantum model is an extension of string rewriting systems and is specifically tailored to encoding quantum computation in a nearest-neighbour Hamiltonian with translationally-invariant interactions. This allows us to shift the proof burden from simplifying the Hamiltonians used to encode a standard computational model, to proving universality of a very simplified model. While previous constructions could accommodate only serial computation, quantum Thue systems allow for non-determinism and concurrency. This requires more sophisticated techniques for analyzing the spectrum of the resulting Hamiltonian: we relate it to the Laplacian of a graph with unitary edge weights. To capture the non-deterministic branching in the computational path, we allow arbitrary graphs, including ones with cycles, which goes beyond path graphs considered in all previous constructions.

at May 06, 2016 12:40 AM UTC

Authors: Michael B. Cohen, Aleksander Madry, Piotr Sankowski, Adrian Vladu
Download: PDF
Abstract: In this paper, we study a set of combinatorial optimization problems on weighted graphs: the shortest path problem with negative weights, the weighted perfect bipartite matching problem, the unit-capacity minimum-cost maximum flow problem and the weighted perfect bipartite $b$-matching problem under the assumption that $\Vert b\Vert_1=O(m)$. We show that each one of these four problems can be solved in $\tilde{O}(m^{10/7}\log W)$ time, where $W$ is the absolute maximum weight of an edge in the graph, which gives the first in over 25 years polynomial improvement in their sparse-graph time complexity.

At a high level, our algorithms build on the interior-point method-based framework developed by Madry (FOCS 2013) for solving unit-capacity maximum flow problem. We develop a refined way to analyze this framework, as well as provide new variants of the underlying preconditioning and perturbation techniques. Consequently, we are able to extend the whole interior-point method-based approach to make it applicable in the weighted graph regime.

at May 06, 2016 12:42 AM UTC

Authors: Kasper Green Larsen, Ryan Williams
Download: PDF
Abstract: We consider the \emph{Online Boolean Matrix-Vector Multiplication} (OMV) problem studied by Henzinger et al. [STOC'15]: given an $n \times n$ Boolean matrix $M$, we receive $n$ Boolean vectors $v_1,\ldots,v_n$ one at a time, and are required to output $M v_i$ (over the Boolean semiring) before seeing the vector $v_{i+1}$, for all $i$. Previous known algorithms for this problem are combinatorial, running in $O(n^3/\log^2 n)$ time. Henzinger et al. conjecture there is no $O(n^{3-\varepsilon})$ time algorithm for OMV, for all $\varepsilon > 0$; their OMV conjecture is shown to imply strong hardness results for many basic dynamic problems.

We give a substantially faster method for computing OMV, running in $n^3/2^{\Omega(\sqrt{\log n})}$ randomized time. In fact, after seeing $2^{\omega(\sqrt{\log n})}$ vectors, we already achieve $n^2/2^{\Omega(\sqrt{\log n})}$ amortized time for matrix-vector multiplication. Our approach gives a way to reduce matrix-vector multiplication to solving a version of the Orthogonal Vectors problem, which in turn reduces to "small" algebraic matrix-matrix multiplication. Applications include faster independent set detection, partial match retrieval, and 2-CNF evaluation.

We also show how a modification of our method gives a cell probe data structure for OMV with worst case $O(n^{7/4}/\sqrt{w})$ time per query vector, where $w$ is the word size. This result rules out an unconditional proof of the OMV conjecture using purely information-theoretic arguments.

at May 06, 2016 12:00 AM UTC

Authors: Szymon Grabowski, Marcin Raniszewski
Download: PDF
Abstract: Rank and select queries on bitmaps are essential building bricks of many compressed data structures, including text indexes, membership and range supporting spatial data structures, compressed graphs, and more. Theoretically considered yet in 1980s, these primitives have also been a subject of vivid research concerning their practical incarnations in the last decade. We present a few novel rank/select variants, focusing mostly on speed, obtaining competitive space-time results in the compressed setting. Our findings can be summarized as follows: $(i)$ no single rank/select solution works best on any kind of data (ours are optimized for concatenated bit arrays obtained from wavelet trees for real text datasets), $(ii)$ it pays to efficiently handle blocks consisting of all 0 or all 1 bits, $(iii)$ compressed select does not have to be significantly slower than compressed rank at a comparable memory use.

at May 06, 2016 12:42 AM UTC

Authors: Anatol Slissenko
Download: PDF
Abstract: The paper describes an approach to measuring convergence of an algorithm to its result in terms of an entropy-like function of partitions of its inputs of a given length. The goal is to look at the algorithmic data processing from the viewpoint of information transformation, with a hope to better understand the work of algorithm, and maybe its complexity. The entropy is a measure of uncertainty, it does not correspond to our intuitive understanding of information. However, it is what we have in this area. In order to realize this approach we introduce a measure on the inputs of a given length based on the Principle of Maximal Uncertainty: all results should be equiprobable to the algorithm at the beginning. An algorithm is viewed as a set of events, each event is an application of a command. The commands are very basic. To measure the convergence we introduce a measure that is called entropic weight of events of the algorithm. The approach is illustrated by two examples.

at May 06, 2016 12:40 AM UTC

Authors: Takaaki Nishimoto, Tomohiro I, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda
Download: PDF
Abstract: A Longest Common Extension (LCE) query on a text $T$ of length $N$ asks for the length of the longest common prefix of suffixes starting at given two positions. We show that the signature encoding $\mathcal{G}$ of size $w = O(\min(z \log N \log^* M, N))$ [Mehlhorn et al., Algorithmica 17(2):183-198, 1997] of $T$, which can be seen as a compressed representation of $T$, has a capability to support LCE queries in $O(\log N + \log \ell \log^* M)$ time, where $\ell$ is the answer to the query, $z$ is the size of the Lempel-Ziv77 (LZ77) factorization of $T$, and $M \geq 3N$ is an integer that can be handled in constant time under word RAM model. In compressed space, this is the fastest deterministic LCE data structure in many cases. Moreover, $\mathcal{G}$ can be enhanced to support efficient update operations: After processing $\mathcal{G}$ in $O(w f_{\mathcal{A}})$ time, we can insert/delete any (sub)string of length $y$ into/from an arbitrary position of $T$ in $O((y+ \log N\log^* M) f_{\mathcal{A}})$ time, where $f_{\mathcal{A}} = O(\min \{ \frac{\log\log M \log\log w}{\log\log\log M}, \sqrt{\frac{\log w}{\log\log w}} \})$. This yields the first fully dynamic LCE data structure. We also present efficient construction algorithms from various types of inputs: We can construct $\mathcal{G}$ in $O(N f_{\mathcal{A}})$ time from uncompressed string $T$; in $O(n \log\log n \log N \log^* M)$ time from grammar-compressed string $T$ represented by a straight-line program of size $n$; and in $O(z f_{\mathcal{A}} \log N \log^* M)$ time from LZ77-compressed string $T$ with $z$ factors. On top of the above contributions, we show several applications of our data structures which improve previous best known results on grammar-compressed string processing.

at May 06, 2016 12:43 AM UTC

Our next talk will take place on Wednesday, May 11th at 1:00 PM Eastern Time (10:00 AM Pacific Time, 19:00 Central European Time, 17:00 UTC). Ankit Garg (Princeton University) will tell us about his work on Operator scaling and applications to non-commutative rational identity testing (abstract below).

Please make sure you reserve a spot for your group to join us live by signing up on the online form. As usual, for more information about the TCS+ online seminar series and the upcoming talks,* or to suggest a possible topic or speaker, please see the website.

Sinkhorn in 1964 studied the problem of matrix scaling, given a non-negative matrix A, find diagonal matrices with positive entries (if exist) D_1 and D_2 s.t. D_1 A D_2 is doubly stochastic. He proved that a natural iterative algorithm finds this scaling if entries of  A are strictly positive. Matrix scaling has since found applications in various fields such as statistics, numerical analysis, operations research, combinatorial geometry and approximating the permanent. Gurvits in 2004 found a striking generalization of matrix scaling, called operator scaling, where one wants to “scale” completely positive operators to make them “doubly stochastic.” Gurvits proved that a natural iterative algorithm converges, and, in fact, in polynomial number of iterations in some special cases. We analyze Gurvits’ algorithm completely and prove that it always converges in polynomial number of iterations. Our analysis is simple, providing explicit bounds on the “capacity” measure of completely positive operators introducted by Gurvits. Using this we obtain the first deterministic polynomial time algorithm for computing rank of a symbolic matrix in non-commuting variables. Previous algorithms required exponential time (with or without randomness). This problem (of computing rank of a symbolic matrix) has a rich history and has appeared in various forms in a remarkable number of mathematical and computational areas. Besides non-commutative algebra, it also has various connections to computational complexity and de-randomization, commutative invariant theory and quantum information theory.

This is based on joint work with Leonid Gurvits, Rafael Oliveira and Avi Wigderson.

* Following this talk, our next speaker (and last of the season!) will be Eric Blais, on May 25th, on “A Polynomial Lower Bound for Testing Monotonicity.”

by plustcs at May 05, 2016 05:19 PM UTC

Through the years I've mentioned a few of my favorite open problems in computational complexity on this blog that have perplexed me through the years. Let me mention a few of them again in case they inspire some of the new generation of complexity theorists.
  1. Does the polynomial-time hierarchy look like the arithmetic hierarchy? I mentioned this in the centenary post for Mostowski. Probably not because it would imply factoring in P (since NP∩co-NP would equal P) but we have no proof of separation and no oracle that makes them look the same.
  2. Does UP = NP imply the polynomial-time hierarchy collapses? What are the consequences if SAT had an NP algorithm with a unique accepting path? Remains open, again even in relativized worlds. 
  3. Do rational functions that agree with Boolean functions on the hypercube have low decision tree complexity? I really expected someone to have come up with a proof or counterexample by now. 
  4. What happens if two queries to NP can be simulated by a single query? Does S2=ZPPNP? Both questions asked in a post on S2P.
  5. Separate NP from Logarithmic space. I gave four approaches in a pre-blog 2001 survey on diagonalization (Section 3) though none have panned out. Should be much easier than separating P from NP.

by Lance Fortnow ( at May 05, 2016 02:26 PM UTC

April has been kind to us, providing us with a trio of new papers in sublinear algorithms. Also, in case you missed them, be sure to check out Oded Goldreich’s guest post and the associated open problem.

Sparse Fourier Transform in Any Constant Dimension with Nearly-Optimal Sample Complexity in Sublinear Time, by Michael Kapralov (ArXiv). There is a rich line of work focused on going beyond the celebrated Fast Fourier Transform algorithm by exploiting sparsity in the signal’s Fourier representation. While the one-dimensional case is very well understood, results in higher dimensions have either been lacking with respect to time complexity, sample complexity, or the approximation guarantee. This work is the first to give an algorithm which runs in sublinear time, has near-optimal sample complexity, and provides an arbitrarily accurate approximation guarantee for the multivariate case.

Sublinear Time Estimation of Degree Distribution Moments: The Arboricity Connection, by Talya Eden, Dana Ron, C. Seshadhri (ArXiv). This work revisits the problem of approximating the moments of the degree distribution of a graph. Prior work has shown that the \(s\)-th moment of the degree distribution can be computed with \(\tilde O(n^{1 – 1/(s+1)})\) queries, which is optimal up to \(\mathrm{poly}(\log n)\) factors. This work provides a new algorithm which requires only \(\tilde O(n^{1 – 1/s})\) queries on graphs of bounded arboricity, while still matching previous near-optimal bounds in the worst case. Impressively, the algorithm is incredibly simple, and can be stated in just a few lines of pseudocode.

A Local Algorithm for Constructing Spanners in Minor-Free Graphs, by Reut Levi, Dana Ron, Ronitt Rubinfeld (ArXiv). This paper addresses the problem of locally constructing a sparse spanning graph. For an edge \(e\) in a graph \(G\), one must determine whether or not \(e\) is in a sparse spanning graph \(G’\) in a manner that is consistent with previous answers. In general graphs, the complexity of this problem is \(\Omega(\sqrt{n})\), leading to the study of restricted families of graphs. In minor-free graphs (which include, for example, planar graphs), existing algorithms require \((d/\varepsilon)^{\mathrm{poly}(h)\log(1/\varepsilon)}\) queries and induce a stretch of \(\mathrm{poly}(d, h, 1/\varepsilon)\), where \(d\) is the maximum degree, \(h\) is the size of the minor, and \(G’\) is of size at most \((1 + \varepsilon)n\). The algorithm in this paper shows the complexity of this problem to be polynomial, requiring only \(\mathrm{poly}(d, h, 1/\varepsilon)\) queries and inducing a stretch of \(\tilde O(h \log (d)/\varepsilon)\).

by Gautam "G" Kamath at May 05, 2016 04:24 AM UTC

Authors: Meghyn Bienvenu, Stanislav Kikot, Roman Kontchakov, Vladimir Podolskii, Michael Zakharyaschev
Download: PDF
Abstract: We give solutions to two fundamental computational problems in ontology-based data access with the W3C standard ontology language OWL 2 QL: the succinctness problem for first-order rewritings of ontology-mediated queries (OMQs), and the complexity problem for OMQ answering. We classify OMQs according to the shape of their conjunctive queries (treewidth, the number of leaves) and the existential depth of their ontologies. For each of these classes, we determine the combined complexity of OMQ answering, and whether all OMQs in the class have polynomial-size first-order, positive existential, and nonrecursive datalog rewritings. We obtain the succinctness results using hypergraph programs, a new computational model for Boolean functions, which makes it possible to connect the size of OMQ rewritings and circuit complexity.

at May 05, 2016 12:40 AM UTC

Authors: Anurag Anshu, Aleksandrs Belovs, Shalev Ben-David, Mika Göös, Rahul Jain, Robin Kothari, Troy Lee, Miklos Santha
Download: PDF
Abstract: While exponential separations are known between quantum and randomized communication complexity for partial functions (Raz, STOC 1999), the best known separation between these measures for a total function is quadratic, witnessed by the disjointness function. We give the first super-quadratic separation between quantum and randomized communication complexity for a total function, giving an example exhibiting a power 2.5 gap. We also present a nearly optimal quadratic separation between randomized communication complexity and the logarithm of the partition number, improving upon the previous best power 1.5 separation due to G\"o\"os, Jayram, Pitassi, and Watson.

Our results are the communication analogues of separations in query complexity proved using the recent cheat sheet framework of Aaronson, Ben-David, and Kothari (STOC 2016). Our main technical results are randomized communication and information complexity lower bounds for a family of functions, called lookup functions, that generalize and port the cheat sheet framework to communication complexity.

at May 05, 2016 12:42 AM UTC

Authors: Greg Bodwin
Download: PDF
Abstract: A pairwise distance preserver of a graph $G$ is a sparse subgraph that exactly preserves all pairwise distances within a given set of node pairs $P$. A famous and indispensable upper bound in this area is the Shortest Path Tree Lemma: there always exists a preserver of $G, P:= \{s\} \times V$ on $O(n)$ edges (or equivalently, $O(|P|)$ edges since here $|P| = n$). The Shortest Path Tree Lemma is extremely powerful and well applied, but its main drawback is the rigid structure it requires for $P$. In this paper, we generalize the Shortest Path Tree Lemma by asking whether this is really necessary: are there situations in which we can always find a preserver with density linear in $n$ or $|P|$, {\em without} assuming that $P$ has any particular structure?

Surprisingly, the answer is yes! We establish two extremely general situations in which linear-sized distance preservers always exist: (1) all $G, P$ has a distance preserver on $O(n)$ edges whenever $|P| = O(n^{1/3})$, even if $G$ is directed and/or weighted, and (2) All $G, P$ has a distance preserver on $O(|P|)$ edges whenever $|P| = \Omega\left(\frac{n^2}{rs(n)}\right)$, and $G$ is undirected and unweighted.

In the latter result, $rs(n)$ is the Ruzsa-Szemeredi function from combinatorics. It is a major open question to determine the value of $rs(n)$, but it is conceivable that $rs(n)$ is large enough to dominate any polylog factor. Thus, we obtain surprising $O(|P|)$ sized preservers even when $|P|$ is smaller than $n^2$ by a super-polylog factor in possibly all graphs, and any exception graphs have been overlooked by researchers in diverse fields for the last 70 years and thus will likely be very hard to find.

Towards algorithmic applications, we further show that both of these distance preservers can be computed within essentially the best time bounds that can be expected.

at May 05, 2016 12:45 AM UTC

Authors: Tiago Tresoldi
Download: PDF
Abstract: This paper presents a new algorithm, the Modified Moving Contracting Window Pattern Algorithm (CMCWPM), for the calculation of field similarity. It strongly relies on previous work by Yang et al. (2001), correcting previous work in which characters marked as inaccessible for further pattern matching were not treated as boundaries between subfields, occasionally leading to higher than expected scores of field similarity. A reference Python implementation is provided.

at May 05, 2016 12:45 AM UTC

Authors: Gopinath Mishra, Subhra Mazumdar, Arindam Pal
Download: PDF
Abstract: Emergency evacuation is the process of movement of people away from the threat or actual occurrence of hazards such as natural disasters, terrorist attacks, fires and bombs. In this paper, we focus on evacuation from a building, but the ideas can be applied to city and region evacuation. We define the problem and show how it can be modeled using graphs. The resulting optimization problem can be formulated as an integer linear program. Though this can be solved exactly, this approach does not scale well for graphs with thousands of nodes and several hundred thousands of edges. This is impractical for large graphs.

We study a special case of this problem, where there is only a single source and a single sink. For this case, we give an improved algorithm \emph{Single Source Single Sink Evacuation Route Planner (SSEP)}, whose evacuation time is always at most that of a famous algorithm \emph{Capacity Constrained Route Planner (CCRP)}, and whose running time is strictly less than that of CCRP. We prove this mathematically and give supporting results by extensive experiments. We also study randomized behavior model of people and give some interesting results.

at May 05, 2016 12:45 AM UTC

While exponential separations are known between quantum and randomized communication complexity for partial functions, e.g. Raz [1999], the best known separation between these measures for a total function is quadratic, witnessed by the disjointness function. We give the first super-quadratic separation between quantum and randomized communication complexity for a total function, giving an example exhibiting a power $2.5$ gap. We also present a nearly optimal quadratic separation between randomized communication complexity and the logarithm of the partition number, improving upon the previous best power $1.5$ separation due to Göös, Jayram, Pitassi, and Watson [2015]. Our results are the communication analogues of separations in query complexity proved using the recent cheat sheet framework of Aaronson, Ben-David, and Kothari [2016]. Our main technical results are randomized communication and information complexity lower bounds for a family of functions, called lookup functions, that generalize and port the cheat sheet framework to communication complexity.

at May 04, 2016 06:14 PM UTC


Here is the abstract of a recent paper by Andrew Suk. (I heard about it from a Facebook post by Yufei Zhao. I added a link to the original Erdős Szekeres’s paper.)

Let ES(n) be the smallest integer such that any set of ES(n) points in the plane in general position contains n points in convex position. In their seminal 1935 paper, Erdős and Szekeres showed that

ES(n)\le {{2n-4}\choose{n-2}}+1=4^{n-o(n)}

In 1960, they showed that

ES(n) \ge 2^{n-2}+1,

and conjectured this to be optimal. Despite the efforts of many researchers, no improvement in the order of magnitude has ever been made on the upper bound over the last 81 years. In this paper, we nearly settle the Erdős-Szekeres conjecture by showing that


This is amazing! The proof uses a 2002 “positive-fraction” version of the Erdős-Szekeres theorem by Pór and Valtr.

Among the many beautiful results extending, applying, or inspired by the Erdős Szekeres theorem let me mention an impressive recent body of works on the number of points in \mathbb R ^d which guarantee n points in cyclic position. A good place to read about it is the paper by Bárány, Matoušek and Pór Curves in \mathbb R^d  intersecting every hyperplane at most d+1 times, where references to earlier papers by Conlon, Eliàš,  Fox, Matoušek, Pach, Roldán-Pensado, Safernová, Sudakov, Suk, and others.



by Gil Kalai at May 04, 2016 07:58 AM UTC

Authors: Jisu Kim, Alessandro Rinaldo, Larry Wasserman
Download: PDF
Abstract: Many algorithms in machine learning and computational geometry require, as input, the intrinsic dimension of the manifold that supports the probability distribution of the data. This parameter is rarely known and therefore has to be estimated. We characterize the statistical difficulty of this problem by deriving upper and lower bounds on the minimax rate for estimating the dimension. First, we consider the problem of testing the hypothesis that the support of the data-generating probability distribution is a well-behaved manifold of intrinsic dimension $d_1$ versus the alternative that it is of dimension $d_2$, with $d_{1}<d_{2}$. With an i.i.d. sample of size $n$, we provide an upper bound on the probability of choosing the wrong dimension of $O\left( n^{-\left(d_{2}/d_{1}-1-\epsilon\right)n} \right)$, where $\epsilon$ is an arbitrarily small positive number. The proof is based on bounding the length of the traveling salesman path through the data points. We also demonstrate a lower bound of $\Omega \left( n^{-(2d_{2}-2d_{1}+\epsilon)n} \right)$, by applying Le Cam's lemma with a specific set of $d_{1}$-dimensional probability distributions. We then extend these results to get minimax rates for estimating the dimension of well-behaved manifolds. We obtain an upper bound of order $O \left( n^{-(\frac{1}{m-1}-\epsilon)n} \right)$ and a lower bound of order $\Omega \left( n^{-(2+\epsilon)n} \right)$, where $m$ is the embedding dimension.

at May 04, 2016 12:45 AM UTC

Authors: Olivier Guye
Download: PDF
Abstract: The described works have been carried out in the framework of a mid-term study initiated by the Centre Electronique de l'Armement, then by an advanced study launched by the Direction de la Recherche et des Etudes Technologiques in France in the aim to develop new techniques for multidimensional hierarchical modeling and to port them on parallel architecture computers for satisfying the future needs in processing huge numerical data bases. Following the first tome describing the modeling principles, the second tome details the way used for developing the modeling software and for porting it on different computers, especially on parallel architecture computers. In addition to these works, it is gone through new algorithms that have been developed after those that have been presented in the former tome and that are described in pseudo-code in annex of the present document: - operators for constructive geometry (building simple shapes, Boolean operators, slice handling); - integral transformations (epigraph, hypograph, convex hull) ; - homotopic transformations (boundary, erosion, dilation, opening, closing) ; - median transformations (median filtering, thinning, median set, intrinsic dimension) ; - transformations of (hyper-)surface manifolds (median filtering, extension, polynomial fitting of a simple function). The present publication is ending with the software porting on two distributed memory parallel computers: - a thin-grained synchronous computer ; - a coarse-grained asynchronous computer.

at May 04, 2016 12:45 AM UTC

Authors: Martin Romauch, Richard F. Hartl
Download: PDF
Abstract: This paper proposes a new model for Cluster-tools with two load locks. Cluster-tools are widely used to automate single wafer processing in semiconductor industry. The load locks are the entry points into the vacuum of the Cluster-tool's mainframe. Usually there are two of them available. Each lot being processed, is dedicated to a single load-lock. Therefore at most two different lots (with possibly different processing times and qualification) can be processed simultaneously. This restriction is one of the major potential bottlenecks.

Capacity planning is one of the possible applications for the proposed model and the paper demonstrates the integration into a more general framework that considers different tool types and different operational modes.

The paper also generalizes an earlier model that is limited to three processing chambers. The proposed modeling approach is based on makespan reductions by parallel processing. It turns out that the performance of the new approach is similar, when compared to the generalized model for three chambers, but the new approach outperforms the generalized model for four and more chambers.

at May 04, 2016 12:44 AM UTC

Authors: Vijay Bhattiprolu, Venkatesan Guruswami, Euiwoong Lee
Download: PDF
Abstract: Given a random $n$-variate degree-$d$ homogeneous polynomial $f$, we study the following problem: \[ \max_{\| x \| = 1} f(x). \] Besides being a fundamental problem in its own right, this random model has received attention recently due to its connection to tensor PCA. We study how the degree-$q$ Sum of Squares (SoS) hierarchy performs on this problem. Our results include:

- When $f$ is a degree-$q$ polynomial with independent random coefficients, we prove that there is a constant $c$ such that with high probability, degree-$q$ SoS satisfies \[

(\frac{n}{q^{\,\,c+o(1)}})^{q/4} \leq \text{ SoS-Value } \leq (\frac{n}{q^{\,\,c-o(1)}})^{q/4}. \] Our upper bound improves a result of Montanari and Richard~\cite{MR14} when $q$ is large. (It is known that $\max_{\|x\|=1} f(x) \lesssim \sqrt{n\,q\,\log q}$ w.h.p.)

- When $f$ is a random degree-$3$ or $4$ polynomial, we prove that with high probability, the degree-$q$ SoS-Value is at most $\tilde{O}(\frac{n^{3/4}}{q^{1/4}})$, $\tilde{O}(\frac{n}{\sqrt{q}})$ respectively. This improves on a result of Hopkins, Shi, and Steurer~\cite{HSS15} when $q$ is growing.

- Lower bounds on the degree-$q$ SoS-Value of the $2$-norm of the $q$-parity-tensor of a random graph.

In providing these results, we develop tools that we believe to be useful in analysing the SoS hierarchy on other optimization problems over the sphere.

at May 04, 2016 12:40 AM UTC

Authors: Aristide Grange, Imed Kacem, Sébastien Martin
Download: PDF
Abstract: We introduce the strongly NP-complete pagination problem, an extension of BIN PACKING where packing together two items may make them occupy less volume than the sum of their individual sizes. To achieve this property, an item is defined as a finite set of symbols from a given alphabet: while, in BIN PACKING, any two such sets would be disjoint, in PAGINATION, they can share zero, one or more symbols. After formulating the problem as an integer linear program, we try to approximate its solutions with several families of algorithms: from straightforward adaptations of classical BIN PACKING heuristics, to dedicated algorithms (greedy and non-greedy), to standard and grouping genetic algorithms. All of them are studied first theoretically, then experimentally on an extensive random test set. Based upon these data, we propose a predictive measure of the difficulty of a given instance, and finally recommend which algorithm should be used in which case, depending on either time constraints or quality requirements.

at May 04, 2016 12:44 AM UTC

I’ve supervised a lot of great student projects in my nine years at MIT, but my inner nerdy teenager has never been as personally delighted by a project as it is right now.  Today, I’m proud to announce that Adam Yedidia, a PhD student at MIT (but an MEng student when he did most of this work), has explicitly constructed a one-tape, two-symbol Turing machine with 7,918 states, whose behavior (when run on a blank tape) can never be proven from the usual axioms of set theory, under reasonable consistency hypotheses.  Adam has also constructed a 4,888-state Turing machine that halts iff there’s a counterexample to Goldbach’s Conjecture, and a 5,372-state machine that halts iff there’s a counterexample to the Riemann Hypothesis.  In all three cases, this is the first time we’ve had a reasonable explicit upper bound on how many states you need in a Turing machine before you can see the behavior in question.

Here’s our research paper, on which Adam generously included me as a coauthor, even though he did the heavy lifting.  Also, here’s a github repository where you can download all the code Adam used to generate these Turing machines, and even use it to build your own small Turing machines that encode interesting mathematical statements.  Finally, here’s a YouTube video where Adam walks you through how to use his tools.

A more precise statement of our main result is this: we give a 7,918-state Turing machine, called Z (and actually explicitly listed in our paper!), such that:

  1. Z runs forever, assuming the consistency of a large-cardinal theory called SRP (Stationary Ramsey Property), but
  2. Z can’t be proved to run forever in ZFC (Zermelo-Fraenkel set theory with the Axiom of Choice, the usual foundation for mathematics), assuming that ZFC is consistent.

A bit of background: it follows, as an immediate consequence of Gödel’s Incompleteness Theorem, that there’s some computer program, of some length, that eludes the power of ordinary mathematics to prove what it does, when it’s run with an unlimited amount of memory.  So for example, such a program could simply enumerate all the possible consequences of the ZFC axioms, one after another, and halt if it ever found a contradiction (e.g., a proof of 1+1=3).  Assuming ZFC is consistent, this program must run forever.  But again assuming ZFC is consistent, ZFC can’t prove that the program runs forever, since if it did, then it would prove its own consistency, thereby violating the Second Incompleteness Theorem!

Alas, this argument still leaves us in the dark about where, in space of computer programs, the “Gödelian gremlin” rears its undecidable head.  A program that searches for an inconsistency in ZFC is a fairly complicated animal: it needs to encode not only the ZFC axiom schema, but also the language and inference rules of first-order logic.  Such a program might be thousands of lines long if written in a standard programming language like C, or millions of instructions if compiled down to a bare-bones machine code.  You’d certainly never run across such a program by chance—not even if you had a computer the size of the observable universe, trying one random program after another for billions of years in a “primordial soup”!

So the question stands—a question that strikes me as obviously important, even though as far as I know, only one or two people ever asked the question before us; see here for example.  Namely: do the axioms of set theory suffice to analyze the behavior of every computer program that’s at most, let’s say, 50 machine instructions long?  Or are there super-short programs that already exhibit “Gödelian behavior”?

Theoretical computer scientists might object that this is “merely a question of constants.”  Well yes, OK, but the origin of life in our universe—a not entirely unrelated puzzle—is also “merely a question of constants”!  In more detail, we know that it’s possible with our laws of physics to build a self-replicating machine: say, DNA or RNA and their associated paraphernalia.  We also know that tiny molecules like H2O and CO2 are not self-replicating.  But we don’t know how small the smallest self-replicating molecule can be—and that’s an issue that influences whether we should expect to find ourselves alone in the universe or find it teeming with life.

Some people might also object that what we’re asking about has already been studied, in the half-century quest to design the smallest universal Turing machine (the subject of Stephen Wolfram’s $25,000 prize in 2007, to which I responded with my own $25.00 prize).  But I see that as fundamentally different, for the following reason.  A universal Turing machine—that is, a machine that simulates any other machine that’s described to it on its input tape—has the privilege of offloading almost all of its complexity onto the description format for the input machine.  So indeed, that’s exactly what all known tiny universal machines do!  But a program that checks (say) Goldbach’s Conjecture, or the Riemann Hypothesis, or the consistency of set theory, on an initially blank tape, has no such liberty.  For such machines, the number of states really does seem like an intrinsic measure of complexity, because the complexity can’t be shoehorned anywhere else.

One can also phrase what we’re asking in terms of the infamous Busy Beaver function.  Recall that BB(n), or the nth Busy Beaver number, is defined to be the maximum number of steps that any n-state Turing machine takes when run on an initially blank tape, assuming that the machine eventually halts. The Busy Beaver function was the centerpiece of my 1998 essay Who Can Name the Bigger Number?, which might still attract more readers than anything else I’ve written since. As I stressed there, if you’re in a biggest-number-naming contest, and you write “BB(10000),” you’ll destroy any opponent—however otherwise mathematically literate they are—who’s innocent of computability theory.  For BB(n) grows faster than any computable sequence of integers: indeed, if it didn’t, then one could use that fact to solve the halting problem, contradicting Turing’s theorem.

But the BB function has a second amazing property: namely, it’s a perfectly well-defined integer function, and yet once you fix the axioms of mathematics, only finitely many values of the function can ever be proved, even in principle.  To see why, consider again a Turing machine M that halts if and only if there’s a contradiction in ZF set theory.  Clearly such a machine could be built, with some finite number of states k.  But then ZF set theory can’t possibly determine the value of BB(k) (or BB(k+1), BB(k+2), etc.), unless ZF is inconsistent!  For to do so, ZF would need to prove that M ran forever, and therefore prove its own consistency, and therefore be inconsistent by Gödel’s Theorem.

OK, but we can now ask a quantitative question: how many values of the BB function is it possible for us to know?  Where exactly is the precipice at which this function “departs the realm of mortals and enters the realm of God”: is it closer to n=10 or to n=10,000,000?  In practice, four values of BB have been determined so far:

  • BB(1)=1
  • BB(2)=6
  • BB(3)=21 (Lin and Rado 1965)
  • BB(4)=107 (Brady 1975)

We also know some lower bounds:

See Heiner Marxen’s page or the Googology Wiki (which somehow I only learned about today) for more information.

Some Busy Beaver enthusiasts have opined that even BB(6) will never be known exactly.  On the other hand, the abstract argument from before tells us only that, if we confine ourselves to (say) ZF set theory, then there’s some k—possibly in the tens of millions or higher—such that the values of BB(k), BB(k+1), BB(k+2), and so on can never be proven.  So again: is the number of knowable values of the BB function more like 10, or more like a million?

This is the question that Adam and I (but mostly Adam) have finally addressed.

It’s hopeless to design a Turing machine by hand for all but the simplest tasks, so as a first step, Adam created a new programming language, called Laconic, specifically for writing programs that compile down to small Turing machines.  Laconic programs actually compile to an intermediary language called TMD (Turing Machine Descriptor), and from there to Turing machines.

Even then, we estimate that a direct attempt to write a Laconic program that searched for a contradiction in ZFC would lead to a Turing machine with millions of states.  There were three ideas needed to get the state count down to something reasonable.

The first was to take advantage of the work of Harvey Friedman, who’s one of the one or two people I mentioned earlier who’s written about these problems before.  In particular, Friedman has been laboring since the 1960s to find “natural” arithmetical statements that are provably independent of ZFC or other strong set theories.  (See this AMS Notices piece by Martin Davis for a discussion of Friedman’s progress as of 2006.)  Not only does Friedman’s quest continue, but some of his most important progress has come only within the last year.  His statements—typically involving objects called “order-invariant graphs”—strike me as alien, and as far removed from anything I’d personally have independent reasons to think about (but is that just a sign of my limited perspective?).  Be that as it may, Friedman’s statements still seem a lot easier to encode as short computer programs than the full apparatus of first-order logic and set theory!  So that’s what we started with; our work wouldn’t have been possible without Friedman (who we consulted by email throughout the project).

The second idea was something we called “on-tape processing.”  Basically, instead of compiling directly from Laconic down to Turing machine, Adam wrote an interpreter in Turing machine (which took about 4000 states—a single, fixed cost), and then had the final Turing machine first write a higher-level program onto its tape and then interpret that program.  Instead of the compilation process producing a huge multiplicative overhead in the number of Turing machine states (and a repetitive machine), this approach gives us only an additive overhead.  We found that this one idea decreased the number of states by roughly an order of magnitude.

The third idea was first suggested in 2002 by Ben-Amram and Petersen (and refined for us by Luke Schaeffer); we call it “introspective encoding.”  When we write the program to be interpreted onto the Turing machine tape, the naïve approach would use one Turing machine state per bit.  But that’s clearly wasteful, since in an n-state Turing machine, every state contains ~log(n) bits of information (because of the other states it needs to point to).  A better approach tries to exploit as many of those bits as it can; doing that gave us up to a factor-of-5 additional savings in the number of states.

For Goldbach’s Conjecture and the Riemann Hypothesis, we paid the same 4000-state overhead for the interpreter, but then the program to be interpreted was simpler, giving a smaller overall machine.  Incidentally, it’s not intuitively obvious that the Riemann Hypothesis is equivalent to the statement that some particular computer program runs forever, but it is—that follows, for example, from work by Lagarias (which we used).

To preempt the inevitable question in the comments section: yes, we did run these Turing machines for a while, and no, none of them had halted after a day or so.  But before you interpret that as evidence in favor of Goldbach, Riemann, and the consistency of ZFC, you should probably know that a Turing machine to test whether all perfect squares are less than 5, produced using Laconic, needed to run for more than an hour before it found the first counterexample (namely, 32=9) and halted.  Laconic Turing machines are optimized only for the number of states, not for speed, to put it mildly.

Of course, three orders of magnitude still remain between the largest value of n (namely, 4) for which BB(n) is known to be knowable in ZFC-based mathematics, and the smallest value of n (namely, 7,918) for which BB(n) is known to be unknowable.  I’m optimistic that further improvements are possible to the machine Z—whether that means simplifications to Friedman’s statement, a redesigned interpreter (possibly using lambda calculus?), or a “multi-stage rocket model” where a bare-bones interpreter would be used to unpack a second, richer interpreter which would be used to unpack a third, etc., until you got to the actual program you cared about.  But I’d be shocked if anyone in my lifetime determined the value of BB(10), for example, or proved the value independent of set theory.  Even after the Singularity happens, I imagine that our robot overlords would find the determination of BB(10) quite a challenge.

In an early Shtetl-Optimized post, I described theoretical computer science as “quantitative epistemology.”  Constructing small Turing machines whose behavior eludes set theory is not conventional theoretical computer science by any stretch of the imagination: it’s closer in practice to programming languages or computer architecture, or even the recreational practice known as code-golfing.  On the other hand, I’ve never been involved with any other project that was so clearly, explicitly about pinning down the quantitative boundary between the knowable and the unknowable.

Comments on our paper are welcome.

Addendum: Some people might wonder “why Turing machines,” as opposed to a more reasonable programming language like C or Python.  Well, first of all, we needed a language that could address an unlimited amount of memory.  Also, the BB function is traditionally defined in terms of Turing machines.  But the most important issue is that we wanted there to be no suspicion whatsoever that our choice of programming language was artificially helping to make our machine small.  And hopefully everyone can agree that one-tape, two-symbol Turing machines aren’t designed for anyone’s convenience!

by Scott at May 03, 2016 10:04 PM UTC

Andrew Suk just placed on arXiv a paper with an almost tight bound for the happy ending problem. These are exciting news for discrete geometers! I’ll use the opportunity to describe the problem and its non-technical history. Andrew Suk and the Anonymous statue. In the early 1930’s in Budapest, several students used to regularly meet […]

by Adam Sheffer at May 03, 2016 04:13 AM UTC

[I sneeze several times and then the following conversation happens]

J.Z.: In China, we say that if you sneeze once, it means that someone is thinking of you. If you sneeze twice, it means someone is cursing you.

Me: and what does it mean when I sneeze three times or more?

J.Z.: it means you have a cold.

by luca at May 03, 2016 02:51 AM UTC

(Posting this here as a backup, since the main TCC website is temporarily down. The submission server is up and running though.) The Fourteenth Theory of Cryptography Conference will be held in Beijing, China, sponsored by the International Association for Cryptologic Research (IACR). Papers presenting original research on foundational and theoretical aspects of cryptography are […]

by adamdsmith at May 02, 2016 02:02 PM UTC

We are delighted to announce that the Handbook of Computational Social Choice has now been published with Cambridge University Press.

handbook_cscDescription: The rapidly growing field of computational social choice, at the intersection of computer science and economics, deals with the computational aspects of collective decision making. This handbook, written by thirty-six prominent members of the computational social choice community, covers the field comprehensively. Chapters devoted to each of the field’s major themes offer detailed introductions. Topics include voting theory (such as the computational complexity of winner determination and manipulation in elections), fair allocation (such as algorithms for dividing divisible and indivisible goods), coalition formation (such as matching and hedonic games), and many more. Graduate students, researchers, and professionals in computer science, economics, mathematics, political science, and philosophy will benefit from this accessible and self-contained book.

A PDF of the book is freely available on the Cambridge University Press website. Click on the Resources tab, then on “Resources” under “General Resources”, and you will find a link called “Online Version”. The password is cam1CSC.

Alternatively, the book can be purchased through Cambridge University Press, Amazon, and other retailers.

We hope that the book will become a valuable resource for the computational social choice community, and the CS-econ community at large.

Best regards,
Felix Brandt, Vince Conitzer, Ulle Endriss, Jerome Lang, and Ariel Procaccia (the editors)

by felixbrandt at May 02, 2016 01:39 PM UTC

Authors: Cornelius Brand, Michael Sagraloff
Download: PDF
Abstract: Given a zero-dimensional polynomial system consisting of n integer polynomials in n variables, we propose a certified and complete method to compute all complex solutions of the system as well as a corresponding separating linear form l with coefficients of small bit size. For computing l, we need to project the solutions into one dimension along O(n) distinct directions but no further algebraic manipulations. The solutions are then directly reconstructed from the considered projections. The first step is deterministic, whereas the second step uses randomization, thus being Las-Vegas.

The theoretical analysis of our approach shows that the overall cost for the two problems considered above is dominated by the cost of carrying out the projections. We also give bounds on the bit complexity of our algorithms that are exclusively stated in terms of the number of variables, the total degree and the bitsize of the input polynomials.

at May 02, 2016 12:41 AM UTC

Authors: Cynthia Kop, Jakob Grue Simonsen
Download: PDF
Abstract: Constructor rewriting systems are said to be cons-free if, roughly, constructor terms in the right-hand sides of rules are subterms of constructor terms in the left-hand side; the computational intuition is that rules cannot build new data structures. It is well-known that cons-free programming languages can be used to characterize computational complexity classes, and that cons-free first-order term rewriting can be used to characterize the set of polynomial-time decidable sets.

We investigate cons-free higher-order term rewriting systems, the complexity classes they characterize, and how these depend on the order of the types used in the systems. We prove that, for every k $\geq$ 1, left-linear cons-free systems with type order k characterize E$^k$TIME if arbitrary evaluation is used (i.e., the system does not have a fixed reduction strategy).

The main difference with prior work in implicit complexity is that (i) our results hold for non-orthogonal term rewriting systems with possible rule overlaps with no assumptions about reduction strategy, (ii) results for such term rewriting systems have previously only been obtained for k = 1, and with additional syntactic restrictions on top of cons-freeness and left-linearity.

Our results are apparently among the first implicit characterizations of the hierarchy E = E$^1$TIME $\subseteq$ E$^2$TIME $\subseteq$ .... Our work confirms prior results that having full non-determinism (via overlaps of rules) does not directly allow characterization of non-deterministic complexity classes like NE. We also show that non-determinism makes the classes characterized highly sensitive to minor syntactic changes such as admitting product types or non-left-linear rules.

at May 02, 2016 12:40 AM UTC

Authors: Gilles Didier, Laurent Tichit
Download: PDF
Abstract: Given a pattern $w$ and a text $t$, the speed of a pattern matching algorithm over $t$ with regard to $w$, is the ratio of the length of $t$ to the number of text accesses performed to search $w$ into $t$. We first propose a general method for computing the limit of the expected speed of pattern matching algorithms, with regard to $w$, over iid texts. Next, we show how to determine the greatest speed which can be achieved among a large class of algorithms, altogether with an algorithm running this speed. Since the complexity of this determination make it impossible to deal with patterns of length greater than 4, we propose a polynomial heuristic. Finally, our approaches are compared with \nAlgos{} pre-existing pattern matching algorithms from both a theoretical and a practical point of view, i.e. both in terms of limit expected speed on iid texts, and in terms of observed average speed on real data. In all cases, the pre-existing algorithms are outperformed.

at May 02, 2016 12:00 AM UTC

Authors: Emilio Di Giacomo, Walter Didimo, William S. Evans, Giuseppe Liotta, Henk Meijer, Fabrizio Montecchiani, Stephen K. Wismath
Download: PDF
Abstract: An ortho-polygon visibility representation of an $n$-vertex embedded graph $G$ (OPVR of $G$) is an embedding preserving drawing of $G$ that maps every vertex to a distinct orthogonal polygon and each edge to a vertical or horizontal visibility between its end-vertices. The vertex complexity of an OPVR of $G$ is the minimum $k$ such that every polygon has at most $k$ reflex corners. We present polynomial time algorithms that test whether $G$ has an OPVR and, if so, compute one of minimum vertex complexity. We argue that the existence and the vertex complexity of an OPVR of $G$ are related to its number of crossings per edge and to its connectivity. Namely, we prove that if $G$ has at most one crossing per edge (i.e. $G$ is a $1$-plane graph) an OPVR of $G$ always exists while this may not be the case if two crossings per edge are allowed. Also, if $G$ is a $3$-connected $1$-plane graph, we can compute in $O(n)$ time an OPVR of $G$ whose vertex complexity is bounded by a constant. However, if $G$ is a $2$-connected $1$-plane graph, the vertex complexity of any OPVR of $G$ may be $\Omega(n)$. In contrast, we describe a family of $2$-connected $1$-plane graphs for which, in $O(n)$ time, an embedding that guarantees constant vertex complexity can be computed. Finally, we present the results of an experimental study on the vertex complexity of ortho-polygon visibility representations of $1$-plane graphs.

at May 02, 2016 12:00 AM UTC

Authors: Daniel Lokshtanov, M. S. Ramanujan, Saket Saurabh
Download: PDF
Abstract: The optimization version of the Unique Label Cover problem is at the heart of the Unique Games Conjecture which has played an important role in the proof of several tight inapproximability results. In recent years, this problem has been also studied extensively from the point of view of parameterized complexity. Cygan et al. [FOCS 2012] proved that this problem is fixed-parameter tractable (FPT) and Wahlstr\"om [SODA 2014] gave an FPT algorithm with an improved parameter dependence. Subsequently, Iwata, Wahlstr\"om and Yoshida [2014] proved that the edge version of Unique Label Cover can be solved in linear FPT-time. That is, there is an FPT algorithm whose dependence on the input-size is linear. However, such an algorithm for the node version of the problem was left as an open problem. In this paper, we resolve this question by presenting the first linear-time FPT algorithm for Node Unique Label Cover.

at May 02, 2016 12:00 AM UTC

Authors: Yannis Almirantis, Panagiotis Charalampopoulos, Jia Gao, Costas S. Iliopoulos, Manal Mohamed, Solon P. Pissis, Dimitris Polychronopoulos
Download: PDF
Abstract: The deviation of the observed frequency of a word $w$ from its expected frequency in a given sequence $x$ is used to determine whether or not the word is avoided. This concept is particularly useful in DNA linguistic analysis. The value of the standard deviation of $w$, denoted by $std(w)$, effectively characterises the extent of a word by its edge contrast in the context in which it occurs. A word $w$ of length $k>2$ is a $\rho$-avoided word in $x$ if $std(w) \leq \rho$, for a given threshold $\rho < 0$. Notice that such a word may be completely absent from $x$. Hence computing all such words na\"{\i}vely can be a very time-consuming procedure, in particular for large $k$. In this article, we propose an $O(n)$-time and $O(n)$-space algorithm to compute all $\rho$-avoided words of length $k$ in a given sequence $x$ of length $n$ over a fixed-sized alphabet. We also present a time-optimal $O(\sigma n)$-time and $O(\sigma n)$-space algorithm to compute all $\rho$-avoided words (of any length) in a sequence of length $n$ over an alphabet of size $\sigma$. Furthermore, we provide a tight asymptotic upper bound for the number of $\rho$-avoided words and the expected length of the longest one. We make available an open-source implementation of our algorithm. Experimental results, using both real and synthetic data, show the efficiency of our implementation.

at May 02, 2016 12:00 AM UTC

Authors: Michael Hamann, Manuel Penschuck
Download: PDF
Abstract: The LFR benchmark is a popular benchmark graph model used to evaluate community detection algorithms. We present the first external memory algorithm that is able to generate massive complex networks following the LFR model. Its most expensive component is the generation of random graphs with prescribed degree sequences which can be divided in two steps: They are first materialized as deterministic graphs using the Havel-Hakimi algorithm and then randomized. Our main contribution are HP-HH and ES-TFP, two I/O-efficient external memory algorithms for these two steps. In an experimental evaluation we demonstrate their performance: our implementation is able to generate graphs with more than 10 billion edges on a single machine, is competitive with a parallel massively distributed algorithm and on smaller instances faster than a state-of-the-art internal memory implementation.

at May 02, 2016 12:00 AM UTC

Authors: P. Wiederhold, H. Reyes
Download: PDF
Abstract: A new algorithm for the determination of the relative convex hull in the plane of a simple polygon A with respect to another simple polygon B which contains A, is proposed. The relative convex hull is also known as geodesic convex hull, and the problem of its determination in the plane is equivalent to find the shortest curve among all Jordan curves lying in the difference set of B and A and encircling A. Algorithms solving this problem known from Computational Geometry are based on the triangulation or similar decomposition of that difference set. The algorithm presented here does not use such decomposition, but it supposes that A and B are given as ordered sequences of vertices. The algorithm is based on convex hull calculations of A and B and of smaller polygons and polylines, it produces the output list of vertices of the relative convex hull from the sequence of vertices of the convex hull of A.

at May 02, 2016 12:00 AM UTC

Authors: Yi Li, David P. Woodruff
Download: PDF
Abstract: For any real number $p > 0$, we nearly completely characterize the space complexity of estimating $\|A\|_p^p = \sum_{i=1}^n \sigma_i^p$ for $n \times n$ matrices $A$ in which each row and each column has $O(1)$ non-zero entries and whose entries are presented one at a time in a data stream model. Here the $\sigma_i$ are the singular values of $A$, and when $p \geq 1$, $\|A\|_p^p$ is the $p$-th power of the Schatten $p$-norm. We show that when $p$ is not an even integer, to obtain a $(1+\epsilon)$-approximation to $\|A\|_p^p$ with constant probability, any $1$-pass algorithm requires $n^{1-g(\epsilon)}$ bits of space, where $g(\epsilon) \rightarrow 0$ as $\epsilon \rightarrow 0$ and $\epsilon > 0$ is a constant independent of $n$. However, when $p$ is an even integer, we give an upper bound of $n^{1-2/p} \textrm{poly}(\epsilon^{-1}\log n)$ bits of space, which holds even in the turnstile data stream model. The latter is optimal up to $\textrm{poly}(\epsilon^{-1} \log n)$ factors.

Our results considerably strengthen lower bounds in previous work for arbitrary (not necessarily sparse) matrices $A$: the previous best lower bound was $\Omega(\log n)$ for $p\in (0,1)$, $\Omega(n^{1/p-1/2}/\log n)$ for $p\in [1,2)$ and $\Omega(n^{1-2/p})$ for $p\in (2,\infty)$. We note for $p \in (2, \infty)$, while our lower bound for even integers is the same, for other $p$ in this range our lower bound is $n^{1-g(\epsilon)}$, which is considerably stronger than the previous $n^{1-2/p}$ for small enough constant $\epsilon > 0$. We obtain similar near-linear lower bounds for Ky-Fan norms, SVD entropy, eigenvalue shrinkers, and M-estimators, many of which could have been solvable in logarithmic space prior to our work.

at May 02, 2016 12:00 AM UTC

Authors: Andrew Suk
Download: PDF
Abstract: Let $ES(n)$ be the smallest integer such that any set of $ES(n)$ points in the plane in general position contains $n$ points in convex position. In their seminal 1935 paper, Erdos and Szekeres showed that $ES(n) \leq {2n - 4\choose n-2} + 1 = 4^{n -o(n)}$. In 1960, they showed that $ES(n) \geq 2^{n-2} + 1$ and conjectured this to be optimal. Despite the efforts of many researchers, no improvement in the order of magnitude has ever been made on the upper bound over the last 81 years. In this paper, we nearly settle the Erdos-Szekeres conjecture by showing that $ES(n) =2^{n +o(n)}$.

at May 02, 2016 12:00 AM UTC

Authors: Priyam Das
Download: PDF
Abstract: In this paper, we develop a novel derivative-free deterministic greedy algorithm for global optimization of any objective function of parameters belonging to a unit-simplex. Main principle of the proposed algorithm is making jumps of varying step-sizes within the simplex parameter space and searching for the best direction to move in a greedy manner. Unlike most of the other existing methods of constraint optimization, here the objective function is evaluated at independent directions within an iteration. Thus incorporation of parallel computing makes it even faster. Requirement of parallelization grows only in the order of the dimension of the parameter space, which makes it more convenient for solving high-dimensional optimization problems in simplex parameter space using parallel computing. A comparative study of the performances of this algorithm and other existing algorithms have been shown for some moderate and high-dimensional optimization problems along with some transformed benchmark test-functions on simplex. Around 20-300 folds improvement in computation time has been achieved using the proposed algorithm over Genetic algorithm with more accurate solution.

at May 02, 2016 12:00 AM UTC

Authors: Laurent Bulteau, Guillaume Fertin, Anthony Labarre, Romeo Rizzi, Irena Rusu
Download: PDF
Abstract: Let $S=\{K_{1,3},K_3,P_4\}$ be the set of connected graphs of size 3. We study the problem of partitioning the edge set of a graph $G$ into graphs taken from any non-empty $S'\subseteq S$. The problem is known to be NP-complete for any possible choice of $S'$ in general graphs. In this paper, we assume that the input graph is cubic, and study the computational complexity of the problem of partitioning its edge set for any choice of $S'$. We identify all polynomial and NP-complete problems in that setting, and give graph-theoretic characterisations of $S'$-decomposable cubic graphs in some cases.

at May 02, 2016 12:00 AM UTC

I posted about the Gathering for Gardner conference and about some of the talks I saw here. Today I continue with a few more talks.

Playing Penney's game with Roulette by Robert Vallen. Penney;'s game is the following:  let k be fixed. Alice and Bob pick different elements of {H,T}^k.  They flip a coin until one of their sequences shows up, and that person wins. Which sequences have the best probability of winning?

New Polyhedral dice by Robert Fathauer, Henry Segerman, Robert Bosch. This is a good example of how my mentality (and possibly yours) differs from others. When I hear ``60-sided dice'' I think ``p1,...,p60 where are all between 0 and 1 and add up to 1'' I also thought that only the platonic solids could be usedvto form fair dice (so only 4-sided, 6-sided, 8-sided, 12-sided, and 20-sided dice can be made). NOT so. These authors actually MAKE real dice and they do not have to be platonic solids. Here is their website.

Numerically balance dice by Robert Bosch (paper is here). Why do dice have the opposite sides sum to the same thing?  Read the paper to find out!

Secret messages in juggling and card shuffling by Erik Demaine. Erik Demaine was one of about 4 theoretical computer scientists I met at the conference, though Erik is so well rounded that calling him a theoretical computer scientist doesn't seem quite right. I had never met him before which surprised me. In this talk he showed us some new fonts- one using juggling. See here for an example of juggling fonts, co-authored with his father Martin.

Fibonacci Lemonade by Andrea Johanna Hawksley. Put in the leomon and sugar in fib number increments. Here is their website. In my first post I said the talks were on a variety of topics and then presented mostly math talks. This talk is an example of that variety. There were other talks involving the Fib numbers. I was surprised by this since they aren't that special (see here).

Penis Covers and Puzzles: Brain Injuries and Brain Health by Gini Wingard-Phillips. She recounted having various brain injuries and how working on mathematical puzzles, of the type Martin Gardner popularized as HELPING HER RECOVER! As for the title- people with brain injuries sometimes have a hard time finding the words for things so they use other words. In this case she wanted her husband to buy some condoms but couldn't think of the word so she said Penis Covers instead.

Loop- Pool on an Ellipse by Alex Bellos. Similar in my mind to the Polyhedral dice talk (you'll see why). We all know that if you built an elliptical pool table with a hole at one of the foci then if the ball is placed at the other foci and hit hard enough it WILL go into the other hole. But Alex Bellos actually MAKES these pool table (see here if you want buy one for $20,000). He told us the history- someone else tried to make one in 1962 but nobody bought them (I wonder if anyone are going to buy his), and Alex had problems with friction as you may recall that it only works on a frictionless surface. So his game does require some skill. The similarity to dice is that I (and you?) are used to thinking about dice and ellipses abstractly, not as objects people actually build.

This post is getting long so I'll stop here and report more in a later post. Why so mny posts? Six minute talks that I an actually understand and are delighted to tell you about!

by GASARCH ( at May 01, 2016 09:56 PM UTC