Theory of Computing Blog Aggregator

A recent work of Chen, Kabanets, Kolokolova, Shaltiel and Zuckerman (CCC 2014, Computational Complexity 2015) introduced the Compression problem for a class $\mathcal{C}$ of circuits, defined as follows. Given as input the truth table of a Boolean function $f:\{0,1\}^n \rightarrow \{0,1\}$ that has a small (say size $s$) circuit from $\mathcal{C}$, find in time $2^{O(n)}$ \emph{any} Boolean circuit for $f$ of size less than trivial, i.e. much smaller than $2^n/n$. The work of Chen et al. gave compression algorithms for many classes of circuits including $AC^0$ (the class of constant-depth unbounded fan-in circuits made up of AND, OR, and NOT gates) and Boolean formulas of size nearly $n^2$. They asked if similar results can be obtained for the circuit class $AC^0[\oplus]$, the class of circuits obtained by augmenting $AC^0$ with unbounded fan-in parity gates. We answer the question positively here, using techniques from work of Kopparty and the author (FSTTCS 2012).

at August 28, 2015 06:00 PM UTC

How to avoid the pain of estimating tough sums

Cricketing source

Andrew Granville is a number theorist, who has written—besides his own terrific research—some beautiful expository papers, especially on analytic number theory.

Today Ken and I wish to talk about his survey paper earlier this year on the size of gaps between consecutive primes.

The paper in question is here. It is a discussion of the brilliant breakthrough work of Yitang Zhang, which almost solves the famous twin-prime conjecture. You probably know that Zhang proved that gaps between consecutive primes are infinitely often bounded by an absolute constant. His constant initially was huge, but using his and additional insights it is now at most {600} and plans are known that might cut it to {6}.

As Andrew says in his paper’s introduction:

To Yitang Zhang, for showing that one can, no matter what

We believe we need the same attitude to make progress on some of our “untouchable” problems. Perhaps there is some budding complexity theorist, who is making ends meet at a Subway™ sandwich shop, and while solving packing problems between rolls in real time is finding the insights that can unlock the door to some of our open problems. Could one of these be ready to fall?

  • Is NP closed under complement?

  • What is the power of {\bmod \ 6} polynomials?

  • Can we at least prove SAT is not computable in linear time?
  • And so on …

Who knows.

Monotone Counting

Throughout mathematics—especially number theory—computing the number of objects is of great importance. Sometimes we can count the exact number of objects. For example, it is long known that there are exactly {n^{n-2}} labeled trees, thanks to Arthur Cayley.

The number of labeled planar graphs is another story—no exact formula is known. The current best estimate is an asymptotic one for the number of such graphs on {n} vertices:

\displaystyle  g\cdot n^{-7/2}\cdot \gamma^n\cdot n!,

where {\gamma\approx 27.22687} and {g\approx 0.43\times 10^{-5}}.

Thanks to a beautiful result of the late Larry Stockmeyer we know that estimating the number of objects from a large set may be hard, but is not too hard, in the sense of complexity theory. He showed that

Theorem 1 For any {\epsilon > 0} one can design an {n^{O(1)}}-time randomized algorithm that takes an NP-oracle {H}, a predicate {R(x,y)} decidable in {n^{O(1)}}-time (where {n = |x|}) and an input {x} and outputs a number {A_x} such that with probability at least {1 - 2^{-n}}, the true number of {y} such that {R(x,y)} holds is between {(1-\epsilon)A_x} and {(1+\epsilon)A_x.}

The predicate {R(x,y)} is treated as a black box, though it needs to be evaluated in polynomial time in order for the algorithm to run in polynomial time. The algorithm has a non-random version that runs within the third level of the polynomial hierarchy.

We won’t state the following formally but Larry’s method can be extended to compute a sum

\displaystyle  \sum_{k=1}^{N} a_{k}

to within a multiplicative error without an NP-oracle, provided the map {k \rightarrow a_{k}} is computable in polynomial time, each {a_{k} \ge 0}, and the summands are not sparse and sufficiently regular. Simply think of {N = 2^n}, identify numbers with strings so that {k} runs over {\{0,1\}^n}, and define {R(k,y)} to hold if {1 \leq y \leq a_k}.


Larry’s method fails when a sum has both negative and positive terms; that is, when there is potential cancellation. Consider a sum

\displaystyle  S = \sum_{k=1}^{N} a_{k}.

Even if the terms {a_{k}} are restricted to be {+1} and {-1}, his method does not work. Rewrite the sum as

\displaystyle  P=\sum_{a_{k}>0} a_{k} \text{ and } Q =\sum_{a_{k}<0}a_{k}.

Then knowing {P} and {Q} approximately does not yield a good approximation to the whole sum {S}. We could have {P = A + C} and {Q = B - C} where {C} is a large term that cancels so that the sum is {A+C +B -C = A+B}. So the cancellation could make the lower order sums {A} and {B} dominate.

This happens throughout mathematics, especially in number theory. It also happens in complexity theory, for example in the simulation of quantum computations. For every language {L} in BQP there are Stockmeyer-style predicates {R(x,y)} and {S(x,y)} such that {h = |y|} equals the number of Hadamard gates in poly-size quantum circuits recognizing {L} whose acceptance amplitude is given by

\displaystyle  \frac{|\{y: R(x,y)\}| - |\{y: S(x,y)\}|}{\sqrt{2^h}}.

Although the individual sums can be as high as {2^h} their difference “miraculously” stays within {\sqrt{2^h}}, and the former is never less than the latter—no absolute value bars are needed on the difference. See this or the last main chapter of our quantum algorithms book for details. Larry’s algorithm can get you a {(1 + \epsilon)} approximation of both terms, but precisely because the difference stays so small, it does not help you approximate the probability. This failing is also why the algorithm doesn’t by itself place BQP within the hierarchy.

When Only Positivity is Needed

What struck Ken and me first is that the basic technique used by Yitang Zhang does not need to estimate a sum, only to prove that it is positive. This in turn is needed only to conclude that some term is positive. Thus the initial task is lazier than doing an estimate, though estimates come in later.

The neat and basic idea—by Daniel Goldston, János Pintz, and Cem Yíldírím in a 2009 paper that was surveyed in 2007—uses indicator terms of the form

\displaystyle  t_k = \ell(k+b_1) + \ell(k+b_2) + \cdots \ell(k+b_r) - \ln(3N),

where {\ell(p)} is defined to be {\ln(p)} if {p} is prime and is {0} otherwise. Here {k} runs from {N} to {2N} and {0 < b_1 < b_2 < \cdots < b_r < N}. If the term {t_k} is positive then at least two of the elements {\ell(k+b_i)} and {\ell(k+b_j)} must be prime, which means that the gap between them is at most the fixed value {g = b_r - b_1}. Doing this for infinitely many {N} yields Zhang’s conclusion that infinitely many pairs of primes have gap at most {g}. Can it really be this simple?

The strategy is to minimize {g}, but it needs to provide a way to get a handle on the {\ell(\cdot)} function. This needs forming the sum

\displaystyle  S = \sum_{k=N}^{2N} w_k t_k,

where the {w_k} are freely choosable non-negative real weights. The analysis needs the {b_i} to be chosen so that for every prime {p \leq r} there exists {k < p} such that {p} does not divide any of {k+b_1,\dots,k+b_r}. This defines the ordered tuple {(b_1,\dots,b_r)} as admissible and enters the territory of the famous conjecture by Godfrey Hardy and John Littlewood that there exist infinitely many {k} such that {k+b_1,\dots,k+b_r} are all prime. This has not been proven for any admissible tuple—of course the tuple {(0,2)} gives the Twin Prime conjecture—but thanks to Zhang and successors we know it holds for {(0,b)} for some {b \leq 600}.

The analysis also needs to decouple {N} from the {b_i}—originally there was a dependence which only enabled Goldston, Pintz, and Yíldírím to prove cases where the gaps grow relatively slowly. Andrew’s survey goes far into details of how the sum {S = \sum w_k t_k} and its use of the {\ell(\cdot)} function must be expanded into further sums using arithmetic progressions and the Möbius function in order for techniques of analysis to attack it. The need for estimation heightens the cancellation problem, which he emphasizes throughout his article. The issue is, as he states, of central importance in number theory:

The two sums … are not easy to evaluate: The use of the Möbius function leads to many terms being positive, and many negative, so that there is a lot of cancellation. There are several techniques in analytic number theory that allow one to get accurate estimates for such sums, two more analytic and the other more combinatorial. We will discuss them all.

As usual see his lovely article for further details. Our hope is that they could one day be used to help us with our cancellation problems.


I thought that we might look at two other strategies that can be used in complexity theory, and that might have some more general applications.

{\bullet } Volume trick

Ben Cousins and Santosh Vempala recently wrote an intricate paper involving integrals that compute volumes. At its heart however is a simple idea. This is represent a volume {V} in the form

\displaystyle  V = \frac{A_{1}}{A_{2}} \cdot \frac{A_{2}}{A_{3}} \cdots \frac{A_{m-1}}{A_{m}},

where {A_m} is the hard quantity that we really want to compute. The trick is to select the other {A_{i}}‘s in a clever way so that each ratio is not hard to approximate. Yet the cancellation yields that

\displaystyle  V = A_{1} / A_{m}.

If {A_{1}} is simple, and if we have another way to estimate {V}, then we get an approximation for the hard to compute {A_{m}}.

{\bullet } Sampling trick

Let {S = \sum_{k} a_{k}} and {R = \sum_{k} a_{k}r(k) } where {r(k)} is a random function with expectation {p>0}. Then we know that {S} is equal to {E[R]/p}. Note that this does not need the values {r(k)} to be independent.

Can we use this idea to actually compute a tough sum? The sums we have in mind are inner products

\displaystyle  \langle X,Y \rangle = \sum_k \bar{x}_k y_k,

where the vectors {X} and {Y} are exponentially long but have useful properties. They might be succinct or might carry a promised condition such as we discussed in connection with an identity proved by Joseph Lagrange.

Both cases arise in problems of simulating quantum computations. Our hope is that the regularities in these cases enable one to restrict the ways in which cancellations can occur. A clever choice of dependence in the {r(k)} values then might interact with these restrictions. Well, at this point it is just an idea. Surveying things that boil down to this kind of idea could be another post, but for now we can invite you the readers to comment on your favorite examples.

Open Problems

Can we delineate methods for dealing with cancellations, also when we don’t need to estimate a sum closely but only prove it is positive?

[fixed omission of NP-oracle for Stockmeyer randomized counting]

by RJLipton+KWRegan at August 28, 2015 04:30 AM UTC

Authors: Thiago Braga Marcilon, Rudini Menezes Sampaio
Download: PDF
Abstract: In 2-neighborhood bootstrap percolation on a graph G, an infection spreads according to the following deterministic rule: infected vertices of G remain infected forever and in consecutive rounds healthy vertices with at least 2 already infected neighbors become infected. Percolation occurs if eventually every vertex is infected. The maximum time t(G) is the maximum number of rounds needed to eventually infect the entire vertex set. In 2013, it was proved \cite{eurocomb13} that deciding whether $t(G)\geq k$ is polynomial time solvable for k=2, but is NP-Complete for k=4 and, if the problem is restricted to bipartite graphs, it is NP-Complete for k=7. In this paper, we solve the open questions. We obtain an $O(mn^5)$-time algorithm to decide whether $t(G)\geq 3$. For bipartite graphs, we obtain an $O(mn^3)$-time algorithm to decide whether $t(G)\geq 3$, an $O(m^2n^9)$-time algorithm to decide whether $t(G)\geq 4$ and we prove that $t(G)\geq 5$ is NP-Complete.

at August 28, 2015 12:42 AM UTC

Authors: Thiago Braga Marcilon, Rudini Menezes Sampaio
Download: PDF
Abstract: In 2-neighborhood bootstrap percolation on a graph $G$, an infection spreads according to the following deterministic rule: infected vertices of $G$ remain infected forever and in consecutive rounds healthy vertices with at least two already infected neighbors become infected. Percolation occurs if eventually every vertex is infected. The maximum time $t(G)$ is the maximum number of rounds needed to eventually infect the entire vertex set. In 2013, it was proved by Benevides et al \cite{eurocomb13} that $t(G)$ is NP-hard for planar graphs and that deciding whether $t(G)\geq k$ is polynomial time solvable for $k\leq 2$, but is NP-complete for $k\geq 4$. They left two open problems about the complexity for $k=3$ and for planar bipartite graphs. In 2014, we solved the first problem\cite{wg2014}. In this paper, we solve the second one by proving that $t(G)$ is NP-complete even in grid graphs with maximum degree 3. We also prove that $t(G)$ is polynomial time solvable for solid grid graphs with maximum degree 3. Moreover, we prove that the percolation time problem is W[1]-hard on the treewidth of the graph, but it is fixed parameter tractable with parameters treewidth$+k$ and maxdegree$+k$.

at August 28, 2015 12:42 AM UTC

Authors: Gregory Gutin, Magnus Wahlstrom
Download: PDF
Abstract: The Workflow Satisfiability Problem (WSP) asks whether there exists an assignment of authorized users to the steps in a workflow specification, subject to certain constraints on the assignment. The problem is NP-hard even when restricted to just not equals constraints. Since the number of steps $k$ is relatively small in practice, Wang and Li (2010) introduced a parametrisation of WSP by $k$. Wang and Li (2010) showed that, in general, the WSP is W[1]-hard, i.e., it is unlikely that there exists a fixed-parameter tractable (FPT) algorithm for solving the WSP. Crampton et al. (2013) and Cohen et al. (2014) designed FPT algorithms of running time $O^*(2^{k})$ and $O^*(2^{k\log_2 k})$ for the WSP with so-called regular and user-independent constraints, respectively. In this note, we show that there are no algorithms of running time $O^*(2^{ck})$ and $O^*(2^{ck\log_2 k})$ for the two restrictions of WSP, respectively, with any $c<1$, unless the Strong Exponential Time Hypothesis fails.

at August 28, 2015 12:00 AM UTC

Authors: Carola Doerr, Johannes Lengler
Download: PDF
Abstract: Black-box complexity theory provides lower bounds for the runtime of black-box optimizers like evolutionary algorithms and serves as an inspiration for the design of new genetic algorithms. Several black-box models covering different classes of algorithms exist, each highlighting a different aspect of the algorithms under considerations. In this work we add to the existing black-box notions a new \emph{elitist black-box model}, in which algorithms are required to base all decisions solely on (a fixed number of) the best search points sampled so far. Our model combines features of the ranking-based and the memory-restricted black-box models with elitist selection.

We provide several examples for which the elitist black-box complexity is exponentially larger than that the respective complexities in all previous black-box models, thus showing that the elitist black-box complexity can be much closer to the runtime of typical evolutionary algorithms.

We also introduce the concept of $p$-Monte Carlo black-box complexity, which measures the time it takes to optimize a problem with failure probability at most $p$. Even for small~$p$, the $p$-Monte Carlo black-box complexity of a function class $\mathcal F$ can be smaller by an exponential factor than its typically regarded Las Vegas complexity (which measures the \emph{expected} time it takes to optimize $\mathcal F$).

at August 28, 2015 12:43 AM UTC

Authors: Jens Maßberg
Download: PDF
Abstract: In the rectilinear Steiner arborescence problem the task is to build a shortest rectilinear Steiner tree connecting a given root and a set of terminals which are placed in the plane such that all root-terminal-paths are shortest paths. This problem is known to be NP-hard.

In this paper we consider a more restricted version of this problem. In our case we have a depth restrictions $d(t)\in\mathbb{N}$ for every terminal $t$. We are looking for a shortest binary rectilinear Steiner arborescence such that each terminal $t$ is at depth $d(t)$, that is, there are exactly $d(t)$ Steiner points on the unique root-$t$-path is exactly $d(t)$. We prove that even this restricted version is NP-hard.

at August 28, 2015 12:42 AM UTC

Authors: Till Bruckdorfer, Michael Kaufmann, Stephen Kobourov, Sergey Pupyrev
Download: PDF
Abstract: Set membership of points in the plane can be visualized by connecting corresponding points via graphical features, like paths, trees, polygons, ellipses. In this paper we study the \emph{bus embeddability problem} (BEP): given a set of colored points we ask whether there exists a planar realization with one horizontal straight-line segment per color, called bus, such that all points with the same color are connected with vertical line segments to their bus. We present an ILP and an FPT algorithm for the general problem. For restricted versions of this problem, such as when the relative order of buses is predefined, or when a bus must be placed above all its points, we provide efficient algorithms. We show that another restricted version of the problem can be solved using 2-stack pushall sorting. On the negative side we prove the NP-completeness of a special case of BEP.

at August 28, 2015 12:43 AM UTC

I serve on the conference committee of the ACM publications board and we've had extensive discussions on the question of the role of journals in publication venues. A number of CS conferences, though notably not in TCS, are moving to a hybrid publication model where their conference presentations make their way into refereed journal papers. One of our proposals is the creating of a specific venue for these activities, a new Proceedings of the ACM. In the September CACM,  Joseph Konstan and Jack Davidson lay out this proposal, with pros and cons by Kathryn McKinley and David Rosenblum respectively. The community (that means you) is being asked to give their input.

The theory model has not significantly changed since I was a student. Papers submitted to a conference get reviewed but not refereed, the proofs read over usually just enough to feel confident that the theorem is likely correct. Once authors started submitting electronically they could submit entire proofs, though often in an appendix the program committee is not required to read.

The papers appear in a proceedings and to quote from the STOC proceedings preface
The submissions were not refereed, and many of these papers represent reports of continuing research. It is expected that most of them will appear in a more polished and complete form in scientific journals.
A small select number of papers from a conference are invited to a special issue of a journal where they do go through the full referee process. Some, but not most, of the other papers get submitted to journals directly. We don't have the proper incentives for authors to produce a journal version with full and complete proofs.

Should theory conferences move towards a more hybrid or PACM type of model? I'd had several debates with my fellow theorists many of whom feel the advantages of requiring journal-level papers get outweighed by the extra effort and time required by the authors and the reviewers.

by Lance Fortnow ( at August 27, 2015 12:01 PM UTC

The proceedings of CONCUR'15 have now been published as volume 42 of LIPIcs and are available here.

This is the first edition of CONCUR with open-access proceedings: every reader will be able to access each paper free of charge from now on.

by Luca Aceto ( at August 27, 2015 11:55 AM UTC

Authors: Aleksander Cisłak
Download: PDF
Abstract: In this work, we present a literature review for full-text and keyword indexes as well as our contributions (which are mostly practice-oriented).

The first contribution is the FM-bloated index, which is a modification of the well-known FM-index (a compressed, full-text index) that trades space for speed. In our approach, the count table and the occurrence lists store information about selected $q$-grams in addition to the individual characters. Two variants are described, namely one using $O(n \log^2 n)$ bits of space with $O(m + \log m \log \log n)$ average query time, and one with linear space and $O(m \log \log n)$ average query time, where $n$ is the input text length and $m$ is the pattern length. We experimentally show that a significant speedup can be achieved by operating on $q$-grams (albeit at the cost of very high space requirements, hence the name "bloated").

In the category of keyword indexes we present the so-called split index, which can efficiently solve the $k$-mismatches problem, especially for 1 error. Our implementation in the C++ language is focused mostly on data compaction, which is beneficial for the search speed (by being cache friendly). We compare our solution with other algorithms and we show that it is faster when the Hamming distance is used. Query times in the order of 1 microsecond were reported for one mismatch for a few-megabyte natural language dictionary on a medium-end PC.

A minor contribution includes string sketches which aim to speed up approximate string comparison at the cost of additional space ($O(1)$ per string). They can be used in the context of keyword indexes in order to deduce that two strings differ by at least $k$ mismatches with the use of fast bitwise operations rather than an explicit verification.

at August 27, 2015 12:44 AM UTC

Authors: N. R. Aravind, Pushkar S. Joglekar
Download: PDF
Abstract: We introduce and study the notion of read-$k$ projections of the determinant: a polynomial $f \in \mathbb{F}[x_1, \ldots, x_n]$ is called a {\it read-$k$ projection of determinant} if $f=det(M)$, where entries of matrix $M$ are either field elements or variables such that each variable appears at most $k$ times in $M$. A monomial set $S$ is said to be expressible as read-$k$ projection of determinant if there is a read-$k$ projection of determinant $f$ such that the monomial set of $f$ is equal to $S$. We obtain basic results relating read-$k$ determinantal projections to the well-studied notion of determinantal complexity. We show that for sufficiently large $n$, the $n \times n$ permanent polynomial $Perm_n$ and the elementary symmetric polynomials of degree $d$ on $n$ variables $S_n^d$ for $2 \leq d \leq n-2$ are not expressible as read-once projection of determinant, whereas $mon(Perm_n)$ and $mon(S_n^d)$ are expressible as read-once projections of determinant. We also give examples of monomial sets which are not expressible as read-once projections of determinant.

at August 27, 2015 12:40 AM UTC

Authors: Kokouvi Hounkanli, Andrzej Pelc
Download: PDF
Abstract: Broadcasting and gossiping are fundamental communication tasks in networks. In broadcasting,one node of a network has a message that must be learned by all other nodes. In gossiping, every node has a (possibly different) message, and all messages must be learned by all nodes. We study these well-researched tasks in a very weak communication model, called the {\em beeping model}. Communication proceeds in synchronous rounds. In each round, a node can either listen, i.e., stay silent, or beep, i.e., emit a signal. A node hears a beep in a round, if it listens in this round and if one or more adjacent nodes beep in this round. All nodes have different labels from the set $\{0,\dots , L-1\}$.

Our aim is to provide fast deterministic algorithms for broadcasting and gossiping in the beeping model. Let $N$ be an upper bound on the size of the network and $D$ its diameter. Let $m$ be the size of the message in broadcasting, and $M$ an upper bound on the size of all input messages in gossiping. For the task of broadcasting we give an algorithm working in time $O(D+m)$ for arbitrary networks, which is optimal. For the task of gossiping we give an algorithm working in time $O(N(M+D\log L))$ for arbitrary networks.

at August 27, 2015 12:43 AM UTC

Authors: Péter Biró, Walter Kern, Daniël Paulusma, Péter Wojuteczky
Download: PDF
Abstract: We generalize two well-known game-theoretic models by introducing multiple partners matching games, defined by a graph $G=(N,E)$, with an integer vertex capacity function $b$ and an edge weighting $w$. The set $N$ consists of a number of players that are to form a set $M\subseteq E$ of 2-player coalitions $ij$ with value $w(ij)$, such that each player $i$ is in at most $b(i)$ coalitions. A payoff is a mapping $p: N \times N \rightarrow {\mathbb R}$ with $p(i,j)+p(j,i)=w(ij)$ if $ij\in M$ and $p(i,j)=p(j,i)=0$ if $ij\notin M$. The pair $(M,p)$ is called a solution. A pair of players $i,j$ with $ij\in E\setminus M$ blocks a solution $(M,p)$ if $i, j$ can form, possibly only after withdrawing from one of their existing 2-player coalitions, a new 2-player coalition in which they are mutually better off. A solution is stable if it has no blocking pairs. We give a polynomial-time algorithm that either finds that no stable solution exists, or obtains a stable solution. Previously this result was only known for multiple partners assignment games, which correspond to the case where $G$ is bipartite (Sotomayor, 1992) and for the case where $b\equiv 1$ (Bir\'o et al., 2012). We also characterize the set of stable solutions of a multiple partners matching game in two different ways and perform a study on the core of the corresponding cooperative game, where coalitions of any size may be formed. In particular we show that the standard relation between the existence of a stable solution and the non-emptiness of the core, which holds in the other models with payments, is no longer valid for our (most general) model.

at August 27, 2015 12:00 AM UTC

Authors: Mohammad Bavarian, Dmitry Gavinsky, Tsuyoshi Ito
Download: PDF
Abstract: Two parties wish to carry out certain distributed computational tasks, and they are given access to a source of correlated random bits. It allows the parties to act in a correlated manner, which can be quite useful. But what happens if the shared randomness is not perfect? In this work, we initiate the study of the power of different sources of shared randomness in communication complexity. This is done in the setting of simultaneous message passing (SMP) model of communication complexity, which is one of the most suitable models for studying the resource of shared randomness. Toward characterising the power of various sources of shared randomness, we introduce a measure for the quality of a source - we call it collision complexity. Our results show that the collision complexity tightly characterises the power of a (shared) randomness resource in the SMP model.

Of independent interest is our demonstration that even the weakest sources of shared randomness can in some cases increase the power of SMP substantially: the equality function can be solved very efficiently with virtually any nontrivial shared randomness.

at August 27, 2015 12:42 AM UTC

Authors: Itai Arad, Miklos Santha, Aarthi Sundaram, Shengyu Zhang
Download: PDF
Abstract: A canonical result about satisfiability theory is that the 2-SAT problem can be solved in linear time, despite the NP-hardness of the 3-SAT problem. In the quantum 2-SAT problem, we are given a family of 2-qubit projectors $\Pi_{ij}$ on a system of $n$ qubits, and the task is to decide whether the Hamiltonian $H=\sum \Pi_{ij}$ has a 0-eigenvalue, or it is larger than $1/n^\alpha$ for some $\alpha=O(1)$. The problem is not only a natural extension of the classical 2-SAT problem to the quantum case, but is also equivalent to the problem of finding the ground state of 2-local frustration-free Hamiltonians of spin $\frac{1}{2}$, a well-studied model believed to capture certain key properties in modern condensed matter physics. While Bravyi has shown that the quantum 2-SAT problem has a classical polynomial-time algorithm, the running time of his algorithm is $O(n^4)$. In this paper we give a classical algorithm with linear running time in the number of local projectors, therefore achieving the best possible complexity.

at August 27, 2015 12:41 AM UTC

Authors: Agnieszka Lupinska
Download: PDF
Abstract: We present a simple parallel algorithm to test chordality of graphs which is based on the parallel Lexicographical Breadth-First Search algorithm. In total, the algorithm takes time O(N ) on N-threads machine and it performs work O(N 2 ) , where N is the number of vertices in a graph. Our implementation of the algorithm uses a GPU environment Nvidia CUDA C. The algorithm is implemented in CUDA 4.2 and it has been tested on Nvidia GeForce GTX 560 Ti of compute capability 2.1. At the end of the thesis we present the results achieved by our implementation and compare them with the results achieved by the sequential algorithm

at August 27, 2015 12:43 AM UTC

A bunch of people have asked me to comment on D-Wave’s release of its 1000-qubit processor, and a paper by a group including Cathy McGeoch saying that the machine is 1 or 2 orders of faster (in annealing time, not wall-clock time) than simulated annealing running on a single-core classical computer.  It’s even been suggested that the “Scott-signal” has been shining brightly for a week above Quantham City, but that Scott-man has been too lazy and out-of-shape even to change into his tights.

Scientifically, it’s not clear if much has changed.  D-Wave now has a chip with twice as many qubits as the last one.  That chip continues to be pretty effective at finding its own low-energy states: indeed, depending on various details of definition, the machine can even find its own low-energy states “faster” than some implementation of simulated annealing running on a single-core chip.  Of course, it’s entirely possible that Matthias Troyer or Sergei Isakov or Troels Ronnow or someone like that will be able to find a better implementation of simulated annealing that closes even the modest gap—as happened the last time—but I’ll give the authors the benefit of the doubt that they put good-faith effort into optimizing the classical code.

More importantly, I’d say it remains unclear whether any of the machine’s performance on the instances tested here can be attributed to quantum tunneling effects.  In fact, the paper explicitly states (see page 3) that it’s not going to consider such questions, and I think the authors would agree that you could very well see results like theirs, even if what was going on was fundamentally classical annealing.  Also, of course, it’s still true that, if you wanted to solve a practical optimization problem, you’d first need to encode it into the Chimera graph, and that reduction entails a loss that could hand a decisive advantage to simulated annealing, even without the need to go to multiple cores.  (This is what I’ve described elsewhere as essentially all of these performance comparisons taking place on “the D-Wave machine’s home turf”: that is, on binary constraint satisfaction problems that have precisely the topology of D-Wave’s Chimera graph.)

But, I dunno, I’m just not feeling the urge to analyze this in more detail.  Part of the reason is that I think the press might be getting less hyper-excitable these days, thereby reducing the need for a Chief D-Wave Skeptic.  By this point, there may have been enough D-Wave announcements that papers realize they no longer need to cover each one like an extraterrestrial landing.  And there are more hats in the ring now, with John Martinis at Google seeking to build superconducting quantum annealing machines but with ~10,000x longer coherence times than D-Wave’s, and with IBM Research and some others also trying to scale superconducting QC.  The realization has set in, I think, that both D-Wave and the others are in this for the long haul, with D-Wave currently having lots of qubits, but with very short coherence times and unclear prospects for any quantum speedup, and Martinis and some others having qubits of far higher quality, but not yet able to couple enough of them.

The other issue is that, on my flight from Seoul back to Newark, I watched two recent kids’ movies that were almost defiant in their simple, unironic, 1950s-style messages of hope and optimism.  One was Disney’s new live-action Cinderella; the other was Brad Bird’s Tomorrowland.  And seeing these back-to-back filled me with such positivity and good will that, at least for these few hours, it’s hard to summon my usual crusty self.  I say, let’s invent the future together, and build flying cars and jetpacks in our garages!  Let a thousand creative ideas bloom for how to tackle climate change and the other crises facing civilization!  (Admittedly, mass-market flying cars and jetpacks are probably not a step forward on climate change … but, see, there’s that negativity coming back.)  And let another thousand ideas bloom for how to build scalable quantum computers—sure, including D-Wave’s!  Have courage and be kind!

So yeah, if readers would like to discuss the recent D-Wave paper further (especially those who know something about it), they’re more than welcome to do so in the comments section.  But I’ve been away from Dana and Lily for two weeks, and will endeavor to spend time with them rather than obsessively reloading the comments (let’s see if I succeed).

As a small token of my goodwill, I enclose two photos from my last visit to a D-Wave machine, which occurred when I met with some grad students in Waterloo this past spring.  As you can see, I even personally certified that the machine was operating as expected.  But more than that: surpassing all reasonable expectations for quantum AI, this model could actually converse intelligently, through a protruding head resembling that of IQC grad student Sarah Kaiser.


by Scott at August 26, 2015 09:45 PM UTC

We introduce and study the notion of read-$k$ projections of the determinant: a polynomial $f \in \mathbb{F}[x_1, \ldots, x_n]$ is called a {\it read-$k$ projection of determinant} if $f=det(M)$, where entries of matrix $M$ are either field elements or variables such that each variable appears at most $k$ times in $M$. A monomial set $S$ is said to be expressible as read-$k$ projection of determinant if there is a read-$k$ projection of determinant $f$ such that the monomial set of $f$ is equal to $S$. We obtain basic results relating read-$k$ determinantal projections to the well-studied notion of determinantal complexity. We show that for sufficiently large $n$, the $n \times n$ permanent polynomial $Perm_n$ and the elementary symmetric polynomials of degree $d$ on $n$ variables $S_n^d$ for $2 \leq d \leq n-2$ are not expressible as read-once projection of determinant, whereas $mon(Perm_n)$ and $mon(S_n^d)$ are expressible as read-once projections of determinant. We also give examples of monomial sets which are not expressible as read-once projections of determinant.

at August 26, 2015 09:02 PM UTC

Chemical reaction networks (CRNs) model the behavior of molecules in a well-mixed system. The emerging field of molecular programming uses CRNs not only as a descriptive tool, but as a programming language for chemical computation. Recently, Chen, Doty and Soloveichik introduced a new model of chemical kinetics, rate-independent continuous CRNs (CCRNs), to study the chemical computation of continuous functions. A fundamental question of a CRN is whether a state of the system is reachable through a sequence of reactions in the network. This is known as the reachability problem. In this paper, we investigate CCRN-REACH, the reachability problem for this model of chemical reaction networks. We show that, for continuous CRNs, constructing a path to a state of the network is computable in polynomial time. We also prove that a related problem, Sub-CCRN-REACH, is NP-complete.

at August 26, 2015 07:37 PM UTC

Authors: Reuven Cohen, Liran Katzir, Aviv Yehezkel
Download: PDF
Abstract: Cardinality estimation algorithms receive a stream of elements whose order might be arbitrary, with possible repetitions, and return the number of distinct elements. Such algorithms usually seek to minimize the required storage and processing at the price of inaccuracy in their output. Real-world applications of these algorithms are required to process large volumes of monitored data, making it impractical to collect and analyze the entire input stream. In such cases, it is common practice to sample and process only a small part of the stream elements. This paper presents and analyzes a generic algorithm for combining every cardinality estimation algorithm with a sampling process. We show that the proposed sampling algorithm does not affect the estimator's asymptotic unbiasedness, and we analyze the sampling effect on the estimator's variance.

at August 26, 2015 12:41 AM UTC

Authors: Gili Rosenberg, Poya Haghnegahdar, Phil Goddard, Peter Carr, Kesheng Wu, Marcos López de Prado
Download: PDF
Abstract: We solve a multi-period portfolio optimization problem using D-Wave Systems' quantum annealer. We derive a formulation of the problem, discuss several possible integer encoding schemes, and present numerical examples that show high success rates. The formulation incorporates transaction costs (including permanent and temporary market impact), and, significantly, the solution does not require the inversion of a covariance matrix. The discrete multi-period portfolio optimization problem we solve is significantly harder than the continuous variable problem. We present insight into how results may be improved using suitable software enhancements, and why current quantum annealing technology limits the size of problem that can be successfully solved today. The formulation presented is specifically designed to be scalable, with the expectation that as quantum annealing technology improves, larger problems will be solvable using the same techniques.

at August 26, 2015 12:41 AM UTC

Authors: Changsoo Je, Min Tang, Youngeun Lee, Minkyoung Lee, Young J. Kim
Download: PDF
Abstract: We present a real-time algorithm that finds the Penetration Depth (PD) between general polygonal models based on iterative and local optimization techniques. Given an in-collision configuration of an object in configuration space, we find an initial collision-free configuration using several methods such as centroid difference, maximally clear configuration, motion coherence, random configuration, and sampling-based search. We project this configuration on to a local contact space using a variant of continuous collision detection algorithm and construct a linear convex cone around the projected configuration. We then formulate a new projection of the in-collision configuration onto the convex cone as a Linear Complementarity Problem (LCP), which we solve using a type of Gauss-Seidel iterative algorithm. We repeat this procedure until a locally optimal PD is obtained. Our algorithm can process complicated models consisting of tens of thousands triangles at interactive rates.

at August 26, 2015 12:42 AM UTC

Authors: Per Austrin, Mikko Koivisto, Petteri Kaski, Jesper Nederlof
Download: PDF
Abstract: The Subset Sum problem asks whether a given set of $n$ positive integers contains a subset of elements that sum up to a given target $t$. It is an outstanding open question whether the $O^*(2^{n/2})$-time algorithm for Subset Sum by Horowitz and Sahni [J. ACM 1974] can be beaten in the worst-case setting by a "truly faster", $O^*(2^{(0.5-\delta)n})$-time algorithm, with some constant $\delta > 0$. Continuing an earlier work [STACS 2015], we study Subset Sum parameterized by the maximum bin size $\beta$, defined as the largest number of subsets of the $n$ input integers that yield the same sum. For every $\epsilon > 0$ we give a truly faster algorithm for instances with $\beta \leq 2^{(0.5-\epsilon)n}$, as well as instances with $\beta \geq 2^{0.661n}$. Consequently, we also obtain a characterization in terms of the popular density parameter $n/\log_2 t$: if all instances of density at least $1.003$ admit a truly faster algorithm, then so does every instance. This goes against the current intuition that instances of density 1 are the hardest, and therefore is a step toward answering the open question in the affirmative. Our results stem from novel combinations of earlier algorithms for Subset Sum and a study of an extremal question in additive combinatorics connected to the problem of Uniquely Decodable Code Pairs in information theory.

at August 26, 2015 12:00 AM UTC

Authors: Kaarthik Sundar, Saravanan Venkatachalam, Sivakumar Rathinam
Download: PDF
Abstract: We consider a multiple depot, multiple vehicle routing problem with fuel constraints. We are given a set of targets, a set of depots and a set of homogeneous vehicles, one for each depot. The depots are also allowed to act as refueling stations. The vehicles are allowed to refuel at any depot, and our objective is to determine a route for each vehicle with a minimum total cost such that each target is visited at least once by some vehicle, and the vehicles never run out fuel as it traverses its route. We refer this problem as Multiple Depot, Fuel-Constrained, Multiple Vehicle Routing Problem (FCMVRP). This paper presents four new mixed integer linear programming formulations to compute an optimal solution for the problem. Extensive computational results for a large set of instances are also presented.

at August 26, 2015 12:41 AM UTC

Authors: Gang Mei
Download: PDF
Abstract: This paper presents a fast implementation of the Graham scan on the GPU. The proposed algorithm is composed of two stages: (1) two rounds of preprocessing performed on the GPU and (2) the finalization of finding the convex hull on the CPU. We first discard the interior points that locate inside a quadrilateral formed by four extreme points, sort the remaining points according to the angles, and then divide them into the left and the right regions. For each region, we perform a second round of filtering using the proposed preprocessing approach to discard the interior points in further. We finally obtain the expected convex hull by calculating the convex hull of the remaining points on the CPU. We directly employ the parallel sorting, reduction, and partitioning provided by the library Thrust for better efficiency and simplicity. Experimental results show that our implementation achieves 6x ~ 7x speedups over the Qhull implementation for 20M points.

at August 26, 2015 12:43 AM UTC

We give a self contained proof of a logarithmic lower bound on the communication complexity of any non redundant function, given that there is no access to shared randomness. This bound was first stated in Yao's seminal paper [STOC 1979], but no full proof appears in the literature. Our proof uses the method of Babai and Kimmel [Computational Complexity 1997], introduced there in the context of the simultaneous messages model, applying it to the more general and standard communication model of Yao.

at August 25, 2015 09:59 PM UTC

There's a fun pair of viewpoints in the September CACM by Jeffrey Ullman and myself on experiments in computer science research, with him addressing systems conferences(/people) being far too focused on experiments as the research validation methodology, and me addressing theory conferences(/people) being almost strangely averse to experimental results.  (This link may bring you a digital version of his viewpoint, and this link to mine.)  I hope they might be interesting reading or food for thought.  As someone who works in both camps, I find this separation -- which we both seem to think is growing -- worrisome for the future of the CS research community.   

We actually wrote these up about a year ago (or maybe longer).  Jeff wrote something on the topic on Google+, and I responded.  I think he got drafted into writing something for CACM, and then I got drafted in later.  There was a pretty thorough reviewing process, with a couple of back and forth rounds;  then there was a non-trivial wait for publication.  This seems OK to me -- I'm glad CACM has a non-trivial queue of items for publication.  Overall it was a thorough and reasonably pleasant publication experience, and it's appealing that CACM offers a platform for these types of editorial comments.

by Michael Mitzenmacher ( at August 25, 2015 09:09 PM UTC

Another spin-off of the Noga-poster-formula-competition is a MathOverflow question:  Important formulas in combinatorics.

The question collects important formulas representing major progress in combinatorics.

So far there are 31 formulas and quite a few were new to me. There are several areas of combinatorics that are not yet represented. As is natural, many formulas come from enumerative combinatorics. Don’t hesitate to contribute (best – on MathOverflow) more formulas!

Here are a few:combform

by Gil Kalai at August 25, 2015 03:39 PM UTC

This paper initiates the study of $k$-catalytic branching programs, a nonuniform model of computation that is the counterpart to the uniform catalytic space model of Buhrman, Cleve, Koucky, Loff and Speelman (STOC 2014). A $k$-catalytic branching program computes $k$ boolean functions simultaneously on the same $n$-bit input. Each function has its own entry, accept and reject nodes in the branching program but internal nodes can otherwise be shared. The question of interest is to determine the conditions under which some form of a direct sum property for deterministic space can be shown to hold, or shown to fail. We exhibit both cases. There are functions that are correlated in such a way that their direct sum fails: a significant amount of space can be saved by sharing internal computation among the $k$ functions. By contrast, direct sum is shown to hold in some special cases, such as for the family of functions $(l_1\circ (l_2\circ (\cdots (l_{n-1}\circ l_n)\cdots)$ where each $l_i$ is a literal on variable $x_i$, $1\leq i\leq n$ and each $\circ$ stands individually for either $\wedge$ or $\vee$.

at August 25, 2015 02:36 PM UTC

Authors: Tomoyuki Yamakami
Download: PDF
Abstract: This paper continues a systematic and comprehensive study on the structural properties of CFL functions, which are in general multi-valued partial functions computed by one-way one-head nondeterministic pushdown automata equipped with write-only output tapes (or pushdown transducers), where CFL refers to a relevance to context-free languages. The CFL functions tend to behave quite differently from their corresponding context-free languages. We extensively discuss containments, separations, and refinements among various classes of functions obtained from the CFL functions by applying Boolean operations, functional composition, many-one relativization, and Turing relativization. In particular, Turing relativization helps construct a hierarchy over the class of CFL functions. We also analyze the computational complexity of optimization functions, which are to find optimal values of CFL functions, and discuss their relationships to the associated languages.

at August 25, 2015 12:40 AM UTC

Authors: Rafal Kapelko, Evangelos Kranakis
Download: PDF
Abstract: Consider $n$ sensors placed randomly and independently with the uniform distribution in a $d-$dimensional unit cube ($d\ge 2$). The sensors have identical sensing range equal to $r$, for some $r >0$. We are interested in moving the sensors from their initial positions to new positions so as to ensure that the $d-$dimensional unit cube is completely covered, i.e., every point in the $d-$dimensional cube is within the range of a sensor. If the $i$-th sensor is displaced a distance $d_i$, what is a displacement of minimum cost? As cost measure for the displacement of the team of sensors we consider the $a$-total movement defined as the sum $M_a:= \sum_{i=1}^n d_i^a$, for some constant $a>0$. We assume that $r$ and $n$ are chosen so as to allow full coverage of the $d-$dimensional unit cube and $a > 0$.

The main contribution of the paper is to show the existence of a tradeoff between the $d-$dimensional cube, sensing radius and $a$-total movement. The main results can be summarized as follows for the case of the $d-$dimensional cube.

If the $d-$dimensional cube sensing radius is $\frac{1}{2n^{1/d}}$ and $n=m^d$, for some $m\in N$, then we present an algorithm that uses $O\left(n^{1-\frac{a}{2d}}\right)$ total expected movement (see Algorithm 2 and Theorem 5).

If the $d-$dimensional cube sensing radius is greater than $\frac{3^{3/d}}{(3^{1/d}-1)(3^{1/d}-1)}\frac{1}{2n^{1/d}}$ and $n$ is a natural number then the total expected movement is $O\left(n^{1-\frac{a}{2d}}\left(\frac{\ln n}{n}\right)^{\frac{a}{2d}}\right)$ (see Algorithm 3 and Theorem 7).

In addition, we simulate Algorithm 2 and discuss the results of our simulations.

at August 25, 2015 12:00 AM UTC

Authors: Patrizio Angelini, Till Bruckdorfer, Michael Kaufmann, Tamara Mchedlidze
Download: PDF
Abstract: A point set $S \subseteq \mathbb{R}^2$ is universal for a class $\cal G$ if every graph of ${\cal G}$ has a planar straight-line embedding on $S$. It is well-known that the integer grid is a quadratic-size universal point set for planar graphs, while the existence of a sub-quadratic universal point set for them is one of the most fascinating open problems in Graph Drawing. Motivated by the fact that outerplanarity is a key property for the existence of small universal point sets, we study 2-outerplanar graphs and provide for them a universal point set of size $O(n \log n)$.

at August 25, 2015 12:00 AM UTC

Authors: Alexandr Kazda
Download: PDF
Abstract: We show that if $\mathbb A$ is a core relational structure such that $CSP(\mathbb{A})$ can be solved by a linear Datalog program, and $\mathbb A$ is $n$-permutable for some $n$, then $CSP(\mathbb A)$ can be solved by a symmetric Datalog program (and thus $CSP(\mathbb{A})$ lies in deterministic logspace). At the moment, it is not known for which structures $\mathbb A$ will $CSP(\mathbb{A})$ be solvable by a linear Datalog program. However, once somebody obtains a characterization of linear Datalog, our result immediately gives a characterization of symmetric Datalog.

at August 25, 2015 12:41 AM UTC

Authors: Michael Codish, Luís Cruz-Filipe, Michael Frank, Peter Schneider-Kamp
Download: PDF
Abstract: We apply the pigeonhole principle to show that there must exist Boolean functions on 7 inputs with a multiplicative complexity of at least 7, i.e., that cannot be computed with only 6 multiplications in the Galois field with two elements.

at August 25, 2015 12:41 AM UTC

Authors: Oliver Knill
Download: PDF
Abstract: The zero locus of a function f on a graph G is defined as the graph with vertex set consisting of all complete subgraphs of G, on which f changes sign and where x,y are connected if one is contained in the other. For d-graphs, finite simple graphs for which every unit sphere is a d-sphere, the zero locus of (f-c) is a (d-1)-graph for all c different from the range of f. If this Sard lemma is inductively applied to an ordered list functions f_1,...,f_k in which the functions are extended on the level surfaces, the set of critical values (c_1,...,c_k) for which F-c=0 is not a (d-k)-graph is a finite set. This discrete Sard result allows to construct explicit graphs triangulating a given algebraic set. We also look at a second setup: for a function F from the vertex set to R^k, we give conditions for which the simultaneous discrete algebraic set { F=c } defined as the set of simplices of dimension in {k, k+1,...,n} on which all f_i change sign, is a (d-k)-graph in the barycentric refinement of G. This maximal rank condition is adapted from the continuum and the graph { F=c } is a (n-k)-graph. While now, the critical values can have positive measure, we are closer to calculus: for k=2 for example, extrema of functions f under a constraint {g=c} happen at points, where the gradients of f and g are parallel D f = L D g, the Lagrange equations on the discrete network. As for an application, we illustrate eigenfunctions of geometric graphs and especially the second eigenvector of 3-spheres, which by Courant-Fiedler has exactly two nodal regions. The separating nodal surface of the second eigenfunction f_2 belonging to the smallest nonzero eigenvalue always appears to be a 2-sphere in experiments if G is a 3-sphere.

at August 25, 2015 12:00 AM UTC

Authors: Richard C. Brewster, Sean McGuinness, Benjamin Moore, Jonathan A. Noel
Download: PDF
Abstract: The "reconfiguration problem" for circular colourings asks, given two $(p,q)$-colourings $f$ and $g$ of a graph $G$, is it possible to transform $f$ into $g$ by changing the colour of one vertex at a time such that every intermediate mapping is a $(p,q)$-colouring? We show that this problem can be solved in polynomial time for $2\leq p/q <4$ and is PSPACE-complete for $p/q\geq 4$. This generalizes a known dichotomy theorem for reconfiguring classical graph colourings.

at August 25, 2015 12:40 AM UTC

Authors: Daxin Zhu, Lei Wang, Yingjie Wu, Xiaodong Wang
Download: PDF
Abstract: In this paper, we revisit the much studied LCS problem for two given sequences. Based on the algorithm of Iliopoulos and Rahman for solving the LCS problem, we have suggested 3 new improved algorithms. We first reformulate the problem in a very succinct form. The problem LCS is abstracted to an abstract data type DS on an ordered positive integer set with a special operation Update(S,x). For the two input sequences X and Y of equal length n, the first improved algorithm uses a van Emde Boas tree for DS and its time and space complexities are O(R\log\log n+n) and O(R), where R is the number of matched pairs of the two input sequences. The second algorithm uses a balanced binary search tree for DS and its time and space complexities are O(R\log L+n) and O(R), where L is the length of the longest common subsequence of X and Y. The third algorithm uses an ordered vector for DS and its time and space complexities are O(nL) and O(R).

at August 25, 2015 01:06 AM UTC

Authors: Ilias Diakonikolas, Daniel M. Kane, Vladimir Nikishkin
Download: PDF
Abstract: We give a general unified method that can be used for $L_1$ {\em closeness testing} of a wide range of univariate structured distribution families. More specifically, we design a sample optimal and computationally efficient algorithm for testing the equivalence of two unknown (potentially arbitrary) univariate distributions under the $\mathcal{A}_k$-distance metric: Given sample access to distributions with density functions $p, q: I \to \mathbb{R}$, we want to distinguish between the cases that $p=q$ and $\|p-q\|_{\mathcal{A}_k} \ge \epsilon$ with probability at least $2/3$. We show that for any $k \ge 2, \epsilon>0$, the {\em optimal} sample complexity of the $\mathcal{A}_k$-closeness testing problem is $\Theta(\max\{ k^{4/5}/\epsilon^{6/5}, k^{1/2}/\epsilon^2 \})$. This is the first $o(k)$ sample algorithm for this problem, and yields new, simple $L_1$ closeness testers, in most cases with optimal sample complexity, for broad classes of structured distributions.

at August 25, 2015 12:00 AM UTC

Authors: Gang Mei
Download: PDF
Abstract: This paper presents a practical GPU-accelerated convex hull algorithm and a novel Sorting-based Preprocessing Approach (SPA) for planar point sets. The proposed algorithm consists of two stages: (1) two rounds of preprocessing performed on the GPU and (2) the finalization of calculating the expected convex hull on the CPU. We first discard the interior points that locate inside a quadrilateral formed by four extreme points, and then distribute the remaining points into several (typically four) sub regions. For each subset of points, we first sort them in parallel, then perform the second round of discarding using SPA, and finally form a simple chain for the current remaining points. A simple polygon can be easily generated by directly connecting all the chains in sub regions. We at last obtain the expected convex hull of the input points by calculating the convex hull of the simple polygon. We use the library Thrust to realize the parallel sorting, reduction, and partitioning for better efficiency and simplicity. Experimental results show that our algorithm achieves 5x ~ 6x speedups over the Qhull implementation for 20M points. Thus, this algorithm is competitive in practical applications for its simplicity and satisfied efficiency.

at August 25, 2015 12:00 AM UTC