Theory of Computing Blog Aggregator

Authors: Tetsuya Araki, Koji M. Kobayashi
Download: PDF
Abstract: Kierstead and Trotter (Congressus Numerantium 33, 1981) proved that their algorithm is an optimal online algorithm for the online interval coloring problem. In this paper, for online unit interval coloring, we show that the number of colors used by the Kierstead-Trotter algorithm is at most $3 \omega(G) - 3$, where $\omega(G)$ is the size of the maximum clique in a given graph $G$, and it is the best possible.

at September 29, 2016 01:00 AM UTC

Authors: Till Schäfer, Petra Mutzel
Download: PDF
Abstract: We present a structural clustering algorithm for large-scale datasets of small labeled graphs, utilizing a frequent subgraph sampling strategy. A set of representatives provides an intuitive description of each cluster, supports the clustering process, and helps to interpret the clustering results. The projection-based nature of the clustering approach allows us to bypass dimensionality and feature extraction problems that arise in the context of graph datasets reduced to pairwise distances or feature vectors. While achieving high quality and (human) interpretable clusterings, the runtime of the algorithm only grows linearly with the number of graphs. Furthermore, the approach is easy to parallelize and therefore suitable for very large datasets. Our extensive experimental evaluation on synthetic and real world datasets demonstrates the superiority of our approach over existing structural and subspace clustering algorithms, both, from a runtime and quality point of view.

at September 29, 2016 01:02 AM UTC

Authors: Ioannis Avramopoulos
Download: PDF
Abstract: We show that, by using multiplicative weights in a game-theoretic thought experiment (and an important convexity result on the composition of multiplicative weights with the relative entropy function), a symmetric bimatrix game (that is, a bimatrix matrix wherein the payoff matrix of each player is the transpose of the payoff matrix of the other) either has an interior symmetric equilibrium or there is a pure strategy that is weakly dominated by some mixed strategy. Weakly dominated pure strategies can be detected and eliminated in polynomial time by solving a linear program. Furthermore, interior symmetric equilibria are a special case of a more general notion, namely, that of an "equalizer," which can also be computed efficiently in polynomial time by solving a linear program. An elegant "symmetrization method" of bimatrix games [Jurg et al., 1992] and the well-known PPAD-completeness results on equilibrium computation in bimatrix games [Daskalakis et al., 2009, Chen et al., 2009] imply then the compelling P = PPAD.

at September 29, 2016 01:00 AM UTC

Authors: George B. Mertzios, André Nichterlein, Rolf Niedermeier
Download: PDF
Abstract: Finding maximum-cardinality matchings in undirected graphs is arguably one of the most central graph problems. For $m$-edge and $n$-vertex graphs, it is well-known to be solvable in $O(m\sqrt{n})$ time; however, for several applications this running time is still too slow. Improving this worst-case upper bound resisted decades of research. In this paper we mainly focus on parameterizations of the input with respect to several kinds of distance to triviality, i.e. how far is the input from some linear-time solvable cases. Our contribution is twofold. First we focus on linear-time fixed-parameter algorithms (with low polynomial parameter dependence). To this end we develop the first linear-time algorithm for maximum matching on cocomparability graphs; this algorithm is based on the recently discovered Lexicographic Depth First Search (LDFS) and is of independent interest. Using this algorithm we derive an $O(k(n+m))$-time algorithm for general graphs, where $k$ is the vertex deletion distance to cocomparability graphs. Second we focus on linear-time kernelization. We start a deeper and systematic study of various "distance to triviality"-parameters for the maximum matching problem. We design linear (and almost linear) time computable kernels of size $O(k)$, $O(k^2)$, $O(k^3)$, and $2^{O(k)}$, respectively, where $k$ is the considered parameter in each case. Investigating linear-time kernelization of a polynomial-time solvable problem, such as maximum matching, leads to a rich number of new and combinatorially interesting challenges. Based on our results, we postulate that maximum matching has the clear potential to become the "drosophila" of "FPT in P studies" analogously to the path-breaking role vertex covering played for classical FPT studies for NP-hard problems.

at September 29, 2016 01:04 AM UTC

Authors: Guillaume Bosc, Chedy Raïssy, Jean-François Boulicaut, Mehdi Kaytoue
Download: PDF
Abstract: Discovering descriptions that highly distinguish a class label from another is still a challenging task. Such patterns enable the building of intelligible classifiers and suggest hypothesis that may explain the presence of a label. Subgroup Discovery (SD), a framework that formally defines this pattern mining task, still faces two major issues: (i) to define appropriate quality measures characterizing the singularity of a pattern; (ii) to choose an accurate heuristic search space exploration when a complete enumeration is unfeasible. To date, the most efficient SD algorithms are based on a beam search. The resulting pattern collection lacks however of diversity due to its greedy nature. We propose to use a recent exploration technique, Monte Carlo Tree Search (MCTS). To the best of our knowledge, this is the first attempt to apply MCTS for pattern mining. The exploitation/exploration trade-off and the power of random search leads to any-time mining (a solution is available any-time and improves) that generally outperforms beam search. Our empirical study on various benchmark and real-world datasets shows the strength of our approach with several quality measures.

at September 29, 2016 01:08 AM UTC

Authors: Yair Bartal, Arnold Filtser, Ofer Neiman
Download: PDF
Abstract: Minimum Spanning Trees of weighted graphs are fundamental objects in numerous applications. In particular in distributed networks, the minimum spanning tree of the network is often used to route messages between network nodes. Unfortunately, while being most efficient in the total cost of connecting all nodes, minimum spanning trees fail miserably in the desired property of approximately preserving distances between pairs. While known lower bounds exclude the possibility of the worst case distortion of a tree being small, it was shown in [ABN15] that there exists a spanning tree with constant average distortion. Yet, the weight of such a tree may be significantly larger than that of the MST. In this paper, we show that any weighted undirected graph admits a spanning tree whose weight is at most (1+\rho) times that of the MST, providing constant average distortion O(1/\rho).

The constant average distortion bound is implied by a stronger property of scaling distortion, i.e., improved distortion for smaller fractions of the pairs. The result is achieved by first showing the existence of a low weight spanner with small prioritized distortion, a property allowing to prioritize the nodes whose associated distortions will be improved. We show that prioritized distortion is essentially equivalent to coarse scaling distortion via a general transformation, which has further implications and may be of independent interest. In particular, we obtain an embedding for arbitrary metrics into Euclidean space with optimal prioritized distortion.

at September 29, 2016 01:03 AM UTC

Authors: Shahram Ghandeharizadeh, Sandy Irani, Jenny Lam
Download: PDF
Abstract: We introduce the subset assignment problem in which items of varying sizes are placed in a set of bins with limited capacity. Items can be replicated and placed in any subset of the bins. Each (item, subset) pair has an associated cost. Not assigning an item to any of the bins is not free in general and can potentially be the most expensive option. The goal is to minimize the total cost of assigning items to subsets without exceeding the bin capacities. This problem is motivated by the design of caching systems composed of banks of memory with varying cost/performance specifications. The ability to replicate a data item in more than one memory bank can benefit the overall performance of the system with a faster recovery time in the event of a memory failure. For this setting, the number $n$ of data objects (items) is very large and the number $d$ of memory banks (bins) is a small constant (on the order of $3$ or $4$). Therefore, the goal is to determine an optimal assignment in time that minimizes dependence on $n$. The integral version of this problem is NP-hard since it is a generalization of the knapsack problem. We focus on an efficient solution to the LP relaxation as the number of fractionally assigned items will be at most $d$. If the data objects are small with respect to the size of the memory banks, the effect of excluding the fractionally assigned data items from the cache will be small. We give an algorithm that solves the LP relaxation and runs in time $O({3^d \choose d+1} \text{poly}(d) n \log(n) \log(nC) \log(Z))$, where $Z$ is the maximum item size and $C$ the maximum storage cost.

at September 29, 2016 01:01 AM UTC

Authors: Sariel Har-Peled, Piotr Indyk, Sepideh Mahabadi
Download: PDF
Abstract: In the Sparse Linear Regression (SLR) problem, given a $d \times n$ matrix $M$ and a $d$-dimensional vector $q$, we want to compute a $k$-sparse vector $\tau$ such that the error $||M \tau-q||$ is minimized. In this paper, we present algorithms and conditional lower bounds for several variants of this problem. In particular, we consider (i) the Affine SLR where we add the constraint that $\sum_i \tau_i=1$ and (ii) the Convex SLR where we further add the constraint that $\tau \geq 0$. Furthermore, we consider (i) the batched (offline) setting, where the matrix $M$ and the vector $q$ are given as inputs in advance, and (ii) the query(online) setting, where an algorithm preprocesses the matrix $M$ to quickly answer such queries. All of the aforementioned variants have been well-studied and have many applications in statistics, machine learning and sparse recovery.

We consider the approximate variants of these problems in the "low sparsity regime" where the value of the sparsity bound $k$ is low. In particular, we show that the online variant of all three problems can be solved with query time $\tilde O(n^{k-1})$. This provides non-trivial improvements over the naive algorithm that exhaustively searches all ${ n \choose k}$ subsets $B$. We also show that solving the offline variant of all three problems, would require an exponential dependence of the form $\tilde \Omega(n^{k/2}/e^{k})$, under a natural complexity-theoretic conjecture. Improving this lower bound for the case of $k=4$ would imply a nontrivial lower bound for the famous Hopcroft's problem. Moreover, solving the offline variant of affine SLR in $o(n^{k-1})$ would imply an upper bound of $o(n^d)$ for the problem of testing whether a given set of $n$ points in a $d$-dimensional space is degenerate. However, this is conjectured to require $\Omega(n^d)$ time.

at September 29, 2016 12:00 AM UTC

Authors: Takuya Akiba, Yoichi Iwata, Yosuke Sameshima, Naoto Mizuno, Yosuke Yano
Download: PDF
Abstract: The construction of cut trees (also known as Gomory-Hu trees) for a given graph enables the minimum-cut size of the original graph to be obtained for any pair of vertices. Cut trees are a powerful back-end for graph management and mining, as they support various procedures related to the minimum cut, maximum flow, and connectivity. However, the crucial drawback with cut trees is the computational cost of their construction. In theory, a cut tree is built by applying a maximum flow algorithm for $n$ times, where $n$ is the number of vertices. Therefore, naive implementations of this approach result in cubic time complexity, which is obviously too slow for today's large-scale graphs. To address this issue, in the present study, we propose a new cut-tree construction algorithm tailored to real-world networks. Using a series of experiments, we demonstrate that the proposed algorithm is several orders of magnitude faster than previous algorithms and it can construct cut trees for billion-scale graphs.

at September 29, 2016 01:10 AM UTC

Authors: Sören Pirk, Vojtech Krs, Kaimo Hu, Suren Deepak Rajasekaran, Hao Kang, Bedrich Benes, Yusuke Yoshiyasu, Leonidas J. Guibas
Download: PDF
Abstract: Interactions play a key role in understanding objects and scenes, for both virtual and real world agents. We introduce a new general representation for proximal interactions among physical objects that is agnostic to the type of objects or interaction involved. The representation is based on tracking particles on one of the participating objects and then observing them with sensors appropriately placed in the interaction volume or on the interaction surfaces. We show how to factorize these interaction descriptors and project them into a particular participating object so as to obtain a new functional descriptor for that object, its interaction landscape, capturing its observed use in a spatio-temporal framework. Interaction landscapes are independent of the particular interaction and capture subtle dynamic effects in how objects move and behave when in functional use. Our method relates objects based on their function, establishes correspondences between shapes based on functional key points and regions, and retrieves peer and partner objects with respect to an interaction.

at September 29, 2016 12:00 AM UTC

The power symmetric polynomial on $n$ variables of degree $d$ is defined as $p_d(x_1,\ldots, x_n) = x_{1}^{d}+\dots + x_{n}^{d}$. We study polynomials that are expressible as a sum of powers of homogenous linear projections of power symmetric polynomials. These form a subclass of polynomials computed by depth five circuits with summation and powering gates (i.e., $ \sum\bigwedge\sum\bigwedge\sum$ circuits). We show $2^{\Omega(n)}$ size lower bounds for $x_1\cdots x_n$ against the following models: \begin{itemize} \item Depth five $\sum\bigwedge\sum^{\le n}\bigwedge^{\ge 21}\sum$ arithmetic circuits where the bottom $\sum$ gate is homogeneous; \item Depth four $\sum\bigwedge\sum^{\le n}\bigwedge$ arithmetic circuits. \end{itemize} Together with the ideas in [Forbes, FOCS 2015] our lower bounds imply deterministic $n^{\poly(\log n)}$ black-box identity testing algorithms for the above classes of arithmetic circuits. Our technique uses a measure that involves projecting the partial derivative space of the given polynomial to its multilinear subspace and then setting a subset of variables to $0$.

at September 28, 2016 03:25 PM UTC

Authors: Patrick Loiseau an Xiaohu Wu
Download: PDF
Abstract: We study simple yet efficient algorithms for scheduling n independent monotonic moldable tasks on $m$ identical processors; the objective is to: (1) minimize the makespan, or (2) maximize the sum of values of tasks completed by a deadline.The workload of a monotonic task is non-decreasing with the number of assigned processors.

In this paper, we propose a scheduling algorithm who achieves a processor utilization of r when the number of processors assigned to a task j is the minimal number of processors needed to complete j by a time d. Here, r equals (1-k/m)/2 in the general setting where k is the maximum number of processors allocated to the tasks (in large computing clusters, m>>k and k/m approaches 0). More importantly, in many real applications, when a parallel task is executed on a small set of f processors, the speedup is linearly proportional to f. This is to be proved to be powerful in designing more efficient algorithms than the existing algorithms which is illustrated in a typical case where f=5; we propose an algorithm who can achieve a utilization of r=3(1-(k+3)/m)/4 and the extension of this algorithm to the case with an arbitrary f is also discussed. Based on the above schedule, we propose an r(1+\epsilon)-approximation algorithm with a complexity of O(nlog(n/\epsilon)) for the first scheduling objective. We also propose a generic greedy algorithm for the second scheduling objective, and, by analyzing it, give an r-approximation algorithm with a complexity of O(n). So far, for the first scheduling objective, the algorithm proposed in the typical setting is simpler and has a better performance than most of the existing algorithms given m>>k; the second objective is considered for the first time.

at September 28, 2016 01:02 AM UTC

Authors: Yi-Jun Chang, Tsvi Kopelowitz, Seth Pettie, Ruosong Wang, Wei Zhan
Download: PDF
Abstract: In many networks of wireless devices the scarcest resource is energy, and the lion's share of energy is often spent on sending and receiving packets. In this paper we present a comprehensive study of the energy complexity of fundamental problems in wireless networks with four different levels of collision detection: Strong-CD (in which transmitters and listeners detect collisions), Sender-CD (in which transmitters detect collisions, indirectly), Receiver-CD (in which listeners detect collisions), and No-CD (in which no one detects collisions).

We show that the randomized energy complexity of Approximate Counting and Leader Election is $\Omega(\log^* n)$ in Sender-CD and No-CD but $\Omega(\log(\log^* n))$ in Strong-CD and Receiver-CD, and also provide matching upper bounds. This establishes an exponential separation between the Sender-CD and Receiver-CD models, and also confirms that the recent $O(\log(\log^* n))$ Contention Resolution protocol of Bender et al. (STOC 2016) is optimal in Strong-CD.

In the deterministic setting, all $n$ devices have unique IDs in the range $[N]$. We establish another exponential separation between the deterministic Sender-CD and Receiver-CD models in the opposite direction. We show that Leader Election can be solved with $O(\log \log N)$ energy in the deterministic Sender-CD model, and give a matching $\Omega(\log \log N)$ energy lower bound in the Strong-CD model. However, in Receiver-CD and No-CD the energy complexity of these problems jumps to $\Theta(\log N)$.

For the special case where $n = \Theta(N)$, we prove that Leader Election can be solved with only $O(\alpha(N))$ energy in No-CD. To our best knowledge, this is the first time the inverse-Ackermann function appears in the field of distributed computing.

at September 28, 2016 12:00 AM UTC

Authors: Gregor Jossé, Ying Lu, Tobias Emrich, Matthias Renz, Cyrus Shahabi, Ugur Demiryurek, Matthias Schubert
Download: PDF
Abstract: This paper extends the Arc Orienteering Problem (AOP) to large road networks with time-dependent travel times and time-dependent value gain, termed Twofold Time-Dependent AOP or 2TD-AOP for short. In its original definition, the NP-hard Orienteering Problem (OP) asks to find a path from a source to a destination maximizing the accumulated value while not exceeding a cost budget. Variations of the OP and AOP have many practical applications such as mobile crowdsourcing tasks (e.g., repairing and maintenance or dispatching field workers), diverse logistics problems (e.g., crowd control or controlling wildfires) as well as several tourist guidance problems (e.g., generating trip recommendations or navigating through theme parks). In the proposed 2TD-AOP, travel times and value functions are assumed to be time-dependent. The dynamic values model, for instance, varying rewards in crowdsourcing tasks or varying urgency levels in damage control tasks. We discuss this novel problem, prove the benefit of time-dependence empirically and present an efficient approximative solution, optimized for fast response systems. Our approach is the first time-dependent variant of the AOP to be evaluated on a large scale, fine-grained, real-world road network. We show that optimal solutions are infeasible and solutions to the static problem are often invalid. We propose an approximate dynamic programming solution which produces valid paths and is orders of magnitude faster than any optimal solution.

at September 28, 2016 01:03 AM UTC

Authors: Jacob Evald, Søren Dahlgaard
Download: PDF
Abstract: Finding important nodes in a graph and measuring their importance is a fundamental problem in the analysis of social networks, transportation networks, biological systems, etc. Among popular such metrics are graph centrality, betweenness centrality (BC), and reach centrality (RC). These measures are also very related to classic notions like diameter and radius. Roditty and Vassilevska Williams~[STOC'13] showed that no algorithm can compute a (3/2-\delta)-approximation of the diameter in sparse and unweighted graphs faster that n^{2-o(1)} time unless the widely believed strong exponential time hypothesis (SETH) is false. Abboud et al.~[SODA'15] and [SODA'16] further analyzed these problems under the recent line of research on hardness in P. They showed that in sparse and unweighted graphs (weighted for BC) none of these problems can be solved faster than n^{2-o(1)} unless some popular conjecture is false. Furthermore they ruled out a (2-\delta)-approximation for RC, a (3/2-\delta)-approximation for Radius and a (5/3-\delta)-approximation for computing all eccentricities of a graph for any \delta > 0. We extend these results to the case of unweighted graphs with constant maximum degree. Through new graph constructions we are able to obtain the same approximation and time bounds as for sparse graphs even in unweighted bounded-degree graphs. We show that no (3/2-\delta) approximation of Radius or Diameter, (2-\delta)-approximation of RC, (5/3-\delta)-approximation of all eccentricities or exact algorithm for BC exists in time n^{2-o(1)} for such graphs and any \delta > 0. This strengthens the result for BC of Abboud et al.~[SODA'16] by showing a hardness result for unweighted graphs, and follows in the footsteps of Abboud et al.~[SODA'16] and Abboud and Dahlgaard~[FOCS'16] in showing conditional lower bounds for restricted but realistic graph classes.

at September 28, 2016 12:00 AM UTC

Authors: Jesse Read, Luca Martino, Jaakko Hollmén
Download: PDF
Abstract: The number of methods available for classification of multi-label data has increased rapidly over recent years, yet relatively few links have been made with the related task of classification of sequential data. If labels indices are considered as time indices, the problems can often be seen as equivalent. In this paper we detect and elaborate on connections between multi-label methods and Markovian models, and study the suitability of multi-label methods for prediction in sequential data. From this study we draw upon the most suitable techniques from the area and develop two novel competitive approaches which can be applied to either kind of data. We carry out an empirical evaluation investigating performance on real-world sequential-prediction tasks: electricity demand, and route prediction. As well as showing that several popular multi-label algorithms are in fact easily applicable to sequencing tasks, our novel approaches, which benefit from a unified view of these areas, prove very competitive against established methods.

at September 28, 2016 01:02 AM UTC

Authors: François Le Gall, David J. Rosenbaum
Download: PDF
Abstract: In this paper, we prove results on the relationship between the complexity of the group and color isomorphism problems. The difficulty of color isomorphism problems is known to be closely linked to the the composition factors of the permutation group involved. Previous works are primarily concerned with applying color isomorphism to bou nded degree graph isomorphism, and have therefore focused on the alternating composit ion factors, since those are the bottleneck in the case of graph isomorphism.

We consider the color isomorphism problem with composition factors restricted to those other than the alternating group, show that group isomorphism reduces in n^(O(log log n)) time to this problem, and, conversely, that a special case of this color isomorphism problem reduces to a slight generalization of group isomorphism. We then sharpen our results by identifying the projective special linear group as the main obstacle to faster algorithms for group isomorphism and prove that the aforementioned reduc tion from group isomorphism to color isomorphism in fact produces only cyclic and projective special linear factors. Our results demonstrate that, just as the alternatin g group was a barrier to faster algorithms for graph isomorphism for three decades, the projective special linear group is an obstacle to faster algorithms for group isomorphism.

at September 28, 2016 12:00 AM UTC

My Ph.D. is in pure mathematics, and I admit I don't know much (i.e. anything) about theoretical CS. However, I have started exploring non-academic options for my career and in introducing myself to machine learning, stumbled across statements such as "No one understands why neural networks work well," which I found interesting.

My question, essentially, is what kinds of answers do researchers want? Here's what I've found in my brief search on the topic:

  • The algorithms implementing simple neural networks are pretty straightforward.
  • The process of SGD is well-understood mathematically, as is the statistical theory.
  • The universal approximation theorem is powerful and proven.
  • There's a nice recent paper https://arxiv.org/abs/1608.08225 which essentially gives the answer that universal approximation is much more than we actually need in practice because we can make strong simplifying assumptions about the functions we are trying to model with the neural network.

In the aforementioned paper, they state (paraphrasing) "GOFAI algorithms are fully understood analytically, but many ANN algorithms are only heuristically understood." Convergence theorems for the implemented algorithms are an example of analytic understanding that it seems we DO have about neural networks, so a statement at this level of generality doesn't tell me much about what's known vs. unknown or what would be considered "an answer."

The authors do suggest in the conclusion that questions such as effective bounds on the size of the neural network needed to approximate a given polynomial are open and interesting. What are other examples of mathematically specific analytical questions that would need to be answered to say that we "understand" neural networks? Are there questions that may be answered in more pure mathematical language?

(I am specifically thinking of methods in representation theory due to the use of physics in this paper --- and, selfishly, because it is my field of study. However, I can also imagine areas such as combinatorics/graph theory, algebraic geometry, and topology providing viable tools.)

by Neuling at September 27, 2016 05:48 PM UTC

The playwright  Edward Albee passed away earlier this month at the age of 88. I had never actually seen any of his plays so I took the opportunity to watch the 1966 movie Who's Afraid of Virginia Woolf? based on the play.

The play has four characters, George, a history professor in his forties and his wife Martha, the daughter of the University's president. Joining them for a late get together is a young biology professor and his wife. The movie had an incredible cast: Richard Burton, Elizabeth Taylor, George Segal and Sandy Dennis. All nominated for academy awards. The women won.

The movie covers many themes, mostly revolving around the disintegrating relationship between George and Martha. George did not live up to his potential as a drunk Martha did not hesitate to point out in front of all of them.
I actually fell for him. And the match seemed practical too. For a while Daddy thought George had the stuff to take over when he was ready to retire. Anyway, I married the S.O.B. I had it all planned out. First he'd take over the History Department, then the whole college. That's how it was supposed to be! That was how it was supposed to be. All very simple. Daddy thought it was a good idea too. For a while!
Until he started watching for a couple of years and and started thinking it wasn't such a good idea. That maybe Georgie-boy didn't have the stuff! That he didn't have it in him! George didn't have much push. He wasn't aggressive. In fact, he was sort of a flop! A great big, fat flop! 
So here I am, stuck with this flop, this bog in the History Department. Who's married to the president's daughter who's expected to be somebody. Not just a nobody! A bookworm who's so goddamn complacent he can't make anything out of himself.
The movie reflects an earlier time in academics, when all the professors were white males and success was measured by taking charge of a department.

Nevertheless Albee captures a fear many academics have. That one day we may wake up and realize we've become that bog. Stuck in our job because we can't afford to give up tenure, but just going through the motions. It's a fear that motivates us, to make us continually try to do new things and achieve new heights. But it's also a reminder that academics is a long career and one that could bog down before we even notice.

Edward Albee writes a play incredibly uncomfortable to watch yet impossible not to. So rare to see that in mainstream plays or movies today.

by Lance Fortnow (noreply@blogger.com) at September 27, 2016 11:36 AM UTC

Contemplating the recently announced 1-local expanders of Viola and Wigderson (ECCC, TR16-129, 2016), one may observe that weaker constructs are well know. For example, one may easily obtain a 4-regular $N$-vertex graph with spectral gap that is $\Omega(1/\log^2 N)$, and similarly a $O(1)$-regular $N$-vertex graph with spectral gap $1/\tildeO(\log N)$. Starting from a generic candidate for a 1-local expander, we formulate a natural problem regarding coordinated random walks (CRW) on the corresponding relocation graph (which has size that is logarithmic in the size of the candidate 1-local graph), and observe that any solution to the CRW problem yields 1-local expanders, and any constant-size expanding set of generators for the symmetric group yields a solution to the CRW problem. This yields an alternative construction and different analysis than the one used by Viola and Wigderson. Furthermore, we show that solving the CRW problem is equivalent to constructing 1-local expanders.

at September 27, 2016 09:06 AM UTC

This post is a continuation of the previous one, where we explored how to detect geometry in a random geometric graph. We now consider the other side of the coin: when is it impossible to detect geometry and what are techniques for proving this? We begin by discussing the G(n,p,d) model introduced in the previous post and then turn to a more general setup, proving a robustness result on the threshold dimension for detection. The proof of the latter result also gives us the opportunity to learn about the fascinating world of entropic central limit theorems.

Barrier to detecting geometry: when Wishart becomes GOE

Recall from the previous post that G(n,p,d) is a random geometric graph where the underlying metric space is the d-dimensional unit sphere \mathbb{S}^{d-1} = \left\{ x \in \mathbb{R}^d : \left\| x \right\|_2 = 1 \right\}, and where the latent labels of the nodes are i.i.d. uniform random vectors in \mathbb{S}^{d-1}. Our goal now is to show the impossibility result of Bubeck, Ding, Eldan, and Racz: if d \gg n^{3}, then it is impossible to distinguish between G(n,p,d) and the Erdos-Renyi random graph G(n,p). More precisely, we have that

(1)   \begin{equation*} \mathrm{TV} \left( G(n,p), G(n,p,d) \right) \to 0 \end{equation*}

when d \gg n^{3} and n \to \infty, where \mathrm{TV} denotes total variation distance.

There are essentially three main ways to bound the total variation of two distributions from above: (i) if the distributions have nice formulas associated with them, then exact computation is possible; (ii) through coupling the distributions; or (iii) by using inequalities between probability metrics to switch the problem to bounding a different notion of distance between the distributions. Here, while the distribution of G(n,p,d) does not have a nice formula associated with it, the main idea is to view this random geometric graph as a function of an n \times n Wishart matrix with d degrees of freedom—i.e., a matrix of inner products of n d-dimensional Gaussian vectors—denoted by W(n,d). It turns out that one can view G(n,p) as (essentially) the same function of an n \times n GOE random matrix—i.e., a symmetric matrix with i.i.d. Gaussian entries on and above the diagonal—denoted by M(n). The upside of this is that both of these random matrix ensembles have explicit densities that allow for explicit computation. We explain this connection here in the special case of p = 1/2 for simplicity; see Bubeck et al. for the case of general p.

Recall that if Y_{1} is a standard normal random variable in \mathbb{R}^d, then Y_1 / \left\| Y_1 \right\| is uniformly distributed on the sphere \mathbb{S}^{d-1}. Consequently we can view G\left( n, 1/2, d \right) as a function of an appropriate Wishart matrix, as follows. Let Y be an n \times d matrix where the entries are i.i.d. standard normal random variables, and let W \equiv W (n,d) = YY^T be the corresponding n \times n Wishart matrix. Note that W_{ii} = \left\langle Y_i, Y_i \right\rangle = \left\| Y_i \right\|^2 and so \left\langle Y_i / \left\| Y_i \right\|, Y_j / \left\| Y_j \right\| \right\rangle = W_{ij} / \sqrt{W_{ii} W_{jj}}. Thus the n \times n matrix A defined as

    \[ A_{i,j} = \begin{cases} 1 & \text{if } W_{ij} \geq 0 \text{ and } i \neq j\\ 0 & \text{otherwise} \end{cases} \]

has the same law as the adjacency matrix of G\left(n,1/2,d\right). Denote the map that takes W to A by H, i.e., A = H \left( W \right).

In a similar way we can view G \left( n, 1/2 \right) as a function of an n \times n matrix drawn from the Gaussian Orthogonal Ensemble (GOE). Let M\left( n \right) be a symmetric n \times n random matrix where the diagonal entries are i.i.d. normal random variables with mean zero and variance 2, and the entries above the diagonal are i.i.d. standard normal random variables, with the entries on and above the diagonal all independent. Then B = H \left( M(n) \right) has the same law as the adjacency matrix of G(n,p). Note that B only depends on the sign of the off-diagonal elements of M\left(n \right), so in the definition of B we can replace M\left( n \right) with M \left( n, d \right) := \sqrt{d} M \left( n \right) + d I_n, where I_n is the n \times n identity matrix.

We can thus conclude that

    \begin{align*} \mathrm{TV} \left( G(n,1/2,d), G(n,1/2) \right) &= \mathrm{TV} \left( H \left( W(n,d) \right), H \left( M(n,d) \right) \right) \\ &\leq \mathrm{TV} \left( W(n,d), M(n,d) \right). \end{align*}

The densities of these two random matrix ensembles are explicit and well known (although we do not state them here), which allow for explicit calculations. The outcome of these calculations is the following result, proven independently and simultaneously by Bubeck et al. and Jiang and Li.

Theorem [Bubeck, Ding, Eldan, and Racz; Jiang and Li]

Define the random matrix ensembles W\left( n, d \right) and M \left( n, d \right) as above. If d / n^3 \to \infty, then

    \begin{equation*}\label{eq:Wishart_GOE} \mathrm{TV} \left( W \left( n, d \right), M \left( n, d \right) \right) \to 0. \end{equation*}

We conclude that it is impossible to detect underlying geometry whenever d \gg n^{3}.

The universality of the threshold dimension

How robust is the result presented above? We have seen that the detection threshold is intimately connected to the threshold of when a Wishart matrix becomes GOE. Understanding the robustness of this result on random matrices is interesting in its own right, and this is what we will pursue in the remainder of this post, which is based on a recent paper by Bubeck and Ganguly.

Let \mathbb{X} be an n \times d random matrix with i.i.d. entries from a distribution \mu that has mean zero and variance 1. The n \times n matrix \mathbb{X} \mathbb{X}^{T} is known as the Wishart matrix with d degrees of freedom. As we have seen above, this arises naturally in geometry, where \mathbb{X} \mathbb{X}^{T} is known as the Gram matrix of inner products of n points in \mathbb{R}^{d}. The Wishart matrix also appears naturally in statistics as the sample covariance matrix, where d is the number of samples and n is the number of parameters. (Note that in statistics the number of samples is usually denoted by n, and the number of parameters is usually denoted by p; here our notation is taken with the geometric perspective in mind.)

We consider the Wishart matrix with the diagonal removed, and scaled appropriately:

    \[ \mathcal{W}_{n,d} = \frac{1}{\sqrt{d}} \left( \mathbb{X} \mathbb{X}^{T} - \mathrm{diag} \left( \mathbb{X} \mathbb{X}^{T} \right) \right). \]

In many applications—such as to random graphs as above—the diagonal of the matrix is not relevant, so removing it does not lose information. Our goal is to understand how large does the dimension d have to be so that \mathcal{W}_{n,d} is approximately like \mathcal{G}_{n}, which is defined as the n \times n Wigner matrix with zeros on the diagonal and i.i.d. standard Gaussians above the diagonal. In other words, \mathcal{G}_{n} is drawn from the Gaussian Orthogonal Ensemble (GOE) with the diagonal replaced with zeros.

A simple application of the multivariate central limit theorem gives that if n is fixed and d \to \infty, then \mathcal{W}_{n,d} converges to \mathcal{G}_{n} in distribution. The main result of Bubeck and Ganguly establishes that this holds as long as d \, \widetilde{\gg} \, n^{3} under rather general conditions on the distribution \mu.

Theorem [Bubeck and Ganguly]

If the distribution \mu is log-concave, i.e., if it has density f(\cdot) = e^{-\varphi(\cdot)} for some convex function \varphi, and if \frac{d}{n^{3} \log^{2} \left( d \right)} \to \infty, then

(2)   \begin{equation*} \mathrm{TV} \left( \mathcal{W}_{n,d}, \mathcal{G}_{n} \right) \to 0. \end{equation*}

On the other hand, if \mu has a finite fourth moment and \frac{d}{n^{3}} \to 0, then

(3)   \begin{equation*} \mathrm{TV} \left( \mathcal{W}_{n,d}, \mathcal{G}_{n} \right) \to 1. \end{equation*}

This result extends Theorem 1 from the previous post and Theorem 1 from above, and establishes n^{3} as the universal critical dimension (up to logarithmic factors) for sufficiently smooth measures \mu: \mathcal{W}_{n,d} is approximately Gaussian if and only if d is much larger than n^{3}. For random graphs, as seen above, this is the dimension barrier to extracting geometric information from a network: if the dimension is much greater than the cube of the number of vertices, then all geometry is lost. In the setting of statistics this means that the Gaussian approximation of a Wishart matrix is valid as long as the sample size is much greater than the cube of the number of parameters. Note that for some statistics of a Wishart matrix the Gaussian approximation is valid for much smaller sample sizes (e.g., the largest eigenvalue behaves as in the limit even when the number of parameters is on the same order as the sample size (Johnstone, 2001)).

To distinguish the random matrix ensembles, we have seen in the previous post that signed triangles work up until the threshold dimension in the case when \mu is standard normal. It turns out that the same statistic works in this more general setting; when the entries of the matrices are centered, this statistic can be written as A \mapsto \mathrm{Tr} \left( A^{3} \right). We leave the details as an exercise for the reader.

We note that for (2) to hold it is necessary to have some smoothness assumption on the distribution \mu. For instance, if \mu is purely atomic, then so is the distribution of \mathcal{W}_{n,d}, and thus its total variation distance to \mathcal{G}_{n} is 1. The log-concave assumption gives this necessary smoothness, and it is an interesting open problem to understand how far this can be relaxed.

We will see the proof (and in particular the connection to entropic CLT!) in the next post.

by Sebastien Bubeck at September 27, 2016 07:54 AM UTC

Authors: Therese Biedl, Martin Derka
Download: PDF
Abstract: This paper considers 1-string representations of planar graphs that are order-preserving in the sense that the order of crossings along the curve representing vertex $v$ is the same as the order of edges in the clockwise order around $v$ in the planar embedding. We show that this does not exist for all planar graphs (not even for all planar 3-trees), but show existence for some subclasses of planar partial 3-trees. In particular, for outer-planar graphs it can be order-preserving and outer-string in the sense that all ends of strings are on the outside of the representation.

at September 27, 2016 12:00 AM UTC

Authors: Marin Bougeret, Ignasi Sau
Download: PDF
Abstract: In the last years, kernelization with structural parameters has been an active area of research within the field of parameterized complexity. As a relevant example, Gajarsk{\`y} et al. [ESA 2013] proved that every graph problem satisfying a property called finite integer index admits a linear kernel on graphs of bounded expansion and an almost linear kernel on nowhere dense graphs, parameterized by the size of a $c$-treedepth modulator, which is a vertex set whose removal results in a graph of treedepth at most $c$, where $c \geq 1$ is a fixed integer. The authors left as further research to investigate this parameter on general graphs, and in particular to find problems that, while admitting polynomial kernels on sparse graphs, behave differently on general graphs.

In this article we answer this question by finding two very natural such problems: we prove that Vertex Cover admits a polynomial kernel on general graphs for any integer $c \geq 1$, and that Dominating Set does not for any integer $c \geq 2$ even on degenerate graphs, unless $\text{NP} \subseteq \text{coNP}/\text{poly}$. For the positive result, we build on the techniques of Jansen and Bodlaender [STACS 2011], and for the negative result we use a polynomial parameter transformation for $c\geq 3$ and an OR-cross-composition for $c = 2$. As existing results imply that Dominating Set admits a polynomial kernel on degenerate graphs for $c = 1$, our result provides a dichotomy about the existence of polynomial kernels for Dominating Set on degenerate graphs with this parameter.

at September 27, 2016 12:00 AM UTC

Authors: Olivier Bournez, Daniel S. Gracaa, Amaury Pouly
Download: PDF
Abstract: The outcomes of this paper are twofold. Implicit complexity.

We provide an implicit characterization of polynomial time computation in terms of ordinary differential equations: we characterize the class PTIME of languages computable in polynomial time in terms of differential equations with polynomial right-hand side. This result gives a purely continuous elegant and simple characterization of PTIME. We believe it is the first time complexity classes are characterized using only ordinary differential equations. Our characterization extends to functions computable in polynomial time over the reals in the sense of computable analysis. Our results may provide a new perspective on classical complexity, by giving a way to define complexity classes, like PTIME, in a very simple way, without any reference to a notion of (discrete) machine. This may also provide ways to state classical questions about computational complexity via ordinary differential equations.

Continuous-Time Models of Computation.

Our results can also be interpreted in terms of analog computers or analog models of computation: As a side effect, we get that the 1941 General Purpose Analog Computer (GPAC) of Claude Shannon is provably equivalent to Turing machines both in terms of computability and complexity, a fact that has never been established before. This result provides arguments in favour of a generalised form of the Church-Turing Hypothesis, which states that any physically realistic (macroscopic) computer is equivalent to Turing machines both in terms of computability and complexity.

at September 27, 2016 01:01 AM UTC

Authors: Takuya Akiba, Kenko Nakamura, Taro Takaguchi
Download: PDF
Abstract: Analysis and modeling of networked objects are fundamental pieces of modern data mining. Most real-world networks, from biological to social ones, are known to have common structural properties. These properties allow us to model the growth processes of networks and to develop useful algorithms. One remarkable example is the fractality of networks, which suggests the self-similar organization of global network structure. To determine the fractality of a network, we need to solve the so-called box-covering problem, where preceding algorithms are not feasible for large-scale networks. The lack of an efficient algorithm prevents us from investigating the fractal nature of large-scale networks. To overcome this issue, we propose a new box-covering algorithm based on recently emerging sketching techniques. We theoretically show that it works in near-linear time with a guarantee of solution accuracy. In experiments, we have confirmed that the algorithm enables us to study the fractality of million-scale networks for the first time. We have observed that its outputs are sufficiently accurate and that its time and space requirements are orders of magnitude smaller than those of previous algorithms.

at September 27, 2016 01:04 AM UTC

Authors: Maryam Fanaeepour, Benjamin I. P. Rubinstein
Download: PDF
Abstract: Mining of spatial data is an enabling technology for mobile services, Internet-connected cars, and the Internet of Things. But the very distinctiveness of spatial data that drives utility, comes at the cost of user privacy. In this work, we continue the tradition of privacy-preserving spatial analytics, focusing not on point or path data, but on planar spatial regions. Such data represents the area of a user's most frequent visitation - such as "around home and nearby shops". Specifically we consider the differentially-private release of data structures that support range queries for counting users' spatial regions. Counting planar regions leads to unique challenges not faced in existing work. A user's spatial region that straddles multiple data structure cells can lead to duplicate counting at query time. We provably avoid this pitfall by leveraging the Euler characteristic. To address the increased sensitivity of range queries to spatial region data, we calibrate privacy-preserving noise using bounded user region size and a constrained inference that uses robust least absolute deviations. Our novel constrained inference reduces noise and introduces covertness by (privately) imposing consistency. We provide a full end-to-end theoretical analysis of both differential privacy and high-probability utility for our approach using concentration bounds. A comprehensive experimental study on several real-world datasets establishes practical validity.

at September 27, 2016 01:02 AM UTC

Authors: Thomas Seiller
Download: PDF
Abstract: This paper exhibits a series of semantic characterisations of sublinear nondeterministic complexity classes. These results fall into the general domain of logic-based approaches to complexity theory and so-called implicit computational complexity (ICC), i.e. descriptions of complexity classes without reference to specific machine models. In particular, it relates strongly to ICC results based on linear logic since the semantic framework considered stems from work on the latter. Moreover, the obtained characterisations are of a geometric nature: each class is characterised by a specific action of a group by measure-preserving maps.

at September 27, 2016 01:01 AM UTC

Authors: Titouan Carette, Mathieu Laurière, Frédéric Magniez
Download: PDF
Abstract: We present new quantum algorithms for Triangle Finding improving its best previously known quantum query complexities for both dense and spare instances. For dense graphs on $n$ vertices, we get a query complexity of $O(n^{5/4})$ without any of the extra logarithmic factors present in the previous algorithm of Le Gall [FOCS'14]. For sparse graphs with $m\geq n^{5/4}$ edges, we get a query complexity of $O(n^{11/12}m^{1/6}\sqrt{\log n})$, which is better than the one obtained by Le Gall and Shogo [ISAAC'15] when $m \geq n^{3/2}$. We also obtain an algorithm with query complexity ${O}(n^{5/6}(m\log n)^{1/6}+d_2\sqrt{n})$ where $d_2$ is the variance of the degree repartition.

Our algorithms are designed and analyzed in a new model of learning graphs that we call extended learning graphs. In addition, we present a framework in order to easily combine and analyze them. As a consequence we get much simpler algorithms and analyses than previous algorithms of Le Gall {\it et al} based on the MNRS quantum walk framework [SICOMP'11].

at September 27, 2016 01:01 AM UTC

Authors: Archontia C. Giannopoulou, Michał Pilipczuk, Dimitrios M. Thilikos, Jean-Florent Raymond, Marcin Wrochna
Download: PDF
Abstract: Suppose $\mathcal{F}$ is a finite family of graphs. We consider the following meta-problem, called $\mathcal{F}$-Immersion Deletion: given a graph $G$ and integer $k$, decide whether the deletion of at most $k$ edges of $G$ can result in a graph that does not contain any graph from $\mathcal{F}$ as an immersion. This problem is a close relative of the $\mathcal{F}$-Minor Deletion problem studied by Fomin et al. [FOCS 2012], where one deletes vertices in order to remove all minor models of graphs from $\mathcal{F}$.

We prove that whenever all graphs from $\mathcal{F}$ are connected and at least one graph of $\mathcal{F}$ is planar and subcubic, then the $\mathcal{F}$-Immersion Deletion problem admits: a constant-factor approximation algorithm running in time $O(m^3 \cdot n^3 \cdot \log m)$; a linear kernel that can be computed in time $O(m^4 \cdot n^3 \cdot \log m)$; and a $O(2^{O(k)} + m^4 \cdot n^3 \cdot \log m)$-time fixed-parameter algorithm, where $n,m$ count the vertices and edges of the input graph.

These results mirror the findings of Fomin et al. [FOCS 2012], who obtained a similar set of algorithmic results for $\mathcal{F}$-Minor Deletion, under the assumption that at least one graph from $\mathcal{F}$ is planar. An important difference is that we are able to obtain a linear kernel for $\mathcal{F}$-Immersion Deletion, while the exponent of the kernel of Fomin et al. for $\mathcal{F}$-Minor Deletion depends heavily on the family $\mathcal{F}$. In fact, this dependence is unavoidable under plausible complexity assumptions, as proven by Giannopoulou et al. [ICALP 2015]. This reveals that the kernelization complexity of $\mathcal{F}$-Immersion Deletion is quite different than that of $\mathcal{F}$-Minor Deletion.

at September 27, 2016 01:05 AM UTC

Authors: Shimin Li, Haitao Wang
Download: PDF
Abstract: Given $n$ intervals on a line $\ell$, we consider the problem of moving these intervals on $\ell$ such that no two intervals overlap and the maximum moving distance of the intervals is minimized. The difficulty for solving the problem lies in determining the order of the intervals in an optimal solution. By interesting observations, we show that it is sufficient to consider at most $n$ "candidate" lists of ordered intervals. Further, although explicitly maintaining these lists takes $\Omega(n^2)$ time and space, by more observations and a pruning technique, we present an algorithm that can compute an optimal solution in $O(n\log n)$ time and $O(n)$ space. We also prove an $\Omega(n\log n)$ time lower bound for solving the problem, which implies the optimality of our algorithm.

at September 27, 2016 12:00 AM UTC

Authors: Alon Shtern, Matan Sela, Ron Kimmel
Download: PDF
Abstract: Automatic estimation of skinning transformations is a popular way to deform a single reference shape into a new pose by providing a small number of control parameters. We generalize this approach by efficiently enabling the use of multiple exemplar shapes. Using a small set of representative natural poses, we propose to express an unseen appearance by a low-dimensional linear subspace, specified by a redundant dictionary of weighted vertex positions. Minimizing a nonlinear functional that regulates the example manifold, the suggested approach supports local-rigid deformations of articulated objects, as well as nearly isometric embeddings of smooth shapes. A real-time non-rigid deformation system is demonstrated, and a shape completion and partial registration framework is introduced. These applications can recover a target pose and implicit inverse kinematics from a small number of examples and just a few vertex positions. The result reconstruction is more accurate compared to state-of-the-art reduced deformable models.

at September 27, 2016 12:00 AM UTC

Authors: João Pedro Pedroso, João Nuno Tavares, Jorge Leite
Download: PDF
Abstract: In this paper we describe a method for packing tubes and boxes in containers. Each container is divided into parts (holders) which are allocated to subsets of objects. The method consists of a recursive procedure which, based on a predefined order for dealing with tubes and boxes, determines the dimensions and position of each holder. Characteristics of the objects to pack and rules limiting their placement make this problem unique. The method devised provides timely and practical solutions.

at September 27, 2016 01:02 AM UTC

Authors: P. A. M. Oliveira, R. J. Cintra, F. M. Bayer, S. Kulasekera, A. Madanayake
Download: PDF
Abstract: The usage of linear transformations has great relevance for data decorrelation applications, like image and video compression. In that sense, the discrete Tchebichef transform (DTT) possesses useful coding and decorrelation properties. The DTT transform kernel does not depend on the input data and fast algorithms can be developed to real time applications. However, the DTT fast algorithm presented in literature possess high computational complexity. In this work, we introduce a new low-complexity approximation for the DTT. The fast algorithm of the proposed transform is multiplication-free and requires a reduced number of additions and bit-shifting operations. Image and video compression simulations in popular standards shows good performance of the proposed transform. Regarding hardware resource consumption for FPGA shows 43.1% reduction of configurable logic blocks and ASIC place and route realization shows 57.7% reduction in the area-time figure when compared with the 2-D version of the exact DTT.

at September 27, 2016 01:04 AM UTC

Authors: Enrico Borriello, Sara Imari Walker
Download: PDF
Abstract: Using elementary cellular automata as an example, a novel, information-based classification of complex systems is proposed that circumvents the problems associated with isolating the complexity generated as a product of an initial state from that which is intrinsic to a dynamical rule. Transfer entropy variations processed by the system for different initial states split the 256 elementary rules into three information classes. These classes form a hierarchy such that coarse-graining transitions permitted among automata rules predominately occur within each information-based class, or much more rarely down the hierarchy.

at September 27, 2016 01:01 AM UTC

Authors: Tamal K. Dey, Dayu Shi, Yusu Wang
Download: PDF
Abstract: In topological data analysis, a point cloud data P extracted from a metric space is often analyzed by computing the persistence diagram or barcodes of a sequence of Rips complexes built on $P$ indexed by a scale parameter. Unfortunately, even for input of moderate size, the size of the Rips complex may become prohibitively large as the scale parameter increases. Starting with the Sparse Rips filtration introduced by Sheehy, some existing methods aim to reduce the size of the complex so as to improve the time efficiency as well. However, as we demonstrate, existing approaches still fall short of scaling well, especially for high dimensional data. In this paper, we investigate the advantages and limitations of existing approaches. Based on insights gained from the experiments, we propose an efficient new algorithm, called SimBa, for approximating the persistent homology of Rips filtrations with quality guarantees. Our new algorithm leverages a batch collapse strategy as well as a new sparse Rips-like filtration. We experiment on a variety of low and high dimensional data sets. We show that our strategy presents a significant size reduction, and our algorithm for approximating Rips filtration persistence is order of magnitude faster than existing methods in practice.

at September 27, 2016 12:00 AM UTC

Sorelle Friedler, Carlos Scheidegger and I just posted a new paper on the arxiv where we try to lay out a framework for talking about fairness in a more precise way. 

The blog post I link to says more about the paper itself. But I wanted to comment here on the tortuous process that led to this paper. In some form or another, we've been thinking about this problem for two years: how do we "mathematize" discussions of fairness and bias? 

It's been a long and difficult slog, more than any other topic in "CS meets X" that I've ever bumped into. The problem here is that the words are extremely slippery, and refract completely different meanings in different contexts. So coming up with notions that aren't overly complicated, and yet appear to capture the different ways in which people think about fairness was excruciatingly difficult. 

We had a basic framework in mind a while ago, but then we started "testing" it in the wild, on papers, and on people. The more conversations we had, the more we refined and adapted our ideas, to the point where the paper we have today owes deep debts to many people that we spoke with and who provided nontrivial conceptual challenges that we had to address. 

I still have no idea whether what we've written is any good. Time will tell. But it feels good to finally put out something, however half-baked. Because now hopefully the community can engage with it. 

by Suresh Venkatasubramanian (noreply@blogger.com) at September 26, 2016 09:49 PM UTC

We prove an essentially sharp $\tilde\Omega(n/k)$ lower bound on the $k$-round distributional complexity of the $k$-step pointer chasing problem under the uniform distribution, when Bob speaks first. This is an improvement over Nisan and Wigderson's $\tilde \Omega(n/k^2)$ lower bound. A key part of the proof is using triangular discrimination instead of total variation distance; this idea may be useful elsewhere.

at September 26, 2016 01:30 PM UTC


Local rules can achieve global behavior

sarahcannonpicture

Sarah Cannon is a current PhD student in our Algorithms, Combinatorics, and Optimization program working with Dana Randall. Sarah has a newly-updated paper with Dana and Joshua Daymude and Andrea Richa entitled, “A Markov Chain Algorithm for Compression in Self-Organizing Particle Systems.” An earlier version was presented at PODC 2016.

Today Ken and I would like to discuss the paper, and relate it to some recent results on soft robots.

For starters let’s call the paper (CDRR)—after the authors last names. Being lazy let me also start by quoting part of their abstract:

We consider programmable matter as a collection of simple computational elements (or particles) with limited (constant-size) memory that self-organize to solve system-wide problems of movement, configuration, and coordination. Here, we focus on the compression problem, in which the particle system gathers as tightly together as possible, as in a sphere or its equivalent in the presence of some underlying geometry. More specifically, we seek fully distributed, local, and asynchronous algorithms that lead the system to converge to a configuration with small perimeter. We present a Markov chain based algorithm that solves the compression problem under the geometric amoebot model, for particle systems that begin in a connected configuration with no holes.

What does this mean? They imagine simple devices that lie on a 2-dimensional lattice. Each device operates in with the same rules: they can decide what to do next based only on their local environment; however, the devices have access to randomness. The goal is that the devices over time should tend, with high probability, to form a tightly grouped system. This is what they call the compression problem. The surprise is that even with only local interactions, {n} such devices can form a configuration that close to as tight a configuration as possible. Roughly they will collapse into a region of perimeter order at most {O(\sqrt{n})}.

Soft Robots

As I started to write this post, I discovered that are some neat results on soft robots. As usual, theorists like CDRR think about {n} objects. They study the behavior of many simple devices and prove theorems that hold often for large enough {n}. Their main one, as a stated above, is that there are devices that can solve the compression problem and get within a region of perimeter order {O(\sqrt{n})}.

On the other hand, practical researchers often start by studying the case of {n=1}. I think both ends of the spectrum are important, and they complement each other. Since I am not a device physicist, I will just point out the highlights of the recent work on soft robots.


poland


The above photo is of a “light-powered micro-robot capable of mimicking the slow, steady crawl of an inchworm or small caterpillar.” See this for more details.

This is research done by a team in Poland led by Piotr Wasylczyk, who writes:

Designing soft robots calls for a completely new paradigm in their mechanics, power supply and control. We are only beginning to learn from nature and shift our design approaches towards these that emerged in natural evolution.

Their soft robot is based on no motors or pneumatic actuators to make it move. Instead it uses a clever liquid crystal elastomer technology: when exposed to light the device moves like a caterpillar. Further the light can be adjusted to make the device move in different ways.

I have no idea if this work can be extended to {n>1}, nor could it be used to implement even small numbers of the devices that CDRR require. I thought you might enjoy hearing about such creepy devices. Let’s turn to the mathematical work on the compression problem.

The Model

CDRR assume that they have a fixed structure, which is an infinite undirected graph. Their devices or particles sit on the vertices of this structure. This, of course, forces the particles to be a discrete system. As you might guess, the usual structures are fixed lattices. These lattices have periodic structure that makes the systems that result at least possible to understand. Even with this regular structure the global behavior of their particles can be quite subtle.

The models of this type have various names; the one they use is called the amoebot model as proposed in this 2014 paper. I like to think of them as soft robots creeping along the lattice.

worm

Okay the above is a cartoon. Here are some more-illustrative figures from the 2014 paper:

bot1

Their “particles” reside in a triangular lattice and either sit at one vertex or occupy two adjacent vertices. Figuratively, the worm is either pulled together or is stretched out. They can creep to an adjacent vertex by stretching out and later contracting. They cannot hop as in Chinese checkers.

The Worms Cannot Play Go

We don’t know if the game of Go can be played sensibly on a Chinese checkers board, but anyway these worms cannot play Go. A necessity in Go is forming interior holes called eyes. Although the movement rules crafted by CDRR are entirely local and randomized, the configuration of worms can never make an eye.

Let {a} be a node occupied by a contracted worm {P} and {b} an adjacent empty node. Then let {c} and {d} be the two nodes adjacent to both {a} and {b.} Define the neighborhood {N_{a,b}} to include {c} and {d,}, the other three nodes adjacent to {a}, and the other three nodes adjacent to {b}. Here is an alternate version of the rules in CDRR for it to be legal for {P} to expand into {b} and then contract into {b}:

  1. If both {c,d} are occupied, then all occupied nodes in {N_{a,b}} must be connected within {N_{a,b}} to either {c} or {d} but not both.

  2. If one of {c,d} is occupied, then all occupied nodes in {N_{a,b}} must be connected within {N_{a,b}} to it.

  3. If neither is occupied, then the occupied neighbors of {a} must form one connected component within {N_{a,b}}, and so must the neighbors of {b}.

There are further rules saying initially that no node in {N_{a,b}} may be part of an expanded worm and covering possible adjacent expanded worms after {P} has expanded. However, their effect is to enable treating the nodes concerned as unoccupied, so that Markov chains {M} built on these rules need only use states with all worms contracted. The rules are enforceable by giving each worm a fixed finite local memory that its neighbors can share.

The “not both” in the first rule subsumes their rule that the worm cannot have five occupied neighbors, which would cause an eye upon its moving to {b}. The second and third rules preserve connectedness of the whole and ensure that the move does not connect islands at {b}. The third rule also ensures that a path of open nodes through {c,b,d} now can go {c,a,d}. The rules’ symmetry makes a subsequent move back from {b} to {a} also legal, so that {M} reversible.

Their chain {M} always executes the move if the number {t_a} of triangles the worm was part of on node {a} is no greater than the number {t_b} of triangles it joins on {b}, and otherwise happens with probability {\lambda^{t_b - t_a}} where {\lambda > 1} is a fixed constant. Given the rules, it strikes us that using the difference in neighbor counts as the exponent is equivalent. The whole system is concurrent and asynchronous, but {M} is described by first choosing one worm uniformly at random for a possible move at each step, and then choosing an unoccupied neighbor at random.

Example and Main Theorems

Here is a configuration from their paper in which the only legal moves involve the third rule:


cdrrbotfig2bigger


The edges denote adjacent contracted worms, not expanded worms. Suppose the chosen node {a} is the one midway up the second column at left. It forms a triangle with its two neighbors to the left, so {t_a = 1.} The node {b} above {a} is vacant, but moving there would close off a big region below it. The move is prevented by the second rule because {N_{a,b}} would comprise four nodes that are not all connected within {N_{a,b}.} Similarly the node below {a} is forbidden. Either node {b'} to the right of {a} is permitted by rule 3, however. Since {t_{b'} = 2} the move will certainly happen if {b'} is randomly chosen.

This suggests there could be examples where the only legal moves go back and forth, creating cycles as can happen in John Conway’s game of Life. However, CDRR show that every connected hole-free n-worm configuration is reachable from any other. This makes the Markov chain {M} ergodic. Their main theorem is:

Theorem 1 For all {\alpha > 1} and {\lambda > (2 + \sqrt{2})^{\alpha/(\alpha - 1)}}, and sufficiently large {n}, when {M} is run from any connected, hole-free {n}-worm configuration, with all but exponentially vanishing probability it reaches and stays among configurations with total perimeter at most {\alpha} times the minimum possible perimeter for {n} nodes.

The second main theorem shows a threshold of {\lambda < 2.172033\dots} for the opposite behavior: the perimeter stays at size {\Omega(n)}.

The Proof

The researchers CDRR are experts at the analysis of Markov chains. So they view their particles as such a system. Then they need to prove that the resulting Markov system behaves the way they claim: that as time increases they tend to form a tight unit that solves the compression problem.

Luckily there are many analytical tools at their disposal. But regarding the ergodicity alone, they say:

We emphasize the details of this proof are far from trivial, and occupy the next ten pages.

Their particles are pretty simple, but to prove that the system operates as claimed requires quite careful analysis. Take a look at their paper for the details.

I (Dick) will make one last detailed comment. They want their system to operate completely locally. This means that there can be no global clock: each particles operates asynchronously. This requires some clever ideas to make it work: they want each particle to activate in an random manner. They use the trick that random sequences of actions can be approximated using Poisson clocks with mean {1.} The key is:

After each action, a particle then computes another random time drawn from the same distribution and executes again after that amount of time has elapsed. The exponential distribution is unique in that, if particle {P} has just activated, it is equally likely that any particle will be the next particle to activate, including particle {P}. Moreover, the particles update without requiring knowledge of any of the other particles’ clocks. Similar Poisson clocks are commonly used to describe physical systems that perform updates in parallel in continuous time.

Open Problems

After looking at their paper in some depth, we find the result that local independent particles can actually work together to solve a global problem remains intriguing. Yes there are many such results but they usually have global clocks and other assumptions. The fact that compression is achievable by a weaker model is very neat.


by RJLipton+KWRegan at September 26, 2016 12:24 AM UTC

Authors: Matthias Christandl, Péter Vrana, Jeroen Zuiddam
Download: PDF
Abstract: We present an upper bound on the exponent of the asymptotic behaviour of the tensor rank of a family of tensors defined by the complete graph on $k$ vertices. For $k\geq4$, we show that the exponent per edge is at most 0.77, outperforming the best known upper bound on the exponent per edge for matrix multiplication ($k=3$), which is approximately 0.79. We raise the question whether for some $k$ the exponent per edge can be below $2/3$, i.e. can outperform matrix multiplication even if the matrix multiplication exponent equals 2. In order to obtain our results, we generalise to higher order tensors a result by Strassen on the asymptotic subrank of tight tensors and a result by Coppersmith and Winograd on the asymptotic rank of matrix multiplication. Our results have applications in entanglement theory and communication complexity.

at September 26, 2016 01:02 AM UTC

Authors: Ioannis Caragiannis, Xenophon Chatzigeorgiou, George A. Krimpas, Alexandros A. Voudouris
Download: PDF
Abstract: Nowadays, several crowdsourcing projects exploit social choice methods for computing an aggregate ranking of alternatives given individual rankings provided by workers. Motivated by such systems, we consider a setting where each worker is asked to rank a fixed (small) number of alternatives and, then, a positional scoring rule is used to compute the aggregate ranking. Among the apparently infinite such rules, what is the best one to use? To answer this question, we assume that we have partial access to an underlying true ranking. Then, the important optimization problem to be solved is to compute the positional scoring rule whose outcome, when applied to the profile of individual rankings, is as close as possible to the part of the underlying true ranking we know. We study this fundamental problem from a theoretical viewpoint and present positive and negative complexity results and, furthermore, complement our theoretical findings with experiments on real-world and synthetic data.

at September 26, 2016 01:00 AM UTC