Theory of Computing Blog Aggregator

If you're used to writing mathematics, but haven't paid much attention to model theory, you probably think a fully-quantified mathematical sentence is generally either true or false. Fermat's last theorem, for instance, can be written as the following sentence: For all positive integers a, b, c, and n, if n > 2, then an + bn ≠ cn. This sentence is fully quantified: the four variables a, b, c, and n are all covered by the quantifier "for all positive integers". It's one of the true ones, if difficult to prove.

But when we're working with the logic of graphs, a (fully-quantified) sentence is itself just another mathematical object, and its truth is relative: it might be true for some graphs and false for others. Consider, for instance, the following sentence about an undirected graph: "There exist vertices v and w such that v and w are adjacent, and for all vertices u, if u and v are adjacent, then u equals w." It can be satisfied only when v is a vertex whose degree is exactly one, and w is its unique neighbor. We can write this more concisely using a notation in which adjacency is written using a tilde:



Let's give this sentence a name, D1. Then D1 is true for a graph that has a degree-one vertex, such as the complete bipartite graph K1,4. But it's false for a graph that doesn't have such a vertex, such as the complete graph K4. If a sentence is true for a graph, we say that the graph "models" the sentence, and we can also write that in mathematical notation:



This kind of logic, in which the only things we can talk about are vertices and their adjacencies, is called the first order logic of graphs, and it's kind of weak. Each of its sentences is equivalent to an algorithm that can contain nested loops over vertices, if-then-else logic involving adjacency tests and equality, and the ability to return Boolean values, but nothing else. For instance:

def d1(G):
   for v in G:
       for w in G:
           if G.adjacent(v,w):
               for u in G:
                   if G.adjacent(u,v):
                       if u != w:
                           break
               else:
                   return True
    return False


This is good enough to recognize some families of graphs (such as the ones with a finite set of forbidden induced subgraphs) but not many others. For instance, I don't know how to describe the distance-hereditary graphs in this way. They can be described by forbidden induced subgraphs, but infinitely many of them, and we're not allowed to write infinitely-long sentences.

On the other hand, the weakness of first-order logic means that we can prove interesting facts about it. For instance, every first-order sentence defines a family of graphs that can be recognized in polynomial time. Also, we have the 0-1 law: if S is any sentence in first-order logic then the probability that a graph chosen uniformly at random among all n-vertex graphs models S is either zero or one in the limit as n goes to infinity. Using the 0-1 law, even though we can't describe the distance-hereditary graphs precisely in first-order logic, we can get an approximate description that's good enough to prove that almost all random graphs are not distance-hereditary. A distance-hereditary graph either has a degree-one vertex (it models D1) or it has two twin vertices, vertices whose sets of neighbors (not counting the two twins themselves) are identical. The existence of twins can also be described by a first-order sentence T:



But for a uniformly random graph, both the expected number of degree-one vertices and the expected number of twin pairs, can be calculated directly from these formulas, and are exponentially small in the number n of vertices. So almost all graphs do not model D1, do not model T, and therefore cannot be distance-hereditary.

The name "first order" should be a hint that there's another kind of logic, "second order logic", and there is. In second order logic, the variables can represent complicated structures built out of k-ary relations (for instance, entire graphs), the quantifiers quantify over these structures, and we need more primitives to be able to look at what's inside these structures. The idea of using second order logic seems to be somewhat controversial in mathematics, in part because there's not a uniquely-defined way of assigning meanings to statements in this logic, but there's a restricted subset of the second order logic of graphs, called monadic second order logic, where these problems do not arise. Or actually there are two such subsets: MSO1 and MSO2.

MSO1 is what you get from the first order logic described above when you add another type of variable for sets of vertices (usually written with capital letters) and you allow quantification over sets of vertices. The only other feature beyond first order logic that's necessary to define this logic is the ability to test whether a vertex belongs to a set. It's convenient to write formulas using more complicated tests such as whether one set is a subset of another, but those can be broken down into membership tests. We can also get the effect of using these sets as variable types that can be quantified over, by instead quantifying over all vertices but then testing whether the results of the quantification belong to the given set. For instance we can write sentences D1[X] and T[X] that have the same form as D1 and T, but restrict all their variables to be in X. The effect of this restriction would be to test whether the subgraph induced by X has a degree-one vertex or has twins. A distance-hereditary graph is a graph in which every induced subgraph of two or more vertices does have a degree-one vertex or twins, and this logic allows us to express this definition in a sentence DH:



A graph G models DH if and only if G is distance-hereditary. MSO2 is similar, but allows four types of variables: vertices, edges, and sets of vertices and edges. The ability to represent sets of edges allows it to express some properties (such as the property of having a Hamiltonian cycle) that cannot be expressed in MSO1.

Unlike first-order logic, we don't necessarily get efficient algorithms out of MSO expressions. Simulating the formula directly would involve an exponential-time search over all possible subsets of vertices or edges in a given graph. But that's not the only way to turn one of these formulas into an algorithm. In particular, we can apply Courcelle's theorem, which says that every MSO2 formula can be translated into a fixed-parameter tractable algorithm on graphs parameterized by their treewidth, and that every MSO1 formula can be translated into an FPT algorithm on graphs parameterized by their clique-width. In the example of the distance-hereditary graphs, we also know that all such graphs have bounded clique-width. So applying Courcelle and plugging in the fixed bound on the clique-width of these graphs immediately tells us that there's a polynomial time algorithm for recognizing distance-hereditary graphs.

All of this is, I think, completely constructive: it's not just that an algorithm exists, but in principle we could automatically translate the formula into the algorithm. It's also completely useless in practice because the dependence on the parameter is ridiculously high (some kind of tower of powers). When an algorithm is found in this way, additional research is needed to find a more direct algorithm that reduces this dependence to something more reasonable like single-exponential with a small base, or even removes it to get a polynomial time algorithm. In the case of the distance-hereditary graphs, there's an easy polynomial algorithm: look for degree one vertices or twins, and whenever one of these patterns is found use it to reduce the size of the graph by one vertex. With a little more care one can even achieve linear time for distance-hereditary graph recognition.

My latest preprint, "Crossing minimization for 1-page and 2-page drawings of graphs with bounded treewidth" (arXiv:1408.6321, with Michael Bannister, to appear at Graph Drawing), uses this same logical approach to attack some problems related to book embedding. We had a paper with Joe Simons in GD 2013 that showed that, for graphs formed from trees by adding a very small number of edges, we can find 1-page and 2-page book drawings with a minimum number of crossings in FPT time. In the new paper, we characterize these drawings using second order logic and apply Courcelle's theorem, allowing us to generalize these algorithms to the graphs of low treewidth, a much larger class of inputs. But because we use Courcelle's theorem, our algorithms are completely impractical. More research is needed to find a way to minimize crossings in practice.

at August 28, 2014 12:43 AM UTC

Authors: Elizaveta Frenkel, Andrey Nikolaev, Alexander Ushakov
Download: PDF
Abstract: The classic knapsack and related problems have natural generalizations to arbitrary (non-commutative) groups, collectively called knapsack-type problems in groups. We study the effect of free and direct products on their time complexity. We show that free products in certain sense preserve time complexity of knapsack-type problems, while direct products may amplify it.

at August 28, 2014 12:40 AM UTC

Authors: Ciaran McCreesh, Patrick Prosser
Download: PDF
Abstract: A clique in a graph is a set of vertices, each of which is adjacent to every other vertex in this set. A k-clique relaxes this requirement, requiring vertices to be within a distance k of each other, rather than directly adjacent. In theory, a maximum clique algorithm can easily be adapted to solve the maximum k-clique problem. We use a state of the art maximum clique algorithm to show that this is feasible in practice, and introduce a lazy global domination rule which sometimes vastly reduces the search space. We include experimental results for a range of real-world and benchmark graphs, and a detailed look at random graphs.

at August 28, 2014 12:41 AM UTC

Authors: Valentin Garnero, Ignasi Sau, Dimitrios M. Thilikos
Download: PDF
Abstract: In the Red-Blue Dominating Set problem, we are given a bipartite graph $G = (V_B \cup V_R,E)$ and an integer $k$, and asked whether $G$ has a subset $D \subseteq V_B$ of at most $k$ "blue" vertices such that each "red" vertex from $V_R$ is adjacent to a vertex in $D$. We provide the first explicit linear kernel for this problem on planar graphs, of size at most $46k$.

at August 28, 2014 12:41 AM UTC

Authors: Meier Florian, Peter Ueli
Download: PDF
Abstract: We consider the classical push broadcast process on a large class of sparse random multigraphs that includes random power law graphs and multigraphs. Our analysis shows that for every $\varepsilon>0$, whp $O(\log n)$ rounds are sufficient to inform all but an $\varepsilon$-fraction of the vertices.

It is not hard to see that, e.g. for random power law graphs, the push process needs whp $n^{\Omega(1)}$ rounds to inform all vertices. Fountoulakis, Panagiotou and Sauerwald proved that for random graphs that have power law degree sequences with $\beta>3$, the push-pull protocol needs $\Omega(\log n)$ to inform all but $\varepsilon n$ vertices whp. Our result demonstrates that, for such random graphs, the pull mechanism does not (asymptotically) improve the running time. This is surprising as it is known that, on random power law graphs with $2<\beta<3$, push-pull is exponentially faster than pull.

at August 28, 2014 12:41 AM UTC

Authors: Sunny Daniels
Download: PDF
Abstract: As far as I know, at the time that I originally devised this result (1998), this was the first constructive proof that, for any integer $k$, there is a language in $\Sigma_2^P$ that cannot be simulated by a family of logic circuits of size $n^k$. However, this result had previously been proved non-constructively: see Cai and Watanabe [CW08] for more information on the history of this problem.

This constructive proof is based upon constructing a language $\Gamma$ derived from the satisfiabiility problem, and a language $\Lambda_k$ defined by an alternating Turing machine. We show that the union of $\Gamma$ and $\Lambda_k$ cannot be simulated by circuits of size $n^k$.

at August 28, 2014 12:40 AM UTC

Authors: Michael J. Bannister, David Eppstein
Download: PDF
Abstract: We investigate crossing minimization for 1-page and 2-page book drawings. We show that computing the 1-page crossing number is fixed-parameter tractable with respect to the number of crossings, that testing 2-page planarity is fixed-parameter tractable with respect to treewidth, and that computing the 2-page crossing number is fixed-parameter tractable with respect to the sum of the number of crossings and the treewidth of the input graph. We prove these results via Courcelle's theorem on the fixed-parameter tractability of properties expressible in monadic second order logic for graphs of bounded treewidth.

at August 28, 2014 12:41 AM UTC

Authors: Rahul Mehta
Download: PDF
Abstract: We prove that a variant of 2048, a popular online puzzle game, is PSPACE-Complete. Our hardness result holds for a version of the problem where the player has oracle access to the computer player's moves. Specifically, we show that for an $n \times n$ game board $\mathcal{G}$, computing a sequence of moves to reach a particular configuration $\mathbb{C}$ from an initial configuration $\mathbb{C}_0$ is PSPACE-Complete. Our reduction is from Nondeterministic Constraint Logic (NCL). We also show that determining whether or not there exists a fixed sequence of moves $\mathcal{S} \in \{\Uparrow, \Downarrow, \Leftarrow, \Rightarrow\}^k$ of length $k$ that results in a winning configuration for an $n \times n$ game board is fixed-parameter tractable (FPT). We describe an algorithm to solve this problem in $O(4^k n^2)$ time.

at August 28, 2014 12:40 AM UTC

Authors: Manoj M. Prabhakaran, Vinod M. Prabhakaran
Download: PDF
Abstract: The main contribution of this work is to relate information complexity to "tension" [Prabhakaran and Prabhakaran, 2014] - an information-theoretic quantity defined with no reference to protocols - and to illustrate that it allows deriving strong lower-bounds on information complexity. In particular, we use a very special case of this connection to give a quantitatively tighter connection between information complexity and discrepancy than the one in the work of Braverman and Weinstein (2012) (albeit, restricted to independent inputs). Further, as tension is in fact a multi-dimensional notion, it enables us to bound the 2-dimensional region that represents the trade-off between the amounts of communication in the two directions in a 2-party protocol.

This work is also intended to highlight tension as a fundamental measure of correlation between a pair of random variables, with rich connections to a variety of questions in computer science and information theory.

at August 28, 2014 12:41 AM UTC

Authors: Edith Cohen, Daniel Delling, Thomas Pajor, Renato F. Werneck
Download: PDF
Abstract: Propagation of contagion through networks is a fundamental process. It is used to model the spread of information, influence, or a viral infection. Diffusion patterns can be specified by a probabilistic model, such as Independent Cascade (IC), or captured by a set of representative traces.

Basic computational problems in the study of diffusion are influence queries (determining the potency of a specified seed set of nodes) and Influence Maximization (identifying the most influential seed set of a given size). Answering each influence query involves many edge traversals, and does not scale when there are many queries on very large graphs. The gold standard for Influence Maximization is the greedy algorithm, which iteratively adds to the seed set a node maximizing the marginal gain in influence. Greedy has a guaranteed approximation ratio of at least (1-1/e) and actually produces a sequence of nodes, with each prefix having approximation guarantee with respect to the same-size optimum. Since Greedy does not scale well beyond a few million edges, for larger inputs one must currently use either heuristics or alternative algorithms designed for a pre-specified small seed set size.

We develop a novel sketch-based design for influence computation. Our greedy Sketch-based Influence Maximization (SKIM) algorithm scales to graphs with billions of edges, with one to two orders of magnitude speedup over the best greedy methods. It still has a guaranteed approximation ratio, and in practice its quality nearly matches that of exact greedy. We also present influence oracles, which use linear-time preprocessing to generate a small sketch for each node, allowing the influence of any seed set to be quickly answered from the sketches of its nodes.

at August 28, 2014 12:41 AM UTC

Authors: Vladimir P. Burichenko
Download: PDF
Abstract: We consider the famous Strassen algorithm for fast multiplication of matrices. We show that this algorithm has a nontrivial finite group of automorphisms of order 36 (namely the direct product of two copies of the symmetric group on 3 symbols), or even 72, if we consider "extended" Strassen algorithm. This is an indirect evidence that the (unknown at present) optimal algorithm for multiplication of two size 3 by 3 matrices also may have a large automorphism group, and this may be a fruitful idea for a search of such an algorithm. In the beginning we give a brief introduction to the subject, to make the text accessible for specialists in the representation theory of finite groups.

at August 28, 2014 12:40 AM UTC

Authors: Zhengjun Cao, Lihua Liu
Download: PDF
Abstract: An efficient quantum modular exponentiation method is indispensible for Shor's factoring algorithm. But the Shor's original description specifies only the process $(a, 1)\rightarrow(a, x^a\,\mbox{mod}\, n)$, where $a, x, n$ are integers. Nielsen and Chuang specify that $$|a\rangle|y\rangle \rightarrow |a\rangle U^{a_{t-1}2^{t-1}}\cdots U^{a_02^{0}}|y \rangle = |a\rangle|x^{a_{t-1}2^{t-1}} \times \cdots \times x^{a_0 2^0}y (\mbox{mod}\, n)\rangle = |a\rangle|x^{a}y (\mbox{mod}\, n)\rangle $$ where $a$'s binary representation is $a_{t-1}a_{t-2}\cdots a_0$, $U$ is the unitary operation such that $U|y\rangle\equiv |xy (\mbox{mod}\, n)\rangle$, $y\in\{0, 1\}^{\ell}$, $\ell$ is the bit length of $n$. We find this method requires $a $ unitary operations. Hence, the process $$ \frac 1{\sqrt q}\sum_{a=0}^{q-1}|a\rangle|0\rangle \rightarrow \frac 1{\sqrt q}\sum_{a=0}^{q-1}|a\rangle|x^a(\mbox{mod}\, n)\rangle $$ where $n^2 \leq q < 2n^2$, $n$ is the large number to be factored, requires $O(q^2)$ unitary operations which can not be implemented in polynomial time. In view of this flaw, we do not think that Shor's factoring algorithm is completely understandable and universally acceptable.

at August 28, 2014 12:41 AM UTC

Authors: Alexei Myasnikov, Andrey Nikolaev, Alexander Ushakov
Download: PDF
Abstract: We generalize the classical knapsack and subset sum problems to arbitrary groups and study the computational complexity of these new problems. We show that these problems, as well as the bounded submonoid membership problem, are P-time decidable in hyperbolic groups and give various examples of finitely presented groups where the subset sum problem is NP-complete.

at August 28, 2014 12:40 AM UTC

I am happy to have an update on the SOCG/STOC colocation issue that arose last week.  Or, better said, Jeff Erickson has an update, the short summary of which is that it looks like there has now been some very useful clarification.  The concerns of the ACM apparently are limited to direct financial support, and the conferences can (formally) co-locate.  I encourage you to read the note on Jeff's blog from Paul, and let me echo Jeff's statement here: 

"Needless to say, this is fantastic news!  I want to publicly thank Paul and the ACM leadership for quickly clearing up this unfortunate misunderstanding."

So say we all.

by Michael Mitzenmacher (noreply@blogger.com) at August 27, 2014 05:44 PM UTC

A few years ago, Blais, Brody, and Matulef (2012) presented a methodology for proving lower bounds for property testing problems by reducing them from problems in communication complexity. Recently, Bhrushundi, Chakraborty, and Kulkarni (2014) showed that some reductions of this type can be deconstructed to two separate reductions, from communication complexity to parity decision trees and from the latter to property testing. This work follows up on these ideas. We present a method for deconstructing reductions from communication complexity to property testing, using an intermediary model that is a generalization of parity decision trees. This method yields a new interpretation for several well-known reductions, since we present the reductions as a composition of two steps with fundamentally different functionalities. Furthermore, we show a technique for proving lower bounds directly in the intermediary model, apply this technique to prove several lower bounds for natural problems in the model, and derive corresponding lower bounds in property testing. In particular, we provide an alternative proof for a known $\Omega(k)$ lower bound on testing $k$-sparse linear functions over $\mathbb{F}_2$, relying on a theorem by Linial and Samorodnitsky (2002). We then extend this result to a new lower bound of $\Omega(s)$ for testing $s$-sparse degree-$d$ polynomials over $\mathbb{F}_2$, for any $d\in\mathbb{N}$. In addition we provide a simple proof for the hardness of testing some families of linear subcodes. We present an unrelated result in an appendix. In property testing, testers that always accept inputs that are in the property (i.e., testers with one-sided error) are natural and common. We show that the dual notion, testers that always reject inputs that are far from the property, seems to be a notion of limited scope.

at August 27, 2014 02:40 PM UTC

We provide an alternative proof for a known result stating that $\Omega(k)$ queries are needed to test $k$-sparse linear Boolean functions. Similar to the approach of Blais and Kane (2012), we reduce the proof to the analysis of Hamming weights of vectors in affi�ne subspaces of the Boolean hypercube. However, we derive our proof from a general result by Linial and Samorodnitsky (2002) that upper bounds the number of vectors with the same Hamming weight in every large a�ffine subspace of the Boolean hypercube. Our line of argument is reminiscent of a technique that is common in communication complexity, and it allows us to derive the lower bound from Linial and Samorodnitsky's result quite easily. We publish this proof as a self-contained excerpt from a broader work (2014), since it might be of independent interest. In the other work we also extend the result to an $\Omega(s)$ lower bound for testing $s$-sparse polynomials of degree $d$, for any $d\in\mathbb{N}$.

at August 27, 2014 02:31 PM UTC

The 7th International Symposium on Algorithmic Game Theory (SAGT) will take place in Patras, Greece, from September 30 to October 2.  Notice the change in location (from the original Hafa, Israel).


by Noam Nisan at August 27, 2014 07:34 AM UTC

Today, after a lecture in the spectral graph theory boot camp at the Simons institute, I was asked what the expander mixing lemma is like in graphs that are not regular.

I don’t know if I will have time to return to this tomorrow, so here is a quick answer.

First, for context, the expander mixing lemma in regular graph. Say that {G=(V,E)} is a {d}-regular undirected graph, and {A} is its adjacency matrix. Then let the eigenvalues of the normalized matrix {\frac 1d A} be

\displaystyle  1 = \lambda_1 \geq \lambda_2 \geq \cdots \geq \lambda_n

We are interested in graphs for which all the eigenvalues are small in absolute value, except {\lambda_1}, that is, if we define

\displaystyle  \sigma_2 := \max\{ |\lambda_2|, |\lambda_3 | , \ldots , | \lambda_n | \}

we are interested in graphs for which {\sigma_2} is small. The expander mixing lemma is the fact that for every two disjoint sets {S} and {T} of vertices we have

\displaystyle  \left | E(S,V-S) - \frac d n \cdot |S| \cdot |T| \right | \leq \sigma_2 d \sqrt{ |S| \cdot |T|} \ \ \ \ \ (1)

The inequality (1) says that, if {\sigma_2} is small, then the number of edges between every two large sets of vertices is almost determined just by the size of the sets, and it is equal to the expected number of edges between the two sets in a random {d}-regular graph, up to an error term that depends on {\sigma_2}.

For the proof, we observe that, if we call {J} the matrix that has ones everywhere, then

\displaystyle  \sigma_2 = \max_{x \perp (1,\ldots , 1) } \frac { |x^T A x|}{d\cdot x^Tx} = \max_{x} \frac { \left|x^T \left( A - \frac dn J \right) x \right|}{d\cdot x^T x} = \max_{x,y} \frac { \left|x^T \left( A - \frac dn J \right) y \right|}{d\cdot ||x|| \cdot ||y||}

and then we substitute {x := {\mathbf 1}_S} and {y:= {\mathbf 1}_T} in the above expression and do the calculations.

In the case of an irregular undirected graph {G=(V,E)}, we are going to consider the normalized adjacency matrix {M:= D^{-\frac 12} A D^{-\frac 12 }}, where {A} is the adjacency matrix of {G} and {D} is the diagonal matrix such that {D_{v,v} = d_v}, where {d_v} is the degree of {v}. As in the regular case, the eigenvalues of the normalized adjacency matrix satisfy

\displaystyle  1 = \lambda_1 \geq \lambda_2 \geq \cdots \lambda_n \geq -1

Let us define

\displaystyle  \sigma_2:= \max \{ |\lambda_2| , |\lambda_3| , \ldots, |\lambda_n| \}

the second largest eigenvalue in absolute value of {M}.

We will need two more definitions: for a set of vertices {S}, its volume is defined as

\displaystyle  vol(S):= \sum_{v\in S} d_v

the sum of the degrees and {\bar d = \frac 1n \sum_v d_v} the average degree, so that {vol(V) = \bar d n }. Now we have

Lemma 1 (Expander Mixing Lemma) For every two disjoint sets of vertices {S}, {T}, we have

\displaystyle   \left | E(S,V-S) - \frac {vol(S) \cdot vol(T) }{vol(V)} \right | \leq \sigma_2 \sqrt{vol(S) \cdot vol(T) } \ \ \ \ \ (2)

So, once again, we have that the number of edges between {S} and {T} is what one would expect in a random graph in which the edge {(u,v)} exists with probability {d_ud_v/vol(V)}, up to an error that depends on {\sigma_2}.

To prove the lemma, we prove the following claim:

\displaystyle  \sigma_2 = \max_{x,y} \ \ \frac { \left| x^T \left( A - R \right) y\right| } {\sqrt{\sum_v d_v x_v^2} \cdot \sqrt{\sum_v d_v y_v^2 } }  \ \ \ \ \ (3)

where {R} is the matrix such that {R_{u,v} = d_u\cdot d_v / vol(V)}, and then the lemma will follow by just substituting {x:= {\mathbf 1}_S} and {x:= {\mathbf 1}_T} in the above expression.

The proof would be a bit cleaner if we had defined the normalized adjacency matrix to be an operator over a vector space with a different type of inner product from the standard Euclidean one (so that {\sqrt{ \sum_v d_v x_v^2}} would become the norm of {x}), but this requires some set-up and explanations, so we will just carry on with the above definitions.

Write the eigenvalue decomposition of the normalized adjacency matrix {M := D^{-1/2} A D^{-1/2}}: it is going to be

\displaystyle  z_1z_1^T + \lambda_2 z_2z_2^T + \cdots + \lambda_n z_n z_n^T

where the {z_i} are an orthonormal basis of eigenvectors.

To calculate the eigenvector {z_1} of {\lambda_1} we see that

\displaystyle  \lambda_1 = \max_z \ \frac { z^T D^{-1/2} A D^{-1/2} z}{z^T z} = \max_z \ \frac { x^T A x}{x^T D x}

where we use the change of variable {z \rightarrow x D^{1/2}} and the maximum is attained for {x = (1,\ldots,1)^T} so that {z_1} is the unit vector parallel to {D^{1/2} (1,\ldots,1)^T}, that is {z_1 = \sqrt{1/vol(V)} \cdot D^{1/2} (1,\ldots,1)^T } and {z_1z_1^T = \frac {1}{vol(V)} D^{1/2} J D^{1/2}}.

Now, to compute {\sigma_2} we just have to compute the spectral norm of {M - z_1^T z_1}, so

\displaystyle  \sigma_2 = \max_{z,w} \frac{ z^T ( M - z_1^Tz_1) w}{||z|| \cdot ||w||} = \max_{x,y} \frac{ x^T D^{1/2} ( M - z_1^Tz_1) D^{1/2}y}{ \sqrt{ \sum_v d_v x_v^2} \cdot \sqrt{ \sum_v y_v^2}}

where the last step is the change of variable {z \rightarrow x D^{1/2}} and {w \rightarrow y D^{1/2} }. It remains to observe that {D^{1/2} M D^{1/2} = A} and that

\displaystyle  D^{1/2} z_1 z_1^T D^{1/2} = \frac 1 {vol(V) } D J D = R

which proves Equation (3) and thus the lemma.


by luca at August 27, 2014 05:56 AM UTC

Another of my Graph Drawing papers is online today: "Planar Induced Subgraphs of Sparse Graphs", arXiv:1408.5939, with Cora Borradaile and her student Pingan Zhu. It's about finding large planar subgraphs in arbitrary graphs; in the version of the problem we study, we want the planar subgraph to be an induced subgraph, so the goal is to find as large a subset of vertices as possible with the property that all edges connecting them can be drawn planarly. Equivalently, we want to delete as few vertices as possible to make the remaining graph planar. It's NP-complete, of course, so our goal is to prove good bounds on how many vertices you need to delete rather than computing this number exactly.

We were inspired to look at this sort of problem by a 2001 paper of Alon, Mubayi, and Thomas, who proved among other things that in triangle-free graphs with m edges you can delete m/4 of the vertices and eliminate all the cycles in the graph (so the remaining graph is a forest). We knew also (too easy for a publication) that you can delete m/3 vertices and get a linear forest, delete m/2 vertices and get a matching, or delete m/1 vertices and get an independent set. So we were hoping that this would be part of a hierarchy of graph classes, sort of like the hierarchy coming from the Colin de Verdière graph invariant: delete m/5 vertices and get an outerplanar graph, delete m/6 vertices and get a planar graph, delete m/7 vertices and get...I don't know, some interesting class of nonplanar graphs. It didn't quite work out that way.

We did get one result sort of in this vein: you can delete m/5 vertices and get a partial 2-tree. And this is in some sense optimal, because there are graphs (such as K5) for which that's the biggest partial 2-tree you can get. But then fractional divisors started turning up: you can delete m/4.5 vertices and get a pseudoforest (again optimal). And the best we could get for finding planar subgraphs was even messier (and used linear programming to work out the precise bounds): you can delete 23m/120 (around m/5.22) vertices and get a planar graph. Probably not optimal.

Ok, so the idea of getting a hierarchy with integer divisors is dead, but maybe there's still a hierarchy, just with messier numbers. Maybe, but if you want minor-closed graph families (as all the above ones are) then the divisors can't get arbitrarily big: if you choose a divisor k that's bigger than six, then no matter how you try to delete m/k vertices, the smallest minor-closed family you can get will be the class of all graphs. The proof involves the existence of 3-regular cages, sparse graphs without any short cycles: if you start with a cage, and don't delete enough vertices, you'll get a graph that still has high girth and high cyclomatic number. The high girth can be used to contract a lot of edges without introducing any self-loops or multiple adjacencies, giving a much denser minor of the remaining graph, dense enough that it necessarily has large clique minors.

Maybe there's some way of rescuing the idea of a hierarchy of subgraph classes by using classes of graphs that aren't closed under minors, but have weaker sparsity properties, such as the 1-planar graphs? I don't know; that's as far as we were able to prove. But it might be an interesting subject for future research.

at August 27, 2014 05:35 AM UTC

Authors: Louis Cuel, Jacques-Olivier Lachaud, Quentin Mérigot, Boris Thibert
Download: PDF
Abstract: The Voronoi Covariance Measure of a compact set K of R^d is a tensor-valued measure that encodes geometric information on K and which is known to be resilient to Hausdorff noise but sensitive to outliers. In this article, we generalize this notion to any distance-like function delta and define the delta-VCM. We show that the delta-VCM is resilient to Hausdorff noise and to outliers, thus providing a tool to estimate robustly normals from a point cloud approximation. We present experiments showing the robustness of our approach for normal and curvature estimation and sharp feature detection.

at August 27, 2014 12:42 AM UTC

Authors: Gregory Kucherov, Laurent Noé, Mikhail Roytberg
Download: PDF
Abstract: We study the pattern matching automaton introduced in (A unifying framework for seed sensitivity and its application to subset seeds) for the purpose of seed-based similarity search. We show that our definition provides a compact automaton, much smaller than the one obtained by applying the Aho-Corasick construction. We study properties of this automaton and present an efficient implementation of the automaton construction. We also present some experimental results and show that this automaton can be successfully applied to more general situations.

at August 27, 2014 12:41 AM UTC

Authors: Mingyu Xiao, Hiroshi Nagamochi
Download: PDF
Abstract: A dominating induced matching, also called an efficient edge domination, of a graph $G=(V,E)$ with $n=|V|$ vertices and $m=|E|$ edges is a subset $F \subseteq E$ of edges in the graph such that no two edges in $F$ share a common endpoint and each edge in $E\setminus F$ is incident with exactly one edge in $F$. It is NP-hard to decide whether a graph admits a dominating induced matching or not. In this paper, we design a $1.1467^nn^{O(1)}$-time exact algorithm for this problem, improving all previous results. This problem can be redefined as a partition problem that is to partition the vertex set of a graph into two parts $I$ and $F$, where $I$ induces an independent set (a 0-regular graph) and $F$ induces a perfect matching (a 1-regular graph). After giving several structural properties of the problem, we show that the problem always contains some "good vertices", branching on which by including them to either $I$ or $F$ we can effectively reduce the graph. This leads to a fast exact algorithm to this problem.

at August 27, 2014 12:41 AM UTC

Authors: Maxim Babenko, Paweł Gawrychowski, Tomasz Kociumaka, Tatiana Starikovskaya
Download: PDF
Abstract: We present an improved wavelet tree construction algorithm and discuss its applications to a number of rank/select problems for integer keys and strings.

Given a string of length n over an alphabet of size $\sigma\leq n$, our method builds the wavelet tree in $O(n \log \sigma/ \sqrt{\log{n}})$ time, improving upon the state-of-the-art algorithm by a factor of $\sqrt{\log n}$. As a consequence, given an array of n integers we can construct in $O(n \sqrt{\log n})$ time a data structure consisting of $O(n)$ machine words and capable of answering rank/select queries for the subranges of the array in $O(\log n / \log \log n)$ time. This is a $\log \log n$-factor improvement in query time compared to Chan and P\u{a}tra\c{s}cu and a $\sqrt{\log n}$-factor improvement in construction time compared to Brodal et al.

Next, we switch to stringological context and propose a novel notion of wavelet suffix trees. For a string w of length n, this data structure occupies $O(n)$ words, takes $O(n \sqrt{\log n})$ time to construct, and captures the combinatorial structure of substrings of w simultaneously enabling efficient top-down traversal and binary search. In particular, with a wavelet suffix tree we are able to answer in $O(\log |x|)$ time the following two natural analogues of rank/select queries for suffixes of substrings: for substrings x and y of w count the number of suffixes of x that are lexicographically smaller than y, for a substring x of w and an integer k, find the k-th lexicographically smallest suffix of x.

We further show that wavelet suffix trees allow to compute a run-length-encoded Burrows-Wheeler transform of a substring x of w (again, given by its endpoints) in $O(s \log |x|)$ time, where s denotes the length of the resulting run-length encoding. This answers a question by Cormode and Muthukrishnan, who considered an analogous problem for Lempel-Ziv compression.

at August 27, 2014 12:41 AM UTC

Authors: Marek Cygan, Tomasz Kociumaka
Download: PDF
Abstract: In the Upper Degree-Constrained Partial Orientation problem we are given an undirected graph $G=(V,E)$, together with two degree constraint functions $d^-,d^+ : V \to \mathbb{N}$. The goal is to orient as many edges as possible, in such a way that for each vertex $v \in V$ the number of arcs entering $v$ is at most $d^-(v)$, whereas the number of arcs leaving $v$ is at most $d^+(v)$. This problem was introduced by Gabow [SODA'06] and proved to be MAXSNP-hard. In the same paper Gabow presented an LP-based iterative rounding 4/3-approximation algorithm.

Since the problem in question is a special case of the classic 3-Dimensional Matching, which in turn is a special case of the k-Set Packing problem, it is reasonable to ask whether recent improvements in approximation algorithms for the latter two problems [Cygan, FOCS'13; Sviridenko & Ward, ICALP'13] allow for an improved approximation for Upper Degree Constrained Partial Orientation. We follow this line of reasoning and present a polynomial-time local search algorithm with approximation ratio $5/4+\varepsilon$. Our algorithm uses a combination of two types of rules, that is improving sets of bounded pathwidth from the recent $4/3+\varepsilon$-approximation algorithm for 3-Set Packing [Cygan, FOCS'13], together with a simple rule tailor-made for the setting of partial orientations. In particular we exploit the fact that one can check in polynomial time whether it is possible to orient all the edges of a given graph [Gy\'arf\'as & Frank, Combinatorics'76].

at August 27, 2014 12:41 AM UTC

Authors: Jan Christoph Athenstädt, Tanja Hartmann, Martin Nöllenburg
Download: PDF
Abstract: We study the simultaneous embeddability of a pair of partitions of the same underlying set into disjoint blocks. Each element of the set is mapped to a point in the plane and each block of either of the two partitions is mapped to a region that contains exactly those points that belong to the elements in the block and that is bounded by a simple closed curve. We establish three main classes of simultaneous embeddability (weak, strong, and full embeddability) that differ by increasingly strict well-formedness conditions on how different block regions are allowed to intersect. We show that these simultaneous embeddability classes are closely related to different planarity concepts of hypergraphs. For each embeddability class we give a full characterization. We show that (i) every pair of partitions has a weak simultaneous embedding, (ii) it is NP-complete to decide the existence of a strong simultaneous embedding, and (iii) the existence of a full simultaneous embedding can be tested in linear time.

at August 27, 2014 12:42 AM UTC

Authors: Minming Li, Frances F. Yao, Hao Yuan
Download: PDF
Abstract: Dynamic Voltage Scaling techniques allow the processor to set its speed dynamically in order to reduce energy consumption. In the continuous model, the processor can run at any speed, while in the discrete model, the processor can only run at finite number of speeds given as input. The current best algorithm for computing the optimal schedules for the continuous model runs at $O(n^2\log n)$ time for scheduling $n$ jobs. In this paper, we improve the running time to $O(n^2)$ by speeding up the calculation of s-schedules using a more refined data structure. For the discrete model, we improve the computation of the optimal schedule from the current best $O(dn\log n)$ to $O(n\log \max\{d,n\})$ where $d$ is the number of allowed speeds.

at August 27, 2014 12:41 AM UTC

Authors: Ilaria De Crescenzo, Salvatore La Torre, Yaron Velner
Download: PDF
Abstract: Games on recursive game graphs can be used to reason about the control flow of sequential programs with recursion. In games over recursive game graphs, the most natural notion of strategy is the modular strategy, i.e., a strategy that is local to a module and is oblivious to previous module invocations, and thus does not depend on the context of invocation. In this work, we study for the first time modular strategies with respect to winning conditions that can be expressed by a pushdown automaton.

We show that such games are undecidable in general, and become decidable for visibly pushdown automata specifications.

Our solution relies on a reduction to modular games with finite-state automata winning conditions, which are known in the literature.

We carefully characterize the computational complexity of the considered decision problem. In particular, we show that modular games with a universal Buchi or co Buchi visibly pushdown winning condition are EXPTIME-complete, and when the winning condition is given by a CARET or NWTL temporal logic formula the problem is 2EXPTIME-complete, and it remains 2EXPTIME-hard even for simple fragments of these logics.

As a further contribution, we present a different solution for modular games with finite-state automata winning condition that runs faster than known solutions for large specifications and many exits.

at August 27, 2014 12:40 AM UTC

Authors: Constantin Enea, Peter Habermehl, Omar Inverso, Gennaro Parlato
Download: PDF
Abstract: We consider the feasibility problem of integer linear programming (ILP). We show that solutions of any ILP instance can be naturally represented by an FO-definable class of graphs. For each solution there may be many graphs representing it. However, one of these graphs is of path-width at most 2n, where n is the number of variables in the instance. Since FO is decidable on graphs of bounded path- width, we obtain an alternative decidability result for ILP. The technique we use underlines a common principle to prove decidability which has previously been employed for automata with auxiliary storage. We also show how this new result links to automata theory and program verification.

at August 27, 2014 12:40 AM UTC

Authors: Amir M. Ben-Amram
Download: PDF
Abstract: Finding whether a linear-constraint loop has a linear ranking function is an important key to understanding the loop behavior, proving its termination and establishing iteration bounds. If no preconditions are provided, the decision problem is known to be in coNP when variables range over the integers and in PTIME for the rational numbers, or real numbers. Here we show that deciding whether a linear-constraint loop with a precondition, specifically with partially-specified input, has a linear ranking function is EXPSPACE-hard over the integers, and PSPACE-hard over the rationals. The precise complexity of these decision problems is yet unknown. The EXPSPACE lower bound is derived from the reachability problem for Petri nets (equivalently, Vector Addition Systems), and possibly indicates an even stronger lower bound (subject to open problems in VAS theory). The lower bound for the rationals follows from a novel simulation of Boolean programs. Lower bounds are also given for the problem of deciding if a linear ranking-function supported by a particular form of inductive invariant exists. For loops over integers, the problem is PSPACE-hard for convex polyhedral invariants and EXPSPACE-hard for downward-closed sets of natural numbers as invariants.

at August 27, 2014 12:40 AM UTC

Authors: Glencora Borradaile, David Eppstein, Pingan Zhu
Download: PDF
Abstract: We show that every graph has an induced pseudoforest of at least $n-m/4.5$ vertices, an induced partial 2-tree of at least $n-m/5$ vertices, and an induced planar subgraph of at least $n-m/5.2174$ vertices. These results are constructive, implying linear-time algorithms to find the respective induced subgraphs. We also show that the size of the largest $K_h$-minor-free graph in a given graph can sometimes be at most $n-m/6+o(m)$.

at August 27, 2014 12:00 AM UTC

Authors: Michael A. Bekos, Martin Gronemann, Michael Kaufmann, Robert Krug
Download: PDF
Abstract: In octilinear drawings of planar graphs, every edge is drawn as an alternating sequence of horizontal, vertical and diagonal ($45^\circ$) line-segments. In this paper, we study octilinear drawings of low edge complexity, i.e., with few bends per edge. A $k$-planar graph is a planar graph in which each vertex has degree less or equal to $k$. In particular, we prove that every 4-planar graph admits a planar octilinear drawing with at most one bend per edge on an integer grid of size $O(n^2) \times O(n)$. For 5-planar graphs, we prove that one bend per edge still suffices in order to construct planar octilinear drawings, but in super-polynomial area. However, for 6-planar graphs we give a class of graphs whose planar octilinear drawings require at least two bends per edge.

at August 27, 2014 12:41 AM UTC


Some algorithmic tricks were first invented in complexity theory

Screen Shot 2014-08-26 at 4.00.14 PM

Andrey Kolmogorov, Fred Hennie, Richard Stearns, and Walter Savitch are all famous separately; but they have something in common. Read on, and see.

Today I wish to discuss some algorithmic tricks and show that they were initially used by complexity theorists, years before they were used by algorithm designers.

To steal a phrase: it’s computational complexity all the way down. Well not exactly. The situation is slightly more complex—a bad pun. The complexity theorists often invented a concept and used it in a narrow way, while later it was rediscovered and made a general notion. This is another example of the principle: The last to discover X often gets the credit for X. I note that the dictionary gets this wrong:

dict

Its not the first but the last. For after the last gets the credit, it’s no longer a “discovery.” Let’s look at three examples of this phenomenon.

Pointers

Who invented it? Andrey Kolmogorov in 1953.

Who really invented it? Harold Lawson in 1964.

Details: Kolmogorov did so much that one should not be surprised that he invented “pointers.” In 1953 he wrote a short paper, really an abstract, “To the Definition of an Algorithm.” This later became a 26-page paper joint with the still-active Vladimir Uspensky. In that and the preceding decades, several researchers had advanced formal notions of an algorithm. Of course we now know they all are the same, in the sense they define the same class of functions. Whether one uses Turing machines, recursive functions, or lambda calculus—to name just a few—you get the same functions. This is an important point. In mathematics, confidence that a definition is right is often best seen by showing that there are many equivalent ways to define the concept.

Kolmogorov’s notion was similar to a Turing machine, but he allowed the “tape” to be an almost arbitrary graph. During the computation, in his model, the machine could add and change edges. How this and other models constitute a “Pointer Machine” is discussed in a survey by Amir ben-Amram. A beautiful explanation of Kolmogorov’s ideas is in the survey: “Algorithms: A Quest for Absolute Definitions,” by Andreas Blass and Yuri Gurevich. I quote from their paper:

The vertices of the graph correspond to Turing’s squares; each vertex has a color chosen from a fixed finite palette of vertex colors; one of the vertices is the current computation center. Each edge has a color chosen from a fixed finite palette of edge colors; distinct edges from the same node have different colors. The program has this form: replace the vicinity U of a fixed radius around the central node by a new vicinity W that depends on the isomorphism type of the digraph U with the colors and the distinguished central vertex. Contrary to Turing’s tape whose topology is fixed, Kolmogorov’s “tape” is reconfigurable.

Harold Lawson invented pointers as we know them today in 1964. He received the IEEE Computer Pioneer Award in 2000:

for inventing the pointer variable and introducing this concept into PL/I, thus providing for the first time, the capability to flexibly treat linked lists in a general-purpose high level language.

The idea that a pointer can be variable was by Kolmogorov, and distinct from fixed pointers in Lisp implementations as Ben-Amram notes, but making it a syntactic variable in a programming language made the difference.

Amortized Complexity

Who invented it? Fred Hennie and Richard Stearns in 1966.

Who really invented it? Robert Tarjan in 1985.

Details: Hennie and Stearns proved one of the basic results in complexity theory. Initially Turing machines had one tape that was used for input and for temporary storage. It was quickly realized that one could easily generalize this to have multiple tapes—some for input, others for temporary storage. This does not change the class of computable functions. That is stable under such changes. But it does change the time that such a machine takes to compute some task. What they proved is that the number of tapes was relatively unimportant provide there were at least two. More precisely they proved that a machine with {k} tapes that ran in time {T(n)} could be simulated in time {O(T(n)\log T(n))} by a machine with two tapes.

This result is very important and fundamental to the understanding of time complexity. The simulation is also very clever. Even for simulating three tapes by two, the issue is that the obvious way to do the simulation seems to require that the time increase from {T(n)} to order {T^{2}(n)}. But they show that the obvious method of making three (or more) tracks on one tape can be fixed to work so that the “average” simulation step takes order {\log T(n)}, using the second tape to move data. Note some simulation steps take a constant number of steps, and others take a huge number of steps. But the average is logarithmic and this proves the theorem. I have discussed this before—see here.

Bob Tarjan years later used the same fundamental idea in what is called amortized complexity. Bob’s ideas explained in his 1985 paper Amortized Computational Complexity made a useful formal distinction within the concept of an operation being good on average. The distinction is whether the operation can possibly be bad all the time, in highly “unlucky” cases. If you are finding the successor of a node in a binary search tree, you might unluckily have to go all the way up the tree, and all the way back down on the next step. But the step after that will be just one edge, and overall, the entire inorder transversal of {n} nodes takes {2n-2} edge steps total, giving an average under {2}. It doesn’t matter how unbalanced the tree is. The idea is used throughout algorithm design.

Meet in the Middle

Who invented it? Walter Savitch in 1965.

Who really invented it? Whitfield Diffie and Martin Hellman in 1977.

Details: The most famous example in computational complexity is certainly Savitch’s brilliant result—still the best known—showing that nondeterministic space {\log n} is contained in {O(\log^2 n)} space. His idea is to change a search problem for a path from {s} to {t} into two search problems. One searches from {s} and also searches from {t}. If there is some point {m} in common, then there is of course a path from {s} to {t}. If we insist on halving the allowance for the search each time we recurse, then we will succeed when—and only when—{m} is exactly midway between {s} and {t}. This insistence guarantees {O(\log n)} recursion height plus {O(\log n)} local storage for the neighborhood of the current node, giving {O(\log^2 n)} space overall. Ken likes to cite a “Modified Chinese Proverb” in his lectures:

A journey of a thousand miles has a step that is 500 miles from the beginning and 500 miles from the end.

Diffie and Hellman in 1977 created the idea of using this method for an attack on a block cipher. Their attack largely deflates the idea that a composition of encryptions

\displaystyle  f_n\circ f_{n-1} \circ \cdots f_2 \circ f_1(x),

where each {f_j} uses a different key {k_j}, can be substantially more secure than a single application of some {f_j}. The idea, explained in full by our friends at Wikipedia, is to guess an intermediate stage {m}, let {E_1,E_2} be the encodings up to and after stage {m}, and {D_1,D_2} the corresponding decodings. Plaintexts {P} and ciphertexts {C} are related by

\displaystyle  C = E_2(E_1(P)),\qquad P = D_1(D_2(C)).

If we had {n = 2}, we could try all single encoding keys {k_1} to get {X = E_1(P)}, and store results in a lookup table. Then we can try all decryption keys {k_2} to see if {D_2(C) = X}, and preserve pairs {(k_1,k_2)} that match. Then for each pair try other {(P',C')} and discard it if they don’t match, until one pair {(k_1,k_2)} survives. Already we have time bounded by the sum not the product of the numbers of keys, multiplied instead by the number of {(P,C)} trials. But the main point is that the time savings (at the expense of space for the table) are largely preserved in a recursion that guesses a breakpoint {m} and recurses on the supposition that {m} is at or near the midpoint, by an analysis similar to Savitch’s.

Open Problems

Okay, not all algorithmic ideas started in complexity theory. But a number of very important ones did. All three that we’ve covered in fact came from analyzing basic models of computation. Can you name some others?


by rjlipton at August 26, 2014 10:42 PM UTC

We show an exponential gap between communication complexity and information complexity for boolean functions, by giving an explicit example of a partial function with information complexity $\leq O(k)$, and distributional communication complexity $\geq 2^k$. This shows that a communication protocol for a partial boolean function cannot always be compressed to its internal information. By a result of Braverman, our gap is the largest possible. By a result of Braverman and Rao, our example shows a gap between communication complexity and amortized communication complexity, implying that a tight direct sum result for distributional communication complexity of boolean functions cannot hold, answering a long standing open problem. Our techniques build on [GKR14], that proved a similar result for relations with very long outputs (double exponentially long in $k$). In addition to the stronger result, the current work gives a simpler proof, benefiting from the short output length of boolean functions. Another (conceptual) contribution of our work is the relative discrepancy method, a new rectangle-based method for proving communication complexity lower bounds for boolean functions, powerful enough to separate information complexity and communication complexity.

at August 26, 2014 10:17 PM UTC

We recently put on arxiv a new draft on "Theoretical Foundations of Equitability and the Maximal Information Coefficient".  This is some follow-on work to a paper that appeared in Science a couple of years ago, where we introduced the idea of equitability.  Essentially, in that Science paper (link to page where you can access the paper), we wanted a statistic that would give back, for samples from a noisy functional relationship, a score corresponding to the amount of noise (or, in that case, to the R^2 of the noisy data relative to the relevant noiseless function), regardless of the relationship type.  The idea was that this would be useful in data exploration settings, where we might have a large number of possible relationship pairs and in particular a number of non-trivially correlated relationships, and we'd want to score them, in some fair way across the possible types of relationships (linear, parabolic, sinusoidal, etc.), so that we could choose the most promising to look at.  We also wanted the statistic to do reasonable things for non-functional relationships.  And, finally, we wanted a pony.  (But we couldn't find a way to put that in the paper.)  The maximal information coefficient (MIC), which we built on top of mutual information, was our proposed statistic.

The paper has gotten some interest.  One thing that we heard was that people wanted a richer theoretical framework for these ideas.  So now we're finally delivering one.  It took a while, because the students involved -- Yakir Reshef and David Reshef -- were off doing crazy, wacky young-people things like going to medical school, making it hard to get cycles for the project.  On the other hand, the time did some good, allowing us to explore to determine the formulation we wanted. The result is, I hope, an interesting mix of ideas from statistics and computer science.  We're eager for feedback as we hope to formally submit somewhere soon. 

In a couple of weeks we should have another paper out on the same topic that is more empirical.  Naturally, when working through the theory, we came up with better algorithms for computing MIC, and it made sense to separate those results (and some others) into another paper.

by Michael Mitzenmacher (noreply@blogger.com) at August 26, 2014 10:14 PM UTC

Authors: Hazem Torfah, Martin Zimmermann
Download: PDF
Abstract: We determine the complexity of counting models of bounded size of specifications expressed in Linear-time Temporal Logic. Counting word models is #P-complete, if the bound is given in unary, and as hard as counting accepting runs of nondeterministic polynomial space Turing machines, if the bound is given in binary. Counting tree models is as hard as counting accepting runs of nondeterministic exponential time Turing machines, if the bound is given in unary. For a binary encoding of the bound, the problem is at least as hard as counting accepting runs of nondeterministic exponential space Turing machines. On the other hand, it is not harder than counting accepting runs of nondeterministic doubly-exponential time Turing machines.

at August 26, 2014 12:40 AM UTC

Authors: Giacomo Parigi, Marco Piastra
Download: PDF
Abstract: In their recent article (2010), Levy and Liu introduced a generalization of Centroidal Voronoi Tessellation (CVT) - namely the Lp-CVT - that allows the computation of an anisotropic CVT over a sound mathematical framework. In this article a new objective function is defined, and both this function and its gradient are derived in closed-form for surfaces and volumes. This method opens a wide range of possibilities, also described in the paper, such as quad-dominant surface remeshing, hex-dominant volume meshing or fully-automated capturing of sharp features. However, in the same paper, the derivations of the gradient and of the new objective function are only partially expanded in the appendices, and some relevant requisites on the anisotropy field are left implicit. In order to better harness the possibilities described there, in this work the entire derivation process is made explicit. In the authors' opinion, this also helps understanding the working conditions of the method and its possible applications.

at August 26, 2014 12:41 AM UTC

Authors: Dan He, Zhanyong Wang, Laxmi Parida, Eleazar Eskin
Download: PDF
Abstract: Reconstruction of family trees, or pedigree reconstruction, for a group of individuals is a fundamental problem in genetics. The problem is known to be NP-hard even for datasets known to only contain siblings. Some recent methods have been developed to accurately and efficiently reconstruct pedigrees. These methods, however, still consider relatively simple pedigrees, for example, they are not able to handle half-sibling situations where a pair of individuals only share one parent. In this work, we propose an efficient method, IPED2, based on our previous work, which specifically targets reconstruction of complicated pedigrees that include half-siblings. We note that the presence of half-siblings makes the reconstruction problem significantly more challenging which is why previous methods exclude the possibility of half-siblings. We proposed a novel model as well as an efficient graph algorithm and experiments show that our algorithm achieves relatively accurate reconstruction. To our knowledge, this is the first method that is able to handle pedigree reconstruction based on genotype data only when half-sibling exists in any generation of the pedigree.

at August 26, 2014 12:41 AM UTC

Authors: Djamal Belazzougui
Download: PDF
Abstract: Assuming we are in a Word-RAM model with word size $w$, we show that we can construct in $o(w)$ time an error correcting code with a constant relative positive distance that maps numbers of $w$ bits into of $\Theta(w)$-bit numbers, and such that the application of the error-correcting code on any given number $x\in[0,2^w-1]$ takes constant time. Our result improves on a previously proposed error-correcting code with the same properties whose construction time was exponential in $w$.

at August 26, 2014 12:41 AM UTC

Authors: Cristopher Moore, Alexander Russell
Download: PDF
Abstract: We establish a precise relationship between spherical harmonics and Fourier basis functions over a hypercube randomly embedded in the sphere. In particular, we give a bound on the expected Boolean noise sensitivity of a randomly rotated function in terms of its "spherical sensitivity," which we define according to its evolution under the spherical heat equation. As an application, we prove an average case of the Gotsman-Linial conjecture, bounding the sensitivity of polynomial threshold functions subjected to a random rotation.

at August 26, 2014 12:40 AM UTC

Authors: Igor Stassiy
Download: PDF
Abstract: In this master thesis we analyze the complexity of sorting a set of strings. It was shown that the complexity of sorting strings can be naturally expressed in terms of the prefix trie induced by the set of strings. The model of computation takes into account symbol comparisons and not just comparisons between the strings. The analysis of upper and lower bounds for some classical algorithms such as Quicksort and Mergesort in terms of such a model was shown. Here we extend the analysis to another classical algorithm - Heapsort. We also give analysis for the version of the algorithm that uses Binomial heaps as a heap implementation.

at August 26, 2014 12:00 AM UTC