Theory of Computing Blog Aggregator

The nymwars are the struggle to maintain online safe havens for pseudonymous free speech, for people who don't feel safe linking their opinions with their real names (for fear of religious persecution / sexual predation / current or future job prospects / whatever else) in the face of attempts by Facebook, Google, and others to force everyone to cross-link all their personal information. Soon after its launch in 2011, Google+ took a strong stand that only people willing to post under their real names would be welcome on the site, and (as one very minor consequence) I stopped posting there. Now, finally, Google+ has relented and will allow arbitrary user-chosen identities. They could have been more apologetic about it, but it's enough for me to return there.

I don't intend to change my Livejournal posting habits much, as I've been using my Google+ account for a somewhat different purpose: posting brief links to web sites that catch my attention and that I think might be of interest to my (mostly technical/academic) circles of contacts there. Here's a roundup of a dozen or so links I've posted so far (skipping the ones where I linked back to my LJ postings).

at August 01, 2014 04:02 AM UTC

And whose theorem is it anyway?

Screen Shot 2014-07-31 at 10.18.18 PM

Georg Cantor, Felix Bernstein, and Ernst Schröder are each famous for many things. But together they are famous for stating, trying to prove, and proving, a basic theorem about the cardinality of sets. Actually the first person to prove it was none of them. Cantor had stated it in 1887 in a sentence beginning, “One has the theorem that…,” without fanfare or proof. Richard Dedekind later that year wrote a proof—importantly one avoiding appeal to the axiom of choice (AC)—but neither published it nor told Cantor, and it wasn’t revealed until 1908. Then in 1895 Cantor deduced it from a statement he couldn’t prove that turned out to be equivalent to AC. The next year Schröder wrote a proof but it was wrong. Schröder found a correct proof in 1897, but Bernstein, then a 19 year old student in Cantor’s seminar, independently found a proof, and perhaps most important, Cantor himself communicated it to the International Congress of Mathematicians that year.

Today I want to go over proofs of this theorem that were written in the 1990’s not the 1890’s.

Often Cantor or Schröder gets left out when naming it, but never Bernstein, and Dedekind never gets included. Steve Homer and Alan Selman say “Cantor-Bernstein Theorem” in their textbook, so let’s abbreviate it CBT. The theorem states:

Theorem: If there is an injective map from a set {A} to {B} and also an injective map from {B} to {A}, then the sets {A} and {B} have the same cardinality.

Recall that a map {h: A \longrightarrow B} is injective provided {f(x)=f(y)} implies that {x=y}, surjective if its range is all of {B}, and bijective if it is both injective and surjective. We can thank the great nonexistent French mathematician Nicolas Bourbaki for standardizing these terms, noting that sur is French for “on.” The definition of {A} and {B} having the same cardinality is that there exists a map {h: A \longrightarrow B} that is bijective. All of this applies for sets of any cardinality, finite or infinite.

CBT and Graph Theory

The key insight for me is to think about CBT as a theorem about directed bipartite graphs. This insight is due to Gyula König. He was a set theorist, but his son became a famous graph theorist. So perhaps this explains the insight. The ideas which follow are related to proofs in a 1994 paper by John Conway and Peter Doyle, which is also summarized here and relates to this 2000 note by Berthold Schweizer. We mentioned Doyle recently here.

A directed bipartite graph has two sets of disjoint vertices the left side and the right side. All edges go only from a left vertex to a right or from a right to a left.

In a directed graph every vertex has an out-degree and an in-degree: the former is the number of edges leaving the vertex and the latter is number of edges that enter the vertex.

In order to study CBT via graph theory we need to restrict our attention to a special type of directed bipartite graph. Say {G} is an injective bipartite graph provided it is a directed bipartite graph with the following restrictions:

  1. The out-degree of all vertices is one;
  2. The in-degree of all vertices is at most one.

The claim is:

Theorem 2 Any injective directed bipartite graph has the same number of left and right vertices.

This theorem proves CBT. Let {f,g} be the injective maps from {A} and {B} as above, where we assume that {A} and {B} are disjoint. Then define {G} as a directed bipartite graph with {A} as the left vertices and {B} as the right. There is an edge from {x \in A} to {f(x) \in B} and an edge from {y \in B} to {g(y) \in A}. The graph is injective: property (1) follows since both {f} and {g} are functions, and property (2) follows since both {f} and {g} are injective.

Basic Properties

From now on, let {G} be an injective directed bipartite graph with left vertices {A} and right vertices {B}. Note that every path in {G} must alternate left and right vertices, because {G} is bipartite.

The key concept is that of a maximal path. A maximal path in {G} is a path that cannot be extended on either end. One simple example of a maximal path is a cycle:

\displaystyle  x_{1} \rightarrow \cdots \rightarrow x_{m} \rightarrow x_{1}.

However, in an infinite graph there can be other maximal paths. One is a two-way infinite path

\displaystyle  \cdots \rightarrow x_{-m} \rightarrow \cdots \rightarrow x_{-1} \rightarrow x_{0} \rightarrow x_{1} \rightarrow \cdots

And the other is a one-way infinite path

\displaystyle  x_{0} \rightarrow x_{1} \rightarrow \cdots

Here there is no edge into {x_{0}}. A simple but important observation is that two distinct maximal paths have no vertices in common.

A final basic observation is the following: Suppose that {A} is partitioned into sets

\displaystyle  A_{1} \cup A_{2} \cup \dots

and {B} is also partitioned into

\displaystyle  B_{1} \cup B_{2} \cup \dots

If for each index {i}, there is a bijection from {A_{i}} to {B_{i}}, then {A} and {B} have the same cardinality. Let’s call this the partition trick.

I am trying to include even the simplest idea of the proof. Is this helpful, or am I being too detailed? You can skip the easy parts, but my experience is that people sometimes get hung up on the most basic steps of a proof. This is why I am including all the details.

Finite Graphs

Let’s prove the theorem for the case when the graph is finite.

Theorem 3 An injective directed finite bipartite graph has the same number of left and right vertices.

We had better be able to do this. We claim the following decomposition fact: The graph {G} is the union of disjoint cycles. This follows since the only maximal paths in {G} are cycles—this uses that {G} is finite.

This proves the theorem, since each cycle has the same number of left vertices as right, and therefore each cycle has a bijection from left to right. By the partition trick the theorem is proved.

So far, pretty easy.

Infinite Graphs

We now will prove the theorem in the case that {G} is infinite. By the partition trick we need only show that there is a bijection on each maximal path. This is clear for cycles—it is the same as above even for a two-way infinite cycle: since it alternates left and right, we can just take {h} to be the map that defines the bijection. The case of a one-way infinite path is just barely harder. Let

\displaystyle  x_{0} \rightarrow x_{1} \rightarrow \cdots

be such a path. If {x_{0}} is a left node, then we define {h(x_0) = f(x_0) = x_1} and {h(x_1) = g^{-1}(x_1) = x_0} to make the bijection go back and forth over the first edge in the path, then {h(x_2) = f(x_2) = x_3} and {h(x_3) = g^{-1}(x_3) = x_2} similarly for the third edge, and so on. If {x_{0}} is a right vertex, we instead get {h(x_j) = g(x_j)} for even {j} and {h(x_j) = f^{-1}(x_j)} for odd {j}. Pretty easy too.

Some Puzzles About The Proof

Even thought the proof is about any sets of any cardinality—as large as you like—the proof employs paths that are either finite or countable in length. This seems a bit strange—no? I have wondered whether this ability to work only with such paths can be exploited in some manner. I do not see how to use this fact. Oh well.

The countable case {A,B \subset \mathbb{N}} matters most in computability and complexity theory. John Myhill proved that if {f} and {g} are computable, then so is some bijection {h}. This at first seems strange—how can you recognize you are in an infinite cycle?—but it works this way in finite stages {1,2,3,\dots}: At any odd stage, let {x} be the least number for which {h(x)} has not been defined. Let {y_1 = f(x)}. If {y_1} has not been listed as a value of {h} at a previous stage, put {h(x) = y_1}. Else, we have already handled some {x_1} such that {h(x_1) = y_1}. Then {x_1 \neq x} since we hadn’t handled {x}, and {f(x_1) \neq f(x)} since {f} is injective. So put {y_2 = f(x_1)}. If {y_2} has not already been listed as a value of {h}, then put {h(x) = y_2}. Else, we have already handled {x_2} such that {h(x_2) = y_2}, and we repeat with {y_3 = f(x_2)}

Eventually we must exhaust the finitely many strings previously placed into the range of {h}, and the first new string {y_i} becomes {h(x)}. Since {y_i} is a value of {f}, inductively we are also preserving the property {x \in A \iff h(x) \in B}. Next, however, we need to do an even stage. Then we call {z} the least number not already placed into the range of {h}, and consider {x_1 = g(z)}. Then {x_1 \in A \iff z \in B}. If {x_1} is not already in the domain of {h}, we define {h(x_1) = z}. Else, we have already handled some {z_1} such that {h(x_1) = z_1}, and inductively {z_1 \in B \iff x_1 \in A \iff z \in B}, so we repeat with {x_2 = g(z_1)}. Eventually we obtain {x_i} not already in the domain of {h}, and define {h(x_i) = z}. This entire process computably defines both {h} and {h^{-1}} in “zipper” fashion for any {x} or {z}, yielding a computable bijection of {\mathbb{N}} that induces an isomorphism between {A} and {B}.

In complexity theory, we want {h} to be polynomial-time computable if {f} and {g} are. This is true provided {f} and {g} are length increasing as well as polynomial-time invertible, but not by the same algorithm—given an {x} we can’t go back and do all previous stages in polynomial time. Instead, the algorithm found by Leonard Berman and Juris Hartmanis alternates trying

\displaystyle  g^{-1}(x),f^{-1}(g^{-1}(x)),g^{-1}(f^{-1}(g^{-1}(x))),\dots

as far as possible, which must stop within length-of-{x} steps because the length is always decreasing. If {g^{-1}} fails first then {h(x) = f(x)}, else {h(x) = g^{-1}(x)}.

This is the second puzzle: why do the ideas have to be so different? Is there a common formulation that might be used with other levels and kinds of complexity? The Berman-Hartmanis proof does resemble the one-way-infinite path case more than it does Myhill’s proof.

Open Problems

Does this help in understanding the proof? There are many proofs of CBT on the web, perhaps this is a better version. Take a look.

[fixed "domain"->"range" in one place in Myhill proof; worked around WordPress bug with length-of-x for |x|.]

by rjlipton at August 01, 2014 03:11 AM UTC

Shafi Goldwasser sends a reminder that the deadline to submit to the next innovations in theoretical computer science conference is next Friday. The conference will take place in January 2015 at the Weizmann Institute, with contingency plans to hold it in Boston in case the need for a relocation arises.

by luca at August 01, 2014 01:31 AM UTC

Authors: Sayan Bandyapadhyay, Aritra Banik, Sandip Das, Hirak Sarkar
Download: PDF
Abstract: \textit{Voronoi game} is a geometric model of competitive facility location problem played between two players. Users are generally modeled as points uniformly distributed on a given underlying space. Each player chooses a set of points in the underlying space to place their facilities. Each user avails service from its nearest facility. Service zone of a facility consists of the set of users which are closer to it than any other facility. Payoff of each player is defined by the quantity of users served by all of its facilities. The objective of each player is to maximize their respective payoff. In this paper we consider the two players {\it Voronoi game} where the underlying space is a road network modeled by a graph. In this framework we consider the problem of finding $k$ optimal facility locations of Player 2 given any placement of $m$ facilities by Player 1. Our main result is a dynamic programming based polynomial time algorithm for this problem on tree network. On the other hand, we show that the problem is strongly $\mathcal{NP}$-complete for graphs. This proves that finding a winning strategy of P2 is $\mathcal{NP}$-complete. Consequently, we design an $1-\frac{1}{e}$ factor approximation algorithm, where $e \approx 2.718$.

at August 01, 2014 12:40 AM UTC

Authors: Mathias Weller
Download: PDF
Abstract: Distance labeling is a preprocessing technique introduced by Peleg [Journal of Graph Theory, 33(3)] to speed up distance queries in large networks. Herein, each vertex receives a (short) label and, the distance between two vertices can be inferred from their two labels. One such preprocessing problem occurs in the hub labeling algorithm [Abraham et al., SODA'10]: the label of a vertex v is a set of vertices x (the "hubs") with their distance d(x,v) to v and the distance between any two vertices u and v is the sum of their distances to a common hub. The problem of assigning as few such hubs as possible was conjectured to be NP-hard, but no proof was known to date. We give a reduction from the well-known Vertex Cover problem on graphs to prove that finding an optimal hub labeling is indeed NP-hard.

at August 01, 2014 12:00 AM UTC

Authors: Sung-Hsien Hsieh, Chun-Shien Lu, Soo-Chang Pei
Download: PDF
Abstract: Fast Fourier Transform (FFT) is one of the most important tools in digital signal processing. FFT costs O(N log N) for transforming a signal of length N. Recently, researchers at MIT have proposed Sparse Fast Fourier Transform (sFFT) [1][2] as a breakthrough with computational complexity O(K log N) and O(K log N log N/K) for exactly K-sparse signal (with only K non-zero frequency grids) and generally K-sparse signal (with K significant frequency grids), respectively, to outperform classic FFT.

In this paper, a new sparse Fast Fourier Transform by downsampling in the time domain (sFFT-DT) is proposed, based on the assumption that the distribution of the non-zero frequency grids is uniform. The idea behind sFFT-DT is to downsample the original input signal at the beginning; then, subsequent processing operates under downsampled signals, where signal lengths are proportional to O(K). Downsampling, however, possibly leads to "aliasing". By the shift property of DFT, we recast the aliasing problem as a "moment-preserving problem (MPP)," which is solvable. We prove two theorems related to initializing the downsampling factors under different conditions to have computational complexity, O(K log K) and O(K^(5/4) log K). Moreover, for generally K-sparse signals, solutions to the MPP are inaccurate due to interference from non-significant frequency grids. We observe that significant frequency grids in aliasing are "sparse". This property is exploited, and a new sparse signal recovery algorithm in the context of compressive sensing is presented to refine the solution to MPP. The computational complexity still costs O(K log K) but requires a larger Big-O constant compared to the case of exactly K-sparse signals. We conduct theoretical complexity analyses and simulations to demonstrate that our method (sFFT-DT) outperforms classic FFT and MIT's sFFT.

at August 01, 2014 12:41 AM UTC

Authors: Moritz Kobitzsch, Samitha Samaranayake, Dennis Schieferdecker
Download: PDF
Abstract: Computing shortest paths is one of the most researched topics in algorithm engineering. Currently available algorithms compute shortest paths in mere fractions of a second on continental sized road networks. In the presence of unreliability, however, current algorithms fail to achieve results as impressive as for the static setting. In contrast to speed-up techniques for static route planning, current implementations for the stochastic on-time arrival problem require the computationally expensive step of solving convolution products. Running times can reach hours when considering large scale networks. We present a novel approach to reduce this immense computational effort of stochastic routing based on existing techniques for alternative routes. In an extensive experimental study, we show that the process of stochastic route planning can be speed-up immensely, without sacrificing much in terms of accuracy.

at August 01, 2014 12:41 AM UTC

Authors: Akitoshi Kawamura, Yusuke Kobayashi
Download: PDF
Abstract: Suppose we want to patrol a fence (line segment) using k mobile agents with given speeds v_1, ..., v_k so that every point on the fence is visited by an agent at least once in every unit time period. Czyzowicz et al. conjectured that the maximum length of the fence that can be patrolled is (v_1 + ... + v_k)/2, which is achieved by the simple strategy where each agent i moves back and forth in a segment of length v_i/2. We disprove this conjecture by a counterexample involving k = 6 agents. We also show that the conjecture is true for k = 2, 3.

at August 01, 2014 12:41 AM UTC

(True Story, but names are changed for no good reason.)

Alice, Bob, and Carol are betting on who will be the Republican Nominee in 2016.

  1. Alice bets on Jeb Bush. Alice's reasoning is that in the past the Republicans always end up with a moderate who they think can win. NOTE: From Alice's prediction you can tell NOTHING about her politics.
  2. Bob bets on Paul Ryan. Bob's reasons  that in the past the Republicans always end up with someone who they are familiar with, quite often someone who has run before. Normally this might be someone who ran in the primaries four years earlier; however, none of Herman Cain, Michelle Bachman, Rick Santorum, Newt Gingrich, seem likely. Rick Perry is possible but seems unlikely. That leaves Paul Ryan (Curiosity- people who lose, even if its close, do not run again. Not sure why. So it won't be Romney.) NOTE: From Bob's prediction you can tell NOTHING about his politics.
  3. Carol bets on Rand Paul. Carol's reasoning is that the American Public is tired of Big Government and is ready for the Rand Paul Revolution! NOTE: From Carol's prediction you can tell EVERYTHING about her politics.
Carol is letting her opinion drive her betting. Can someone take advantage of this? YES- if someone is betting from sentiment or emotion or conviction rather than from facts, that gives you an advantage in a bet. This does not always work, but its a good bet. Now watch, I'll lose my bet to Carol (NOTE- I am Bob).
One full dollar!

I think I read that you are better off betting AGAINST a triple crown if a horse won the first two legs for the same reason- there will be emotion driving the bet FOR, and that increases the payoff for those betting AGAINST. But of course nothing is guaranteed.  And there are some cases where its just ridiculous to vote against (Secretariat  in 1973); however, to use my system you can't just skip one since you ``just know'' that there will be a triple crown. Its a long term strategy. And one could be wrong.  That's why its called gambling!

Are there other cases where long term betting with facts and against sentiment can give you an edge?

by GASARCH ( at July 31, 2014 09:46 PM UTC

We initiate a study of super-perfect zero-knowledge proof systems. Loosely speaking, these are proof systems for which the interaction can be perfectly simulated in strict probabilistic polynomial-time. In contrast, the standard definition of perfect zero-knowledge only requires that the interaction can be perfectly simulated by a strict probabilistic polynomial-time that is allowed to fail with probability at most one half. We show that two types of perfect zero-knowledge proof systems can be transformed into super-perfect ones. The first type includes the perfect zero-knowledge interactive proof system for Graph Isomorphism and other systems of the same form, including perfect zero-knowledge arguments for NP. The second type refers to perfect non-interactive zero-knowledge proof systems. We also present a super-perfect non-interactive zero-knowledge proof system for the set of Blum integers.

at July 31, 2014 07:03 AM UTC

Authors: Fidaa Abed, Ioannis Caragiannis, Alexandros A. Voudouris
Download: PDF
Abstract: We study the asymmetric binary matrix partition problem that was recently introduced by Alon et al. (WINE 2013) to model the impact of asymmetric information on the revenue of the seller in take-it-or-leave-it sales. Instances of the problem consist of an $n \times m$ binary matrix $A$ and a probability distribution over its columns. A partition scheme $B=(B_1,...,B_n)$ consists of a partition $B_i$ for each row $i$ of $A$. The partition $B_i$ acts as a smoothing operator on row $i$ that distributes the expected value of each partition subset proportionally to all its entries. Given a scheme $B$ that induces a smooth matrix $A^B$, the partition value is the expected maximum column entry of $A^B$. The objective is to find a partition scheme such that the resulting partition value is maximized. We present a $9/10$-approximation algorithm for the case where the probability distribution is uniform and a $(1-1/e)$-approximation algorithm for non-uniform distributions, significantly improving results of Alon et al. Although our first algorithm is combinatorial (and very simple), the analysis is based on linear programming and duality arguments. In our second result we exploit a nice relation of the problem to submodular welfare maximization.

at July 31, 2014 12:40 AM UTC

Authors: Julian Shun
Download: PDF
Abstract: Wavelet trees have received significant attention due to their applications in compressed data structures. We present several work-efficient parallel algorithms for wavelet tree construction that have polylogarithmic span, improving upon the linear span of the recent parallel algorithms by Fuentes-Sepulveda et al. We experimentally show that on 40 cores our algorithms outperform the existing parallel algorithms by 2--12x and achieve up to 27x speedup over the sequential algorithm on a variety of real-world and artificial inputs. Our algorithms show good scalability with both increasing input size and increasing alphabet size. We also discuss how to extend our algorithms to variants of the standard wavelet tree.

at July 31, 2014 12:41 AM UTC

Authors: Alejandro Edera, Yanela Strappa, Facundo Bromberg
Download: PDF
Abstract: Markov networks are models for compactly representing complex probability distributions. They are composed by a structure and a set of numerical weights. The structure qualitatively describes independences in the distribution, which can be exploited to factorize the distribution into a set of compact functions. A key application for learning structures from data is to automatically discover knowledge. In practice, structure learning algorithms focused on "knowledge discovery" present a limitation: they use a coarse-grained representation of the structure. As a result, this representation cannot describe context-specific independences. Very recently, an algorithm called CSPC was designed to overcome this limitation, but it has a high computational complexity. This work tries to mitigate this downside presenting CSGS, an algorithm that uses the Grow-Shrink strategy for reducing unnecessary computations. On an empirical evaluation, the structures learned by CSGS achieve competitive accuracies and lower computational complexity with respect to those obtained by CSPC.

at July 31, 2014 12:41 AM UTC

Authors: Lin Chen, Nicole Megow, Benjamin Müller, Kevin Schewior
Download: PDF
Abstract: We consider the online resource minimization problem in which jobs with hard deadlines arrive online over time at their release date. The task is to determine a feasible schedule on a minimum number of machines.

Our main result is a general O(log n)-competitive algorithm for the preemptive online problem. This is the first improvement on a O(log(p_max/p_min))-competitive algorithm that Phillips et al. (STOC 1997) gave for a semi-online variant in which the optimal number of machines is known in advance. Our algorithm maintains a dynamic partition of the job set and schedules each (temporal) subset individually on separate sets of machines. The key is a characterization of how the decrease in the relative laxity of jobs influences the optimum number of machines. To achieve this we derive a compact expression of the optimum value, which might be of independent interest. We complement our main algorithmic result by showing lower bounds that rule out that other known algorithms may yield a similar performance guarantee.

We also present a rigorous study of interesting restricted problem variants and show small constant competitive ratios. Here, we include the non-preemptive problem for which in general strong lower bounds show that no online algorithm admits a competitive ratio sublinear in n. We investigate instances in which the feasible time intervals for jobs form a proper interval graph and instances in which either the processing times or the deadlines of all jobs are equal. Furthermore, we quantify a tradeoff between the minimum relative laxity of jobs and an achievable competitive ratio.

at July 31, 2014 12:40 AM UTC

Authors: Roi Blanco, Paolo Boldi, Andrea Marino
Download: PDF
Abstract: Entity-linking is a natural-language-processing task that consists in identifying the entities mentioned in a piece of text, linking each to an appropriate item in some knowledge base; when the knowledge base is Wikipedia, the problem comes to be known as wikification (in this case, items are wikipedia articles). One instance of entity-linking can be formalized as an optimization problem on the underlying concept graph, where the quantity to be optimized is the average distance between chosen items. Inspired by this application, we define a new graph problem which is a natural variant of the Maximum Capacity Representative Set. We prove that our problem is NP-hard for general graphs; nonetheless, under some restrictive assumptions, it turns out to be solvable in linear time. For the general case, we propose two heuristics: one tries to enforce the above assumptions and another one is based on the notion of hitting distance; we show experimentally how these approaches perform with respect to some baselines on a real-world dataset.

at July 31, 2014 12:41 AM UTC

Authors: Luis Barba, Pat Morin
Download: PDF
Abstract: We describe todolists (top-down skiplists), a variant of skiplists (Pugh 1990) that can execute searches using at most $\log_{2-\varepsilon} n + O(1)$ binary comparisons per search and that have amortized update time $O(\varepsilon^{-1}\log n)$. A variant of todolists, called working-todolists, can execute a search for any element $x$ using $\log_{2-\varepsilon} w(x) + o(\log w(x))$ binary comparisons and have amortized search time $O(\varepsilon^{-1}\log w(w))$. Here, $w(x)$ is the "working-set number" of $x$. No previous data structure is known to achieve a bound better than $4\log_2 w(x)$ comparisons. We show through experiments that, if implemented carefully, todolists are comparable to other common dictionary implementations in terms of insertion times and outperform them in terms of search times.

at July 31, 2014 12:41 AM UTC

Authors: Cameron T. Chalk, Dominic A. Fernandez, Alejandro Huerta, Mario A. Maldonado, Robert T. Schweller, Leslie Sweet
Download: PDF
Abstract: In this paper we consider the problem of the strict self-assembly of infinite fractals within tile self-assembly. In particular, we provide tile assembly algorithms for the assembly of the discrete Sierpinski triangle and the discrete Sierpinski carpet within a class of models we term the \emph{$h$-handed assembly model} ($h$-HAM), which generalizes the 2-HAM to allow up to $h$ assemblies to combine in a single assembly step. Despite substantial consideration, no purely growth self-assembly model has yet been shown to strictly assemble an infinite fractal without significant modification to the fractal shape. In this paper we not only achieve this, but in the case of the Sierpinski carpet are able to achieve it within the 2-HAM, one of the most well studied tile assembly models in the literature. Our specific results are as follows. We provide a $6$-HAM construction for the Sierpinski triangle that works at scale factor 1, 30 tile types, and assembles the fractal in a \emph{near perfect} way in which all intermediate assemblies are finite-sized iterations of the recursive fractal. We further assemble the Sierpinski triangle within the $3$-HAM at scale factor 3 and 990 tile types. For the Sierpinski carpet, we present a $2$-HAM result that works at scale factor 3 and uses 1,216 tile types. We further include analysis showing that the aTAM is incapable of strictly assembling the non-tree Sierpinski triangle considered in this paper, and that multiple hands are needed for the near-perfect assembly of the Sierpinski triangle and the Sierpinski carpet.

at July 31, 2014 12:41 AM UTC

Authors: Grigory Yaroslavtsev
Download: PDF
Abstract: We give new sublinear and parallel algorithms for the extensively studied problem of approximating n-variable r-CSPs (constraint satisfaction problems with constraints of arity r up to an additive error. The running time of our algorithms is O(n/\epsilon^2) + 2^O(1/\epsilon^2) for Boolean r-CSPs and O(k^4 n / \epsilon^2) + 2^O(log k / \epsilon^2) for r-CSPs with constraints on variables over an alphabet of size k. For any constant k this gives optimal dependence on n in the running time unconditionally, while the exponent in the dependence on 1/\epsilon is polynomially close to the lower bound under the exponential-time hypothesis, which is 2^\Omega(\epsilon^(-1/2)).

For Max-Cut this gives an exponential improvement in dependence on 1/\epsilon compared to the sublinear algorithms of Goldreich, Goldwasser and Ron (JACM'98) and a linear speedup in n compared to the algorithms of Mathieu and Schudy (SODA'08). For the maximization version of k-Correlation Clustering problem our running time is O(k^4 n / \epsilon^2) + k^O(1/\epsilon^2), improving the previously best n k^{O(1/\epsilon^3 log k/\epsilon) by Guruswami and Giotis (SODA'06).

at July 31, 2014 12:40 AM UTC

Authors: Guy Even, Moti Medina, Dana Ron
Download: PDF
Abstract: We present deterministic distributed algorithms for computing approximate maximum cardinality matchings and approximate maximum weight matchings. Our algorithm for the unweighted case computes a matching whose size is at least $(1-\eps)$ times the optimal in $\Delta^{O(1/\eps)} + O\left(\frac{1}{\eps^2}\right) \cdot\log^*(n)$ rounds where $n$ is the number of vertices in the graph and $\Delta$ is the maximum degree. Our algorithm for the edge-weighted case computes a matching whose weight is at least $(1-\eps)$ times the optimal in $\log(\min\{1/\wmin,n/\eps\})^{O(1/\eps)}\cdot(\Delta^{O(1/\eps)}+\log^*(n))$ rounds for edge-weights in $[\wmin,1]$.

The best previous algorithms for both the unweighted case and the weighted case are by Lotker, Patt-Shamir, and Pettie~(SPAA 2008). For the unweighted case they give a randomized $(1-\eps)$-approximation algorithm that runs in $O((\log(n)) /\eps^3)$ rounds. For the weighted case they give a randomized $(1/2-\eps)$-approximation algorithm that runs in $O(\log(\eps^{-1}) \cdot \log(n))$ rounds. Hence, our results improve on the previous ones when the parameters $\Delta$, $\eps$ and $\wmin$ are constants (where we reduce the number of runs from $O(\log(n))$ to $O(\log^*(n))$), and more generally when $\Delta$, $1/\eps$ and $1/\wmin$ are sufficiently slowly increasing functions of $n$. Moreover, our algorithms are deterministic rather than randomized.

at July 31, 2014 12:41 AM UTC

Authors: Martin Dyer, Leslie Ann Goldberg, David Richerby
Download: PDF
Abstract: Given a symmetric matrix $M\in \{0,1,*\}^{D\times D}$, an $M$-partition of a graph $G$ is a function from $V(G)$ to $D$ such that no edge of $G$ is mapped to a $0$ of $M$ and no non-edge to a $1$. We give a computer-assisted proof that, when $|D|=4$, the problem of counting the $M$-partitions of an input graph is either in FP or is #P-complete. Tractability is proved by reduction to the related problem of counting list $M$-partitions; intractability is shown using a gadget construction and interpolation. We use a computer program to determine which of the two cases holds for all but a small number of matrices, which we resolve manually to establish the dichotomy. We conjecture that that the dichotomy also holds for $|D|>4$. More specifically, we conjecture that, for any symmetric matrix $M\in\{0,1,*\}^{D\times D}$, the complexity of counting $M$-partitions is the same as the related problem of counting list $M$-partitions.

at July 30, 2014 12:40 AM UTC

Authors: Ryan O'Donnell
Download: PDF
Abstract: We describe a web of connections between the following topics: the mathematical theory of voting and social choice; the computational complexity of the Maximum Cut problem; the Gaussian Isoperimetric Inequality and Borell's generalization thereof; the Hypercontractive Inequality of Bonami; and, the analysis of Boolean functions. A major theme is the technique of reducing inequalities about Gaussian functions to inequalities about Boolean functions f : {-1,1}^n -> {-1,1}, and then using induction on n to further reduce to inequalities about functions f : {-1,1} -> {-1,1}. We especially highlight De, Mossel, and Neeman's recent use of this technique to prove the Majority Is Stablest Theorem and Borell's Isoperimetric Inequality simultaneously.

at July 30, 2014 12:40 AM UTC

Authors: Amey Bhangale, Swastik Kopparty, Sushant Sachdeva
Download: PDF
Abstract: Given $k$ collections of 2SAT clauses on the same set of variables $V$, can we find one assignment that satisfies a large fraction of clauses from each collection? We consider such simultaneous constraint satisfaction problems, and design the first nontrivial approximation algorithms in this context.

Our main result is that for every CSP $F$, for $k < \tilde{O}(\log^{1/4} n)$, there is a polynomial time constant factor Pareto approximation algorithm for $k$ simultaneous Max-$F$-CSP instances. Our methods are quite general, and we also use them to give an improved approximation factor for simultaneous Max-w-SAT (for $k <\tilde{O}(\log^{1/3} n)$). In contrast, for $k = \omega(\log n)$, no nonzero approximation factor for $k$ simultaneous Max-$F$-CSP instances can be achieved in polynomial time (assuming the Exponential Time Hypothesis).

These problems are a natural meeting point for the theory of constraint satisfaction problems and multiobjective optimization. We also suggest a number of interesting directions for future research.

at July 30, 2014 12:41 AM UTC

Authors: Rachel Cummings, Michael Kearns, Aaron Roth, Zhiwei Steven Wu
Download: PDF
Abstract: We study a very general class of games --- multi-dimensional aggregative games --- which in particular generalize both anonymous games and weighted congestion games. For any such game that is also large (meaning that the influence that any single player's action has on the utility of others is diminishing with the number of players in the game), we solve the equilibrium selection problem in a strong sense. In particular, we give an efficient weak mediator: an algorithm or mechanism which has only the power to listen to reported types and provide non-binding suggested actions, such that (a) it is an asymptotic Nash equilibrium for every player to truthfully report their type to the mediator, and then follow its suggested action; and (b) that when players do so, they end up coordinating on a particular asymptotic pure strategy Nash equilibrium of the induced complete information game. In fact, truthful reporting is an ex-post Nash equilibrium of the mediated game, so our solution applies even in settings of incomplete information, and even when player types are arbitrary or worst-case (i.e. not drawn from a common prior). We achieve this by giving an efficient differentially private algorithm for computing a Nash equilibrium in such games. We also give similar results for a related class of one-dimensional games with weaker conditions on the aggregation function, and apply our main results to a multi-dimensional market game.

Our results can be viewed as giving, for a rich class of games, a more robust version of the Revelation Principle, in that we work with weaker informational assumptions (no common prior), yet provide a stronger solution concept (Nash versus Bayes Nash equilibrium). Previously, similar results were only known for the special case of unweighted congestion games. In the process, we derive several algorithmic results that are of independent interest.

at July 30, 2014 12:41 AM UTC

Authors: Dragan Bošnački, Stefan Edelkamp, Alberto Lluch Lafuente, Anton Wijs
Download: PDF
Abstract: These are the proceedings of the Third Workshop on GRAPH Inspection and Traversal Engineering (GRAPHITE 2014), which took place on April 5, 2014 in Grenoble, France, as a satellite event of the 17th European Joint Conferences on Theory and Practice of Software (ETAPS 2014).

The aim of GRAPHITE is to foster the convergence on research interests from several communities dealing with graph analysis in all its forms in computer science, with a particular attention to software development and analysis. Graphs are used to represent data and processes in many application areas, and they are subjected to various computational algorithms in order to analyze them. Just restricting the attention to the analysis of software, graph analysis algorithms are used, for instance, to verify properties using model checking techniques that explore the system's state space graph or static analysis techniques based on control flow graphs. Further application domains include games, planning, and network analysis. Very often, graph problems and their algorithmic solutions have common characteristics, independent of their application domain. The goal of this event is to gather scientists from different communities, who do research on graph analysis algorithms, such that awareness of each others' work is increased. More information can be found at this http URL

at July 30, 2014 12:40 AM UTC

Authors: Evripidis Bampis, Dimitrios Letsios, Giorgio Lucarelli
Download: PDF
Abstract: We revisit the non-preemptive speed-scaling problem, in which a set of jobs have to be executed on a single or a set of parallel speed-scalable processor(s) between their release dates and deadlines so that the energy consumption to be minimized. We adopt the speed-scaling mechanism first introduced in [Yao et al., FOCS 1995] according to which the power dissipated is a convex function of the processor's speed. Intuitively, the higher is the speed of a processor, the higher is the energy consumption. For the single-processor case, we improve the best known approximation algorithm by providing a $(1+\epsilon)^{\alpha}\tilde{B}_{\alpha}$-approximation algorithm, where $\tilde{B}_{\alpha}$ is a generalization of the Bell number. For the multiprocessor case, we present an approximation algorithm of ratio $\tilde{B}_{\alpha}((1+\epsilon)(1+\frac{w_{\max}}{w_{\min}}))^{\alpha}$ improving the best known result by a factor of $(\frac{5}{2})^{\alpha-1}(\frac{w_{\max}}{w_{\min}})^{\alpha}$. Notice that our result holds for the fully heterogeneous environment while the previous known result holds only in the more restricted case of parallel processors with identical power functions.

at July 30, 2014 12:41 AM UTC

Authors: Maciej Zielenkiewicz, Aleksy Schubert, Jacek Chrząszcz
Download: PDF
Abstract: In this work we analyze the multiply-exponential complexity classes for write-once Turing machines, i.e. machines that can write to a given tape cell at most once. We show that $k$-DExpWOSpace = $k$-DExpWOTime = $k$-ExpTime and the nondeterministic counterpart. For alternating machines we show that $k$-AExpWOTime = $k$-AExpTime = $k-1$-ExpSpace.

at July 30, 2014 12:40 AM UTC

Given a system of linear equations $Ax=b$ over the binary field $\mathbb{F}_2$ and an integer $t\ge 1$, we study the following three algorithmic problems: 1. Does $Ax=b$ have a solution of weight at most $t$? 2. Does $Ax=b$ have a solution of weight exactly $t$? 3. Does $Ax=b$ have a solution of weight at least $t$? We investigate the parameterized complexity of these problems with $t$ as parameter. A special aspect of our study is to show how the maximum multiplicity $k$ of variable occurrences in $Ax=b$ influences the complexity of the problem. We show a sharp dichotomy: for each $k\ge 3$ the first two problems are W[1]-hard (which strengthens and simplifies a result of Downey et al. [SIAM J. Comput. 29, 1999]). For $k=2$, the problems turn out to be intimately connected to well-studied matching problems and can be efficiently solved using matching algorithms.

at July 29, 2014 12:30 PM UTC

Authors: Robert F. Erbacher, Trent Jaeger, Nirupama Talele, Jason Teutsch
Download: PDF
Abstract: Motivated by an application in network security, we investigate the following "linear" case of Directed Mutlicut. Let $G$ be a directed graph which includes some distinguished vertices $t_1, \ldots, t_k$. What is the size of the smallest edge cut which eliminates all paths from $t_i$ to $t_j$ for all $i < j$? We show that this problem is fixed-parameter tractable when parametrized in the cutset size $p$ via an algorithm running in $O(4^p p n^4)$ time.

at July 29, 2014 12:41 AM UTC

Authors: Vasileios Iliopoulos
Download: PDF
Abstract: We analyse a generalisation of the Quicksort algorithm, where k uniformly at random chosen pivots are used for partitioning an array of n distinct keys. Specifically, the expected cost of this scheme is obtained, under the assumption of linearity of the cost needed for the partition process. The integration constants of the expected cost are computed using Vandermonde matrices.

at July 29, 2014 12:41 AM UTC

Authors: Michał Karpiński
Download: PDF
Abstract: In this paper we study a problem of vertex two-coloring of undirected graph such that there is no monochromatic cycle of given length. We show that this problem is hard to solve. We give a proof by presenting a reduction from variation of satisfiability (SAT) problem. We show nice properties of coloring cliques with two colors which plays pivotal role in the reduction construction.

at July 29, 2014 12:00 AM UTC

Authors: Shenggen Zheng, Daowen Qiu
Download: PDF
Abstract: State complexity of quantum finite automata is one of the interesting topics in studying the power of quantum finite automata. It is therefore of importance to develop general methods how to show state succinctness results for quantum finite automata. One such method is presented and demonstrated in this paper. In particular, we show that state succinctness results can be derived out of query complexity results.

at July 29, 2014 12:40 AM UTC

Authors: Kareem Amin, Rachel Cummings, Lili Dworkin, Michael Kearns, Aaron Roth
Download: PDF
Abstract: We consider the problem of learning from revealed preferences in an online setting. In our framework, each period a consumer buys an optimal bundle of goods from a merchant according to her (linear) utility function and current prices, subject to a budget constraint. The merchant observes only the purchased goods, and seeks to adapt prices to optimize his profits. We give an efficient algorithm for the merchant's problem that consists of a learning phase in which the consumer's utility function is (perhaps partially) inferred, followed by a price optimization step. We also consider an alternative online learning algorithm for the setting where prices are set exogenously, but the merchant would still like to predict the bundle that will be bought by the consumer for purposes of inventory or supply chain management. In contrast with most prior work on the revealed preferences problem, we demonstrate that by making stronger assumptions on the form of utility functions, efficient algorithms for both learning and profit maximization are possible, even in adaptive, online settings.

at July 29, 2014 12:41 AM UTC

Authors: Eric Aaron, Danny Krizanc, Elliot Meyerson
Download: PDF
Abstract: We consider the Dynamic Map Visitation Problem (DMVP), in which a team of agents must visit a collection of critical locations as quickly as possible, in an environment that may change rapidly and unpredictably during the agents' navigation. We apply recent formulations of time-varying graphs (TVGs) to DMVP, shedding new light on the computational hierarchy $\mathcal{R} \supset \mathcal{B} \supset \mathcal{P}$ of TVG classes by analyzing them in the context of graph navigation. We provide hardness results for all three classes, and for several restricted topologies, we show a separation between the classes by showing severe inapproximability in $\mathcal{R}$, limited approximability in $\mathcal{B}$, and tractability in $\mathcal{P}$. We also give topologies in which DMVP in $\mathcal{R}$ is fixed parameter tractable, which may serve as a first step toward fully characterizing the features that make DMVP difficult.

at July 29, 2014 12:00 AM UTC

Authors: Keren Cohavi, Shahar Dobzinski
Download: PDF
Abstract: We present fast algorithms for sketching valuation functions. Let $N$ ($|N|=n$) be some ground set and $v:2^N\rightarrow \mathbb R$ be a function. We say that $\tilde v:2^N\rightarrow \mathbb R$ is an $\alpha$-sketch of $v$ if for every set $S$ we have that $\frac {v(S)} {\alpha} \leq \tilde v(S) \leq v(S)$ and $\tilde v$ can be described in $poly(n)$ bits.

Goemans et al. [SODA'09] showed that if $v$ is submodular then there exists an $\tilde O(\sqrt n)$-sketch that can be constructed using polynomially many value queries (this is the best possible, as Balcan and Harvey [STOC'11] show that no submodular function admit an $n^{\frac 1 3 - \epsilon}$-sketch). Based on their work, Balcan et al. [COLT'12] and Badanidiyuru et al. [SODA'12] show that if $v$ is subadditive then there exists an $\tilde O(\sqrt n)$-sketch that can be constructed using polynomially many demand queries. All previous sketches are based on complicated geometric constructions. The first step in their constructions is proving the existence of a good sketch by finding an ellipsoid that ``approximates'' $v$ well (this is done by applying John's theorem to ensure the existence of an ellipsoid that is ``close'' to the polymatroid that is associated with $v$). The second step is showing this ellipsoid can be found efficiently, and this is done by repeatedly solving a certain convex program to obtain better approximations of John's ellipsoid.

In this paper, we give a much simpler, non-geometric proof for the existence of good sketches, and utilize the proof to obtain much faster algorithms that match the previously obtained approximation bounds. Specifically, we provide an algorithm that finds $\tilde O(\sqrt n)$-sketch of a submodular function with only $\tilde O(n^\frac{3}{2})$ value queries, and an algorithm that finds $\tilde O(\sqrt n)$-sketch of a subadditive function with $O(n)$ demand and value queries.

at July 29, 2014 12:41 AM UTC

Authors: Jaroslaw Byrka, Krzysztof Sornat
Download: PDF
Abstract: We consider Approval Voting systems where each voter decides on a subset to candidates he/she approves. We focus on the optimization problem of finding the committee of fixed size k minimizing the maximal Hamming distance from a vote. In this paper we give a PTAS for this problem and hence resolve the open question raised by Carragianis et al. [AAAI'10]. The result is obtained by adapting the techniques developed by Li et al. [JACM'02] originally used for the less constrained Closest String problem. The technique relies on extracting information and structural properties of constant size subsets of votes.

at July 29, 2014 12:00 AM UTC

Authors: Arkadiusz Socala
Download: PDF
Abstract: We study the complexity of the Channel Assignment problem. A major open problem asks whether Channel Assignment admits an $O(c^n)$-time algorithm, for a constant $c$ independent of the weights on the edges. We answer this question in the negative i.e. we show that there is no $2^{o(n\log n)}$-time algorithm solving Channel Assignment unless the Exponential Time Hypothesis fails. Note that the currently best known algorithm works in time $O^*(n!) = 2^{O(n\log n)}$ so our lower bound is tight.

at July 29, 2014 12:41 AM UTC

Authors: N. R. Aravind, R. B. Sandeep, Naveen Sivadasan
Download: PDF
Abstract: For a set of graphs $\mathcal{H}$, the \textsc{$\mathcal{H}$-free Edge Deletion} problem asks to find whether there exist at most $k$ edges in the input graph whose deletion results in a graph without any induced copy of $H\in\mathcal{H}$. In \cite{cai1996fixed}, it is shown that the problem is fixed-parameter tractable if $\mathcal{H}$ is of finite cardinality. However, it is proved in \cite{cai2013incompressibility} that if $\mathcal{H}$ is a singleton set containing $H$, for a large class of $H$, there exists no polynomial kernel unless $coNP\subseteq NP/poly$. In this paper, we present a polynomial kernel for this problem for any fixed finite set $\mathcal{H}$ of connected graphs and when the input graphs are of bounded degree. We note that there are \textsc{$\mathcal{H}$-free Edge Deletion} problems which remain NP-complete even for the bounded degree input graphs, for example \textsc{Triangle-free Edge Deletion}\cite{brugmann2009generating} and \textsc{Custer Edge Deletion($P_3$-free Edge Deletion)}\cite{komusiewicz2011alternative}. When $\mathcal{H}$ contains $K_{1,s}$, we obtain a stronger result - a polynomial kernel for $K_t$-free input graphs (for any fixed $t> 2$). We note that for $s>9$, there is an incompressibility result for \textsc{$K_{1,s}$-free Edge Deletion} for general graphs \cite{cai2012polynomial}. Our result provides first polynomial kernels for \textsc{Claw-free Edge Deletion} and \textsc{Line Edge Deletion} for $K_t$-free input graphs which are NP-complete even for $K_4$-free graphs\cite{yannakakis1981edge} and were raised as open problems in \cite{cai2013incompressibility,open2013worker}.

at July 29, 2014 12:41 AM UTC

Authors: Asish Mukhopadhyay, Puspal Bhabak
Download: PDF
Abstract: Phylogenetic trees are leaf-labelled trees, where the leaves correspond to extant species (taxa), and the internal vertices represent ancestral species. The evolutionary history of a set of species can be explained by more than one phylogenetic tree, giving rise to the problem of comparing phylogenetic trees for similarity. Various distance metrics, like the subtree prune-and-regraft (SPR), tree bisection reconnection (TBR) and nearest neighbour interchange (NNI) have been proposed to capture this similarity. The distance between two phylogenetic trees can also be measured by the size of a Maximum Agreement Forest (MAF) on these trees, as it has been shown that the rooted subtree prune-and-regraft distance is 1 less than the size of a MAF. Since computing a MAF of minimum size is an NP-hard problem, approximation algorithms are of interest. Recently, it has been shown that the MAF on k(>=2) trees can be approximated to within a factor of 8. In this paper, we improve this ratio to 3. For certain species, however, the evolutionary history is not completely tree-like. Due to reticulate evolution two gene trees, though related, appear different, making a phylogenetic network a more appropriate representation of reticulate evolution. A phylogenetic network contains hybrid nodes for the species evolved from two parents. The number of such nodes is its hybridization number. It has been shown that this number is 1 less than the size of a Maximum Acyclic Agreement Forest (MAAF). We show that the MAAF for k(>= 2) phylogenetic trees can be approximated to within a factor of 3.

at July 29, 2014 12:41 AM UTC

Authors: Ciaran McCreesh, Patrick Prosser
Download: PDF
Abstract: The maximum labelled clique problem is a variant of the maximum clique problem where edges in the graph are given labels, and we are not allowed to use more than a certain number of distinct labels in a solution. We introduce a new branch-and-bound algorithm for the problem, and explain how it may be parallelised. We evaluate an implementation on a set of benchmark instances, and show that it is consistently faster than previously published results, sometimes by four or five orders of magnitude.

at July 29, 2014 12:41 AM UTC

In a prior post (here) I discussed the change problem: given a dollar, how many ways can you make change using pennies, nickels, dimes, and quarters (and generalize to n cents). Scouring the web I found either programs for it, the answer   (242), and answers like `is the coefficient of x^100 in ... . What I didn't find is a  formula for n cents. So I derived one using recurrences and wrote this up with some other things about the change problem, and I posted it on arXiv. (I have updated the paper many times after comments I got. It is still where it was, but updated, here.)

I then got some very helpful emails pointing me to the vast math literature on this problem. I was not surprised there was such a literature, though I was surprised I hadn't found any of it searching the web.
The literature didn't have my formula, though it had several ways to derive it (some faster than mine).

Why isn't the formula out there? Why couldn't I find the literature?

  1. The formula falls into a space right between recreational and serious math. I use a recurrence  but not a generating function. Recreational math just wants to know the answer for 100, so a program suffices (or some clever by-hand recurrences, which are also in my paper). Serious math people are happy to show that a formula CAN be derived and to refine that in terms of how quickly the coefficients can be found.
  2. There is a gap between recreational and serious math. In particular- the serious people aren't talking to (or posting on websites to) the rec people.
  3. Different terminologies. I was looking for ``change problems'' and ``coin problems'' and things like that when I should have been looking for ``Sylvester Denumerant'' or ''Frobenius problem''
For some areas Google (and other tools) are still not as good as finding someone who knows about your area. My posting on the blog got some people who know stuff's attention, and my posting on arXiv got one more person (who knew A LOT!).  I'm glad they emailed me comments and references so I improved the paper and cited the literature properly. But this seems so haphazard! Google didn't work,  mathstack exchange and other similar sites didn't work (that is where I saw the Rec people post but not get a formula). Is it the case that no matter how good Google gets we'll still need to find people who know stuff? Thats fine if you can, but sometimes you can't.

by GASARCH ( at July 28, 2014 01:50 AM UTC