Theory of Computing Blog Aggregator

Sounds like the name of a pub in Britain? Well, I came home after a long day and looked at the theory world in blogs, and saw the FOCS09 accepted list had been ripped apart into geometry, game theory and I am sure many others (eg learning, complexity, approximation) in peoples' minds. But there is a hanging chad and I will take it. People ask me whether streaming research is active. I am happy to point to (at least) three streaming-like papers in the list, so yes, it is active.
  • Exact And Approximate Pattern Matching In The Streaming Model. Ely Porat and Benny Porat.
  • The Data Stream Space Complexity of Cascaded Norms. T.S. Jayram and David Woodruff.
  • Efficient sketches for Earth-Mover Distance, with applications. Alexandr Andoni, Khanh Do Ba, Piotr Indyk and David Woodruff.

by metoo (noreply@blogger.com) at July 02, 2009 11:50 PM UTC

Not Elsevier, this time. The rumor is that SAGE Publications, the corporate publisher of the journal Political Theory, have bypassed the journal's editorial board and unilaterally imposed a new editor. As one commenter (6/17 6:44 on the first thread below) states, "The idea that the editorship of the journal is to be determined directly by them, apparently with no formal consultation with members of the existing editorial community, is like the idea of a faculty search being run by a couple of corporate honchos from a University's Board of Trustees, without consultation with current members of the faculty of the relevant department."

See here, here, and here for discussion, but so far (despite a signed statement by one of the editorial board members) there's a lot more heat than light.

at July 02, 2009 10:21 PM UTC

As many others have noticed, the list of accepted papers for FOCS 2009 have been posted; a second list with abstracts is also available. No word yet on when we'll see all those marvelous 2-page summaries everyone was asked to submit. (Yes, when, not whether!)

Anyway here's a partial list of geometry/topology papers. Enjoy!

  • Kevin Buchin and Wolfgang Mulzer. Delaunay Triangulations in O(sort(n)) and Other Transdichotomous and Hereditary Algorithms in Computational Geometry.
    We present several results about Delaunay triangulations (DTs) and convex hulls in transdichotomous and hereditary settings: (i) the DT of a planar point set can be omputed in time O(sort(n)) on a word RAM, where sort(n) is the time to sort n numbers; (ii) if we know the ordering of a planar point set in x- and in y-direction, its DT can be found by a randomized algebraic computation tree of expected linear depth; (iii) given a universe U of points in the plane, we construct a data structure D for Delaunay queries: for any subset P of U, D can find the DT of P in time O(|P| log log |U|); (iv) given a universe U of three-dimensional points in general convex position, there is a data structure D for _convex hull queries: for any subset P of U, D can find the convex hull of P in time O(|P| (log log |U|)^2); (v) given a convex polytope in 3-space with n vertices which are colored with k > 2 colors, we can split it into the convex hulls of the individual color classes in time O(n (log log n)^2). The results (i)--(iii) generalize to higher dimensions. We need a wide range of techniques. Most prominently, we describe a reduction from nearest-neighbor graphs to DTs that relies on a new variant of randomized incremental constructions which allow dependent sampling.

  • Saugata Basu and Thierry Zell. Polynomial hierarchy, Betti numbers and a real analogue of Toda's Theorem.
    Toda proved in 1989 that the (discrete) polynomial time hierarchy, is contained in the class of languages that can be decided by a Turing machine in polynomial time given access to an oracle with the power to compute a function in the counting complexity class #P. This result which illustrates the power of counting is considered to be a seminal result in computational complexity theory. An analogous result in the complexity theory over the reals (in the sense of Blum-Shub-Smale real Turing machines) has been missing so far. In this paper we formulate and prove a real analogue of Toda's theorem. Unlike Toda's proof in the discrete case, which relied on sophisticated combinatorial arguments, our proof is topological in nature. As a consequence of our techniques we are also able to relate the computational hardness of two extremely well-studied problems in algorithmic semi-algebraic geometry -- namely the problem of deciding sentences in the first order theory of the reals with a constant number of quantifier alternations, and that of computing the Betti numbers of semi-algebraic sets. We obtain a polynomial time reduction of the compact version of the first problem to the second. This latter result might be of independent interest to researchers in algorithmic semi-algebraic geometry.

  • Adam Kalai, Shang-Hua Teng and Alex Samorodnitsky. Learning Decision Trees From Random Examples: a Smoothed Analysis.
    We consider a new model of learning from random examples, motivated by smoothed analysis. Like previous work, we assume that the distribution over unlabeled examples is a product distribution -- the individual bits are chosen independently. However, we consider "typical" product distributions. Formally, we assume that the parameters are chosen with some (constant) amount of randomness, for example from the cube [0.49,0.51]^n. In this model we show how to agnostically learn polynomially-sized decision trees from random examples alone.

    We also consider a second model, which is a specific distribution over unlabeled data from which we efficiently learn all O(log n)-depth decision trees. In particular, we consider a distribution where the data exhibits even a small amount of diversity, e.g., symmetrically-distributed data where the fraction of 1's in the attributes span a constant range such as [.49,.51]. In recent work, Arpe and Mossel (2008) efficiently learn Juntas of size O(log(n) log log(n)) in a related model.

  • Peyman Afshani, Lars Arge, and Kasper Dalgaard Larsen. Orthogonal Range Reporting in Three and Higher Dimensions.
    In orthogonal range reporting we are to preprocess N points in d dimensions such that the points inside an axis-aligned query box can be found efficiently. This is a fundamental problem in various fields, including spatial databases and computational geometry. In this paper we provide a number of improvements for three and higher dimensional orthogonal range reporting: In the pointer machine model, we improve all the best previous results, some of which have not seen any improvements in almost two decades. In the I/O-model, we give the first polylogarithmic query bounds that exactly match the bounds in the pointer machine model but with logarithms taken to base B; no such results were known in more than three dimensions. We also prove a space lower bound that shows our new structures are space-optimal in the I/O-model. Our main pointer machine data structure uses optimal O(N (log N / loglog N)^{d-1}) space and answers queries in O(log^{d-1}N / (loglog N)^{d-2} + K) time; here K is the size of the output. Our main I/O-efficient structure answers queries in O(Log^{d-1}N / (LogLog N)^{d-2} + K/B) I/Os and uses optimal O(N (Log N / LogLog N)^{d-1}) space where Log N is logarithm taken to base B.

  • David Arthur, Bodo Manthey, and Heiko Roeglin. k-Means has Polynomial Smoothed Complexity.
    The k-means method is one of the most widely used clustering algorithms, drawing its popularity from its speed in practice. Recently, however, it was shown to have exponential worst-case running time. In this paper, we settle the smoothed running time of the k-means method. We show that the smoothed number of iterations is bounded by a polynomial in the number of data points and the reciprocal of the standard deviation of the Gaussian perturbations. This means that if an arbitrary input data set is randomly perturbed, then the k-means method will run in expected polynomial time on that input set.

  • Peyman Afshani, Jeremy Barbay, and Timothy M. Chan. Instance-Optimal Geometric Algorithms.
    We prove the existence of an algorithm A for computing 2-d or 3-d convex hulls that is optimal for every point set in the following sense: for every set S of n points and for every algorithm A' in a certain class C, the maximum running time of A on input s_1,...,s_n is at most a constant factor times the maximum running time of A' on s_1,...,s_n, where the maximum is taken over all permutations s_1,...,s_n of S. In fact, we can establish a stronger property: for every S and A', the maximum running time of A is at most a constant factor times the average running time of A' over all permutations of S. We call algorithms satisfying these properties instance-optimal in the order-oblivious and random-order setting. Such instance-optimal algorithms simultaneously subsume output-sensitive algorithms and distribution-dependent average-case algorithms, and all algorithms that do not take advantage of the order of the input or that assume the input is given in a random order.

    The class C under consideration consists of all algorithms in a decision tree model where the tests involve only multilinear functions with a constant number of arguments. To establish an instance-specific lower bound, we deviate from traditional Ben-Or-style proofs and adopt an interesting adversary argument. For 2-d convex hulls, we prove that a version of the well known algorithm by Kirkpatrick and Seidel (1986) or Chan, Snoeyink, and Yap (1995) already attains this lower bound. For 3-d convex hulls, we propose a new algorithm.

    To demonstrate the potential of the concept, we further obtain instance-optimal results for a few other standard problems in computational geometry, such as maxima in 2-d and 3-d, orthogonal line segment intersection in 2-d, finding bichromatic L_\infty-close pairs in 2-d, off-line orthogonal range searching in 2-d, off-line dominance reporting in 2-d and 3-d, off-line halfspace range reporting in 2-d and 3-d, and off-line point location in 2-d.

  • Matt Gibson and Kasturi Varadarajan. Decomposing Coverings and the Planar Sensor Cover Problem.
    We show that a k-fold covering using translates of an arbitrary convex polygon can be decomposed into Omega(k) covers (using an efficient algorithm). We generalize this result to obtain a constant factor approximation to the sensor cover problem where the ranges of the sensors are translates of a given convex polygon. The crucial ingredient in this generalization is a constant factor approximation for a one-dimensional version of the sensor cover problem, called the Restricted Strip Cover (RSC) problem, where sensors are intervals of possibly different lengths. Our algorithm for RSC improves on the previous O(log log log n) approximation.

  • Jonathan Kelner, James Lee, Gregory Price and Shanghua Teng. Higher eigenvalues of graphs.
    We present a general method for proving upper bounds on the eigenvalues of the graph Laplacian. In particular, we show that for any positive integer k, the kth smallest eigenvalue of the Laplacian on a bounded-degree planar graph is O(k/n). This bound is asymptotically tight for every k, as it is easily seen to be achieved for planar grids. We also extend this spectral result to graphs with bounded genus, graphs which forbid fixed minors, and other natural families. Previously, such spectral upper bounds were only known for k=2, i.e. for the Fiedler value of these graphs. In addition, our result yields a new, combinatorial proof of the celebrated result of Korevaar in differential geometry.

by Jeff Erickson at July 02, 2009 10:15 PM UTC

Guest Post by Dan Spielman.

The list of papers accepted to FOCS 2009 is now available at: http://www.cs.yale.edu/focs09/papers.html

As Mike did after the STOC 2009 PC Meeting, I'd like to provide a short summary of what happened at the FOCS 2009 Meeting.

We arrived at the meeting having at least three reviews for each paper. To save time for discussions of papers during the meeting, we voted to reject many papers in the two weeks before the meeting. We also voted to accept a small number that had overwhelmingly strong reviews. I would like to have heard more about these papers, but it would not have been a productive use of time.

After our preliminary decisions we had 123 papers to discuss over 2 days. This gave us only a few minutes to discuss each paper. In spite of the short timeframes, the discussions of papers were remarkably insightful, intelligent, and witty. Things were said that I will remember always. Unfortunately, confidentiality prevents me from sharing them with you. As chair, my job was to prematurely halt these discussions so that we could vote and move on to the next paper.

As was done in many PCs before, I asked committee members with conflicts of interest to leave the room. Committee members were deemed to have a conflict of interest with authors who were their advisor or advisee, who were at the same institution, and with whom they were close friends. My test for friendship was "would you be proud if this paper was accepted?" As many observed, these are not perfect tests for COIs, but they are a first-order approximation. This policy did have some drawbacks, as it often meant that the person most expert in the area of a paper was excluded from the discussion. To ameliorate this problem, I solicited extra reviews of papers for which this was the case. I am thankful that so many in our community responded to my urgent requests for reviews.

I am VERY happy that we took this approach to handling COIs. I know that I would have had a difficult time being unbiased when discussing papers with which I had a COI. One advantage of this approach that I did not see discussed in the comments on Mike's blog is that it preserves the committee members' reputations for integrity. We accepted many papers with which I had a COI, and I am glad that I will not be suspected of tilting the process in favor of my friends.

Deciding which papers to accept is a very difficult process. With insufficient time and consideration, we were forced to make decisions that are very important to many people. We did the best we could, but I am sure we made mistakes. At least there is no paper on which a majority of the committee believed we made a mistake.

I hope you all will join us for the 50th IEEE FOCS.

by Michael Mitzenmacher (noreply@blogger.com) at July 02, 2009 08:12 PM UTC


raigor

 Andrei Raigorodskii

(This post follows an email by Aicke Hinrichs.)

In a previous post we discussed the following problem:

Problem: Let A be a measurable subset of the d-dimensional sphere S^d = \{x\in {\bf R}^{d+1}:\|x\|=1\}. Suppose that A does not contain two orthogonal vectors. How large can the d-dimensional volume of A be?

Setting the volume of the sphere to be 1, the Frankl-Wilson gives a lower bound (for large d) of  1.203^{-d},
2) The double cap conjecture would give (for large d) of 1.414^{-d}.

A result of A. M. Raigorodskii from 1999 gives a better bound of 1.225^{-d}. (This have led to an improvement concerning the dimensions where a counter example for Borsuk’s conjecture exist; we will come back to that.) Raigorodskii’s method supports the hope that by considering clever confiurations of points instead of just \pm 1-vectors and applying the polynomial method (the method of proof we described for Frankl-Wilson’s theorem) we may get closer and perhaps even prove the double-cup conjecture.

What Raigorodskii did was to prove a Frankl-Wilson type result for vectors with 0,\pm1 coordinates with a prescribed number of zeros. Here is the paper.

Now, how can we bit the 1.225^{-d} record???

 

 

by Gil Kalai at July 02, 2009 08:05 PM UTC


FOCS 2009 accepted papers have been posted.  The following seem to be in the general area of algorithmic game theory:

  1. The Complexity of Rationalizing Network Formation by Shankar Kalyanaraman and Christopher Umans.
  2. Convergence of Local Dynamics to Balanced Outcomes in Exchange Networks by Yossi Azar, Benjamin Birnbaum, L. Elisa Celis, Nikhil R. Devanur and Yuval Peres.
  3. On the Power of Randomization in Algorithmic Mechanism Design by Shahar Dobzinski and Shaddin Dughmi.  (A link to the paper and some discussion in this blog post.)
  4. On Allocating Goods to Maximize Fairness by Deeparnab Chakrabarty, Julia Chuzhoy and Sanjeev Khanna.
  5. Reducibility Among Fractional Stability Problems by Shiva Kintali, Laura Poplawski, Rajmohan Rajaraman, Ravi Sundaram and Shang-Hua Teng. (A link to the paper and some discussion in this blog post.)
  6. Online Stochastic Matching: Beating 1-1/e by Jon Feldman, Aranyak Mehta, Vahab Mirrokni and S. Muthukrishnan.  (A link to the paper hides in this related blog post.)
  7. Settling the Complexity of Arrow-Debreu Equilibria in Markets with Additively Separable Utilities by Xi Chen, Decheng Dai, Ye Du and Shang-Hua Teng. (A link to the paper and some discussion in this blog post.)
  8. Convergence to Equilibrium in Local Interaction Games byAndrea Montanari and Amin Saberi.
  9. Dynamic and Non-Uniform Pricing Strategies for Revenue Maximization byTanmoy Chakraborty, Zhiyi Huang and Sanjeev Khanna.
  10. Approximability of Combinatorial Problems with Multi-agent Submodular Cost Functions by Gagan Goel, Chinmay Karande, Pushkar Tripathi and Lei Wang.

by algorithmicgametheory at July 02, 2009 04:32 PM UTC

Prahladh Harsha asked me to post the following announcement on the blog. I don't usually post announcements but this seems like an excellent opportunity to learn about approximation and PCPs from the masters. You can find many more links and short announcements on my Twitter Feed where you would also learn the FOCS accepted papers list is out. My thoughts on the FOCS papers tomorrow.
  


DIMACS Tutorial on Limits of Approximation Algorithms: PCPs and Unique Games

When: July 20 - 21, 2009 
Where: DIMACS Center, CoRE Building, Rutgers University

I would like to advertise an upcoming tutorial on "Limits of Approximation Algorithms: PCPs and Unique Games", organized under the auspices of the DIMACS Special Focus on Hardness of Approximation. This tutorial is geared towards graduate students, postdocs and others who are theoretically oriented, but not necessarily familiar with the material.  The aim of the tutorial is to give participants a general overview of approximability, introduce them to important results in inapproximability, including some of the recent developments in the world of probabilistically checkable proofs (PCPs) and the unique games conjecture. 

The list of speakers includes: Matthew Andrews (Alcatel-Lucent Bell Laboratories), Sanjeev Arora (Princeton University), Moses Charikar (Princeton University), Prahladh Harsha (University of Texas, Austin), Subhash Khot (New York University) and Lisa Zhang (Alcatel-Lucent Bell Laboratories).

Registration is free and limited travel support is available for non-local participants (with preference to students and postdocs). More info on the workshop web site.

by Lance (lance@fortnow.com) at July 02, 2009 03:58 PM UTC


Rakesh Vohra and Eran Shmaya have started a new blog, The leisure of the theory class. The cute name doesn’t completely specify the intended contents, but the authors’ research interests as well as the first few posts seem to indicate game-theoretic tendencies. Rakesh has been long serving as a bridge between the CS community and the economics community, so his writings may well be interesting to the algorithmic game theory community.

by algorithmicgametheory at July 02, 2009 07:33 AM UTC

Jon Feldman and I will be giving a tutorial at EC 2009 on Sponsored Search. But we promise not to rehash auctions. In fact, tutorial will focus on an advertiser point of view, so more optimization than auctions. We are also running an ad campaign to collect stats:
EC 2009 Tutorial
Feldman, Muthukrishnan: Information
Exchange in Sponsored Search.
www.sigecom.org/ec09/..

There are at least two forthcoming talks of great interest, but you have to choose one: Susan Athey will speak at EC09 on Wed, July 8, at Palo Alto, CA. On Thurs, July 9th, Noam Nisan will speak at ICALP09 at Rhodes, Greece.

Finally, truthfulness. Can we design a truthful mechanism for budget-constrained bidders in a series of ad auctions? If you wanted to maximize the number of clicks, in some cases one can design such a mechanism [FMNP SAGT08]. If you want to maximize profit, such mechanisms can not produce Pareto-optimality [DLN FOCS08], but in a forthcoming paper at Ad Auctions Workshop, Ravi, Hafalir and Sayedi show a very nice, simple semi-truthful mechanism that is also Pareto optimal.

by metoo (noreply@blogger.com) at July 02, 2009 07:25 AM UTC

Authors: Hunter Monroe
Download: PDF
Abstract: Informally, a language L has speedup if, for any Turing machine for L, there exists one that is better. Blum showed that there are computable languages that have almost-everywhere speedup. These languages were unnatural in that they were constructed for the sole purpose of having such speedup. We identify a condition apparently only slightly stronger than P!=NP which implies that accepting any coNP-complete language has an infinitely-often (i.o.) superpolynomial speedup and NP!=coNP. We also exhibit a natural problem which unconditionally has a weaker type of i.o. speedup based upon whether the full input is read. Neither speedup pertains to the worst case.

at July 01, 2009 12:00 AM UTC

Authors: Venkatesan Chakaravarthy, Sambuddha Roy
Download: PDF
Abstract: We study some problems solvable in deterministic polynomial time given oracle access to the (promise version of) the Arthur-Merlin class. Our main results are the following: (i) $BPP^{NP}_{||} subseteq P^{prAM}_{||}$; (ii) $S_2^p subseteq P^{prAM}$. In addition to providing new upperbounds for the classes $S_2^p$ and $BPP^{NP}_{||}$, these results are interesting from a derandomization perspective. In conjunction with the hitting set generator construction of Miltersan and Vinodchandran [Computational Complexity, 2005], we get that $S_2^p = P^{NP}$ and $BPP^{NP}_{||} = P^{NP}_{||}$, under the hardness hypothesis associated with derandomizing the class $AM$. This gives an alternative proof of the same result obtained by Shaltiel and Umans [Computational Complexity, 2007]. We also show that if $NP$ has polynomial size circuits then the polynomial time hierarchy ($PH$) collapses as $PH = P^{prMA}$. Under the same hypothesis, we also derive an $FP^{prMA}$ algorithm for learning circuits for SAT; this improves the $ZPP^{NP}$ algorithm for the same problem by Bshouty et al.[JCSS, 1996]. Finally, we design a $FP^{prAM}$ algorithm for the problem of finding near-optimal strategies for succinctly presented zero-sum games. For the same problem, Fortnow et al. [Computational Complexity, 2008] described a $ZPP^{NP}$ algorithm. One advantage of our $FP^{prAM}$ algorithm is that it can be derandomized using the construction of Miltersen and Vinodchandran yielding a $FP^{NP}$ algorithm, under a hardness hypothesis used for derandomizing $AM$.

at July 01, 2009 12:00 AM UTC

Authors: Emanuele Viola, Emanuele Viola
Download: PDF
Abstract: We prove that to store n bits x so that each prefix-sum query Sum(i) := sum_{k < i} x_k can be answered by non-adaptively probing q cells of log n bits, one needs memory > n + n/log^{O(q)} n. Our bound matches a recent upper bound of n + n/log^{Omega(q)} n by Patrascu (FOCS 2008), also non-adaptive. We also obtain a n + n/log^{2^{O(q)}} n lower bound for storing a string of balanced brackets so that each Match(i) query can be answered by non-adaptively probing q cells. To obtain these bounds we show that a too efficient data structure allows us to break the correlations between query answers.

at July 01, 2009 12:00 AM UTC

Authors: Johannes Koebler, Sebastian Kuhnert
Download: PDF
Abstract: We show that k-tree isomorphism can be decided in logarithmic space by giving a logspace canonical labeling algorithm. This improves over the previous StUL upper bound and matches the lower bound. As a consequence, the isomorphism, the automorphism, as well as the canonization problem for k-trees are all complete for deterministic logspace. We also show that even simple structural properties of k-trees are complete for logspace.

at July 01, 2009 12:00 AM UTC

Authors: Eric Allender, Michal Koucký, Detlef Ronneburger, Sambuddha Roy
Download: PDF
Abstract: We continue an investigation into resource-bounded Kolmogorov complexity cite{abkmr}, which highlights the close connections between circuit complexity and Levin's time-bounded Kolmogorov complexity measure Kt (and other measures with a similar flavor), and also exploits derandomization techniques to provide new insights regarding Kolmogorov complexity. The Kolmogorov measures that have been introduced have many advantages over other approaches to defining resource-bounded Kolmogorov complexity (such as much greater independence from the underlying choice of universal machine that is used to define the measure) [ABKMR]. Here, we study the properties of other measures that arise naturally in this framework. The motivation for introducing yet more notions of resource-bounded Kolmogorov complexity are two-fold: * to demonstrate that other complexity measures such as branching-program size and formula size can also be discussed in terms of Kolmogorov complexity, and * to demonstrate that notions such as nondeterministic Kolmogorov complexity and distinguishing complexity [BFL] also fit well into this framework. The main theorems that we provide using this new approach to resource-bounded Kolmogorov complexity are: * A complete set ($RKNt$) for NEXP/poly defined in terms of strings of high Kolmogorov complexity. * A lower bound, showing that $RKNt$ is not in NP intersect coNP. * New conditions equivalent to the conditions ``NEXP is contained in nonuniform NC1'' and ``NEXP is contained in L/poly''. * Theorems showing that ``distinguishing complexity'' is closely connected to both FewEXP and to EXP. * Hardness results for the problems of approximating formula size and branching program size.

at July 01, 2009 12:00 AM UTC

Authors: Jan Kyncl, Tomas Vyskocil
Download: PDF
Abstract: Directed reachability (or briefly reachability) is the following decision problem: given a directed graph G and two of its vertices s,t, determine whether there is a directed path from s to t in G. Directed reachability is a standard complete problem for the complexity class NL. Planar reachability is an important restricted version of the reachability problem, where the input graph is planar. Planar reachability is hard for L and is contained in NL but is not known to be NL-complete or contained in L. Allender et al. showed that reachability for graphs embedded on the torus is logspace-reducible to the planar case. We generalize this result to graphs embedded on a fixed surface of arbitrary genus.

at July 01, 2009 12:00 AM UTC

Authors: N. Rampersad, J. Shallit, Z. Xu
Download: PDF
Abstract: In this paper we consider the computational complexity of the following problems: given a DFA or NFA representing a regular language L over a finite alphabet Sigma is the set of all prefixes (resp., suffixes, factors, subwords) of all words of L equal to Sigma*? In the case of testing universality for factors of languages represented by DFA's, we find an interesting connection to Cerny's conjecture on synchronizing words.

at July 02, 2009 03:30 AM UTC

Authors: Farzad Didehvar, Ali D. Mehrabi, Fatemeh Raee B
Download: PDF
Abstract: An independent set in a graph G is a set of vertices no two of which are joined by an edge. A vertex-weighted graph associates a weight with every vertex in the graph. A vertex-weighted graph G is called a unique independence vertex-weighted graph if it has a unique independent set with maximum sum of weights. Although, in this paper we observe that the problem of recognizing unique independence vertex-weighted graphs is NP-hard in general and therefore no efficient characterization can be expected in general; we give, however, some combinatorial characterizations of unique independence vertex-weighted graphs. This paper introduces a motivating application of this problem in the area of combinatorial auctions, as well.

at July 02, 2009 03:30 AM UTC

May 28 should be Pi-day instead of March 14 since pi should be what we now call 2*pi (6.28...) since 2*pi comes up in more formulas than pi. (When I blogged about this here one of the commenter's suggested 2*pi*i.)

So what-should-be-pi-day was last Sunday. To honor this day I asked people what day or concept they would want to make a holiday. Here is what I got.
  1. Multicultural day. OR give every ethnic group a holiday. This may lead to the 4-day work week.
  2. Mid-autumn day to give us a break.
  3. Pi-Day (March 14)
  4. Mole Day (Oct 23)
  5. Talk like a pirate day (Sept 19.) Did pirates really talk that way in the past? now?
  6. Election day. That way more people will vote.
  7. The day Louis Brandeis got appointed to the supreme court. He was the first Jewish member of the court. This could be a way to celebrate the decline of anti-semitism in America.
  8. Peace Day. Could celebrate the end of some war. But there is always the next war. Oh well.
  9. The day women got the vote. (Aug 26, 1920). To quote Hail to the Chiefs, with regard to the election of Warren G. Harding as president in 1920: {\it It was the first election women voted in. They needed more practice}.
  10. The day the civil rights act passing. Actually there were many civil rights bills passed, so you'd have to pick which one to celebrate. Or celebrate all of them and get a 4-day work week.
  11. Groundhog day. (Feb 2). The movie movie Groundhog day higher Google rank than the day does.
  12. Chocolate day.
  13. Moon day. They have Earth Day so why not Moon Day?
  14. Children's day. We have Mothers and Fathers day, so why not Children's day?
  15. St. Patrick's day. Or the day after to get rid of your hangover.
  16. Teacher's day. Would teachers get this day off?
  17. Weird Al's birthday (10/23). Since its the same as Mole Day we can combine them to get Weird Mole Day.
Lets say they made your choice a holiday. And then there was a movement to make it, instead of the actual day, the 2nd Monday of that month it was in. (or something like that). Would you mind? On the one hand, your holiday is getting made into just a day off. On the other hand, you get a 3-day weekend!

Veterans day would have been the 2nd Monday in November except that some veterans protested. Or did they? Maybe it was groups that claim to represent them. I wonder if veterans prefer 3-day weekends or prefer having their holiday be on the day WW I ended. What is more important: Efficiency or meaning?

Some days would be hard to move. July 4, Cinco do Mayo and Pi-day are rather tied to the day they are celebrated. Even so, the government (and others) gives July 3 off if July 4 is on a Saturday.

by GASARCH (noreply@blogger.com) at July 01, 2009 04:03 PM UTC


A “textbook system” based on social choice theory would have a centralized mechanism interacting with multiple software agents, each of them representing  a user.  The centralized mechanism would be designed to optimize some global goal (such as revenue or social welfare) and each software agent would elicit the preferences of its user and then optimize according to user preferences.

Among other irritating findings, behavioral economics also casts doubts on this pretty picture, questioning the very notion that users have preferences; that is preferences that are independent of the elicitation method.  In the world of computation, we have a common example of this “framing” difficulty: the default.  Users rarely change it, but we can’t say that they actually prefer the default to the other alternative since if we change the default then they stick with the new one.  Judicious choice of defaults can obviously be used for the purposes of the centralized mechanism (default browser = Internet explorer); but what should we do if we really just want to make the user happy?  What does this even mean?

The following gripping talk by Dan Ariely demonstrates such issues.

by algorithmicgametheory at July 01, 2009 12:13 PM UTC



The isolation lemma and recent applications

Picture 3

Ketan Mulmuley, Umesh Vazirani, and Vijay Vazirani are three great theorists: Ketan is most famous these days for his attempts to use algebraic geometry to prove lower bounds, Umesh for his seminal work on quantum computing, and Vijay for his many contributions to the theory of approximation algorithms–his book on this is already a classic.

Today I would like to talk about an old result of theirs: the famous Isolation Lemma. This beautiful result has had a huge impact on many parts of complexity theory. The reason for this post is that there have been recent results either using the lemma or using related lemmas. Also their famous result is just another example of great results that have not been awarded the accolades they deserve. Oh well.

Umesh and Vijay were one of the early pairs of brothers in the theory of computing. I think the first pair was probably the Fischer brothers, Michael Fischer and Patrick Fischer. Now we have many more pairs of siblings, who both work in the theory of computing. I am currently working with one half of a pair of identical twins. Is that right?

The Isolation Lemma

The Isolation Lemma (IL) is, in my opinion, one of those results that seems magical. There are countless examples of the power of randomization, but there is something very surprising about this result.

Let {X} be a set of {n} points, and let {\cal F} be a family of subsets of {X}. Assign a weight {w(x)} to each point, and define the weight of a set {E} as

\displaystyle  w(E) = \sum_{x \in E} w(x).

If the set with the minimum weight is unique, then call the weight function {w} isolating.

Lemma: Let {\cal F} be a family of {n} element subsets of the set {X}. Then,

\displaystyle  \text{Prob}_{w}[ w \text{ is isolating for } {\cal F}] \ge 1-\frac{n}{N}.

Here {w} the random function from {X} to {1,\dots,N}.

That there is no dependence on the structure or lack of structure of the family {\cal F} is surprising. There are many nice proofs of IL–see this for one.

Applications

There are many classic applications of the IL. Of course before the IL there was the beautiful paper of Leslie Valiant and Vijay on unique SAT, which was used in Seinosuke Toda’s famous theorem reducing the polynomial hierarchy to {\mathsf{PP}}.

In the last few years there have been a number of quite different applications. I would like to highlight a few of these.

{\bullet} One interesting application of the IL technology is to Watermarking. This is the problem of hiding information into either hardware or software, so that intellectual property is protected. Rupak Majumdar and Jennifer Wong use the unique SAT result in particular to help create a “watermark.” I like this paper, since it shows the wide range and power of the IL.

{\bullet} Another direction is to use the IL to get identity testing results, which is what Vikraman Arvind, Partha Mukhopadhyay, and Srikanth Srinivasan do in their wonderful paper. They give a new identity test based on IL for non-commutative polynomials:

Theorem: Let {f} be a polynomial over {n} non-commuting variables given by an arithmetic circuit of size {m}. Let {d} be the upper bound on the degree of {f}. Then there is a randomized algorithm which runs in time {\text{poly}(n,m,d)} and can test whether {f \equiv 0}.

{\bullet} Last, I would like to mention the work of Ryan Williams that does not use the IL directly. However, his work uses a lemma that is similar in spirit to the IL. His main result is:

Theorem: There is a randomized algorithm that determines if a given graph has a simple path of length at least {k} in

\displaystyle O(2^{k}\cdot \text{poly}(n,k))

time.

See his pretty paper for the details. The main lemma–actually in the paper it’s Theorem 3.1–is an kind of isolation result. He encodes the graph problem as the problem of determining whether or not a polynomial {P(x_{1},\dots,x_{k})} has a multilinear term. That is a term of the form {xyz} and not {xy^{6}wt}. Since the polynomial is defined by an arithmetic circuit, it is impossible to determine this by brute force–there will be an exponential number of terms in general.

What Ryan shows is this: evaluate the polynomial at a random element from a certain group algebra. Then, all non-multilinear terms evaluate to zero, and some multilinear terms survive. Finally, he makes use of randomness again to detect this. I think that this is in the spirit of the IL, randomness is used to isolate the multilinear terms from the non-multilinear terms. Take a look at the paper and see if you agree.

Open Problems

One of the on-going questions is de-randomizing the Isolation Lemma in specific cases. Since there are {2^{2^{n}}} set systems {{\cal F}} general de-randomization appears to be impossible.

Another obvious related question is: what structure on the family {{\cal F}} would imply a sharper bound, or allow de-randomization?

by rjlipton at July 01, 2009 11:35 AM UTC

Authors: Yong-Hyuk Kim, Yourim Yoon
Download: PDF
Abstract: Within personalized marketing, a recommendation issue known as multicampaign assignment is to overcome a critical problem, known as the multiple recommendation problem which occurs when running several personalized campaigns simultaneously. This paper mainly deals with the hardness of multicampaign assignment, which is treated as a very challenging problem in marketing. The objective in this problem is to find a customer-campaign matrix which maximizes the effectiveness of multiple campaigns under some constraints. We present a realistic response suppression function, which is designed to be more practical, and explain how this can be learned from historical data. Moreover, we provide a proof that this more realistic version of the problem is NP-hard, thus justifying to use of heuristics presented in previous work.

at July 01, 2009 03:30 AM UTC

Let me make an attempt at predicting the winner(s) of the LICS Test-of-Time Award for 2009.

According to the rules of the award, the prize will go to a paper that was presented at LICS 1989 in lovely Asilomar. I was there as a second-year Ph.D. student and I presented a paper myself, but it was not award material.

The conference programme was really good and the award committee must have had a difficult choice. I do not know who will receive the award in mid-August, but let make a personal prediction: the award will be shared by the following two papers:
What papers do you think will receive the award? It will be interesting to look back at the predictions and see who was right.


by Luca Aceto (noreply@blogger.com) at June 30, 2009 09:32 PM UTC


The announcement for next year’s Computer Science Study Group (CSSG) is out. The deadline for proposals is August 25. As stated in the call:

DARPA seeks junior faculty with research interest in computer science, to serve as Principal Investigators to explore novel ideas that lead to fundamental technological advances that benefit the US Department of Defense.

Although one might not expect it (I didn’t), the CSSG is fairly “theory friendly”. I was selected for the CSSG this past year, and Rocco Servedio was chosen two years ago. Moreover, explicitly listed as “technologies of interest” are complexity theory(!), approximation algorithms, machine learning, and network modeling. (They also accept proposals that are not in the areas of interest.) Perhaps this should not be too surprising, given that mathematician Ben Mann is behind the program. Given Lipton’s recent post about NSF grants tending to go to “low risk” projects, let me add that I think DARPA is very open to “high risk” ideas. (Unfortunately for Lipton, the CSSG is only for junior faculty.)

Since I was part of the CSSG this year, let me say a few words about what a great opportunity it is. The purpose of the program is to introduce computer scientists to the US Department of Defense (DoD) and its research challenges. The program begins with a 1 year overview of the DoD, during which time CSSG members spend 4 weeks throughout the year touring DoD agencies and facilities. (See below.) Following this, members are supposed to propose a project of specific relevance to the DoD that will be funded for up to 2 years. Finally, through contacts established as part of the program, members are expected to obtain independent funding from a DoD agency, which will be matched by DARPA. Yes, the ability to obtain SECRET clearance is required.

We have had two of the four weeks so far. Highlights have included a visit to the control room of the Pentagon, a tour of a Navy submarine, jumping out of a 34-ft tower (that is how paratroopers learn), and flying on a military aircraft. Of particular interest to cryptographers, we visited and spoke with people at IDA, DIA, and DISA (other intelligence agencies will be in the Fall); saw the Naval Network Warfare Command; and (on the next trip) will see STRATCOM, where the new US Cyber Command will report.

Anyone interested in applying please feel free to contact me for further details.

by jonkatz at June 30, 2009 03:41 PM UTC

A group (BellKor's Pragmatic Chaos) has announced that they've beaten the barrier needed to win the $1 million Netflix Prize. This sets off a 30-day clock -- this team will win unless some other team does better in this period.

While there's been various opinions as to what extent contests like this really drive forward science, it's certainly been memorable -- I remember lots of news articles when the contest was first announced, and a lot of excitement over it. I'm a little surprised as how muted the news is now that the contest has been "won"; a Google news query for "Netflix Prize" shows just over a hundred articles, reasonable for a tech story but certainly not high by any means. At the very least, it was an interesting challenge that non-scientists could relate to and demonstrated the importance of good algorithms.

Are there other similar prizes out there? (Google seems to have run a number of contests, such as this one.) Maybe we should create more of them, or encourage industry to do so? For many people science is inspiring for its own sake, but for many others, a little flair (and money) is more likely to attract their attention.

by Michael Mitzenmacher (noreply@blogger.com) at June 30, 2009 02:50 PM UTC

The NSF has been getting very strict about having PIs (primary investigators) fill out an annual report listing publications and activities related to their grants. The report is due 90 days before the anniversary of the start of the grant, so if you have a grant that started in the fall you are probably getting notices reminding you that your report is now overdue. 

There are few academic tasks more painful than filling out annual grant reports but best to just grit your teeth and do it or the NSF could cut off your money. Would you rather the alternative, not having to do a report because you don't have a grant?

Annual reports are used more for gathering information than for evaluation so you don't need to worry about the style as much as you might for a grant proposal. 

There is one confusing part of the reporting system on NSF Fastlane: There are separate sections for conferences and publications. Most of our conferences don't show up under conferences so best to just list your other conference papers as proceedings in the publications section, probably under "Books or Other One Time Publications".

Fastlane reporting is a one-size fits all system so it is not well designed for the way computer science handles conferences. But we computer scientists always know how to implement the Kludge.

by Lance (lance@fortnow.com) at June 30, 2009 01:39 PM UTC


I just got the following email:

Hi Noam,

I just came across your blog, Algorithmic Game Theory and found it to be pretty insightful. I’m trying to get some freelance practice, and was wondering if you would be willing to consider a guest piece from me. I could write on something specific for you, or something that would fit in with your blog. Let me know if you are willing and if there is a process or anything. If you choose to accept my article, I would only ask for a single link back to my site, http://www.onlinedegreeshub.com/. I have included 2 of my previous published posts:

http://www.onlinedegreeshub.com/blog/2009/12-celebs-who-give-geeks-hope-for-fame-and-fortune/

http://www.onlinedegreeshub.com/blog/2009/100-ivy-league-computer-science-courses-you-can-take-for-free-online/

Please get back with me if you would like me to send something over for your consideration.

Thanks for your time,
Tara Miller

I have to admit that I was intrigued by this unusual mode of marketing: the sole desired reward is attention — a link to the website. Of course, the spam filter on this blog routinely removes many automatically-generated comments that attempt “stealing attention”, but this offer does not seem to be neither spam nor automatically generated and the offer for the “guest post” seems personalized enough so that real work was offered. What is being promoted seems to be a rather new web site and blog that intend to be a portal for online education/degrees — probably not useful for most of the audience of this blog — but still seems to contain some useful links to online material. I was impressed enough with the basic premise of this offer (as well as somewhat flattered by the suggestion that the attention that my blog can bring is worth something) that my response was to use the letter, with the links, directly as a blog post. So, Tara, if you’re actually reading this, a comment on your experience with this marketing effort would be appreciated.


by algorithmicgametheory at June 30, 2009 10:16 AM UTC

See this video, pointed out to me by my recently graduated M.Sc. student Arnar Birgisson. It is well worth spending three minutes looking at.

I guess that many readers of this blog will like the message in this position video :-)


by Luca Aceto (noreply@blogger.com) at June 30, 2009 07:50 AM UTC

Authors: Javaid Aslam
Download: PDF
Abstract: We present a resolution to the refutation provided by Ferraro et. al (arXiv:0904.3912), for the proof of $NP=P$ in [arXiv:0812.1385]. We also provide a correct solution to the counter example and additional results that explain why some issues raised in the cited refutation are not quite valid.

at June 30, 2009 03:30 AM UTC

Authors: Mukul S. Bansal, Jianrong Dong, David Fernández-Baca
Download: PDF
Abstract: We define, analyze, and give efficient algorithms for two kinds of distance measures for rooted and unrooted phylogenies. For rooted trees, our measures are based on the topologies the input trees induce on triplets; that is, on three-element subsets of the set of species. For unrooted trees, the measures are based on quartets (four-element subsets). Triplet and quartet-based distances provide a robust and fine-grained measure of the similarities between trees. The distinguishing feature of our distance measures relative to traditional quartet and triplet distances is their ability to deal cleanly with the presence of unresolved nodes, also called polytomies. For rooted trees, these are nodes with more than two children; for unrooted trees, they are nodes of degree greater than three.

Our first class of measures are parametric distances, where there is a parameter that weighs the difference between an unresolved triplet/quartet topology and a resolved one. Our second class of measures are based on Hausdorff distance. Each tree is viewed as a set of all possible ways in which the tree could be refined to eliminate unresolved nodes. The distance between the original (unresolved) trees is then taken to be the Hausdorff distance between the associated sets of fully resolved trees, where the distance between trees in the sets is the triplet or quartet distance, as appropriate.

at June 30, 2009 03:30 AM UTC

Authors: R. Thamilselvan, Dr. P. Balasubramanie
Download: PDF
Abstract: This paper presents a new algorithm based on integrating Genetic Algorithms and Tabu Search methods to solve the Job Shop Scheduling problem. The idea of the proposed algorithm is derived from Genetic Algorithms. Most of the scheduling problems require either exponential time or space to generate an optimal answer. Job Shop scheduling (JSS) is the general scheduling problem and it is a NP-complete problem, but it is difficult to find the optimal solution. This paper applies Genetic Algorithms and Tabu Search for Job Shop Scheduling problem and compares the results obtained by each. With the implementation of our approach the JSS problems reaches optimal solution and minimize the makespan.

at June 30, 2009 03:30 AM UTC

Authors: Leah Epstein, Asaf Levin
Download: PDF
Abstract: Following the work of Anily et al., we consider a variant of bin packing, called "bin packing with general cost structures" (GCBP) and design an asymptotic fully polynomial time approximation scheme (AFPTAS) for this problem. In the classic bin packing problem, a set of one-dimensional items is to be assigned to subsets of total size at most 1, that is, to be packed into unit sized bins. However, in GCBP, the cost of a bin is not 1 as in classic bin packing, but it is a non-decreasing and concave function of the number of items packed in it, where the cost of an empty bin is zero. The construction of the AFPTAS requires novel techniques for dealing with small items, which are developed in this work. In addition, we develop a fast approximation algorithm which acts identically for all non-decreasing and concave functions, and has an asymptotic approximation ratio of 1.5 for all functions simultaneously.

at June 30, 2009 03:30 AM UTC

Authors: Leah Epstein, Asaf Levin
Download: PDF
Abstract: We consider two well-known natural variants of bin packing, and show that these packing problems admit asymptotic fully polynomial time approximation schemes (AFPTAS). In bin packing problems, a set of one-dimensional items of size at most 1 is to be assigned (packed) to subsets of sum at most 1 (bins). It has been known for a while that the most basic problem admits an AFPTAS. In this paper, we develop methods that allow to extend this result to other variants of bin packing. Specifically, the problems which we study in this paper, for which we design asymptotic fully polynomial time approximation schemes, are the following. The first problem is "Bin packing with cardinality constraints", where a parameter k is given, such that a bin may contain up to k items. The goal is to minimize the number of bins used. The second problem is "Bin packing with rejection", where every item has a rejection penalty associated with it. An item needs to be either packed to a bin or rejected, and the goal is to minimize the number of used bins plus the total rejection penalty of unpacked items. This resolves the complexity of two important variants of the bin packing problem. Our approximation schemes use a novel method for packing the small items. This new method is the core of the improved running times of our schemes over the running times of the previous results, which are only asymptotic polynomial time approximation schemes (APTAS).

at June 30, 2009 03:30 AM UTC

Authors: C. Seshadhri
Download: PDF
Abstract: We deal with the problem of designing one-sided error property testers for cycle-freeness in bounded degree graphs. Such a property tester always accepts forests. Furthermore, when it rejects an input, it provides a short cycle as a certificate. The problem of testing cycle-freeness in this model was first considered by Goldreich and Ron \cite{GR97}. They give a constant time tester with two-sided error (it does not provide certificates for rejection) and prove a $\Omega(\sqrt{n})$ lower bound for testers with one-sided error. We design a property tester with one-sided error whose running time matches this lower bound (upto polylogarithmic factors). Interestingly, this has connections to a recent conjecture of Benjamini, Schramm, and Shapira \cite{BSS08}. The property of cycle-freeness is closed under the operation of taking minors. This is the first example of such a property that has an almost optimal $\otilde(\sqrt{n})$-time one-sided error tester, but has a constant time two-sided error tester. It was conjectured in \cite{BSS08} that this happens for a vast class of minor-closed properties, and this result can seen as the first indication towards that.

at June 30, 2009 03:30 AM UTC

I just returned from Montpellier, France, where I was invited to speak at WG 2009, the 35th International Workshop on Graph-Theoretic Concepts in Computer Science.

This was my first trip to France, and Montpellier was a very charming place to visit. It is in the south, and though not quite on the Mediterranean itself it has a very Mediterranean feeling. The conference organizers put together an interesting and educational excursion for us in which we learned some of the local history by visiting three local landmarks (the opera house, an old pharmacy, and the top of the gateway arch) and drinking a different wine in each. We learned that Montpellier is a new city for France, being only a little over 1000 years old, but it is home to one of the oldest medical schools in Europe. It's on the old pilgrim trail from Rome to Compostela, markers for which run through the town, and although the language is no longer spoken in the area there are several inscriptions for tourists written in Occitan. The nearby mountains were a refuge for the Cathars when they were persecuted by the Catholics, and there are many open squares in the city due to the monasteries and other buildings that were torn down in the French revolution. My hotel was on one side of the old quarter, now mostly a shopping district, and the conference was in an old movie theater on the other side of the quarter, so I had a very pleasant walk every day to the conference (only getting lost twice), and my bad high-school French was enough to get by with the people who didn't or wouldn't speak English (rather more of them than I've encountered in other parts of Europe).

The conference itself is about graph algorithms, both for problems on arbitrary graphs and (a larger fraction of the papers) for important special classes of graphs. There were too many interesting talks to describe them all in detail, so let me just mention a few.

Daniel Král started the conference with the other invited talk, on graphs of bounded expansion. This is a notion of sparsity in graphs, based on shallow minors, graph minors (subgraphs of contractions) in which two vertices may only be contracted together if they started out within some constant distance of each other in the original graph. There is a trichotomy theorem for the density of shallow minors: among the densest shallow minors of a given family of graphs, the numbers of edges must grow roughly as an integer power of the number of vertices, the exponent of which can only be 0 (for graphs with bounded numbers of edges), 1 (sparse graphs, with comparable numbers of edges and vertices), or 2 (dense graphs). Additionally, in the case where the exponent is 2, all graphs can be formed as shallow minors. Graphs of bounded expansion form the case of exponent 1, in which the shallow minors are all sparse; they generalize both minor-closed graph families (in which all minors are sparse) and bounded-degree graphs (since a shallow minor of a bounded-degree graph also has bounded degree). As Král described, graph properties describable as first-order logic formulae may be tested efficiently on these graphs, but there seems to be plenty of opportunity for more algorithmic work on them. The two main papers on this subject are in the European Journal of Combinatorics, by Nešetřil and Ossona de Mendez: one on structure theory and a second on algorithms. There also seems to be some connection with Král's work with Dvořák and Thomas at SODA 2009 on efficient 3-coloring of triangle-free surface-embedded graphs (via a data structure for finding paths of bounded length) but I'm not sure of the details of this part.

My favorite contributed talk of the first day was by Pim van 't Hof, a Ph.D. student of Broersma at Durham, presenting his work with Marcin Kamiński
and Daniël Paulusma
on finding induced paths of given parity in claw-free graphs. Since line graphs are claw-free, this generalizes the problem of finding arbitrary paths of given parity in arbitrary graphs. Non-induced paths of given parity are easy enough to find: the shortest path has one parity, and a path of the other parity exists if and only if at least one of the biconnected components on a path between a given pair of terminals is non-bipartite (one direction is easy and the other can be proved by induction using open ear decomposition). However, the induced claw-free generalization is not as easy. I don't know that it's a very important problem, but it was a well-presented method that used some deep results of Chudnovksy and Seymour on the three-in-a-tree problem to reduce the problem to one in which all vertices belong to induced paths between the given terminals, and then split the problem into a case analysis using the known structure of perfect claw-free graphs (in the case that the graph is perfect) or the existence of an odd hole (in the case that it is not).

The talks on the morning of the second day were all about parametrized complexity and exponential time algorithmics. The most interesting to me (and the most closely related to my own recent research) was the last of the morning's talks, by Fedor Fomin, Daniel Lokshtanov, and Saket Saurabh, on algorithms for finding an embedding of a metric space into a line minimizing the total distortion. The metric spaces they consider are the ones generated by distances in an unweighted graph (hence the relevance to WG); the difficult part of the problem is determining the ordering of the vertices on the line, so the problem could be solved in factorial time, but they reduce this time bound to 5n + o(n). The method involves partitioning the line into blocks whose length is the eventual distortion bound, so that each vertex only has a constant number of choices for the block it belongs to, and then performing some more complicated placement algorithms within the blocks. It's one of only a very few algorithms that I know of in which the optimal distortion of an embedding can be calculated exactly (another being my upcoming WADS paper); it would be of interest to extend the result from unweighted graphs to arbitrary metrics.

My own talk was in the afternoon; I surveyed problems (such as the partition of an orthogonal polygon into a minimum number of rectangles) the solution to which involves constructing a graph from the input (the bipartite intersection graph of line segments bounded by two reflex vertices of the polygon) and uses the special properties of this graph to find an efficient algorithmic solution (maximum independent sets in bipartite graphs via König's theorem). See here for talk slides.

I almost missed the first talk of the third day, due to my mistaken belief that I could find an alternative route from my hotel to the conference site without referring to a map, but I was very glad that I recovered from my disorientation in time for its start. Torben Hagerup spoke, on the problem of determining whether a given tree is the minimum spanning tree of a weighted graph, or (almost equivalently) of finding the maximum-weight edge on each of a collection of paths in a given tree, a key step in the randomized linear time minimum spanning tree algorithm of Karger, Klein, and Tarjan. This spanning tree verification problem had been previously solved in linear time in two papers, by Tarjan, Rauch, and Dixon and again by King, but Hagerup's solution was even simpler (though still not without some complication). By using least common ancestors and a technique of hierarchical clustering into subtrees from the previous papers based on Borůvka's algorithm, the problem becomes one of finding maximum-weight edges on ancestor-descendant paths in a tree in which all paths have logarithmic length, so sets of edges in these paths (for instance, the set of suffix maxima according to the weight function) can be represented in single machine words. Hagerup then derives some clever bit manipulation tricks for solving the problem in a linear number of word operations, based around the algebraic properties of an interesting operation down(x,y) that returns the nonzero bits of y for which the next larger bit of x is smaller than the next larger bit of y. Unlike our new SIGACT chair, I'm a big fan of this sort of bit-parallel computation — I think the effort of algorithm design in this area is well-spent because these techniques really do speed up actual programs and yet can be quite nontrivial to discover. Or, if you prefer an argument by authority, Knuth likes them too: on my way to France, I had coincentally watched Knuth discuss similar manipulations in his "Platologic Computation" talk (available here for Windows or here for iTunes).

All in all, a good conference, and one I would attend much more regularly if only it weren't so far away.

at June 29, 2009 11:40 PM UTC

Avi Wigderson has asked me to announce that Princeton’s recently-founded and delightfully-named Center for Computational Intractability will be holding a week-long workshop on Barriers in Computational Complexity, this August 25th to 29th.  Apparently I’m even co-organizing one of the sessions.  So register now!  Lowerbounderati, provers of meta-impossibility theorems, and other congenital pessimists are particularly discouraged from not attending.

by Scott at June 29, 2009 08:32 PM UTC


This is not as clear cut a question as the earlier ones, and if you do not know an answer then it will be difficult to figure one out just based on intuition. (But perhaps possible).

If you are intrigued by the question and would like to explore what an answer could be, I would be interested to know how you tried to find an answer.  Asked a colleague? Looked at a book (which?)? Looked online (where?)? 

Here is the question:

A differentiable complex function automatically has derivatives of every order. (In contrast to differentiable real functions that need not have even second derivatives at any point.) 

Can you describe this “miracle” as part of a more general phenomenon?

by Gil Kalai at June 29, 2009 08:02 PM UTC

In my post of June 24 I requested some help on a sum (among other things). In particular, we had a summation result that we were using in a paper and we wanted to know if it was new or not. We got three comments relevant to it
  1. Method of differences due to Pascal
  2. discrete calculus analog of the fact that the nth degree Taylor expansion of a polynomial is the polynomials.
  3. The first lemma is a direct consequence of the method of finite differences; it says that the nth forward difference of a polynomial of degree n-1 is identically zero. The inner summation in the second identity is basically computing the coefficients of the Newton Form of the polynomial, although you seem to be using a slightly different basis. Nevertheless, these results are quite standard.
This raises a request and some question.

REQUEST:
Authors of those comments (and others who know stuff)--- please email me more exact references. I found some things on Google but it would be good to know what to read and what is a good source to reference. Preferable online.
QUESTION: In a paper if you are using a result that is known how much detail should you put in?
  1. Clearly put in that it is known and provide a references. I would not want to frustrate my readers with This is easily derived from the method of differences. without providing a reference. In this day and age an online reference if you can mange it.
  2. If the result is not quite written down anywhere but your readers could easily derive it using known techniques, then you do not need to supply a proof. But this is not as clear a statement as it sounds- it depends on how you define readers, easily, known, and technique.
  3. If the result is known but you have a cute proof of it which seems new (hard to tell) then what do you do?
  4. If the proof is short then I am more inclined to include it. If its an e-journal than length matters less. (This topic has been raised in a different form before---if Conferences proceedings are CD's then why have a page limit?)
  5. If there are no online references then I am more inclined to include a proof.
  6. My only real point here is that its a QUESTION- what is the cutoff for what is worth including? There is no one answer.

by GASARCH (noreply@blogger.com) at June 29, 2009 04:25 PM UTC

I leave in about a week for Greece, where I'll be presenting a paper at AlgoSensors, a workshop associated with ICALP. The paper is entitled "Compressing Kinetic Data From Sensor Networks" and is joint work with my advisor, David Mount. You can see the camera ready version here. Slides will be posted on my website once they exist.

While the main result of the paper is a compression algorithm, the point of the paper is broader than that. When reasoning about kinetic data (the data generated by a set of moving points), the current accepted model is the kinetic data structures model that assumes that points are given flight plans in the form of algebraic expressions. This works well when reasoning theoretically (assume the points follow predetermined algebraic curves...), when modeling physical properties, or possibly when dealing with actual planes that really do need to give us their flight plans in advance. But what about if we want to do statistical analyses on collected data to determine what motion trends the objects follow? The cell phone users, migratory birds, enemy tanks, whatever you like to think about, aren't going to tell us how and when and where they're going to move. Many of these observations are collected using a sensor network, so it makes sense to design algorithms that work over these networks using the possible minimal amount of information available - the number of objects each sensor sees at each time instant. The framework we introduce assumes nothing about the motion of the objects or our advance knowledge of their plans. Instead, we analyze our results with respect to the entropy of this information. The ultimate goal, then, is that algorithms designed over this framework would perform better on "neat" motion, say all the birds following a straight line from Canada to Mexico, while algorithms on "messy" data would take longer. This paper takes the first step in that direction, giving a compression algorithm that compresses the sensor observations to within a constant factor of the optimal joint entropy bound.

For those interested in more precision, the abstract is included below:

We introduce a framework for storing and processing kinetic data observed by sensor networks. These sensor networks generate vast quantities of data, which motivates a significant need for data compression. We are given a set of sensors, each of which continuously monitors some region of space. We are interested in the kinetic data generated by a finite set of objects moving through space, as observed by these sensors. Our model relies purely on sensor observations; it allows points to move freely and requires no advance notification of motion plans. Sensor outputs are represented as random processes, where nearby sensors may be statistically dependent. We model the local nature of sensor networks by assuming that two sensor outputs are statistically dependent only if the two sensors are among the k nearest neighbors of each other. We present an algorithm for the lossless compression of the data produced by the network. We show that, under the statistical dependence and locality assumptions of our framework, asymptotically this compression algorithm encodes the data to within a constant factor of the information-theoretic lower bound optimum dictated by the joint entropy of the system.


Also, a reminder - the SODA abstract deadline is today at 4:59pm EDT.

by sorelle (noreply@blogger.com) at June 29, 2009 03:59 PM UTC



Solving Diophantine equations with mixed rationals and reals

images

Derrick Lehmer was an early computational number theorist and, during his long career, both proved wonderful theorems and made amazing special purpose computers. He used his machines, for one, to factor the largest numbers possible at the time.

Today I planned to talk about an approach to integer factorization–a problem that Lehmer worked on his whole life. The method is a refinement of an idea I discussed in an earlier post. However, there is an issue that came up while writing that post, that I want to talk about first; I will post shortly on the whole factoring idea. The question concerns solving mixed Diophantine equations. More on these later.

I was honored to meet Lehmer, in 1979, when I was on the faculty at Berkeley. One day Manny Blum and I went up to Lehmer’s office, where we had a wonderful chat on a wide range of topics.

One topic we discussed at length was the production of random numbers. Lehmer had thought quite deeply about the issues involved in the creation of “truly” random numbers. The bottom line of the conversation was the claim–well argued by Lehmer–that it is extremely hard to generate truly random numbers, even with physical devices.

It was a long discussion, but here is one of his insights, which I still explain in my lectures to students, whenever we discuss randomized algorithms. Suppose we try to use the randomness that appears to be inherent in radioactive decay. We use a Geiger counter to count the number of beta and gamma particles emitted by a low-grade radioactive source. Every now and then we stop the counter and read off the low order bit. The claim, Manny and I believed, was that this bit would be random.

Derrick smiled and explained that our idea was flawed–in many ways. He pointed out that it is easy to make a counter, by design or by error, that when you stop it, tends to stop with an even count rather than an odd count. This is true, especially, if the counter is “spinning” at a high rate. Obviously, such a counter would give very biased bits. Our answer was to have the random process flip the counter at a much slower rate and, we stated, this would mean that when we stopped the counter it would likely be unchanging. However, Derrick said that then the rate of getting random bits would be very slow. And so on.

Before we talk about mixed Diophantine equations, I thought you might like to know about Lehmer’s machines. One of his most interesting special purpose machines was made of bicycle parts, an electric motor, a light, and a photodetector; all to solve Diophantine equations.
3372221744_22db9e827b

The machine had several wheels that rotated at different speeds and, holes had been cut into the wheels, allowing light to potentially shine through at special positions. As the wheels rotated, if all the holes were aligned, then the light got through to the photodetector–and this stopped the machine. The wheels would be turned back slowly, by hand, so that the exact point where they were all aligned was found.

Lehmer used his machine to solve certain sieve problems, which yielded solutions to Diophantine equations. The advantage of the machine was that the electric motor could turn the wheels at a high speed, which allowed a large number of combinations to be tried per second.

The device was relatively cheap, and Lehmer could run it day and night: after all it was his own machine. With this device he could vastly out perform computers of the day, both in speed and definitely in price-performance. There was a time when it was the state of the art in factoring.

Mixed Diophantine Equations

I have an approach to factoring based on finding certain special matrices. The properties that these matrices need to satisfy can be stated as finding the solution to Diophantine equations. I will discuss today, a relaxation that may make the required equations easier to solve. Also the questions that arise seem to have independent interest.

In general a Diophantine equation is

\displaystyle  f(x_{1},\dots,x_{n}) = 0

where {x_{i}} are restricted to be integers (or sometimes rationals) and {f} is a polynomial over the integers. Finding a solution over the integers is well known to be undecidable, and even special cases can be extremely difficult to resolve. Finding a solution over the rationals is a long-standing open problem, by the way.

We are interested in the following simpler question: Suppose that

\displaystyle  f(x_{1},\dots,x_{n}) = 0

has a solution over the complex numbers. When does that imply that there is another solution {x'_{1},\dots,x'_{n}} so that {x'_{1}\dots x'_{k}} are rationals? Of course, the interesting case is when {k<n}.

The question is: does such a solution always exist? If it does, then that has potential to greatly simplify the approach I have to factoring.

There is a simple counterexample: Consider the polynomial {f(x,y)=x^{2}-2}. Then, it clearly has solutions, but no solution is possible with {x} rational.

One Equation Case

The problem with the simple counterexample, is that the polynomial {f} does not depend on {y}. A natural conjecture is the following: Suppose {f(x,y)} depends on {y}. Then if it has a solution it has a solution with {x} a rational.

Let’s examine this, by writing the equation in the following form:

\displaystyle  f(x,y) = a(x)y^{2} + b(x)y + c(x) = 0

for the case when {y} has degree {2}. Let’s assume that {a(x)} is not identically {0}. For that, let’s simply select {x} as an integer so that {a(x)} is not zero. Then, there is a {y \in \mathbb{C}} so that {f(x,y)=0}.

Theorem: Suppose that {f(x_{1},\dots,x_{n},y)} depends on the variable {y}. Then, there are integers {x_{1},\dots,x_{n}} and a {y \in \mathbb{C}} so that

\displaystyle  f(x_{1},\dots,x_{n},y) = 0.

The proof is the same as the example we just went over. Note, this result is essentially just the Fundamental Theorem of Algebra.

One Equation and One Inequation Case

A much more interesting problem arises with multiple equations.

As usual {|a|} is the absolute value of {a}. Also for a vector {x}, let {||x||} denote the value

\displaystyle  |x_{1}| + |x_{2}| + \dots + |x_{n}|.

Note, {||(a,b)-(c,d)|| = ||a-c|| + ||b-d||}.

The goal of this section is to prove the following theorem: (This is a joint result with Subrahmanyam Kalyanasundaram and Farbod Shokrieh.)

Theorem: Suppose that {f(x_{1},\dots,x_{n},y)} and {g(x_{1},\dots,x_{n},y)} are polynomials, and that {f(x_{1},\dots,x_{n},y)} depends on the variable {y}. Also assume that there are reals {w_{1},\dots,w_{n},z} so that {f(w,z)=0} and {g(w,z) \neq 0}. Then, there are rationals {x_{1},\dots,x_{n}} and a {y \in \mathbb{C}} so that

\displaystyle  f(x,y) = 0

and

\displaystyle  g(x,y) \neq 0.

We first need a classic lemma:

Lemma: Let the polynomial {f(y)} equal

\displaystyle  y^{k} + a_{k-1}y^{k-1} + \dots + a_{0}

where {k \ge 1}, and the polynomial {g(y)} equal

\displaystyle  y^{k} + b_{k-1}y^{k-1} + \dots + b_{0}.

Suppose that {f(z)=0}. Then, for any {\delta>0}, there is an {\epsilon>0} so that if { ||a-b|| < \epsilon}, then there exists a {y \in \mathbb{C}} so that {|z-y| < \delta} and {g(y)=0}.

Lemma: Suppose that the polynomial {f(x,y)} is equal to

\displaystyle  a_{k}(x)y^{k} + a_{k-1}(x)y^{k-1} + \dots + a_{0}(x)

where {k \ge 1} and each {a_{i}(x)} is a polynomial, and {x=(x_{1},\dots,x_{n})}. Then, for any {\delta>0}, there is an {\epsilon>0} so that if {a_{k}(w) \neq 0} and {f(w,z)=0} and {||w-x||<\epsilon}, then there exists a {y \in \mathbb{C}} so that {|z-y| < \delta} and {f(x,y)=0}.

Proof: Since {a_{k}(w) \neq 0}, this follows from the previous lemma, by dividing by the lead term. \Box

Now we will prove the main theorem:

Proof: First, the function {g(x,y)} is such that for any {\delta>0} there is an {\epsilon>0} so that

\displaystyle    ||(w,z)-(x,y)|| < \epsilon \rightarrow |g(w,z)-g(x,y)| < \delta. \ \ \ \ \ (1)

This is, of course, just uniform continuity of the polynomial function.

Second, since {f} is a polynomial that depends on {y}, by the lemma, for any {\delta>0}, there is an {\epsilon>0} so that

\displaystyle    ||w-x|| < \epsilon \rightarrow \exists y, |z-y| < \delta, f(x,y)=0. \ \ \ \ \ (2)

Let { |g(w,z)| = b >0}. Select {\epsilon_{1}>0} small enough so that (1) holds for {\delta=b/2}. Now select {\epsilon_{2}>0} so that it is smaller than {\epsilon_{1}}, and (2) holds for {\delta=\epsilon_{1}/2}.

Since the rationals are dense, for {\epsilon_{2}>0} there is an {x} all rationals so that {||w-x|| < \epsilon_{2}/2}. By (2) there is a {y} so that {|z-y| < \epsilon_{1}/2} and also {f(x,y)=0}. Note, that {||(w,z)-(x,y)|| < \epsilon_{2}/2 + \epsilon_{1}/2 < \epsilon_{1}}. Thus, by (1)

\displaystyle  |g(w,z)-g(x,y)| < b/2.

This implies that {g(x,y) \neq 0}. \Box

General Case

What we really would like is a generalization that could handle several equations and one inequation. Note, one inequation is the same as many. The condition

\displaystyle  g_{1}(z) \neq 0 \text{ and } \dots \text{ and } g_{k}(z) \neq 0

is the same as

\displaystyle  g_{1}(z)g_{2}(z) \cdots g_{k}(z) \neq 0.

The theorem we need to help solve the factoring problem would look like our theorem, but would allow several equations and one inequation.

Suppose that {x =(x_{1},\dots,x_{n})} and {y=(y_{1},\dots,y_{m})}. Let {f_{i}(x,y)} for {i=1,\dots,k} and {g(x,y)} be polynomials. Also assume that for some real {w,z},

\displaystyle  f_{1}(w,z)=0,\dots,f_{k}(w,z)=0 \text{ and } g(w,z) \neq 0.

Then, there are rationals {x} and a {y \in \mathbb{C}^{m}} so that

\displaystyle  f_{1}(x,y) = 0,\dots,f_{k}(x,y)=0

and

\displaystyle  g(x,y) \neq 0

provided {\Gamma} is true.

The key question is what is the property {\Gamma}? In the case of {k=1}, {\Gamma} was that the polynomial depended on the single variable {y}. I would imagine that now the right property is a stronger notion of dependence? What is it?

Open Problems

The main open problem is to find the right statement of the general theorem and prove it. That what should {\Gamma} be?

Are there some interesting applications of this ability to “almost” solve Diophantine in rationals? What about a similar result that holds over the complex numbers?

One potential application that comes to mind is the area of solutions to economic or game theoretic problems. In order to apply to these problems the value of the {y} variable must be restricted to be real also. This does not seem possible with the technology that we have so far. Thus, an important additional direction is to try to prove a theorem, even with one equation and one inequation, that allows {y} to stay real. I believe that such a theorem could be quite interesting.

by rjlipton at June 29, 2009 02:02 PM UTC