Data Structures And Algorithms | 2019-07-03

in datastructure •  5 years ago 

Data Structures And Algorithms


Approximability of Discriminators Implies Diversity in GANs (1806.10586v4)

Yu Bai, Tengyu Ma, Andrej Risteski

2018-06-27

While Generative Adversarial Networks (GANs) have empirically produced impressive results on learning complex real-world distributions, recent works have shown that they suffer from lack of diversity or mode collapse. The theoretical work of Arora et al. suggests a dilemma about GANs' statistical properties: powerful discriminators cause overfitting, whereas weak discriminators cannot detect mode collapse. By contrast, we show in this paper that GANs can in principle learn distributions in Wasserstein distance (or KL-divergence in many cases) with polynomial sample complexity, if the discriminator class has strong distinguishing power against the particular generator class (instead of against all possible generators). For various generator classes such as mixture of Gaussians, exponential families, and invertible and injective neural networks generators, we design corresponding discriminators (which are often neural nets of specific architectures) such that the Integral Probability Metric (IPM) induced by the discriminators can provably approximate the Wasserstein distance and/or KL-divergence. This implies that if the training is successful, then the learned distribution is close to the true distribution in Wasserstein distance or KL divergence, and thus cannot drop modes. Our preliminary experiments show that on synthetic datasets the test IPM is well correlated with KL divergence or the Wasserstein distance, indicating that the lack of diversity in GANs may be caused by the sub-optimality in optimization instead of statistical inefficiency.

Generating Normal Networks via Leaf Insertion and Nearest Neighbor Interchange (1906.12053v2)

Louxin Zhang

2019-06-28

Galled trees are studied as a recombination model in theoretic population genetics. This class of phylogenetic networks has been generalized to tree-child networks, normal networks and tree-based networks by relaxing a structural condition. Although these networks are simple, their topological structures have yet to be fully understood. It is well-known that all phylogenetic trees on taxa can be generated by the insertion of the -th taxa to each edge of all the phylogenetic trees on taxa. We prove that all tree-child networks with reticulate nodes on taxa can be uniquely generated via three operations from all the tree-child networks with or reticulate nodes on taxa . An application of this result is found in counting tree-child networks and normal networks. In particular, a simple formula is given for the number of rooted phylogenetic networks with one reticulate node.

Chasing Nested Convex Bodies Nearly Optimally (1811.00999v3)

Sébastien Bubeck, Bo'az Klartag, Yin Tat Lee, Yuanzhi Li, Mark Sellke

2018-11-02

The convex body chasing problem, introduced by Friedman and Linial, is a competitive analysis problem on any normed vector space. In convex body chasing, for each timestep , a convex body is given as a request, and the player picks a point . The player aims to ensure that the total distance is within a bounded ratio of the smallest possible offline solution. In this work, we consider the nested version of the problem, in which the sequence must be decreasing. For Euclidean spaces, we consider a memoryless algorithm which moves to the so-called Steiner point, and show that in a certain sense it is exactly optimal among memoryless algorithms. For general finite dimensional normed spaces, we combine the Steiner point and our recent previous algorithm to obtain a new algorithm which is nearly optimal for all spaces with , closing a polynomial gap.

On learning with shift-invariant structures (1812.01115v2)

Cristian Rusu

2018-12-03

We describe new results and algorithms for two different, but related, problems which deal with circulant matrices: learning shift-invariant components from training data and calculating the shift (or alignment) between two given signals. In the first instance, we deal with the shift-invariant dictionary learning problem while the latter bears the name of (compressive) shift retrieval. We formulate these problems using circulant and convolutional matrices (including unions of such matrices), define optimization problems that describe our goals and propose efficient ways to solve them. Based on these findings, we also show how to learn a wavelet-like dictionary from training data. We connect our work with various previous results from the literature and we show the effectiveness of our proposed algorithms using synthetic, ECG signals and images.

Detecting Feedback Vertex Sets of Size in Time (1906.12298v1)

Jason Li, Jesper Nederlof

2019-06-28

In the Feedback Vertex Set problem, one is given an undirected graph and an integer , and one needs to determine whether there exists a set of vertices that intersects all cycles of (a so-called feedback vertex set). Feedback Vertex Set is one of the most central problems in parameterized complexity: It served as an excellent test bed for many important algorithmic techniques in the field such as Iterative Compression~[Guo et al. (JCSS'06)], Randomized Branching~[Becker et al. (J. Artif. Intell. Res'00)] and Cut&Count~[Cygan et al. (FOCS'11)]. In particular, there has been a long race for the smallest dependence in run times of the type , where the notation omits factors polynomial in . This race seemed to be run in 2011, when a randomized algorithm time algorithm based on Cut&Count was introduced. In this work, we show the contrary and give a time randomized algorithm. Our algorithm combines all mentioned techniques with substantial new ideas: First, we show that, given a feedback vertex set of size of bounded average degree, a tree decomposition of width can be found in polynomial time. Second, we give a randomized branching strategy inspired by the one from~[Becker et al. (J. Artif. Intell. Res'00)] to reduce to the aforementioned bounded average degree setting. Third, we obtain significant run time improvements by employing fast matrix multiplication.

A -Competitive Algorithm for Scheduling Packets with Deadlines (1807.07177v4)

Pavel Veselý, Marek Chrobak, Łukasz Jeż, Jiří Sgall

2018-07-18

In the online packet scheduling problem with deadlines (PacketScheduling, for short), the goal is to schedule transmissions of packets that arrive over time in a network switch and need to be sent across a link. Each packet has a deadline, representing its urgency, and a non-negative weight, that represents its priority. Only one packet can be transmitted in any time slot, so, if the system is overloaded, some packets will inevitably miss their deadlines and be dropped. In this scenario, the natural objective is to compute a transmission schedule that maximizes the total weight of packets which are successfully transmitted. The problem is inherently online, with the scheduling decisions made without the knowledge of future packet arrivals. The central problem concerning PacketScheduling, that has been a subject of intensive study since 2001, is to determine the optimal competitive ratio of online algorithms, namely the worst-case ratio between the optimum total weight of a schedule (computed by an offline algorithm) and the weight of a schedule computed by a (deterministic) online algorithm. We solve this open problem by presenting a -competitive online algorithm for PacketScheduling (where is the golden ratio), matching the previously established lower bound.

Improving and benchmarking of algorithms for decision making with lower previsions (1906.12215v1)

Nawapon Nakharutai, Matthias C. M. Troffaes, Camila C. S. Caiado

2019-06-28

Maximality, interval dominance, and E-admissibility are three well-known criteria for decision making under severe uncertainty using lower previsions. We present a new fast algorithm for finding maximal gambles. We compare its performance to existing algorithms, one proposed by Troffaes and Hable (2014), and one by Jansen, Augustin, and Schollmeyer (2017). To do so, we develop a new method for generating random decision problems with pre-specified ratios of maximal and interval dominant gambles. Based on earlier work, we present efficient ways to find common feasible starting points in these algorithms. We then exploit these feasible starting points to develop early stopping criteria for the primal-dual interior point method, further improving efficiency. We find that the primal-dual interior point method works best. We also investigate the use of interval dominance to eliminate non-maximal gambles. This can make the problem smaller, and we observe that this benefits Jansen et al.'s algorithm, but perhaps surprisingly, not the other two algorithms. We find that our algorithm, without using interval dominance, outperforms all other algorithms in all scenarios in our benchmarking.

PUFFINN: Parameterless and Universally Fast FInding of Nearest Neighbors (1906.12211v1)

Martin Aumüller, Tobias Christiani, Rasmus Pagh, Michael Vesterli

2019-06-28

We present PUFFINN, a parameterless LSH-based index for solving the -nearest neighbor problem with probabilistic guarantees. By parameterless we mean that the user is only required to specify the amount of memory the index is supposed to use and the result quality that should be achieved. The index combines several heuristic ideas known in the literature. By small adaptions to the query algorithm, we make heuristics rigorous. We perform experiments on real-world and synthetic inputs to evaluate implementation choices and show that the implementation satisfies the quality guarantees while being competitive with other state-of-the-art approaches to nearest neighbor search. We describe a novel synthetic data set that is difficult to solve for almost all existing nearest neighbor search approaches, and for which PUFFINN significantly outperform previous methods.

The Moser-Tardos Framework with Partial Resampling (1406.5943v5)

David G. Harris, Aravind Srinivasan

2014-06-23

The resampling algorithm of Moser & Tardos is a powerful approach to develop constructive versions of the Lov'{a}sz Local Lemma (LLL). We generalize this to partial resampling: when a bad event holds, we resample an appropriately-random subset of the variables that define this event, rather than the entire set as in Moser & Tardos. This is particularly useful when the bad events are determined by sums of random variables. This leads to several improved algorithmic applications in scheduling, graph transversals, packet routing etc. For instance, we settle a conjecture of Szab'{o} & Tardos (2006) on graph transversals asymptotically, and obtain improved approximation ratios for a packet routing problem of Leighton, Maggs, & Rao (1994).

Near-Optimal Methods for Minimizing Star-Convex Functions and Beyond (1906.11985v1)

Oliver Hinder, Aaron Sidford, Nimit Sharad Sohoni

2019-06-27

In this paper, we provide near-optimal accelerated first-order methods for minimizing a broad class of smooth nonconvex functions that are strictly unimodal on all lines through a minimizer. This function class, which we call the class of smooth quasar-convex functions, is parameterized by a constant , where encompasses the classes of smooth convex and star-convex functions, and smaller values of indicate that the function can be "more nonconvex." We develop a variant of accelerated gradient descent that computes an -approximate minimizer of a smooth -quasar-convex function with at most total function and gradient evaluations. We also derive a lower bound of on the number of gradient evaluations required by any deterministic first-order method in the worst case, showing that, up to a logarithmic factor, no deterministic first-order algorithm can improve upon ours.



Authors get paid when people like you upvote their post.
If you enjoyed what you read here, create your account today and start earning FREE STEEM!