Data Structures And Algorithms | 2019-06-23

in algorithms •  5 years ago 

Data Structures And Algorithms


Logarithmic regret in the dynamic and stochastic knapsack problem with equal rewards (1809.02016v2)

Alessandro Arlotto, Xinchang Xie

2018-09-06

We study a dynamic and stochastic knapsack problem in which a decision maker is sequentially presented with items arriving according to a Bernoulli process over discrete time periods. Items have equal rewards and independent weights that are drawn from a known non-negative continuous distribution . The decision maker seeks to maximize the expected total reward of the items that she includes in the knapsack while satisfying a capacity constraint, and while making terminal decisions as soon as each item weight is revealed. Under mild regularity conditions on the weight distribution , we prove that the regret---the expected difference between the performance of the best sequential algorithm and that of a prophet who sees all of the weights before making any decision---is, at most, logarithmic in . Our proof is constructive. We devise a re-optimized heuristic that achieves this regret bound.

Lower bounds for testing graphical models: colorings and antiferromagnetic Ising models (1901.07361v2)

Ivona Bezakova, Antonio Blanca, Zongchen Chen, Daniel Štefankovič, Eric Vigoda

2019-01-22

We study the identity testing problem in the context of spin systems or undirected graphical models, where it takes the following form: given the parameter specification of the model and a sampling oracle for the distribution of an unknown model , can we efficiently determine if the two models and are the same? We consider identity testing for both soft-constraint and hard-constraint systems. In particular, we prove hardness results in two prototypical cases, the Ising model and proper colorings, and explore whether identity testing is any easier than structure learning. For the ferromagnetic (attractive) Ising model, Daskalakis et al. (2018) presented a polynomial time algorithm for identity testing. We prove hardness results in the antiferromagnetic (repulsive) setting in the same regime of parameters where structure learning is known to require a super-polynomial number of samples. In particular, for -vertex graphs of maximum degree , we prove that if (where is the inverse temperature parameter), then there is no polynomial running time identity testing algorithm unless . We also establish computational lower bounds for a broader set of parameters under the (randomized) exponential time hypothesis. Our proofs utilize insights into the design of gadgets using random graphs in recent works concerning the hardness of approximate counting by Sly (2010). In the hard-constraint setting, we present hardness results for identity testing for proper colorings. Our results are based on the presumed hardness of #BIS, the problem of (approximately) counting independent sets in bipartite graphs. In particular, we prove that identity testing is hard in the same range of parameters where structure learning is known to be hard.

Adaptive Sequence Submodularity (1902.05981v2)

Marko Mitrovic, Ehsan Kazemi, Moran Feldman, Andreas Krause, Amin Karbasi

2019-02-15

In many machine learning applications, one needs to interactively select a sequence of items (e.g., recommending movies based on a user's feedback) or make sequential decisions in a certain order (e.g., guiding an agent through a series of states). Not only do sequences already pose a dauntingly large search space, but we must also take into account past observations, as well as the uncertainty of future outcomes. Without further structure, finding an optimal sequence is notoriously challenging, if not completely intractable. In this paper, we view the problem of adaptive and sequential decision making through the lens of submodularity and propose an adaptive greedy policy with strong theoretical guarantees. Additionally, to demonstrate the practical utility of our results, we run experiments on Amazon product recommendation and Wikipedia link prediction tasks.

Text Indexing and Searching in Sublinear Time (1712.07431v2)

J. Ian Munro, Gonzalo Navarro, Yakov Nekrich

2017-12-20

We introduce the first index that can be built in time for a text of length , and also queried in time for a pattern of length . On a constant-size alphabet, for example, our index uses bits, is built in deterministic time, and finds the pattern occurrences in time , where is an arbitrarily small constant. As a comparison, the most recent classical text index uses bits, is built in time, and searches in time . We build on a novel text sampling based on difference covers, which enjoys properties that allow us efficiently computing longest common prefixes in constant time. We extend our results to the secondary memory model as well, where we give the first construction in time of a data structure with suffix array functionality, which can search for patterns in the almost optimal time, with an additive penalty of , where is the size of main memory available and is the disk block size.

Stochastic One-Sided Full-Information Bandit (1906.08656v1)

Haoyu Zhao, Wei Chen

2019-06-20

In this paper, we study the stochastic version of the one-sided full information bandit problem, where we have arms , and playing arm would gain reward from an unknown distribution for arm while obtaining reward feedback for all arms . One-sided full information bandit can model the online repeated second-price auctions, where the auctioneer could select the reserved price in each round and the bidders only reveal their bids when their bids are higher than the reserved price. In this paper, we present an elimination-based algorithm to solve the problem. Our elimination based algorithm achieves distribution independent regret upper bound , and distribution dependent bound , where is the time horizon, is a vector of gaps between the mean reward of arms and the mean reward of the best arm, and is a formula depending on the gap vector that we will specify in detail. Our algorithm has the best theoretical regret upper bound so far. We also validate our algorithm empirically against other possible alternatives.

Quantum versus Classical Online Streaming Algorithms with Logarithmic Size of Memory (1710.09595v3)

Kamil Khadiev, Aliya Khadieva, Dmitry Kravchenko, Alexander Rivosh, Ramis Yamilov, Ilnaz Mannapov

2017-10-26

We consider online algorithms with respect to the competitive ratio. Here, we investigate quantum and classical one-way automata with non-constant size of memory (streaming algorithms) as a model for online algorithms. We construct problems that can be solved by quantum online streaming algorithms better than by classical ones in a case of logarithmic or sublogarithmic size of memory.

Quantum Algorithm for Dynamic Programming Approach for DAGs. Applications for Zhegalkin Polynomial Evaluation and Some Problems on DAGs (1804.09950v2)

Kamil Khadiev, Liliya Safina

2018-04-26

In this paper, we present a quantum algorithm for dynamic programming approach for problems on directed acyclic graphs (DAGs). The running time of the algorithm is , and the running time of the best known deterministic algorithm is , where is the number of vertices, is the number of vertices with at least one outgoing edge; is the number of edges. We show that we can solve problems that use OR, AND, NAND, MAX and MIN functions as the main transition steps. The approach is useful for a couple of problems. One of them is computing a Boolean formula that is represented by Zhegalkin polynomial, a Boolean circuit with shared input and non-constant depth evaluating. Another two are the single source longest paths search for weighted DAGs and the diameter search problem for unweighted DAGs.

Coresets for Clustering with Fairness Constraints (1906.08484v1)

Lingxiao Huang, Shaofeng H. -C. Jiang, Nisheeth K. Vishnoi

2019-06-20

In a recent work, Chierichetti et al. studied the following "fair" variants of classical clustering problems such as -means and -median: given a set of data points in and a binary type associated to each data point, the goal is to cluster the points while ensuring that the proportion of each type in each cluster is roughly the same as its underlying proportion. Subsequent work has focused on either extending this setting to when each data point has multiple, non-disjoint sensitive types such as race and gender, or to address the problem that the clustering algorithms in the above work do not scale well. The main contribution of this paper is an approach to clustering with fairness constraints that involve multiple, non-disjoint types, that is {\em also scalable}. Our approach is based on novel constructions of coresets: for the -median objective, we construct an -coreset of size where is the number of distinct collections of groups that a point may belong to, and for the -means objective, we show how to construct an -coreset of size . The former result is the first known coreset construction for the fair clustering problem with the -median objective, and the latter result removes the dependence on the size of the full dataset as in Schmidt et al. and generalizes it to multiple, non-disjoint types. Plugging our coresets into existing algorithms for fair clustering such as Backurs et al. results in the fastest algorithms for several cases. Empirically, we assess our approach over the Adult and Bank dataset, and show that the coreset sizes are much smaller than the full dataset; applying coresets indeed accelerates the running time of computing the fair clustering objective while ensuring that the resulting objective difference is small.

Extensions of Self-Improving Sorters (1906.08448v1)

Siu-Wing Cheng, Kai Jin, Lie Yan

2019-06-20

Ailon et al. (SICOMP 2011) proposed a self-improving sorter that tunes its performance to an unknown input distribution in a training phase. The input numbers come from a product distribution, that is, each is drawn independently from an arbitrary distribution . We study two relaxations of this requirement. The first extension models hidden classes in the input. We consider the case that numbers in the same class are governed by linear functions of the same hidden random parameter. The second extension considers a hidden mixture of product distributions.

A Fast MCMC for the Uniform Sampling of Binary Matrices with Fixed Margins (1904.03836v2)

Guanyang Wang

2019-04-08

Uniform sampling of binary matrix with fixed margins is an important and difficult problem in statistics, computer science, ecology and so on. The well-known swap algorithm would be inefficient when the size of the matrix becomes large or when the matrix is too sparse/dense. Here we propose the Rectangle Loop algorithm, a Markov chain Monte Carlo algorithm to sample binary matrices with fixed margins uniformly. Theoretically the Rectangle Loop algorithm is better than the swap algorithm in Peskun's order. Empirically studies also demonstrates the Rectangle Loop algorithm is remarkablely more efficient than the swap algorithm.



Authors get paid when people like you upvote their post.
If you enjoyed what you read here, create your account today and start earning FREE STEEM!