Technical Reports
Permanent URI for this collection
Browse
Browsing Technical Reports by Title
Now showing 1 - 20 of 418
Results Per Page
Sort Options
- ItemAbsolute Loss Bounds for Prediction using Linear Functions(1996-07-01T00:00:00Z) Philip M. LongWe prove new absolute loss bounds for learning linear functions in the standard on-line prediction model. These bounds are on the difference between the sum of absolute prediction errors made by the learning algorithm, and the best sum of absolute prediction errors that can be obtained by fixing a linear function in some class. Known results imply that our bounds on this difference cannot be improved by more than a constant factor.
- ItemAbsolute versus probabilistic classification in a logical setting(2006-03-22T01:23:33Z) JAIN, Sanjay; MARTIN, Eric; STEPHAN, FrankGiven a set W of logical structures, or possible worlds, a set of logical formulas called possible data and a logical formula phi, we consider the classification problem of determining in the limit and almost always correctly whether a possible world M in W satisfies phi, from a complete enumeration of the possible data that are true in M. One interpretation of almost always correctly is that the classification might be wrong on a set of possible worlds of measure 0, with respect to some natural probability distribution over the set of possible worlds. Another interpretation is that the classifier is only required to classify a set W' of possible worlds of measure 1, without having to produce any claim in the limit on the truth of phi in the members of W which are not in W'. We compare these notions with absolute classification of W with respect to a formula that is almost always equivalent to phi in W, hence investigate whether the set of possible worlds on which the classification is correct is definable. We mainly work with the probability distribution that corresponds to the standard measure on the Cantor space, but we also consider an alternative probability distribution proposed by Solomonoff and contrast it with the former. Finally, in the spirit of the kind of computations considered in Logic programming, we address the issue of computing almost correctly in the limit witnesses to leading existentially quantified variables in existential formulas.
- ItemAC-5*:An Improved AC-5 and its Specializations(1996-04-01T00:00:00Z) Bing LIUMany general and specific arc consistency algorithms have been produced in the past for solving Constraint Satisfaction Problems (CSP). The important general algorithms are AC-3, AC-4, AC-5 and AC-6. AC-5 is also a generic algorithm. It can be reduced to AC-3, AC-4 and AC-6. Specific algorithms are efficient specializations of the general ones for specific constraints. Functional, anti-functional and monotonic constraints are three important classes of specific constraints. AC-5 has been specialized to produce an O(ed) algorithm (in time) for these classes of constraints. However, this specialization does not reduce the space requirement. In practical applications, both time and space requirements are important criteria in choosing an algorithm. This paper makes two contributions. First, we propose an improved generic arc consistency algorithm, called AC-5*. It can be specialized to reduce both time and space complexities. Second, we present a more efficient technique for handling an important subclass of functional constraints, namely increasing functional constraints. This technique is significant because in practice almost all functional constraints are actually increasing functional constraints.
- ItemAccelerating Point-Based POMDP Algorithms through Successive Approximations of the Optimal Reachable Space(2007-04-29) HSU, David; LEE, Wee Sun; RONG, NanPoint-based approximation algorithms have drastically im-proved the speed of POMDP planning. This paper presents a new point-based POMDP algorithm called SARSOP. Like earlier point-based algorithms, SARSOP performs value iter-ation at a set of sampled belief points; however, it focuses on sampling near the space reachable from an initial belief point under the optimal policy. Since neither the optimal policynor the optimal reachable space is known in advance, SARSOP builds successive approximations to it through sampling and pruning. In our experiments, the new algorithm solved dif-.cult POMDP problems with more than 10,000 states. Its running time is competitive with the fastest existing point-based algorithm on most problems andfasterby manytimes on some. Our approach is complementary to existing point-based algorithms and can be integrated with them to improve their performance.
- ItemAccurate and True Reconstruction of 3D Models with Closed-Form Solutions(1998-12-01T00:00:00Z) Wee Kheng Leow; Zhiyong Huang3D models of objects or scenes can be reconstructed from image sequences or multiple views. Reconstruction algorithms that adopt calibrated cameras generally produce more accurate results than those that adopt uncalibrated cameras, whereas the latter are more convenient to use. Depending on an application's need, one may trade off between reconstruction accuracy and convenience. The objective of this research is to devise an accurate method for reconstructing the true and complete 3D models of small objects for use in applications such as virtual reality and model-based object recognition. This technical report focuses on the task of accurately recovering 3D coordinates of world points given a set of tracked 2D image points. To ensure reconstruction accuracy, the method uses a calibrated camera and a computer-controlled turntable. With this equipment setup, close-form solutions to the reconstruction problem are derived for various camera models including the perspective and the projective models. The existence of close-form solutions facilitate the construction of an efficient, accurate, and reliable optimization algorithm that performs true and complete reconstruction. Comprehensive experiments have been conducted to test the method's performance. Experimental results show that the method is more accurate and robust against noise than existing methods.
- ItemAchieving Sublinear Complexity under Constant 𝑇 in 𝑇 -interval Dynamic Networks(2022-05-26) Hou, Ruomu; Jahja, Irvan; Sun, Yucheng; Wu, Jiyan; Yu, HaifengThis paper considers standard {\em $T$-interval dynamic networks}, where the $N$ nodes in the network proceed in lock-step {\em rounds}, and where the topology of the network can change arbitrarily from round to round, as determined by an {\em adversary}. The adversary promises that in every $T$ consecutive rounds, the $T$ (potentially different) topologies in those $T$ rounds contain a common connected subgraph that spans all nodes. Within such a context, we propose novel algorithms for solving some fundamental distributed computing problems such as Count/Consensus/Max. Our algorithms are the first algorithms whose complexities do not contain an $\Omega(N)$ term, under constant $T$ values. Previous sublinear algorithms require significantly larger $T$ values.
- ItemActive Learning for Causal Bayesian Network Structure with Non-symmetrical Entropy(2008-06-26) LI, Guoliang; LEONG, Tze-YunCausal knowledge is crucial for facilitating comprehension, diagnosis, prediction, and control in automated reasoning. Active learning in Bayesian networks involves interventions by manipulating specific variables or their interactions, and observing the patterns of change over the other variables to derive causal relationships for knowledge discovery. In this paper, we propose a new active learning approach that supports interventions with node selection. Our method admits a node selection criterion based on non-symmetrical information entropy and a stop criterion based on minimizing structure entropy of the resulting networks. We examine the technical challenges and practical issues in developing effective node selection and stopping criteria in our method. Experimental results on a set of benchmark Bayesian networks are promising. The proposed method is applicable in many real-life applications where multiple instances are simultaneously sampled as a data set in each active learning step.
- Item
- ItemAdaptive Hybrid Sampling for Probabilistic Roadmap Planning(2004-05-01T00:00:00Z) David HSU; Zheng SUNSeveral sophisticated sampling strategies have been proposed recently to address the narrow passage problem for probabilistic roadmap (PRM)planning. They all have unique strengths and weaknesses in different environments, but none seems sufficient on its own in general. In this paper, we propose a systematic approach for adaptively combining multiple sampling strategies for PRM planning. Using this approach, we describe three adaptive hybrid sampling strategies. Two are motivated by theoretical results from the computational learning theory. Another one is simple and performs well in practice. We tested them on robots with two to eight degrees of freedom in planar workspaces. In these preliminary tests, the adaptive hybrid sampling strategies showed consistently good performance, compared with fixed-weight hybrid sampling strategies.
- ItemAdaptive Pre-computed Partition Top Method for Range Top-k Queries in OLAP Data Cubes(2001-07-01T00:00:00Z) Zheng Xuan LOH; Tok Wang LING; Chuan Heng ANG; Sin Yeung LEE; Hua-Gang LIA range top-k query finds the top-k maximum values over all selected cells of an On-Line Analytical Processing (OLAP) data cube where the selection is specified by providing ranges of contiguous values for each dimension. The naive method answers a range query by accessing every individual cell in the data cube. Therefore, the naive method is very costly and is exponentially proportional to the number of dimensions of the data cube. Here, we propose a new method for handling the range top-k queries efficiently, termed the Adaptive Pre-computed Partition Top method (APPT). This method partitions the given data cube and stores r pre-computed maximum values with the corresponding locations over partitioned sub-blocks. Furthermore, an area termed overflow array stores additional maximum values for sub-block, which requires more than r maximum values for some range queries. By pre-storing the additional maximum values in the overflow array, the response time for the subsequent queries with the similar ranges will be significantly reduced. The experiment results exhibit vast improvement in terms of the response time of our APPT method as compared to the naive method. Our method promises an average update cost of (1+t) total disk accesses, where t is the average number of IO blocks in the overflow array for each sub-block. This method, which is equipped with intelligence and self- learning capability, is able to maintain the value of t below 0.5. Moreover, the APPT method can be efficiently applied on uniform and non-uniform data distributions in both dense and sparse OLAP data cubes.
- ItemAdvances in Digital Image Compression Techniques(1992-02-01T00:00:00Z) Lu GuojunAbstract not available.
- ItemAffine-Based Size-Change Termination(2003-09-01T00:00:00Z) Anderson Hugh; Siau Cheng KHOOThe size-change principle devised by Lee, Jones and Ben-Amram, provides an effective method of determining program termination for recursive functions over well-founded types. Termination analysis using the principle involves the classication of functions either into size-change terminating ones, or ones which are not size-change terminating. Size-change graphs are constructed to represent the functions, and decreasing parameter sizes in those graphs that are idempotent are identifed. In this paper, we propose a translation of the sizechange graphs to affine-based graphs, in which affine relations among parameters are expressed by Presburger formulae. We show the correctness of our translation by defining the size-change graph composition in terms of affine relation manipulation, and identifying the idempotent size-change graphs with transitive closures in affine relations. We then propose an affine-based termination analysis, in which more refined termination size-change information is admissible by affine relations. Specifically, our affine-related analysis improves the effectiveness of the termination analysis by capturing constant changes in parameter sizes, affine relationships of the sizes of the source parameters, and contextual information pertaining to function calls. We state and reason about the corresponding soundness and termination of this affine-related analysis. Our approach widens the set of size-change terminating functions.
- ItemAggregation of Association Rules(1999-07-01T00:00:00Z) Shichao ZHANG; Xindong WUDealing with very large databases is one of the defining challenges in data mining research and development. Some databases are simply too large (e.g., with terabytes of data) to be processed at one time, for efficiency and space reasons, so splitting them into subsets for processing is a necessary step. Also, some organizations have different data sources (e.g., different branches of a large company), and while putting all data from different sources might amass a huge database for centralized processing, mining rules at different data sources and forwarding the rules (rather than the original raw data) to the centralized company headquarter provides a feasible way to deal with very large database problems. This paper presents a model of aggregating association rules from different data sources. Each data source could also be a subset of a very large database, and so the aggregation model is applicable to both dealing with very large databases by splitting them into subsets, and processing data from different data sources.
- ItemThe Algebra and Analysis of Adaptive-Binning(2002-08-01T00:00:00Z) Wee Kheng LEOWHistograms are commonly used in content-based image retrieval systems to represent the distributions of colors in images. It is a common understanding that histograms that adapt to images can represent their color distributions more efficiently than do histograms with fixed binnings. However, existing systems almost exclusively adopt fixed-binning histograms because among existing well-known dissimilarity measures, only the computationally expensive Earth Mover's Distance (EMD) can compare histograms with different binnings. Another major concern is that fixed-binning histograms have been regarded as vectors in a linear vector space so that powerful algorithms such as clustering, PCA, and SVD can be applied to process and analyze the histograms. Unfortunately, this approach is not satisfactory because the natural distance measure in linear vector space, the Euclidean distance, is less reliable than other non-Euclidean dissimilarities are in measuring histogram dissimilarity, thus compromising the effectiveness and reliability of the approach. This technical report addresses the above issues in a mathematically sound manner. In this article, adaptive histograms are formally defined and are provided with a well-defined set of operations. Properties of the operations on adaptive histograms are analyzed, leading to mathematically sound definitions of similarity, dissimilarity, and mean of histograms. Extensive test results in \cite{Leow-cvpr2001} have shown that the use of adaptive histograms produces the best overall performance, in terms of good accuracy, small number of bins, no empty bin, and efficient computation, compared to existing methods in histogram retrieval and classification tasks.
- ItemAlgebra and the Formal Semantics of GLASS(2005-09-13) NI, Wei; LING, Tok WangIn database world, it is common to translate a query language into an algebra for the purpose of precisely defining the formal semantics of a query language and doing query optimization later. In this paper, we examine the scenario of graphical XML query languages, focus on their expressive power and present the underlying algebra of our graphical XML query language. Compared with various previous works on XML algebra, our algebra supports not only traditional select, project and join operators but also swap and SQL-like group operators. To achieve the exactness in query representation, we use ORA-SS (Object-Relationship-Attribute model for Semi-Structured data), a semantic rich data model for XML including the information such as Key constraints, Functional dependencies and Relationship types which are lacked in DTD. With examples, we show how our graphical language solves the difficult points in representation and how it is translated into our algebra. Based on the translation, we use the algebra to define the formal semantics of GLASS.
- ItemAnalyzing Temporal Keyword Queries for Interactive Search over Temporal DatabasesGAO, Qiao; LEE, Mong Li; LING, Tok Wang; DOBBIE, Gillian; ZENG, ZhongQuerying temporal relational databases is a challenge for non-expert database users, since it requires users to understand the semantics of the database and apply temporal joins as well as temporal conditions correctly in SQL statements. Traditional keyword search approaches are not directly applicable to temporal relational databases since they treat time-related keywords as tuple values and do not consider the temporal joins between relations, which leads to missing answers, incorrect answers and missing query interpretations. In this work, we extend keyword queries to allow the temporal predicates, and design a schema graph approach based on the Object-Relationship-Attribute (ORA) semantics. This approach enables us to identify temporal attributes of objects/relationships and infer the target temporal data of temporal predicates, thus improving the completeness and correctness of temporal keyword search and capturing the various possible interpretations of temporal keyword queries.
- ItemAnswering Keyword Queries involving Aggregates and Group-Bys in Relational Databases(2015-07-08) Zhong, ZENG; Mong Li, Lee; Tok Wang, LingKeyword search over relational databases has gained popularity as it provides a user-friendly way to explore structured data. Current research has focused on the computation of minimal units that contain all the query keywords, and largely ignores queries to retrieve statistical information from the database. The latter involves aggregate functions and group-bys, and are called aggregate queries. In this work, we propose a semantic approach to answer keyword queries containing aggregates and group-bys. Our approach utilizes the ORM schema graph to capture the semantics of objects and relationships in the database, and determines the various interpretations of a query. Based on each interpretation, we generate an SQL statement to apply aggregates and group-bys. Further, we detect duplications of objects and relationships arising from denormalized relations so that the aggregate functions will not compute the statistics for the same information repeatedly. Experimental results demonstrate that our approach is able to return correct answers to aggregate queries.
- ItemAn Application of Constraint Programming to Mobile Data(2005-10-19) BRESSAN, Stephane; TOK, Wee Hyong; PRESTWICH, StevenGiven the current ubiquity of wireless networks and mobile devices, data must be disseminated to a large number of these devices in an effective and efficient manner. A promising technique for such applications is data broadcasting. We introduce a new optimization problem derived from mobile data broadcasting applications. A single broadcaster transmits pages of data in sequence in response to requests from multiple clients, each asking for different subsets of a set of available pages. A transmitted page can be read by any number of clients, and we aim to minimise the total time taken to transmit all pages. Users impose additional constraints on how long they are willing to wait between pages, because of limited power supply in mobile devices. This problem is closely related to the multiple sequence alignment problem from bioinformatics. We present a first constraint model for this problem, designed for use with randomized local search algorithms.
- ItemApplications of Kolmogorov Complexity to Computable Model Theory(2006-09-07T06:46:20Z) KHOUSSAINOV, Bakhadyr; SEMUKHIN, Pavel; STEPHAN, FrankIn this paper we answer the following well-known open question in computable model theory. Does there exist a computable not countably categorical saturated structure with a unique computable isomorphism type? Our answer is affirmative and uses a construction based on Kolmogorov complexity. With a variation of this construction, we also provide an example of a uncountably categorical but not countably categorical saturated Sigma-0-1-structure with a unique computable isomorphism type. In addition, using the construction we give an example of an uncountably categorical but not countably categorical theory whose only non-computable model is the prime one.
- ItemApplying Cognitive Apprenticeship to the Teaching of Smalltalk in a Computer-Based Learning Environment(1993-03-01T00:00:00Z) Chee Y S; Tan J T; Chan TaizanAbstract not available.