DSpace Angular :: Browsing by Author "Sanjay JAIN"

Browsing by Author "Sanjay JAIN"

Now showing 1 - 4 of 4

Identifying Clusters from Positive Data
(2004-07-01T00:00:00Z) John CASE; Sanjay JAIN; Eric MARTIN; Arun SHARMA; Frank STEPHAN
The present work studies clustering from an abstract point of view and investigates its properties in the framework of inductive inference. Any class S considered is given by a numbering A0,A1,... of nonempty subsets of N or Rk which is also used as a hypothesis space. A clustering task is a finite and nonempty set of indices of pairwise disjoint sets. The class S is said to be clusterable if there is an algorithm which, for every clustering task I, converges in the limit on any text for yi I Ai to a finite set J of indices of pairwise disjoint clusters such that y j J Aj = yi I Ai. A class is called semiclusterable if there is such an algorithm which finds a J with the last condition relaxed to yj J Aj { yi I Ai. The relationship between natural topological properties and clusterability is investigated. Topological properties can provide sufficient or necessary conditions for clusterability but they cannot characterize it. On one hand, many interesting conditions make use of both the topological structure of the class and a well-chosen numbering. On the other hand, the clusterability of a class does not depend on the decision which numbering of the class is used as a hypothesis space for the clusterer. These ideas are demonstrated in the context of geometrically defined classes. Clustering of many of these classes requires besides the text for the clustering task some additional information: the class of convex hulls of finitely many points in a rational vector space can be clustered with the number of clusters as additional information. Interestingly, the class of polygons (together with their interiors) is clusterable if the number of clusters and the overall number of vertices of these clusters is given to the clusterer as additional information. Intriguingly this additional information is not sufficient for classes including figures with holes. While some classes are unclusterable due to their topological structure, others are only computationally intractable. An oracle might permit clustering all computationally intractable tasks but fail on some classes which are topologically difficult. It is shown that an oracle E permits clustering all computationally difficult classes iff E dT K and E' dT K''. Furthermore, no 1-generic oracle below K and no 2-generic oracle permits clustering any class which is not clusterable without an oracle.
Learning Language from Positive Data and Finite Number of Queries
(2004-04-01T00:00:00Z) Sanjay JAIN; Efim KINBER
A computational model for learning languages in the limit from full positive data and a bounded number of queries to the teacher (oracle) is introduced and explored. Equivalence, superset, and subset queries are considered (for the latter one we consider also a variant when the learner tests every conjecture, but the number of negative answers is uniformly bounded). If the answer is negative, the teacher may provide a counterexample. We consider several types of counterexamples: arbitrary, least counterexamples, the ones whose size is bounded by the size of positive data seen so far, and no counterexamples. A number of hierarchies based on the number of queries (answers) and types of answers/counterexamples is established. Capabilities of learning with different types of queries are compared. In most cases, one or two queries of one type can sometimes do more than any bounded number of queries of another type. Still, surprisingly, a finite number of subset queries is sufficient to simulate the same number of (stronger) equivalence queries when behaviourally correct learners do not receive counterexamples and may have unbounded number of errors in almost all conjectures.
Learning Languages from Positive Data and Negative Counterexamples
(2004-04-01T00:00:00Z) Sanjay JAIN; Efim KINBER
In this paper we introduce a paradigm for learning in the limit of potentially infinite languages from all positive data and negative counterexamples provided in response to the conjectures made by the learner. Several variants of this paradigm are considered that reflect different conditions/constraints on the type and size of negative counterexamples and on the time for obtaining them. In particular, we consider the models where 1) a learner gets the least negative counterexample; 2) the size of a negative counterexample must be bounded by the size of the positive data seen so far; 3) a counterexample can be delayed indefinitely. Learning power, limitations of these models, relationships between them, as well as their relationships with classical paradigms for learning languages in the limit (without negative counterexamples) are explored. Several surprising results are obtained. In particular, for Gold's model of learning requiring a learner to syntactically stabilize on correct conjectures, learners getting negative counterexamples immediately turn out to be as powerful as the ones that do not get them for indefinitely long time (or are only told that their latest conjecture is not a subset, without any specific negative counterexample). Another result shows that for behaviourally correct learning (where semantic convergence is required from a learner)with negative counterexamples, a learner making just one error in almost all its correct conjectures has the `ultimate power': it can learn the class of all recursively enumerable languages. Yet another result demonstrates that sometimes positive data and negative counterexamples provided by a teacher are not enough to compensate for full positive and negative data.
On Intrinsic Complexity of Learning Geometrical Concepts from Texts
(1999-06-01T00:00:00Z) Sanjay JAIN; Efim KINBER
The goal of this paper is to quantify complexity of algorithmic learning of geometrical concepts from growing finite segments. The geometrical concepts we consider are variants of open-hulls. We use intrinsic complexity as our complexity measure. The scale we use is based on a hierarchy of degrees of intrinsic complexity composed of simple natural ground degrees such as INIT and COINIT.

Browsing by Author "Sanjay JAIN"

Results Per Page

Sort Options