A POTENTIAL THEORY APPROACH TO AN ALGORITHM OF CONCEPTUAL SPACE PARTITIONING

This paper proposes a new classification algorithm for the partitioning of a conceptual space. All the algorithms which have been used until now have mostly been based on the theory of Voronoi diagrams. This paper proposes an approach based on potential theory, with the criteria for measuring similarities between objects in the conceptual space being based on the Newtonian potential function. The notion of a fuzzy prototype, which generalizes the previous definition of a prototype, is introduced. Furthermore, the necessary conditions that a natural concept must meet are discussed. Instead of convexity, as proposed by Gärdenfors, the notion of geodesically convex sets is used. Thus, if a concept corresponds to a set which is geodesically convex, it is a natural concept. This definition applies, for example, if the conceptual space is an Euclidean space. As a by-product of the construction of the algorithm, an extension of the conceptual space to d-dimensional Riemannian manifolds is obtained.


Introduction
This paper considers the problem of partitioning a conceptual space (the definition of which appears later).In a broad sense, the problem of partitioning of a given space is closely related to the problem of data classification (Gordon, 1999).Recently, the theory and application of data classification has attracted the interest of many scientists from different areas, who are working on this topic and related problems.The reason for this trend is the fact that companies and institutions connected to industry (but not only), as well as scientists working in such areas as statistics, economics, sociology, psychology and linguistics (especially those working on corpus linguistics), collect enormous sets of data in order to work on them to identify patterns and structure.Usually, the first step is to divide a set of data into smaller classes in such a way that A potential theory approach to an algorithm of conceptual space partitioning elements belonging to the same class are, in some sense, similar to each other, whereas elements from different classes should be essentially different.A measure of similarity (or dissimilarity) is very often constructed ad hoc depending on the concrete practical problem considered.A classical monograph (Gordon, 1999), mentioned above, contains a survey of possible similarity measures.As a matter of fact, this survey does not cover all of the possible measures, since those measures, as mentioned earlier, depend on the problem in question or the hypothesis which is to be proved.Hence, there is actually an infinity of similarity measures.
A set which is divided into several classes has a simple structure, and is therefore potentially easer to deal with.
The paradigm of dividing objects into classes is one of the most natural cognitive processes, one which is performed by human beings upon the objects of their surrounding reality.The next process is the naming of the encountered objects.An object which does not have a name does not exist.
As mentioned earlier, the problem under current consideration comes from the area of cognitivism.Cognitivism is an interdisciplinary science whose main aim is to analyze and model the brain activity of human beings, as well as their senses.Cognitivism also constitutes a basis for other sciences.One example of this is cognitive linguistics, which is relevant to the problems discussed in this paper, see e.g.Gärdenfors (2011).
For many years, the dominant paradigm in linguistics was structuralism, which started with the groundbreaking ideas of Ferdinand de Saussure, which were published in 1916 (posthumously) by his students in the famous book Cours de linguistique générale (Saussure, 1916).Around 1980, the ideas of structuralism seemed to fall out of favor and linguistics shifted to the new paradigm of cognitivism.It became widely accepted that natural language should be studied together with its relation to the perception of reality.The term language and world view was coined, although the idea itself is much older.One of the pioneers of cognitive linguistics is Ronald Langacker, who in 1986 published a paper which laid out the basics of cognitive grammar (Langacker, 1986, see also Langacker, 2008).
This paper is concerned with the notion of conceptual space -an idea introduced to cognitive studies by Peter Gärdenfors (1988, 1996).The paper will focus on, at least to start with, conceptual space as the usual d-dimensional Euclidean space, R d = {(x 1 , ..., x d ) : x i ∈ R} which is simply a set of d-dimensional vectors with real entries.In other words, the focus is on objects, which can be characterized by d real parameters.The second important example of conceptual space are subsets of R d (for example, if x 1 represents the mass of the object x, then x 1 can only be a nonnegative number.Thus, instead of the whole real straight line there is only a half-line on the first axis).A concept in the conceptual space is represent by a subset.A natural concept 1 is a concept which is given by the convex subset 2 .The subsets are constructed on the basis of prototypes, i.e. by the objects which are "the best" representatives.Usually, all methods of partitioning of the conceptual space are based on some version of the Voronoi diagram, see (Okabe, Boots, Sugihara, & Chiu, 2000).As a result, all sets corresponding to a given concept are polygons (they may be unbounded).Gärdenfors, in his monograph (Gärdenfors, 2000), claims that: 1 Definition of this notion given in Gärdenfors (2000, p. 67) is not quite clear.According to Gärdenfors natural concepts are those, "that are natural for the purposes of problem-solving, planning, memorizing, communicating, and so fort."(Emphasis by Gärdenfors).
2 A subset A ⊆ R d is said to be convex if for all x, y ∈ A and for every real number 0 ≤ λ ≤ 1, the element λx + (1 − λ)y ∈ A. In other words this means that together with two (different) elements of A the whole interval which connect these elements ia also contained in A.
A potential theory approach to an algorithm of conceptual space partitioning "A Voronoi tessellation based on a set of prototypes is a simple way of classifying a continuous space of stimuli.The partitioning results in a discretization a of the space.The prime cognitive effect is that the discretization speeds up learning.The reason for this is that remembering the finite prototypes, which is sufficient to compute the tessellation once the metric is given, puts considerably less burden on memory than remembering the categorization of each single point in the space.In other words, a Voronoi tessellation is a cognitively economical way of representing information about concepts.Furthermore, having a space partitioned into a finite number of classes means that it is possible to give names to the classes".(Gärdenfors, 2000, p. 89).
a Emphasis by Gärdenfors.
The idea that partitioning of space takes place in the human mind may seem quite plausible.However, it is not so easy to agree with the second part of Gärdenforsa's statement in which he claims that the human brain, for reasons of economy, uses Voronoi diagrams.It is not thought that the human brain is limited to only operating on linear notions.Similar doubts have been articulated by Douven, Decock, Dietz and Égré.Nevertheless, these researchers also used Voronoi diagrams, saying that: "[...] we adopt as our working hypothesis that, to a first approximation at least, the conceptual spaces approach, coupled with the ideas of prototypes and Voronoi diagrams as a way of generating categorizations, captures an important part of the truth about human cognition".(Douven, Decock, Dietz, & Égré, 2013, p. 142) Hence, these authors also, in some sense, suppose the linearity of thinking processes.This paper presents a different "nonlinear" partitioning algorithm.One justification for this more general approach may be the similarity of the model to some models which can be found in physics.Thus, this paper's solution to a cognitive problem has its roots in physics.Prototypes in the model can be interpreted as the mass placed in R d .The similarity of a given object to a prototype is measured by the force of attraction of the prototype and the given object.The second innovation in the model is the definition of a more general notion of a prototype (a fuzzy prototype), which has some analogy with fuzzy sets, as defined by Zadeh (1965).Another innovation is the replacement of Gärdenfors' assumptions about natural concept (convexity) by a more general notion -that of geodesic convexity for domains or sets (considered as Riemannian manifolds).Geodesic convexity is a generalization of convexity for spaces with non-zero curvature.Geodesic convexity agrees with convexity on manifolds with zero curvature.Euclidean space has zero curvature and geodesics are straight line (or intervals if connecting only two given points).Therefore, geodesically convex domains or sets in R d are convex.
The idea of Voronoi diagrams is to find domains v(p 1 ), ..., v(p n ) ⊆ X, so that element x belongs to v(p i ) if the distance ρ(x, p i ) is smaller than the distances ρ(x, p j ) for j = i.In other words, element x is the closest to the prototype p i .Hence, v(p i ) = {x ∈ X : ρ(x, p i ) ≤ d(x, p j )} for every j = i.
The most used is the space X = R d or it subsets with natural l 2 -metric defined by Clearly, the choice of an appropriate metric depends on what the points in the conceptual space represent.Thus, the choice of a specific metric depends on the problem under study.
In the picture above we see an example of a Voronoi diagram for the metric space X = [0, 5] × [0, 5] with the Euclidean distance function l 2 defined above, and three prototypes chosen randomly (three black points).

Conceptual space and fuzzy prototypes
In linguistics, there exists the notion of a semantic field (in some cases they can be considered as equivalents to concepts or natural concepts).The aim of this paper is to show that models which are based on Voronoi diagrams, and which have been widely applied up to now, are inadequate as a means of description of (a part of) reality.Let us start with semantic fields which are difficult to characterize by real parameters.As an example, we can use the semantic field related to modes of transport.It may contain elements such as {cab, horse, bicycle, scooter, skateboard, train, airplane, car }.Examples of this kind require a different approach, the first step of which is related to the concept of fuzziness.
Not all of the modes of transport in question are of equivalent significance nowadays.One can expect that if one conducted a survey asking randomly selected people to give one example of a mode of transport, it is fairly certain that the majority of people would answer {car }.An object, which appears more often than other terms in surveys, which appears with highest frequency and which is the most associated with a given semantic field by native speakers, is called a prototype.Therefore, in the example above {car } is the prototype of semantic field of modes of transport.
Let us now consider an example where the object can be described by d = 3 real values.The simplest example of this kind of notions are colors.They can be characterized by wavelength, saturation, and hue.Not all people see colors in the same way.For this reason, the prototype of a given color for one person will not necessarily be the prototype for another person.Moreover, everybody has a tendency to view the prototype of the color red not as one object described by the three parameters above, but rather by a spectrum of red colors.For most people, the prototype of the color red is simply a subset of conceptual space.This example can be very easily generalized using a Voronoi diagram.Gärdenfors (2000, p. 139) considers a 2-dimensional R 2 with prototypes which are discs and gives appropriate generalizations.The picture below (page 5) illustrates this situation.
Voronoi diagrams with prototypes which are general sets are considered in Okabe et al. (2000, pp. 186-189).
However, even this generalized model is still inadequate.Although someone may see the color red as a spectrum, they are also able to show that some red colors from the prototype set are 'redder' than the others.Taking this into account, we propose a completely different method which solves the case described above.
Firstly, it is necessary to introduce weight functions on prototype domains.A prototype for us is a set U together with a weight function defined on U , i.e., ϕ : U → [0, 1], and which meets the condition that there is an element x ∈ U , so that ϕ(x) = 1.This last condition can be interpreted as meaning that element x is the best prototype among all the prototypes from U4 .A prototype satisfying the above definition will be called a fuzzy prototype.
As was mentioned in the introduction, we do not completely agree with the statement that cognitive processes operate only on simple linear objects, and therefore we propose a new algorithm for partitioning a conceptual space which generates concepts which are not convex.Instead of polygons, we get curved domains with smooth boundaries.

Newton's potential and the partitioning of conceptual space
Let P = {p 1 , p 2 , ..., p n } consist of n elements.Every element is a vector in d-dimensional space R d .Let U i be a subset of R d containing p i .A fuzzy prototype is the set U i together with the function ϕ i : U → [0, 1], satisfying ϕ i (p i ) = 1.(Clearly, such a point is not unique, see also footnote 5).The function ϕ i is called the weight or density of the prototype U i .
Partitioning of the conceptual space is performed as follows.For every x, which does not belong to the sum of the sets U i , i = 1, ..., n we compute the Newtonian potentials5 and A quantity Ψ i (x) indicates the energy of the potential field at point x, which is generated by the mass concentrated in the domain U i with density ϕ i . 6his means that for a given i, the greater the value of Ψ i (x), the stronger the element (object) x is attracted by the set (domain) U i .Our algorithm for partitioning conceptual space is based on this simple observation.Specifically, for every x ∈ i U i / we compute Ψ i (x) for i = 1, ..., n and if for some i 0 , Ψ i0 (x) = max i Ψ i (x),then we classify the element x to the domain of attraction of U i0 .If x is on the border lines of our partitioning then a few functions Ψ i (x) (with different indices) have the same values and cannot be classified.The same problem appears in Voronoi diagrams.
Below are presented two examples of the partitioning of a conceptual space using fuzzy prototypes.We have limited ourselves to only the simplest fuzzy prototypes, where the differences between Voronoi diagrams and our algorithm are visible.

Example 1
As conceptual space X we take a direct product X = [0, 5] × [0, 5] which is a subset of the plane R 2 .We denote a generic element (vector) on the plane by (x, y).Consider three domains U i with density functions ϕ i defined as follows, Note that the sets U 1 , U 2 , U 3 are discs with their centers at (1, 1), (3, 2), (4, 4), respectively.The graphs of the functions ϕ 1 , ϕ 2 are the upper-halves of the spheres of radii 1 over the discs U 1 and U 2 .The function ϕ 3 is a constant function equal to 1 on U 3 .
The domains U i , together with ϕ i , will play the role of fuzzy prototypes in this example.The functions ϕ i visualize the density of the prototypes and give information about how prototypical they are.If for a given (x, y) in the domain U i we have ϕ i (x, y) = 1, then we can conclude that the object (x, y) is the most prototypical7 and the point (x, y) gives the highest contribution to the force of attraction of the set U i .According to the formula of Newtonian potential (see section 4), for every i = 1, 2, 3, we write the energy of the potential Ψ i (x, y): The space X = [0, 5] × [0,5].The values of the potential function were computed for a domain8 {(x, y) ∈ R 2 : (x, y) ∈ in lattice points of the form (0.2i, 0.2j) where i, j = 0, ..., 25.The lattice points (x, y) on which the max i Ψ i (x, y) = Ψ 1 (x, y) are drawn using blue color, max i Ψ i (x, y) = Ψ 2 (x, y) are indicated on the picture in red.Finally, the black points are these for which the function Ψ 3 has the highest value.This is shown on the picture below (page 7)9 : A potential theory approach to an algorithm of conceptual space partitioning

Remarks
The partitioning obtained by the fuzzy prototype algorithm consists of disjoint sets which are typically non-convex.This is most visible in Example 1.However, this is not an obstacle to applying this partitioning because in fact the natural concept does not need to be convex in the usual Euclidean sense.In our method, the notion of geodesic convexity, which is the analogue of convexity, appears quite naturally.This will be examined in more detail in the next section.

The natural concept and geodesic convexity
Due to the limitations of the paper's length, differential geometry and the theory of Riemannian manifolds10 will not be explained here in detail.Instead, what follows is a brief outline of the application of the Theory of General Relativity to the constructed partitioning of a conceptual space by means of fuzzy prototypes.
In Gärdenfors (2011), we encounter one more notion -natural property.11The difference between a natural concept and a natural property can be checked by comparing the definitions given in Gärdenfors (2000Gärdenfors ( , 2011)).According to Gärdenfors, if a concept (or property) is a natural concept (or a natural property), they are represented in a conceptual space by convex sets.Henceforth we will focus only on concepts.Gärdenfors argues for the necessity of natural concept convexity when he states: "[. . .] if an object o 1 is described as having color term C in a given language, and another object o 2 is also said to have color C, then an object o 3 with a color which lies between the color of o 1 and the color of o 2 will also be described by the color term C." This implies that the interval joining the objects o 1 and o 2 has to be contained in the corresponding domain C. In the Euclidean space used by Gärdenfors, the above explanation is meaningful and convexity is an obvious assumption.However, Gärdenfors does not say that if one considers, in the Euclidean space, all the curves joining the two points o 1 and o 2 then the shortest curve is an interval (geodesics).This is a consequence of the fact that the Euclidean space has a curvature equal to 0. This observation, together with our algorithm which generates a partitioning consisting of sets which are not necessarily convex, lead us to a different condition which is necessary for a natural concept to satisfy.
Clearly, there is no obstacle to considering more general conceptual spaces than the Euclidean space (there is only one condition in the classical definition -a conceptual space must be a metric space).Therefore, instead of the metric space R d we can consider the Riemannian space M d .For simplicity, we assume that our Riemannian manifolds are smooth.A smooth Riemannian manifold is a smooth differential manifold (locally, looks like R d and its coordinate system is differentiable an arbitrary number of times) endowed with a smooth Riemannian metric: for every point p ∈ M d there is defined the tangent space T p M ≈ R d with an inner product g(X p , Y p ) p , which is smooth as a function of p ∈ M d .The inner product allows us to define the Riemannian metric on M d .The Euclidean space is a trivial example of a smooth Riemannian manifold.The Euclidean space R d with the inner product (x, y) = d i=1 x i y j (here the inner product does not depend on the point p ∈ R d ) and the metric function ρ(x, y) = x − y = ( d 1 (x i − y i ) 2 ) 1/2 .At this point, the notion of geodesic becomes necessary.One can say that a geodesic (a certain curve in M d ) locally minimizes a distance.If the Riemannian manifold has some curvature then the geodesics are curved, whereas they are straight lines in R d -a space which has zero curvature.
In a Riemannian space, it is very easy to generalize the notion of convexity.We say that a subset A ⊆ M d is geodesically convex if for all x, y ∈ A there is a geodesic contained in A which joins the points x, y and minimizes the distance between these points, see e.g. Lee (2009, p. 634).