The Ground Bayes Network (GBN) in ProbReM is the smallest subset of data that is required to answer a specific query. While the PRM uses a first-order representation of the world, the inference process needs a propositional represenation of the data. The network.groundBN module implements an efficient data structure for that purpose.
A GBNGraph is a dictionary that contains a set of vertices of type GBNvertex.
The GBNGraph instance itself is a dictionary which is used to store vertices {key=vertex_id : value= GBNvertex}. There are also various different dictionaries of all GBNvertex objects that allow fast retrieval of sets of GBN vertices.
Instantiates a new evidence GBNvertex and updates the corresponding GBN data structures.
Parameters: |
|
---|
Adds a ReferenceVertex to the ground Bayesian network.
For now, the reference attribute (all exist attributes) are assumed to be sampling nodes (i.e. not in the evidence nor in the event variables)
Instantiates a new sampling GBNvertex and updates the corresponding GBN data structures.
Parameters: |
|
---|
A dicitonary that groups all GBNvertex instances according to their attribute class, e.g. {key=:class:Attribute : value=[ list of GBNvertex ]}
A dicitonary that groups all event GBNvertex instances according to their attribute class, e.g. {key=vertex_id : value= GBNvertex}
A dicitonary that groups all sampling GBNvertex instances (event & latent vertices), e.g. {key=vertex_id : value= GBNvertex}
A dicitonary that groups all sampling GBNvertex instances (event & latent vertices) according to their attribute class, e.g. {key=:class:Attribute : value=[ list of GBNvertex ]}
A queue that keeps track of vertices that need to be processed when constructing the Ground Bayesian network.
This class is also a dictionary as the information is stored in groups that correspond to sets of vertices that share the same local distribution (= the same attribute).
{ key=:class:Attribute : value=[ list of GBNvertex ] }
The ground Bayesian network implemented in network.groundBN consists of different kind of vertices implemented in network.vertices.
A GBNvertex represents a vertex in the Ground Bayes net. It is a variable in the GBN representing an attribute object whose CPD is distributed according to the CPD of the attribute class. E.g. all attribute objects of the same attribute class share the same CPD. A GBNvertex instance can take on a value from the domain of the associated attribute.
A node is associated with a specific attribute object .
We use sets to identify an obj, where the set contains a value
of for each primary key in self.pk of attr.erClass
Adds the parentVertex to the list of parent vertices associated with the parent vertex attribute. It adds the corresponding information to the children dictionary of the parent node.
Parameters: | parentVertex – GBNvertex |
---|
The associated attribute class
The dictionary of children attribute objects {key=`child.attribute` : value= { key=`id` : value = GBNvertex}}. child.attribute is of type Attribute and the gbnVertices of type GBNvertex
Returns the conditional probability distribution of the gbnV given its parent values.
Parameters: | gbnV – GBN instance |
---|---|
Returns: | A 1 x |attr.domain| numpy.array probability distribution |
Boolean. If True, we are interested in the posterior distribution of the vertex.
Boolean. If the value is fixed the vertex is part of the evidence
Returns True if the number of parents for the attribute paAttr is not zero. If paAttr is not the parent of self.attr, a key exception will be raised.
Parameters: | paAttr – Attribute |
---|
Returns the number of parents for attr. If attr==None the total number of parents is returned.
Parameters: | attr – Attribute |
---|
List identifier for the vertex
Returns the number of parents for attr. If attr==None the total number of parents is returned.
Parameters: | attr – Attribute |
---|
The parent assignment of the parents of this node. The order of the parent values is the same as the self.attr.parents list. It can be updated using parentAssignments()
Computes the values of the parents of that GBN vertex (using aggregation if necessary). Note that since there is an GBNVertex instance for every node in the GBN, the parent assignments are stored in the instance variable self.parentAss. In the case of the local distribution instance of an attribute, this is not the case as the distribution is shared among many attribute objects.
The dictionary of parents attribute objects {key=`parent.attribute` : value= { key=`id` : value = GBNvertex}}. parent.attribute is of type Attribute and the gbnVertices of type GBNvertex
Samples a new value for that gbn vertex. Warning: self.value will be overwritten even if self.fixed=True. We opt of performance and trust our implementation.
Current value, must be in the domain of attr
The class ReferenceVertex is a compact representation of the probabilitic variables required to represent reference uncertainty for one connection. A relationship r connects two entities e1, e2 with a certain type of connection; either a n:1 or a m:n connection. In case of a n:1 connection, like in the student professor example, each object in e1 is associated with exactly one object in e2, whereas an object in e2 can be associated with multiple objects in e1.
For example, when infering the success of a student s1.s. There will be one ReferenceVertex instance that contains a datastructure representing the shaded nodes in the network displayed below.
Note
For now this works, but there is a problem.
If there are multiple dependencies leading through the uncertain dependency dep, they all must use the same mapping of course (i.e. the same exist attributes)
Student/Prof Example: If the
student.success depends on Professor.fame
and a
student.phd also depends on Professor.fame
Assuming that we do inference for student1 on student.success and student.phd, then of course all exist attributes with student1 should be identical. This means that if we sample the exist attributes, then the edges for student.success and student.phd should be changed!
At this point, a GBN reference vertex is associated with only 1 GBN vertex (e.g. student1.success) of the n-Entity. In reality it should be associated with 1 object (e.g. student1) of the n-Entity.
This is not hard to do:
`self.refGBNvertex` should be a dictionary holding all attribute objects (e.g. student.success.1 student.iq.1 and student) of a certain object (e.g. student.1)
`self.dependency = dep` should be a dictionary holding all uncertain dependencies
Adds one reference in self.references and updates the parent/children information of the involved vertices.
Parameters: | gbnV_new – GBNvertex to be added. |
---|
The uncertain Dependency instance
{ key = k_entity_ID (e.g. Professor.2) : value = { key = parent.attr (e.g. prof.funding) : value = { key = parent.ID (e.g. ‘prof.funding.2’) : value = parent.Vertex (e.g. prof.funding.2.vertex)} } }
The relationship is assumed to be of type n:k, where k serves as a fixed-parameter to limit the size of the state space of the Markov chain for inference. Assuming that relationship R of type n:k is connecting entities (E1,`E2`). Thus every object in E1 is connected with at most k objects in E2. By definition, the E1 and E2 refer to the first and second entry in the relationship.pk list, respectively.
Note, this method is overwritten from GBNvertex. As a reference vertex is a represenation of multiple (i.e. self.k) exist attributes, there are also multiple parent assignment. The methods takes a k_gbnV_erID of the k-entity as argument and returns the parent assignments list of the exist attribute object associated with k_gbnV_erID. This is probably neither fast nor pretty, another way would be to also overwrite GBNVertex.parentAss to use a dictionary for all entries in ReferenceVertex.references. As is, a new list is returned at each execution.
Parameters: | k_gbnV_erID – erID of GBNVertex instance |
---|---|
Returns: | List of parents assignments |
The theoretical exist attributes don’t have to stored explicitly. The deterministic constraint (k) limits the number of non-zero exist variables to k. ReferenceVertex.references is a compact represenation of all exist attributes for one n-side attribute object (i.e. a students success). The dictionary of length ReferenceVertex.k stores all links that exist (i.e. the exist attribute is 1) in the format {key = k_entity_ID : value = gbnV_E2 }.
The methods addReference(), removeReference() and replaceReference() can be used to manipulate this datastructure.
A simple helper function that computes a unique ID from an object (e.g. a student). It allows to identify an object (e.g. student.1), rather than an attribute object (e.g. student.success.1) computed by computeID().
Parameters: | er – Instance of ERClass |
---|---|
Returns: | A unique string ID for the object |
A simple helper function that computes a unique ID from an attr and obj, the primary key of the attribute object which is part of the GBN.
Parameters: |
|
---|---|
Returns: | A unique string ID for the attribute object |