The state of python libraries for performing bayesian graph inference is a bit frustrating. libpgm is one of the few libraries which seems to exist, but it is quite limited in its abilities. OpenPNL from Intel is a great c++ implementation of the Matlab Bayes-Net toolbox, but its C++ and Matlab interfaces are both not particularly convenient. So we set about to properly swig the OpenPNL out to python where it can be used rapidly. Additionally some of the build infrastructure for OpenPNL was a bit dated and needed some cleaning to work on modern Linux systems.
Our updated modules for both of these can be found at: https://github.com/PyOpenPNL
Not all of OpenPNL has yet been swigged and some of the python interface is still a little bit rough, but it does work. Here we’ll work through the canonical Bayes-net example from Russell and Norvig, also used in Matlab BNT docs. The Bayes network of interest is illustrated below.
We have a simple graph with four discrete nodes and we would like to instantiate the model, provide evidence and infer marginal probabilities given this evidence.
We focus on the example included in the repo which can be viewed in full here simple_bnet.py
Syntax for defining a DAG’s adjacency matrix and conditional probability distribution types for the Bayes net reads as.
nnodes = 4 # set up the graph # Dag must be square, with zero diag! dag = np.zeros([nnodes,nnodes], dtype=np.int32) dag[0,[1,2]] = 1 dag[2,3] = 1 dag[1,3] = 1 pGraph = openpnl.CGraph.CreateNP(dag) # set up the node types types = openpnl.pnlNodeTypeVector() types.resize(nnodes) isDiscrete = 1 types.SetType( isDiscrete, 2 ) types.SetType( isDiscrete, 2 ) types.SetType( isDiscrete, 2 ) types.SetType( isDiscrete, 2 ) # node associations nodeAssoc = openpnl.toConstIntVector(*nnodes) # make the bayes net ... pBNet = openpnl.CBNet.Create( nnodes, types, nodeAssoc, pGraph )
We can verify the DAG structure by plotting with python-networkx, shown below and verifying it matches our goal.
In this case we have allocated a Bayes-net with 4 nodes, each with 2 discrete states (T|F). Next we assign CPDFs to each of the nodes.
pBNet.AllocFactors() for (node, cpdvals) in [ (0, [0.5,0.5]), (1, [0.8, 0.2, 0.2, 0.8]), (2, [0.5, 0.9, 0.5, 0.1]), (3, [1, 0.1, 0.1, 0.01, 0, 0.9, 0.9, 0.99]), ]: parents = pGraph.GetParents(node); print "node: ", node, " parents: ", parents domain = list(parents) + [node] cCPD = openpnl.CTabularCPD.Create( pBNet.GetModelDomain() , openpnl.toConstIntVector(domain) ) cCPD.AllocMatrix( cpdvals, openpnl.matTable ) cCPD.NormalizeCPD() pBNet.AttachFactor(cCPD)
Assigning these known distributions we now have defined the DAG and the corresponding CPDFs. We can allocate an inference engine and begin posing problems to it.
We start by assigning evidence that we know Cloudy=False and then seek to measure the marginal of WetGrass resulting.
# Set up the inference engine infEngine = openpnl.CPearlInfEngine.Create( pBNet ); # Problem 1 P(W|C=0) evidence = openpnl.mkEvidence( pBNet, ,  ) infEngine.EnterEvidence(evidence) infEngine.pyMarginalNodes( , 0 ) infEngine.GetQueryJPD().Dump()
This provides an output value of [0.788497 0.211503] shown below when plotted.
Changing the evidence to Cloudy=True, we can compute this marginal and plot it in comparison.
# Problem 1 P(W|C=1) evidence = openpnl.mkEvidence( pBNet, ,  ) infEngine.EnterEvidence(evidence) infEngine.pyMarginalNodes( , 0 ) infEngine.GetQueryJPD().Dump()
This is an extremely simple BN, but it illustrates the relatively straightforward simplicity with which we can now set up and work with such problems using the PyOpenPNL interface. Hopefully this project will be of much use to a number of people! We’ll be largely adding and testing python API support now on an as needed basis.
Installing OpenPNL and PyOpenPNL is now relatively straightforward, I’ve gone through and put together some relatively sane build systems to make these usable, they can be built roughly by following
git clone https://github.com/PyOpenPNL/OpenPNL.git cd OpenPNL && ./configure && make && sudo make install git clone https://github.com/PyOpenPNL/PyOpenPNL.git cd PyOpenPNL && sudo python setup.py build install
This is of course only scratching the surface of BN/PGM style inference. OpenPNL support DBNs, GMM based continuous distributions, and numerous CPD & Structure learning as well as inference engines under the hood, much more to come soon.