Building a plugin#
spey
package has been designed to be expandable. It only needs to know certain aspects of the
data structure that is presented and a prescription to form a likelihood function.
What a plugin provides#
A quick intro on the terminology of spey plugins in this section:
A plugin is an external Python package that provides additional statistical model prescriptions to spey.
Each plugin may provide one (or more) statistical model prescriptions accessible directly through Spey.
Depending on the scope of the plugin, you may wish to provide additional (custom) operations and differentiability through various autodif packages such as
autograd
orjax
. As long as they are implemented through predefined function names, Spey can automatically detect and use them within the interface.
Creating your Statistical Model Prescription#
The first step in creating your Spey plugin is to create your statistical model interface.
This is as simple as importing abstract base class BackendBase
from spey and
inheriting it. The most basic implementation of a statistical model can be found below;
1>>> import spey
2
3>>> class MyStatisticalModel(spey.BackendBase):
4>>> name = "my_stat_model"
5>>> version = "1.0.0"
6>>> author = "John Smith <john.smith@smith.com>"
7>>> spey_requires = ">=0.1.0,<0.2.0"
8
9>>> def __init__(self, ...)
10>>> ...
11
12>>> @property
13>>> def is_alive(self):
14>>> ...
15
16>>> def config(
17... self, allow_negative_signal: bool = True, poi_upper_bound: float = 10.0
18... ):
19>>> ...
20
21>>> def get_logpdf_func(
22... self, expected = spey.ExpectationType.observed, data = None
23... ):
24>>> ...
25
26>>> def expected_data(self, pars):
27>>> ...
BackendBase
requires certain functionality from the statistical model to be
implemented, but let us first go through the above class structure. Spey looks for specific
metadata to track the implementation’s version, author and name. Additionally,
it checks compatibility with the current Spey version to ensure that the plugin works as it should.
Note
The list of metadata that Spey is looking for:
name (
str
): Name of the plugin.version (
str
): Version of the plugin.author (
str
): Author of the plugin.spey_requires (
str
): The minimum spey version that the plugin needs, e.g.spey_requires="0.0.1"
orspey_requires=">=0.3.3"
.doi (
List[str]
): Citable DOI numbers for the plugin.arXiv (
List[str]
): arXiv numbers for the plugin.
MyStatisticalModel class has four main functionalities namely is_alive()
,
config()
, get_logpdf_func()
, and
BackendBase()
documentation by clicking on them.)
is_alive()
: This function returns a boolean indicating that the statistical model has at least one signal bin with a non-zero yield.config()
: This function returnsModelConfig
class which includes certain information about the model structure, such as the index of the parameter of interest within the parameter list (poi_index
), minimum value parameter of interest can take (minimum_poi
), suggested initialisation parameters for the optimiser (suggested_init
) and suggested bounds for the parameters (suggested_bounds
). Ifallow_negative_signal=True
the lower bound of POI is expected to be zero; ifFalse
minimum_poi
.poi_upper_bound
is used to enforce an upper bound on POI.Note
Suggested bounds and initialisation values should return a list with a length of the number of nuisance parameters and parameters of interest. Initialisation values should be a type of
List[float, ...]
and bounds should have the type ofList[Tuple[float, float], ...]
.get_logpdf_func()
: This function returns a function that takes a NumPy array as an input which indicates the fit parameters (nuisance, \(\theta\), and POI, \(\mu\)) and returns the value of the natural logarithm of the likelihood function, \(\log\mathcal{L}(\mu, \theta)\). The inputexpected
defines which data to be used in the absence ofdata
input, i.e. ifexpected=spey.ExpectationType.observed
yields of observed data should be used to compute the likelihood, but ifexpected=spey.ExpectationType.apriori
background yields should be used. This ensures the difference between prefit and postfit likelihoods. Ifdata
is provided, it is overwritten; this is for the case where Asimov data is in use.expected_data()
(optional): This function is crutial for asymptotic hypothesis testing. This function is used to generate the expected value of the data with the given fit parameters, i.e. \(\theta\) and \(\mu\). If this function does not exist, exclusion limits can still be computed usingchi_square
calculator. seeexclusion_confidence_level()
.
Other available functions that can be implemented are shown in the table below.
Functions and Properties |
Explanation |
---|---|
Returns the objective function and/or its gradient. |
|
Returns Hessian of the log-probability |
|
Returns a function to sample from the likelihood distribution. |
Attention
A simple example implementation can be found in the example-plugin repository which implements
Identifying and installing your statistical model#
In order to add your brand new statistical model to the spey interface, all you need to do is to create a setup.py
file,
which will create an entry point for the statistical model class. So let us assume that you have the following folder structure
my_folder
├── my_subfolder
│ ├── __init__.py
│ └── mystat_model.py # this includes class MyStatisticalModel
└── setup.py
The setup.py
file should include the following
>>> from setuptools import setup
>>> stat_model_list = ["my_stat_model = my_subfolder.mystat_model:MyStatisticalModel"]
>>> setup(entry_points={"spey.backend.plugins": stat_model_list})
where
stat_model_list
is a list of statistical models you would like to register.my_stat_model
is the short name for a statistical model. This should be the same as thename
attribute of the class. Spey will identify the backend with this name.my_subfolder.mystat_model
is the path to your statistical model class, MyStatisticalModel.
Note that stat_model_list
can include as many implementations as desired. After this step is complete, all one needs to do
is pip install -e .
and AvailableBackends()
function will include mystat_model
as well.
Citing Plug-ins#
Since other users can build plug-ins, they are given a metadata accessor to extract proper information
to cite them. get_backend_metadata()
function allows the user to extract name, author, version, DOI and
arXiv number to be used in academic publications. This information can be accessed as follows
>>> import spey
>>> spey.get_backend_metadata("mystat_model")
>>> # {'name': 'my_stat_model',
... # 'author': 'John Smith <john.smith@smith.com>',
... # 'version': '1.0.0',
... # 'spey_requires': '>=0.1.0,<0.2.0',
... # 'doi': [],
... # 'arXiv': []}