msmbuilder.msm.
BayesianMarkovStateModel
(lag_time=1, n_samples=100, n_steps=0, n_chains=None, n_timescales=None, reversible=True, ergodic_cutoff='on', prior_counts=0, sliding_window=True, random_state=None, sampler='metzner', verbose=False)¶Bayesian reversible Markov state model.
Variant of MarkovStateModel
which estimates a distribution over
transition matrices instead of a single transition matrix using
Metropolis Markov chain Monte Carlo. This distribution gives
information about the statistical uncertainty in the transition matrix
(and functions of the transition matrix), and is stored in
all_transmats_
Parameters: |
|
---|
n_states_
¶int – The number of states in the model
mapping_
¶dict – Mapping between “input” labels and internal state indices used by the
counts and transition matrix for this Markov state model. Input states
need not necessarily be integers in (0, ..., n_states_ - 1), for
example. The semantics of mapping_[i] = j
is that state i
from
the “input space” is represented by the index j
in this MSM.
countsmat_
¶array_like, shape = (n_states_, n_states_) – Number of transition counts between states. countsmat_[i, j] is counted during fit(). The indices i and j are the “internal” indices described above. No correction for reversibility is made to this matrix.
transmats_
¶array_like, shape = (n_samples, n_states_, n_states_) – Samples from the posterior ensemble of transition matrices.
Notes
Markov chain Monte Carlo can be computationally expensive. To get good
(converged) results and acceptable performance, you’ll likely need to
play around with the n_samples
, n_steps
and n_chains
parameters.
n_samples
gives the total number of transition matrices sampled
from the posterior. These samples are generated from n_chains
different
independent MCMC chains, at an interval of n_steps
. The total number
of iterations of MCMC performed during fit()
is n_samples * n_steps
.
Increasing n_chains
therefore does not alter the total number of
iterations – instead it controls whether those iterations occur as part
of one long chain or multiple shorter chains (which are run in parallel
for sampler=='metzner'
).
References
[1] | P. Metzner, F. Noe and C. Schutte, “Estimating the sampling error: Distribution of transition matrices and functions of transition matrices for given trajectory data.” Phys. Rev. E 80 021106 (2009) |
__init__
(lag_time=1, n_samples=100, n_steps=0, n_chains=None, n_timescales=None, reversible=True, ergodic_cutoff='on', prior_counts=0, sliding_window=True, random_state=None, sampler='metzner', verbose=False)¶Methods
__init__ ([lag_time, n_samples, n_steps, ...]) |
|
fit (sequences[, y]) |
|
fit_transform (X[, y]) |
Fit to data, then transform it. |
get_params ([deep]) |
Get parameters for this estimator. |
inverse_transform (sequences) |
Transform a list of sequences from internal indexing into |
partial_transform (sequence[, mode]) |
Transform a sequence to internal indexing |
set_params (\*\*params) |
Set the parameters of this estimator. |
summarize () |
|
transform (sequences[, mode]) |
Transform a list of sequences to internal indexing |
Attributes
all_eigenvalues_ |
Eigenvalues of the transition matrices. |
all_left_eigenvectors_ |
Left eigenvectors, \(\Phi\), of each transition matrix in the |
all_populations_ |
|
all_right_eigenvectors_ |
Right eigenvectors, \(\Psi\), of each transition matrix in the |
all_timescales_ |
Implied relaxation timescales each sample in the ensemble |
all_eigenvalues_
¶Eigenvalues of the transition matrices.
Returns: | eigs – The eigenvalues of each transition matrix in the ensemble |
---|---|
Return type: | array-like, shape = (n_samples, n_timescales+1) |
all_left_eigenvectors_
¶Left eigenvectors, \(\Phi\), of each transition matrix in the ensemble
Each transition matrix’s left eigenvectors are normalized such that:
lv[:, 0]
is the equilibrium populations and is normalized such that sum(lv[:, 0]) == 1`- The eigenvectors satisfy
sum(lv[:, i] * lv[:, i] / model.populations_) == 1
. In math notation, this is \(<\phi_i, \phi_i>_{\mu^{-1}} = 1\)
Returns: | lv – The columns of lv, lv[:, i] , are the left eigenvectors of
transmat_ . |
---|---|
Return type: | array-like, shape=(n_samples, n_states, n_timescales+1) |
all_right_eigenvectors_
¶Right eigenvectors, \(\Psi\), of each transition matrix in the ensemble
Each transition matrix’s left eigenvectors are normalized such that:
Weighted by the stationary distribution, the right eigenvectors are normalized to 1. That is,
sum(rv[:, i] * rv[:, i] * self.populations_) == 1
,or \(<\psi_i, \psi_i>_{\mu} = 1\)
Returns: | rv – The columns of lv, rv[:, i] , are the right eigenvectors of
transmat_ . |
---|---|
Return type: | array-like, shape=(n_samples, n_states, n_timescales+1) |
all_timescales_
¶Implied relaxation timescales each sample in the ensemble
Returns: | timescales – The longest implied relaxation timescales of the each sample in
the ensemble of transition matrices, expressed in units of
time-step between indices in the source data supplied
to fit() . |
---|---|
Return type: | array-like, shape = (n_samples, n_timescales,) |
References
[1] | Prinz, Jan-Hendrik, et al. “Markov models of molecular kinetics: |
Generation and validation.” J. Chem. Phys. 134.17 (2011): 174105.
fit_transform
(X, y=None, **fit_params)¶Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
Parameters: |
|
---|---|
Returns: | X_new – Transformed array. |
Return type: | numpy array of shape [n_samples, n_features_new] |
get_params
(deep=True)¶Get parameters for this estimator.
Parameters: | deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators. |
---|---|
Returns: | params – Parameter names mapped to their values. |
Return type: | mapping of string to any |
inverse_transform
(sequences)¶Transform a list of sequences from internal indexing into labels
Parameters: | sequences (list) – List of sequences, each of which is one-dimensional array of
integers in 0, ..., n_states_ - 1 . |
---|---|
Returns: | sequences – List of sequences, each of which is one-dimensional array of labels. |
Return type: | list |
partial_transform
(sequence, mode='clip')¶Transform a sequence to internal indexing
Recall that sequence can be arbitrary labels, whereas transmat_
and countsmat_
are indexed with integers between 0 and
n_states - 1
. This methods maps a set of sequences from the labels
onto this internal indexing.
Parameters: |
|
---|---|
Returns: | mapped_sequence – If mode is “fill”, return an ndarray in internal indexing. If mode is “clip”, return a list of ndarrays each in internal indexing. |
Return type: | list or ndarray |
set_params
(**params)¶Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects
(such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each
component of a nested object.
Returns: | |
---|---|
Return type: | self |
transform
(sequences, mode='clip')¶Transform a list of sequences to internal indexing
Recall that sequences can be arbitrary labels, whereas transmat_
and countsmat_
are indexed with integers between 0 and
n_states - 1
. This methods maps a set of sequences from the labels
onto this internal indexing.
Parameters: |
|
---|---|
Returns: | mapped_sequences – List of sequences in internal indexing |
Return type: | list |