msmbuilder.featurizer.GaussianSolventFeaturizer

class msmbuilder.featurizer.GaussianSolventFeaturizer(solute_indices, solvent_indices, sigma, periodic=False)

Featurizer on weighted pairwise distance between solute and solvent.

We apply a Gaussian kernel to each solute-solvent pairwise distance and sum the kernels for each solute atom, resulting in a vector of len(solute_indices).

The values can be physically interpreted as the degree of solvation of each solute atom.

Parameters:
  • solute_indices (np.ndarray, shape=(n_solute,)) – Indices of solute atoms
  • solvent_indices (np.ndarray, shape=(n_solvent,)) – Indices of solvent atoms
  • sigma (float) – Sets the length scale for the gaussian kernel
  • periodic (bool) – Whether to consider a periodic system in distance calculations

References

..[1] Gu, Chen, et al. BMC Bioinformatics 14, no. Suppl 2 (January 21, 2013): S8. doi:10.1186/1471-2105-14-S2-S8.

__init__(solute_indices, solvent_indices, sigma, periodic=False)

Methods

__init__(solute_indices, solvent_indices, sigma)
describe_features(traj) Generic method for describing features.
featurize(traj)
fit(traj_list[, y])
fit_transform(X[, y]) Fit to data, then transform it.
get_params([deep]) Get parameters for this estimator.
partial_transform(traj) Featurize an MD trajectory into a vector space via calculation
set_params(\*\*params) Set the parameters of this estimator.
summarize() Return some diagnostic summary statistics about this Markov model
transform(traj_list[, y]) Featurize a several trajectories.
describe_features(traj)

Generic method for describing features.

Parameters:traj (mdtraj.Trajectory) – Trajectory to use
Returns:feature_descs – Dictionary describing each feature with the following information about the atoms participating in each feature
  • resnames: unique names of residues
  • atominds: the four atom indicies
  • resseqs: unique residue sequence ids (not necessarily 0-indexed)
  • resids: unique residue ids (0-indexed)
  • featurizer: Featurizer name
  • featuregroup: Other information
Return type:list of dict

Notes

Method resorts to returning N/A for everything if describe_features in not implemented in the sub_class

fit_transform(X, y=None, **fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters:
  • X (numpy array of shape [n_samples, n_features]) – Training set.
  • y (numpy array of shape [n_samples]) – Target values.
Returns:

X_new – Transformed array.

Return type:

numpy array of shape [n_samples, n_features_new]

get_params(deep=True)

Get parameters for this estimator.

Parameters:deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:params – Parameter names mapped to their values.
Return type:mapping of string to any
partial_transform(traj)

Featurize an MD trajectory into a vector space via calculation of solvent fingerprints

Parameters:traj (mdtraj.Trajectory) – A molecular dynamics trajectory to featurize.
Returns:features – A featurized trajectory is a 2D array of shape (length_of_trajectory x n_features) where each features[i] vector is computed by applying the featurization function to the `i`th snapshot of the input trajectory.
Return type:np.ndarray, dtype=float, shape=(n_samples, n_features)

See also

transform()
simultaneously featurize a collection of MD trajectories
set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns:
Return type:self
summarize()

Return some diagnostic summary statistics about this Markov model

transform(traj_list, y=None)

Featurize a several trajectories.

Parameters:traj_list (list(mdtraj.Trajectory)) – Trajectories to be featurized.
Returns:features – The featurized trajectories. features[i] is the featurized version of traj_list[i] and has shape (n_samples_i, n_features)
Return type:list(np.ndarray), length = len(traj_list)