marsi.chemistry package

Submodules

marsi.chemistry.common module

marsi.chemistry.common.rmsd()

Root-mean-squared deviation of XYZ.

$$RMSD = sqrt{

rac{1}{n} sum_{i=1}^{n}{(v - t)^2}}}$$

Parameters:

v : list

List of x, y, z

w : list

List of x, y, z

marsi.chemistry.common.tanimoto_coefficient()

Calculate the Tanimoto coefficient for 2 fingerprints.

Parameters:
  • fingerprint1 (ndarray) – First fingerprint.
  • fingerprint2 (ndarray) – Second fingerprint.
Returns:

The Tanimoto coefficient.

Return type:

float

marsi.chemistry.common.tanimoto_distance()

Calculate the Tanimoto distance for 2 fingerprints (1 - tanimoto coefficient).

Parameters:
  • fingerprint1 (ndarray) – First fingerprint.
  • fingerprint2 (ndarray) – Second fingerprint.
Returns:

The Tanimoto distance.

Return type:

float

marsi.chemistry.common.monte_carlo_volume()

Adapted from:

Simple Monte Carlo estimation of VdW molecular volume (in A^3) by Geoffrey Hutchison <geoffh@pitt.edu>

https://github.com/ghutchis/hutchison-cluster

marsi.chemistry.common_ext module

marsi.chemistry.common_ext.monte_carlo_volume()

Adapted from:

Simple Monte Carlo estimation of VdW molecular volume (in A^3) by Geoffrey Hutchison <geoffh@pitt.edu>

https://github.com/ghutchis/hutchison-cluster

marsi.chemistry.common_ext.rmsd()

Root-mean-squared deviation of XYZ.

$$RMSD = sqrt{

rac{1}{n} sum_{i=1}^{n}{(v - t)^2}}}$$

Parameters:

v : list

List of x, y, z

w : list

List of x, y, z

marsi.chemistry.common_ext.tanimoto_coefficient()

Calculate the Tanimoto coefficient for 2 fingerprints.

Parameters:
  • fingerprint1 (ndarray) – First fingerprint.
  • fingerprint2 (ndarray) – Second fingerprint.
Returns:

The Tanimoto coefficient.

Return type:

float

marsi.chemistry.common_ext.tanimoto_distance()

Calculate the Tanimoto distance for 2 fingerprints (1 - tanimoto coefficient).

Parameters:
  • fingerprint1 (ndarray) – First fingerprint.
  • fingerprint2 (ndarray) – Second fingerprint.
Returns:

The Tanimoto distance.

Return type:

float

marsi.chemistry.molecule module

class marsi.chemistry.molecule.Molecule(ob_mol, rd_mol)[source]

Bases: object

Object representing a molecule.

Attributes

inchi
inchi_key
num_atoms
num_bonds
num_rings

Methods

fingerprint([fpformat, bits])
from_inchi(inchi) Builds a molecule from an InChI key.
from_mol(path_or_str) Builds a molecule from a file or string.
from_sdf(path_or_str) Builds a molecule from a file or string.
classmethod from_sdf(path_or_str)[source]

Builds a molecule from a file or string.

Parameters:path_or_str (str) – The input file name or a string with the molecule description in SDF format.
Returns:
Return type:marsi.chemistry.molecule.Molecule
classmethod from_inchi(inchi)[source]

Builds a molecule from an InChI key.

Parameters:inchi (str) – A valid InChI key.
Returns:
Return type:marsi.chemistry.molecule.Molecule
classmethod from_mol(path_or_str)[source]

Builds a molecule from a file or string.

Parameters:path_or_str (str) – The input file name or a string with the molecule description in MOL format.
Returns:
Return type:marsi.chemistry.molecule.Molecule
inchi
inchi_key
num_atoms
num_bonds
num_rings
fingerprint(fpformat='maccs', bits=None)[source]

marsi.chemistry.openbabel module

marsi.chemistry.openbabel.has_radical(mol)[source]

Finds if a pybel.Molecule has Radicals. Radicals have an atomic number of 0.

Parameters:mol (pybel.Molecule) – A molecule.
Returns:True if there are any radicals.
Return type:bool
marsi.chemistry.openbabel.mol_to_inchi(mol)[source]

Makes an InChI from a pybel.Molecule.

Parameters:mol (pybel.Molecule) – A molecule.
Returns:A InChI string.
Return type:str
marsi.chemistry.openbabel.mol_to_svg(mol)[source]

Makes an SVG from a pybel.Molecule.

Parameters:mol (pybel.Molecule) – A molecule.
Returns:A SVG string.
Return type:str
marsi.chemistry.openbabel.mol_to_inchi_key(mol)[source]

Makes an InChI Key from a pybel.Molecule.

Parameters:mol (pybel.Molecule) – A molecule.
Returns:A InChI key.
Return type:str
marsi.chemistry.openbabel.inchi_to_inchi_key(inchi)[source]

Makes an InChI Key from a InChI string.

Parameters:inchi (str) – A valid InChI string.
Returns:A InChI key.
Return type:str
marsi.chemistry.openbabel.mol_drugbank_id(mol)[source]

Returns the DrugBank ID from the molecule data.

Parameters:l (mo) – A molecule.
Returns:DrugBank ID
Return type:str
marsi.chemistry.openbabel.mol_pubchem_id(mol)[source]

Returns the PubChem Compound ID from the molecule data.

Parameters:mol (pybel.Molecule) – A molecule.
Returns:PubChem Compound ID
Return type:str
marsi.chemistry.openbabel.mol_chebi_id(mol)[source]

Returns the ChEBI ID from the molecule data.

Parameters:mol (pybel.Molecule) – A molecule.
Returns:ChEBI ID
Return type:str
marsi.chemistry.openbabel.fingerprint(mol, fpformat='maccs')[source]

Returns the Fingerprint of the molecule.

Parameters:
  • mol (pybel.Molecule) – A molecule.
  • fpformat (str) – A valid fingerprint format (see pybel.fps)
Returns:

A fingerprint

Return type:

pybel.Fingerprint

marsi.chemistry.openbabel.inchi_to_molecule(inchi)[source]

Returns a molecule from a InChI string.

Parameters:inchi (str) – A valid string.
Returns:mol – A molecule.
Return type:pybel.Molecule
marsi.chemistry.openbabel.mol_str_to_inchi(mol_str)[source]

Returns the InChI from the molecule data.

Parameters:mol_str (str) – A valid MOL string.
Returns:A InChI string.
Return type:str
marsi.chemistry.openbabel.smiles_to_molecule(smiles)[source]

Returns the pybel.Molecule from the molecule data.

Parameters:smiles (str) – A valid SMILES string.
Returns:A Molecule.
Return type:pybel.Molecule
marsi.chemistry.openbabel.fingerprint_to_bits(fp, bits=1024)[source]

Converts a pybel.Fingerprint into a binary array

Parameters:
  • fp (pybel.Fingerprint) – A fingerprint molecule.
  • bits (int) – Number of bits (default is 1024)
Returns:

An array of 0’s and 1’s.

Return type:

bitarray

marsi.chemistry.openbabel.align_molecules(reference, molecule, include_h=True, symmetry=True)[source]

Align molecule to a reference.

Parameters:
  • reference (pybel.Molecule) – A reference molecule.
  • molecule (pybel.Molecule) – Molecule to align.
  • include_h (bool) – Include implicit hydrogen atoms.
  • symmetry (bool) –
Returns:

Return type:

list

marsi.chemistry.openbabel.get_spectrophore_data(molecule)[source]

A Spectrophore is calculated as a vector of 48 numbers (in the case of a non-stereospecific Spectrophore. The 48 doubles are organised into 4 sets of 12 doubles each:

  • numbers 01-11: Spectrophore values calculated from the atomic partial charges;
  • numbers 13-24: Spectrophore values calculated from the atomic lipophilicity properties;
  • numbers 25-36: Spectrophore values calculated from the atomic shape deviations;
  • numbers 37-48: Spectrophore values calculated from the atomic electrophilicity properties;
Parameters:molecule (pybel.Molecule) –
Returns:Float 1D-Array with the 48 features generated by OBSpectrophore.
Return type:ndarray
marsi.chemistry.openbabel.solubility(molecule, log_value=True)[source]

ESOL:  Estimating Aqueous Solubility Directly from Molecular Structure [1]

$Log(S_w) = 0.16 - 0.63 logP - 0.0062 MWT + 0.066 RB - 0.74 AP$ MWT = Molecular Weight RB = Rotatable Bounds AP = Aromatic Proportion ogP$
Parameters:
  • molecule (pybel.Molecule) – A molecule.
  • log_value (bool) – Return log(Solubility) if true (default).
Returns:

log(S_w): log value of solubility

Return type:

float

marsi.chemistry.qsar module

marsi.chemistry.rdkit module

marsi.chemistry.rdkit.inchi_to_molecule(inchi)[source]

Returns a molecule from a InChI string.

Parameters:inchi (str) – A valid string.
Returns:A molecule.
Return type:rdkit.Chem.rdchem.Mol
marsi.chemistry.rdkit.mol_to_molecule(file_or_molecule_desc, from_file=True)[source]

Returns a molecule from a MOL file.

Parameters:
  • file_or_molecule_desc (str) – A valid MOL file path or a valid MOL string.
  • from_file (bool) – If True tries to read the molecule from a file.
Returns:

A molecule.

Return type:

rdkit.Chem.rdchem.Mol

marsi.chemistry.rdkit.sdf_to_molecule(file_or_molecule_desc, from_file=True)[source]

Returns a molecule from a SDF file.

Parameters:
  • file_or_molecule_desc (str) – A valid sdf file path or a valid SDF string.
  • from_file (bool) – If True tries to read the molecule from a file.
Returns:

A molecule.

Return type:

rdkit.Chem.rdchem.Mol

marsi.chemistry.rdkit.inchi_to_inchi_key(inchi)[source]

Makes an InChI Key from a InChI string.

Parameters:inchi (str) – A valid InChI string.
Returns:A InChI key.
Return type:str
marsi.chemistry.rdkit.mol_to_inchi_key(mol)[source]

Makes an InChI Key from a Molecule.

Parameters:mol (rdkit.Chem.rdchem.Mol) – A molecule.
Returns:A InChI key.
Return type:str
marsi.chemistry.rdkit.mol_to_inchi(mol)[source]

Makes an InChI from a Molecule.

Parameters:mol (rdkit.Chem.rdchem.Mol) – A molecule.
Returns:A InChI.
Return type:str
marsi.chemistry.rdkit.fingerprint(molecule, fpformat='maccs')[source]

Returns the Fingerprint of the molecule.

Parameters:
  • molecule (rdkit.Chem.rdchem.Mol) – A molecule.
  • fpformat (str) – A valid fingerprint format.
Returns:

rdkit.DataStructs.cDataStructs.ExplicitBitVect

Return type:

Fingerprint

marsi.chemistry.rdkit.fingerprint_to_bits(fp, bits=1024)[source]

Converts a pybel.Fingerprint into a binary array

Parameters:
  • fp (rdkit.DataStructs.cDataStructs.ExplicitBitVect) – A fingerprint molecule.
  • bits (int) – Number of bits (default is 1024)
Returns:

Return type:

bitarray

marsi.chemistry.rdkit.maximum_common_substructure(reference, molecule, match_rings=True, match_fraction=0.6, timeout=None)[source]

Returns the Maximum Common Substructure (MCS) between two molecules.

Parameters:
  • reference (rdkit.Chem.Mol) – A molecule.
  • molecule (rdkit.Chem.Mol) – Another molecule.
  • match_rings (bool) – Force ring structure to match
  • match_fraction (float) – Match is fraction of the reference atoms (default: 0.6)
  • timeout (int) – Time out in seconds.
Returns:

Maximum Common Substructure result.

Return type:

rdkit.Chem.MCS.MCSResult

marsi.chemistry.rdkit.mcs_similarity(mcs_result, molecule, atoms_weight=0.5, bonds_weight=0.5)[source]

Returns the Maximum Common Substructure (MCS) between two molecules.

$$ atoms_weight * (mcs_res.similar_atoms/mol.num_atoms) + bonds_weight * (mcs_res.similar_bonds/mol.num_bonds) $$

Parameters:
  • mcs_result (rdkit.Chem.MCS.MCSResult) – The result of a Maximum Common Substructure run.
  • molecule (rdkit.Chem.Mol) – A molecule.
  • atoms_weight (float) – How much similar atoms matter.
  • bonds_weight (float) – How much similar bonds matter.
Returns:

Similarity value

Return type:

float

marsi.chemistry.rdkit.structural_similarity(reference, molecule, atoms_weight=0.5, bonds_weight=0.5, match_rings=True, match_fraction=0.6, timeout=None)[source]

Returns a structural similarity based on the Maximum Common Substructure (MCS) between two molecules.

$$ mcs_s(ref) * mcs_s(mol) $$

Parameters:
  • reference (rdkit.Chem.Mol) – The result of a Maximum Common Substructure run.
  • molecule (rdkit.Chem.Mol) – A molecule.
  • atoms_weight (float) – How much similar atoms matter.
  • bonds_weight (float) – How much similar bonds matter.
  • match_rings (bool) – Force ring structure to match.
  • match_fraction (float) – Match is fraction of the reference atoms (default: 0.6).
  • timeout (int) – Time out in seconds.
Returns:

Similarity between reference and molecule.

Return type:

float

marsi.chemistry.rdkit.monte_carlo_volume(molecule, coordinates=None, tolerance=1, max_iterations=10000, step_size=1000, seed=1517933887.7638862, verbose=False, forcefield='mmff94', steps=100)[source]

Adapted from:

Simple Monte Carlo estimation of VdW molecular volume (in A^3) by Geoffrey Hutchison <geoffh@pitt.edu>

https://github.com/ghutchis/hutchison-cluster

Parameters:
  • molecule (rdkit.Chem.rdchem.Mol) – A molecule from rdkit.
  • coordinates (list) – A list of pre selected x,y,z coords. It must match the atoms order.
  • tolerance (float) – The tolerance for convergence of the monte carlo abs(new_volume - volume) < tolerance
  • max_iterations (int) – Number of iterations before the algorithm starts.
  • step_size (int) – Number of points to add each step.
  • seed (object) – A valid seed for random.
  • verbose (bool) – Print debug information if True.
  • forcefield (str) – The force field to get a 3D molecule. (only if it is not 3D already)
  • steps (int) – The number of steps used for the force field to get a 3D molecule. (only if it is not 3D already)
Returns:

Molecule volume

Return type:

float

marsi.chemistry.solubility module

Module contents

marsi.chemistry.convex_hull_volume(xyz)[source]
marsi.chemistry.monte_carlo_volume()

Adapted from:

Simple Monte Carlo estimation of VdW molecular volume (in A^3) by Geoffrey Hutchison <geoffh@pitt.edu>

https://github.com/ghutchis/hutchison-cluster

marsi.chemistry.tanimoto_distance()

Calculate the Tanimoto distance for 2 fingerprints (1 - tanimoto coefficient).

Parameters:
  • fingerprint1 (ndarray) – First fingerprint.
  • fingerprint2 (ndarray) – Second fingerprint.
Returns:

The Tanimoto distance.

Return type:

float

marsi.chemistry.tanimoto_coefficient()

Calculate the Tanimoto coefficient for 2 fingerprints.

Parameters:
  • fingerprint1 (ndarray) – First fingerprint.
  • fingerprint2 (ndarray) – Second fingerprint.
Returns:

The Tanimoto coefficient.

Return type:

float