OCDocker.Receptor module

class OCDocker.Receptor.Receptor(structure, name, mol2_path='', c_model='gasteiger', gravy_scale='KyteDoolitle', relative_asa_cutoff=0.7, from_json_descriptors='', overwrite=False, clean=False, canonicalize_pdb='auto', allow_missing_surface=False)[source]

Bases: object

Represents a receptor (protein) molecule with computed descriptors.

This class loads receptor structures from PDB or mmCIF files and computes various molecular descriptors including amino acid composition, surface accessibility (SASA), dipole moment, isoelectric point, GRAVY score, aromaticity, and instability index.

Parameters:
  • structure (str | Bio.PDB.Structure.Structure) – Path to a PDB/mmCIF file or a BioPython Structure object.

  • name (str) – Name identifier for the receptor.

  • mol2_path (str, optional) – Path to an existing MOL2 file, by default “”.

  • c_model (str, optional) – Charge model for dipole moment calculation, by default “gasteiger”.

  • gravy_scale (str, optional) – GRAVY scale to use, by default “KyteDoolitle”.

  • relative_asa_cutoff (float, optional) – Relative accessible surface area cutoff for surface amino acids, by default 0.7.

  • from_json_descriptors (str, optional) – Path to JSON file containing pre-computed descriptors, by default “”.

  • overwrite (bool, optional) – Whether to overwrite existing files, by default False.

  • clean (bool, optional) – Whether to clean/renumber the PDB structure, by default False.

  • canonicalize_pdb (Union[bool, str]) –

  • allow_missing_surface (bool) –

name

Name of the receptor.

Type:

str

structure

BioPython structure object.

Type:

Bio.PDB.Structure.Structure

path

Path to the structure file.

Type:

str

original_path

Original input path (e.g., .cif/.mmcif) when conversion occurs.

Type:

str

clean_source_path

Previous path before cleaning, when a cleaned file is generated.

Type:

str

SASA

Solvent accessible surface area.

Type:

float

DipoleMoment

Dipole moment of the receptor.

Type:

float

IsoelectricPoint

Isoelectric point (pI) of the receptor.

Type:

float

GRAVY

Grand average of hydropathy.

Type:

float

Aromaticity

Aromaticity index.

Type:

float

InstabilityIndex

Instability index.

Type:

float

countA, countR, countN, ..., countV

Count of each amino acid type.

Type:

int

TotalAALength

Total number of amino acids.

Type:

int

AvgAALength

Average chain length.

Type:

float

countChain

Number of chains.

Type:

int

descriptors_names = {'count': ['A', 'R', 'N', 'D', 'C', 'Q', 'E', 'G', 'H', 'I', 'L', 'K', 'M', 'F', 'P', 'S', 'T', 'W', 'Y', 'V']}
single_descriptors = ['TotalAALength', 'AvgAALength', 'countChain', 'SASA', 'DipoleMoment', 'IsoelectricPoint', 'GRAVY', 'Aromaticity', 'InstabilityIndex']
allDescriptors = ['countA', 'countR', 'countN', 'countD', 'countC', 'countQ', 'countE', 'countG', 'countH', 'countI', 'countL', 'countK', 'countM', 'countF', 'countP', 'countS', 'countT', 'countW', 'countY', 'countV', 'TotalAALength', 'AvgAALength', 'countChain', 'SASA', 'DipoleMoment', 'IsoelectricPoint', 'GRAVY', 'Aromaticity', 'InstabilityIndex']
__init__(structure, name, mol2_path='', c_model='gasteiger', gravy_scale='KyteDoolitle', relative_asa_cutoff=0.7, from_json_descriptors='', overwrite=False, clean=False, canonicalize_pdb='auto', allow_missing_surface=False)[source]

Constructor of the class Receptor.

Parameters:
  • structure (str | Bio.PDB.Structure.Structure) – Path to the structure file OR Bio.PDB.Structure.Structure object.

  • name (str) – Name of the receptor.

  • mol2_path (str, optional) – Path to the mol2 file, by default “”.

  • c_model (str, optional) – Charge model to be used, by default “gasteiger”.

  • gravy_scale (str, optional) – Scale to be used to compute the GRAVY descriptor, by default “KyteDoolitle”.

  • relative_asa_cutoff (float, optional) – Relative cutoff to be used to compute the SASA descriptor, by default 0.7.

  • from_json_descriptors (str, optional) – Path to the json file containing the descriptors, by default “”.

  • overwrite (bool, optional) – Flag to denote if files will be overwritten, by default False.

  • clean (bool, optional) – Flag to denote if the pdb file will be cleaned, by default False.

  • canonicalize_pdb (bool | str, optional) – Whether to canonicalize CHARMM-style PDB names. Use True, False, or “auto”.

  • allow_missing_surface (bool, optional) – If True, allows initialization to continue when DSSP/surface AA counts are unavailable, using zeroed surface counts. Default is False.

Return type:

None

get_descriptors()[source]

Return the descriptors for the Receptor object.

Parameters:

None

Returns:

The descriptors for the Receptor object.

Return type:

Dict[str, float | int]

is_valid()[source]

Check if a Receptor object is valid.

Parameters:

None

Returns:

True if the Receptor object is valid, False otherwise.

Return type:

bool

print_attributes()[source]

Print all attributes of the receptor to stdout.

Displays the receptor’s name, structure path, and all computed descriptors (SASA, dipole moment, isoelectric point, GRAVY, aromaticity, instability index, amino acid counts, etc.) in a formatted, aligned table.

Return type:

None

to_dict()[source]

Return all the properties for the Receptor object.

Parameters:

None

Returns:

The properties for the Receptor object.

Return type:

Dict[str, float | int]

to_json(overwrite=False)[source]

Stores the descriptors as json to avoid the necessity of evaluate them many times.

Parameters:

overwrite (bool, optional) – If True, the json file will be overwritten if it already exists. Default is False.

Returns:

The exit code of the command (based on the Error.py code table).

Return type:

int

OCDocker.Receptor.compute_aromaticity(residues)[source]

Compute the aromaticity according to Lobry, 1994.

Parameters:

residues (str) – The residues of the protein.

Returns:

The aromaticity of the protein.

Return type:

float

OCDocker.Receptor.compute_dipole_moment(structure, c_model='gasteiger')[source]

Computes the receptor’s dipole moment.

Parameters:
  • structure (Bio.PDB.Structure.Structure, str) – The structure to be analysed or the path to the structure

  • c_model (str, optional) – The charge model to be used, by default “gasteiger”.

Returns:

The dipole moment of the receptor.

Return type:

float

OCDocker.Receptor.compute_gravy(residues, scale='KyteDoolitle')[source]

Computes the GRAVY (Grand Average of Hydropathy) according to Kyte and Doolitle, 1982.

Utilizes the given Hydrophobicity scale, by default uses the original proposed by Kyte and Doolittle (KyteDoolitle). Other options are: Aboderin, AbrahamLeo, Argos, BlackMould, BullBreese, Casari, Cid, Cowan3.4, Cowan7.5, Eisenberg, Engelman, Fasman, Fauchere, GoldSack, Guy, Jones, Juretic, Kidera, Miyazawa, Parker,Ponnuswamy, Rose, Roseman, Sweet, Tanford, Wilson and Zimmerman.

Parameters:
  • residues (str) – The residues of the protein.

  • scale (str, optional) – The hydrophobicity scale to be used, by default “KyteDoolitle”.

Returns:

The GRAVY of the protein.

Return type:

float

OCDocker.Receptor.compute_instability_index(residues)[source]

Calculate the instability index according to Guruprasad et al 1990.

Implementation of the method of Guruprasad et al. 1990 to test a protein for stability. Any value above 40 means the protein is unstable (has a short half life). See: Guruprasad K., Reddy B.V.B., Pandit M.W. Protein Engineering 4:155-161(1990).

Parameters:

residues (str) – The residues of the protein.

Returns:

The instability index of the protein.

Return type:

float

OCDocker.Receptor.compute_isoelectric_point(residues)[source]

Computes protein’s isoelectric point.

Parameters:

residues (str) – The residues of the protein.

Returns:

The isoelectric point of the protein.

Return type:

float

OCDocker.Receptor.compute_sasa(model, n_points=1000)[source]

Computes the Solvent Accessible Surface Area of the molecule. NOTE: The sasa value is added to the structure and can be called using the command “model.sasa” (without quotes).

Parameters:
  • model (Bio.PDB.Structure.Structure) – The model to be analysed.

  • n_points (int, optional) – The number of points to be used in the calculation, by default 1000.

Return type:

None

OCDocker.Receptor.count_AAs_and_chains(structure)[source]

Counts the total length (sum of all AAs), the average length (the total AAs divided by the number of chains) and the number of chains the protein has.

Parameters:

structure (Bio.PDB.Structure.Structure) – The structure to be analysed.

Returns:

The total length, the average length and the number of chains. If the structure is not valid, returns None.

Return type:

Tuple[int, float, int] | None

OCDocker.Receptor.count_surface_AA(structure, structurePath, cutoff=0.7)[source]

Counts how many of each of the 20 standard AAs has a relative ASA value above a given cutoff.

Parameters:
  • structure (Bio.PDB.Structure.Structure) – The structure to be loaded.

  • structurePath (str) – The path of the structure.

  • cleanStructurePath (str) – The path of the clean structure.

  • cutoff (float, optional) – The cutoff to consider an AA as surface. Default is 0.7.

Returns:

A dictionary with the count of each AA.

Return type:

Dict[str, int]

OCDocker.Receptor.get_res(model)[source]

Get the amino acid one letter sequence for the receptor (Ignore chains).

Parameters:

model (Bio.PDB.Structure.Structure) – The model to be analysed.

Returns:

The amino acid one letter sequence for the receptor.

Return type:

str

OCDocker.Receptor.load_mol(structure, name='', compute_sasa=True, mol2_path='', overwrite=False, clean=True, canonicalize_pdb='auto')[source]

Load a structure pdb/cif/mmcif if a path is provided or just assign the Bio.PDB.Structure.Structure object to the structure. Also returns the path as a tuple (path, structure).

Parameters:
  • structure (str | os.PathLike | Bio.PDB.Structure.Structure) – Path to the structure file or a Bio.PDB.Structure.Structure object.

  • name (str, optional) – The name of the structure, by default “”.

  • compute_sasa (bool, optional) – Whether to compute the SASA or not, by default True.

  • mol2_path (str, optional) – The path to the mol2 file, by default “”.

  • overwrite (bool, optional) – Whether to overwrite the mol2 file or not, by default False.

  • clean (bool, optional) – Whether to clean the protein file or not, by default True.

  • canonicalize_pdb (bool | str, optional) – Whether to canonicalize CHARMM-style PDB names. Use True, False, or “auto”.

Returns:

The path to the structure and the structure object. Will return a tuple of (“”, None) if the structure is not valid.

Return type:

Tuple[str, Bio.PDB.Structure.Structure]

OCDocker.Receptor.read_descriptors_from_json(path, returnData=False)[source]

Read the descriptors from a json file.

Parameters:
  • path (str) – The path to the json file.

  • returnData (bool, optional) – If True, returns a dictionary with the descriptors. By default False.

Returns:

The descriptors dictionary or None if any error occurs.

Return type:

Dict[str, str | float | int] | Tuple[float | str | int | dict[str, int], …] | None

Raises:

KeyError

OCDocker.Receptor.renumber_pdb_residues(structure, outputPdb='')[source]

Renumber the pdb residues using biopython.

Parameters:
  • structure (Bio.PDB.Structure.Structure) – The structure to be renumbered.

  • outputPdb (str, optional) – The output pdb file. If not provided, the structure will be renumbered in place, by default “”.

Returns:

The renumbered structure.

Return type:

Bio.PDB.Structure.Structure