OCDocker.Receptor module¶
- class OCDocker.Receptor.Receptor(structure, name, mol2_path='', c_model='gasteiger', gravy_scale='KyteDoolitle', relative_asa_cutoff=0.7, from_json_descriptors='', overwrite=False, clean=False, canonicalize_pdb='auto', allow_missing_surface=False)[source]¶
Bases:
objectRepresents a receptor (protein) molecule with computed descriptors.
This class loads receptor structures from PDB or mmCIF files and computes various molecular descriptors including amino acid composition, surface accessibility (SASA), dipole moment, isoelectric point, GRAVY score, aromaticity, and instability index.
- Parameters:
structure (str | Bio.PDB.Structure.Structure) – Path to a PDB/mmCIF file or a BioPython Structure object.
name (str) – Name identifier for the receptor.
mol2_path (str, optional) – Path to an existing MOL2 file, by default “”.
c_model (str, optional) – Charge model for dipole moment calculation, by default “gasteiger”.
gravy_scale (str, optional) – GRAVY scale to use, by default “KyteDoolitle”.
relative_asa_cutoff (float, optional) – Relative accessible surface area cutoff for surface amino acids, by default 0.7.
from_json_descriptors (str, optional) – Path to JSON file containing pre-computed descriptors, by default “”.
overwrite (bool, optional) – Whether to overwrite existing files, by default False.
clean (bool, optional) – Whether to clean/renumber the PDB structure, by default False.
canonicalize_pdb (Union[bool, str]) –
allow_missing_surface (bool) –
- name¶
Name of the receptor.
- Type:
str
- structure¶
BioPython structure object.
- Type:
Bio.PDB.Structure.Structure
- path¶
Path to the structure file.
- Type:
str
- original_path¶
Original input path (e.g., .cif/.mmcif) when conversion occurs.
- Type:
str
- clean_source_path¶
Previous path before cleaning, when a cleaned file is generated.
- Type:
str
- SASA¶
Solvent accessible surface area.
- Type:
float
- DipoleMoment¶
Dipole moment of the receptor.
- Type:
float
- IsoelectricPoint¶
Isoelectric point (pI) of the receptor.
- Type:
float
- GRAVY¶
Grand average of hydropathy.
- Type:
float
- Aromaticity¶
Aromaticity index.
- Type:
float
- InstabilityIndex¶
Instability index.
- Type:
float
- countA, countR, countN, ..., countV
Count of each amino acid type.
- Type:
int
- TotalAALength¶
Total number of amino acids.
- Type:
int
- AvgAALength¶
Average chain length.
- Type:
float
- countChain¶
Number of chains.
- Type:
int
- descriptors_names = {'count': ['A', 'R', 'N', 'D', 'C', 'Q', 'E', 'G', 'H', 'I', 'L', 'K', 'M', 'F', 'P', 'S', 'T', 'W', 'Y', 'V']}¶
- single_descriptors = ['TotalAALength', 'AvgAALength', 'countChain', 'SASA', 'DipoleMoment', 'IsoelectricPoint', 'GRAVY', 'Aromaticity', 'InstabilityIndex']¶
- allDescriptors = ['countA', 'countR', 'countN', 'countD', 'countC', 'countQ', 'countE', 'countG', 'countH', 'countI', 'countL', 'countK', 'countM', 'countF', 'countP', 'countS', 'countT', 'countW', 'countY', 'countV', 'TotalAALength', 'AvgAALength', 'countChain', 'SASA', 'DipoleMoment', 'IsoelectricPoint', 'GRAVY', 'Aromaticity', 'InstabilityIndex']¶
- __init__(structure, name, mol2_path='', c_model='gasteiger', gravy_scale='KyteDoolitle', relative_asa_cutoff=0.7, from_json_descriptors='', overwrite=False, clean=False, canonicalize_pdb='auto', allow_missing_surface=False)[source]¶
Constructor of the class Receptor.
- Parameters:
structure (str | Bio.PDB.Structure.Structure) – Path to the structure file OR Bio.PDB.Structure.Structure object.
name (str) – Name of the receptor.
mol2_path (str, optional) – Path to the mol2 file, by default “”.
c_model (str, optional) – Charge model to be used, by default “gasteiger”.
gravy_scale (str, optional) – Scale to be used to compute the GRAVY descriptor, by default “KyteDoolitle”.
relative_asa_cutoff (float, optional) – Relative cutoff to be used to compute the SASA descriptor, by default 0.7.
from_json_descriptors (str, optional) – Path to the json file containing the descriptors, by default “”.
overwrite (bool, optional) – Flag to denote if files will be overwritten, by default False.
clean (bool, optional) – Flag to denote if the pdb file will be cleaned, by default False.
canonicalize_pdb (bool | str, optional) – Whether to canonicalize CHARMM-style PDB names. Use True, False, or “auto”.
allow_missing_surface (bool, optional) – If True, allows initialization to continue when DSSP/surface AA counts are unavailable, using zeroed surface counts. Default is False.
- Return type:
None
- get_descriptors()[source]¶
Return the descriptors for the Receptor object.
- Parameters:
None –
- Returns:
The descriptors for the Receptor object.
- Return type:
Dict[str, float | int]
- is_valid()[source]¶
Check if a Receptor object is valid.
- Parameters:
None –
- Returns:
True if the Receptor object is valid, False otherwise.
- Return type:
bool
- print_attributes()[source]¶
Print all attributes of the receptor to stdout.
Displays the receptor’s name, structure path, and all computed descriptors (SASA, dipole moment, isoelectric point, GRAVY, aromaticity, instability index, amino acid counts, etc.) in a formatted, aligned table.
- Return type:
None
- to_dict()[source]¶
Return all the properties for the Receptor object.
- Parameters:
None –
- Returns:
The properties for the Receptor object.
- Return type:
Dict[str, float | int]
- to_json(overwrite=False)[source]¶
Stores the descriptors as json to avoid the necessity of evaluate them many times.
- Parameters:
overwrite (bool, optional) – If True, the json file will be overwritten if it already exists. Default is False.
- Returns:
The exit code of the command (based on the Error.py code table).
- Return type:
int
- OCDocker.Receptor.compute_aromaticity(residues)[source]¶
Compute the aromaticity according to Lobry, 1994.
- Parameters:
residues (str) – The residues of the protein.
- Returns:
The aromaticity of the protein.
- Return type:
float
- OCDocker.Receptor.compute_dipole_moment(structure, c_model='gasteiger')[source]¶
Computes the receptor’s dipole moment.
- Parameters:
structure (Bio.PDB.Structure.Structure, str) – The structure to be analysed or the path to the structure
c_model (str, optional) – The charge model to be used, by default “gasteiger”.
- Returns:
The dipole moment of the receptor.
- Return type:
float
- OCDocker.Receptor.compute_gravy(residues, scale='KyteDoolitle')[source]¶
Computes the GRAVY (Grand Average of Hydropathy) according to Kyte and Doolitle, 1982.
Utilizes the given Hydrophobicity scale, by default uses the original proposed by Kyte and Doolittle (KyteDoolitle). Other options are: Aboderin, AbrahamLeo, Argos, BlackMould, BullBreese, Casari, Cid, Cowan3.4, Cowan7.5, Eisenberg, Engelman, Fasman, Fauchere, GoldSack, Guy, Jones, Juretic, Kidera, Miyazawa, Parker,Ponnuswamy, Rose, Roseman, Sweet, Tanford, Wilson and Zimmerman.
- Parameters:
residues (str) – The residues of the protein.
scale (str, optional) – The hydrophobicity scale to be used, by default “KyteDoolitle”.
- Returns:
The GRAVY of the protein.
- Return type:
float
- OCDocker.Receptor.compute_instability_index(residues)[source]¶
Calculate the instability index according to Guruprasad et al 1990.
Implementation of the method of Guruprasad et al. 1990 to test a protein for stability. Any value above 40 means the protein is unstable (has a short half life). See: Guruprasad K., Reddy B.V.B., Pandit M.W. Protein Engineering 4:155-161(1990).
- Parameters:
residues (str) – The residues of the protein.
- Returns:
The instability index of the protein.
- Return type:
float
- OCDocker.Receptor.compute_isoelectric_point(residues)[source]¶
Computes protein’s isoelectric point.
- Parameters:
residues (str) – The residues of the protein.
- Returns:
The isoelectric point of the protein.
- Return type:
float
- OCDocker.Receptor.compute_sasa(model, n_points=1000)[source]¶
Computes the Solvent Accessible Surface Area of the molecule. NOTE: The sasa value is added to the structure and can be called using the command “model.sasa” (without quotes).
- Parameters:
model (Bio.PDB.Structure.Structure) – The model to be analysed.
n_points (int, optional) – The number of points to be used in the calculation, by default 1000.
- Return type:
None
- OCDocker.Receptor.count_AAs_and_chains(structure)[source]¶
Counts the total length (sum of all AAs), the average length (the total AAs divided by the number of chains) and the number of chains the protein has.
- Parameters:
structure (Bio.PDB.Structure.Structure) – The structure to be analysed.
- Returns:
The total length, the average length and the number of chains. If the structure is not valid, returns None.
- Return type:
Tuple[int, float, int] | None
- OCDocker.Receptor.count_surface_AA(structure, structurePath, cutoff=0.7)[source]¶
Counts how many of each of the 20 standard AAs has a relative ASA value above a given cutoff.
- Parameters:
structure (Bio.PDB.Structure.Structure) – The structure to be loaded.
structurePath (str) – The path of the structure.
cleanStructurePath (str) – The path of the clean structure.
cutoff (float, optional) – The cutoff to consider an AA as surface. Default is 0.7.
- Returns:
A dictionary with the count of each AA.
- Return type:
Dict[str, int]
- OCDocker.Receptor.get_res(model)[source]¶
Get the amino acid one letter sequence for the receptor (Ignore chains).
- Parameters:
model (Bio.PDB.Structure.Structure) – The model to be analysed.
- Returns:
The amino acid one letter sequence for the receptor.
- Return type:
str
- OCDocker.Receptor.load_mol(structure, name='', compute_sasa=True, mol2_path='', overwrite=False, clean=True, canonicalize_pdb='auto')[source]¶
Load a structure pdb/cif/mmcif if a path is provided or just assign the Bio.PDB.Structure.Structure object to the structure. Also returns the path as a tuple (path, structure).
- Parameters:
structure (str | os.PathLike | Bio.PDB.Structure.Structure) – Path to the structure file or a Bio.PDB.Structure.Structure object.
name (str, optional) – The name of the structure, by default “”.
compute_sasa (bool, optional) – Whether to compute the SASA or not, by default True.
mol2_path (str, optional) – The path to the mol2 file, by default “”.
overwrite (bool, optional) – Whether to overwrite the mol2 file or not, by default False.
clean (bool, optional) – Whether to clean the protein file or not, by default True.
canonicalize_pdb (bool | str, optional) – Whether to canonicalize CHARMM-style PDB names. Use True, False, or “auto”.
- Returns:
The path to the structure and the structure object. Will return a tuple of (“”, None) if the structure is not valid.
- Return type:
Tuple[str, Bio.PDB.Structure.Structure]
- OCDocker.Receptor.read_descriptors_from_json(path, returnData=False)[source]¶
Read the descriptors from a json file.
- Parameters:
path (str) – The path to the json file.
returnData (bool, optional) – If True, returns a dictionary with the descriptors. By default False.
- Returns:
The descriptors dictionary or None if any error occurs.
- Return type:
Dict[str, str | float | int] | Tuple[float | str | int | dict[str, int], …] | None
- Raises:
KeyError –
- OCDocker.Receptor.renumber_pdb_residues(structure, outputPdb='')[source]¶
Renumber the pdb residues using biopython.
- Parameters:
structure (Bio.PDB.Structure.Structure) – The structure to be renumbered.
outputPdb (str, optional) – The output pdb file. If not provided, the structure will be renumbered in place, by default “”.
- Returns:
The renumbered structure.
- Return type:
Bio.PDB.Structure.Structure