phages2050.classifiers.proteins package¶
Submodules¶
phages2050.classifiers.proteins.structural_protein module¶
-
class
phages2050.classifiers.proteins.structural_protein.BacteriophageStructuralProteinClassifier(model_path: str, label_encoder_path: str)[source]¶ Bases:
objectClassifier is responsible to load and execute pre-trained model and label encoder for phage structural protein prediction. This model support 11 proteins classes: - HTJ - basplate - collar - major_capsid - major_tail - minor_capsid - minor_tail - other - portal - tail_fiber - tail_shaft
The model accuracy is 96.92% on training and 95.64% on validation sets after 10-fold cross-validation. Model was trained with 11 000 samples.
-
FEATURE_SPACE= 1024¶
-
SUPPORTED_COLUMNS= ['predicted_index', 'predicted_class', 'accuracy']¶
-
predict(protein_vector: pandas.core.frame.DataFrame) → pandas.core.frame.DataFrame[source]¶ Execute classification model and return best prediction as DataFrame with three columns: - “predicted_index” - predicted protein class index - “predicted_class” - predicted protein class name - “accuracy” - accuracy of prediction (0-100%)
This method can be executed many times for different protein vectors
protein_vector is represented by DataFrame with 1024 numeric values as a result of BERT embedding
-
-
class
phages2050.classifiers.proteins.structural_protein.BacteriophageStructuralProteinManager(root_dir: str = 'bsp_model')[source]¶ Bases:
objectManager class is responsible to download and unzip pre-trained model and label encoder for Bacteriophage Structural Protein classification
-
BSP_LABELS_URL= b'https://deeppetri.ai/static/phages2050/bsp_label_encoder_21.08.2020.zip'¶
-
BSP_MODEL_URL= b'https://deeppetri.ai/static/phages2050/bsp_model_21.08.2020.zip'¶
-
STATUS_CODE_200= 200¶
-