Evaluation#
Evaluation module for compound profiling.
Evaluator#
- evaluation.evaluator.evaluate_model(model_name: Literal['base_resnet', 'simclr', 'wsdino'], distance_measure: Literal['l1', 'l2', 'cosine'] = 'cosine', nsc_eval=True, tvn: bool = False) Dict[str, float] [source]#
Evaluate MOA prediction using 1-nearest neighbor with specified distance measure on pre-extracted features.
- Parameters:
model_name – Name of the model to use for loading pre-computed features.
distance_measure – Distance measure to use for 1NN (“l1”, “l2”, or “cosine”).
nsc_eval – If True, same compound (all concentrations) is not used for evaluation.
tvn – If True, apply Typical Variation Normalization to features.
- Returns:
Dictionary with per-compound accuracies and total accuracy
- Return type:
Feature Extractor#
- evaluation.extractor.extract_moa_features(model_name: Literal['base_resnet', 'simclr', 'wsdino'], device, batch_size=16, data_root: str = '/scratch/cv-course2025/group8', compounds: list[str] | None = None, tvn: bool = False) None [source]#
Extract features for the BBBC021 dataset using a pretrained ResNet50 model.
- Parameters:
model_name – Name of the model to use. Is of type ModelName.
device – Device to run the model on.
batch_size – Batch size for data loading.
data_root – Root directory where the BBBC021 dataset is stored.
compounds – List of compounds to process. If None, all compounds will be processed.
tvn – If True, apply Typical Variation Normalization to features before averaging and saving.
Visualization#
Visualize BBBC021 compound embeddings using t-SNE and UMAP.
This script loads pre-extracted feature embeddings from .pkl files, applies t-SNE and UMAP dimensionality reduction, and visualizes the resulting 2D projections colored by Mechanism of Action (MoA).
- evaluation.visualization.visualize_embeddings.load_model_features(model_name, data_root='/scratch/cv-course2025/group8')[source]#
Load features for a single model.
- evaluation.visualization.visualize_embeddings.plot_tsne_comparison(model_names, data_root='/scratch/cv-course2025/group8', output_dir='/scratch/cv-course2025/group8/plots', figsize=None)[source]#
Create side-by-side t-SNE plots for multiple models.
- evaluation.visualization.visualize_embeddings.plot_umap_comparison(model_names, data_root='/scratch/cv-course2025/group8', output_dir='/scratch/cv-course2025/group8/plots', figsize=None)[source]#
Create side-by-side UMAP plots for multiple models.
- evaluation.visualization.visualize_embeddings.plot_accuracy_vs_image_count(model_names, data_root='/scratch/cv-course2025/group8', output_dir='/scratch/cv-course2025/group8/plots', distance_measure='cosine', sort_by='accuracy', max_image_count=None)[source]#
Create a dual-axis plot showing per-compound accuracies with image counts context.
- Parameters:
model_names (list) – List of model names to compare
data_root (str) – Root directory containing bbbc021_features and dataset
output_dir (str) – Directory to save plots
distance_measure (str) – Distance measure for evaluation (“cosine”, “l1”, “l2”)
sort_by (str) – Sort compounds by “accuracy”, “image_count”, or “compound_name”
max_image_count (int, optional) – Maximum value for the image count y-axis (right side). If None, uses automatic scaling.