rnaglib.tasks.BenchmarkBindingSite¶
- class rnaglib.tasks.BenchmarkBindingSite(cutoff=6.0, **kwargs)[source]¶
Version of RNA-Site implemented using the data and splitting of the experiment by Su et al. (2021)
Hong Su, Zhenling Peng, and Jianyi Yang. Recognition of small molecule–rna binding sites using rna sequence and structure. Bioinformatics, 37(1):36–42, 2021. <https://doi.org/10.1093/bioinformatics/btaa1092>
Task type: binary classification Task level: residue-level
- Parameters:
cutoff (float) – distance (in Angstroms) between an RNA atom and any small molecule atom below which the RNA residue is considered as part of a binding site (default 6.0)
Methods
__init__
([cutoff])add_feature
(feature[, feature_level, is_input])Shortcut to RNADataset.add_feature
add_representation
(representation)add_rna_to_building_list
(all_rnas, rna)compute_distances
()compute_metrics
(all_preds, all_probs, all_labels)compute_one_metric
(preds, probs, labels)create_dataset_from_list
(rnas)Computes an RNADataset object from the lists touched in add_rna_to_building_list
describe
()Get description of task dataset.
dummy_inference
()evaluate
(model, loader)from_scratch
(size_thresholds)from_zenodo
()Downloads the task dataset from Zenodo and loads it.
get_split_datasets
([recompute])get_split_loaders
([recompute])get_task_vars
()Specifies the FeaturesComputer object of the tasks which defines the features which have to be added to the RNAs (graphs) and nucleotides (graph nodes)
init_metadata
([additional_metadata])Initialize dictionary to hold key/value pairs to self.metadata.
load
()Load dataset and splits from disk.
post_process
()The most common post_processing steps to remove redundancy.
process
()"Creates the task-specific dataset.
remove_redundancy
()remove_representation
(representation_name)set_datasets
([recompute])Sets the train, val and test datasets Call this each time you modify
self.dataset
.set_loaders
([recompute])Sets the dataloader properties.
split
(dataset)Calls the splitter and returns train, val, test splits.
to_csv
(path)Write a single CSV with all task data.
write
()Save task data and splits to root.
Attributes
default_metric
default_splitter
Returns the splitting strategy to be used for this specific task.
dummy_model
input_var
name
target_var
task_id
Task hash is a hash of all RNA ids and node IDs in the dataset
version