rnaglib.tasks.BindingSite

class rnaglib.tasks.BindingSite(cutoff=6.0, size_thresholds=(15, 500), **kwargs)[source]

Predict the RNA residues which are the most likely to be part of binding sites for small molecule ligands

Task type: binary classification Task level: residue-level

Parameters:
  • cutoff (float) – distance (in Angstroms) between an RNA atom and any small molecule atom below which the RNA residue is considered as part of a binding site (default 6.0)

  • size_thresholds (tuple[int]) – range of RNA sizes to keep in the task dataset(default (15, 500))

__init__(cutoff=6.0, size_thresholds=(15, 500), **kwargs)[source]

Methods

__init__([cutoff, size_thresholds])

add_feature(feature[, feature_level, is_input])

Shortcut to RNADataset.add_feature

add_representation(representation)

add_rna_to_building_list(all_rnas, rna)

compute_distances()

compute_metrics(all_preds, all_probs, all_labels)

compute_one_metric(preds, probs, labels)

create_dataset_from_list(rnas)

Computes an RNADataset object from the lists touched in add_rna_to_building_list

describe()

Get description of task dataset.

dummy_inference()

evaluate(model, loader)

from_scratch(size_thresholds)

from_zenodo()

Downloads the task dataset from Zenodo and loads it.

get_split_datasets([recompute])

get_split_loaders([recompute])

get_task_vars()

Specifies the FeaturesComputer object of the tasks which defines the features which have to be added to the RNAs (graphs) and nucleotides (graph nodes)

init_metadata([additional_metadata])

Initialize dictionary to hold key/value pairs to self.metadata.

load()

Load dataset and splits from disk.

post_process()

The most common post_processing steps to remove redundancy.

process()

remove_redundancy()

remove_representation(representation_name)

set_datasets([recompute])

Sets the train, val and test datasets Call this each time you modify self.dataset.

set_loaders([recompute])

Sets the dataloader properties.

split(dataset)

Calls the splitter and returns train, val, test splits.

to_csv(path)

Write a single CSV with all task data.

write()

Save task data and splits to root.

Attributes

default_metric

default_splitter

Returns the splitting strategy to be used for this specific task.

dummy_model

input_var

name

task_id

Task hash is a hash of all RNA ids and node IDs in the dataset

version