rnaglib.tasks.RNAGo¶

class rnaglib.tasks.RNAGo(size_thresholds=(15, 500), **kwargs)[source]¶

Predict the GO terms associated with the Rfam family of a given RNA chain. Of course, this task is solved by definition since families are constructed using covariance models. However, it can still test the ability of a model to capture characteristic structural features from 3D.

Task type: multi-class classification Task level: RNA-level

Parameters:: size_thresholds (tuple[int]) – range of RNA sizes to keep in the task dataset(default (15, 500))

__init__(size_thresholds=(15, 500), **kwargs)[source]¶

Methods

`__init__`([size_thresholds])
`add_feature`(feature[, feature_level, is_input])	Add a feature to the dataset.
`add_representation`(representation)	Add a representation transform to the dataset.
`add_rna_to_building_list`(all_rnas, rna)	Add an RNA to the building list.
`compute_distances`()	Compute similarity distances between RNAs in the dataset.
`compute_metrics`(all_preds, all_probs, all_labels)	Compute classification metrics aggregated across all predictions.
`compute_one_metric`(preds, probs, labels)	Compute classification metrics for a single set of predictions.
`create_dataset_from_list`(rnas)	Compute an RNADataset object from the lists touched in add_rna_to_building_list.
`describe`()	Get description of task dataset.
`dummy_inference`()	Run dummy inference on the test dataset.
`evaluate`(model, loader)	Evaluate model performance on a dataset.
`from_scratch`(size_thresholds)	Create task dataset from scratch.
`from_zenodo`()	Download the task dataset from Zenodo and load it.
`get_split_datasets`([recompute])	Get train, validation, and test datasets.
`get_split_loaders`([recompute])	Get train, validation, and test dataloaders.
`get_task_vars`()	Specifies the FeaturesComputer object of the tasks which defines the features which have to be added to the RNAs (graphs) and nucleotides (graph nodes)
`init_metadata`([additional_metadata])	Initialize dictionary to hold key/value pairs to self.metadata.
`load`()	Load dataset and splits from disk.
`post_process`()	Computes sequence similarity between all atom pairs using CD-Hit
`process`()	Creates the task-specific dataset.
`remove_redundancy`()	Remove redundant RNAs from the dataset based on similarity.
`remove_representation`(representation_name)	Remove a representation transform from the dataset.
`set_datasets`([recompute])	Set the train, val and test datasets.
`set_loaders`([recompute])	Set the dataloader properties.
`split`(dataset)	Calls the splitter and returns train, val, test splits.
`to_csv`(path)	Write a single CSV with all task data.
`write`()	Save task data and splits to root.

Attributes

`default_metric`
`default_splitter`	Returns the splitting strategy to be used for this specific task.
`dummy_model`	Get a dummy model for testing purposes.
`input_var`
`name`
`target_var`
`task_id`	Task hash is a hash of all RNA ids and node IDs in the dataset.
`version`