rnaglib.tasks.gRNAde

class rnaglib.tasks.gRNAde(size_thresholds=(15, 300), **kwargs)[source]

This class is a subclass of InverseFolding and is used to train a model on the gRNAde dataset.

Task type: multi-class classification Task level: residue-level

Parameters:

size_thresholds (tuple[int]) – range of RNA sizes to keep in the task dataset(default (15, 500))

__init__(size_thresholds=(15, 300), **kwargs)[source]

Methods

__init__([size_thresholds])

add_feature(feature[, feature_level, is_input])

Add a feature to the dataset.

add_representation(representation)

Add a representation transform to the dataset.

add_rna_to_building_list(all_rnas, rna)

Add an RNA to the building list.

compute_distances()

Compute similarity distances between RNAs in the dataset.

compute_metrics(all_preds, all_probs, all_labels)

Evaluate model performance on nucleotide prediction task.

compute_one_metric(preds, unfiltered_preds, ...)

Compute classification metrics for a single set of predictions.

create_dataset_from_list(rnas)

Compute an RNADataset object from the lists touched in add_rna_to_building_list.

describe()

Get description of task dataset.

dummy_inference()

Run dummy inference on the test dataset.

evaluate(model, loader)

Evaluate model performance on a dataset.

from_scratch(size_thresholds)

Create task dataset from scratch.

from_zenodo()

Download the task dataset from Zenodo and load it.

get_split_datasets([recompute])

Get train, validation, and test datasets.

get_split_loaders([recompute])

Get train, validation, and test dataloaders.

get_task_vars()

Specifies the FeaturesComputer object of the tasks which defines the features which have to be added to the RNAs (graphs) and nucleotides (graph nodes)

init_metadata([additional_metadata])

Initialize dictionary to hold key/value pairs to self.metadata.

load()

Load dataset and splits from disk.

post_process()

The task-specific post processing steps to remove redundancy and compute distances which will be used by the splitters.

process()

remove_redundancy()

Remove redundant RNAs from the dataset based on similarity.

remove_representation(representation_name)

Remove a representation transform from the dataset.

set_datasets([recompute])

Set the train, val and test datasets.

set_loaders([recompute])

Set the dataloader properties.

split(dataset)

Calls the splitter and returns train, val, test splits.

to_csv(path)

Write a single CSV with all task data.

write()

Save task data and splits to root.

Attributes

default_metric

default_splitter

Returns the splitting strategy to be used for this specific task.

dummy_model

Get a dummy model for testing purposes.

input_var

name

nucs

target_var

task_id

Task hash is a hash of all RNA ids and node IDs in the dataset.

version