rnaglib.tasks.ResidueClassificationTask

class rnaglib.tasks.ResidueClassificationTask(additional_metadata=None, **kwargs)[source]

Classification task at the residue level.

Each residue (nucleotide) in the RNA is classified independently.

__init__(additional_metadata=None, **kwargs)[source]

Methods

__init__([additional_metadata])

add_feature(feature[, feature_level, is_input])

Add a feature to the dataset.

add_representation(representation)

Add a representation transform to the dataset.

add_rna_to_building_list(all_rnas, rna)

Add an RNA to the building list.

compute_distances()

Compute similarity distances between RNAs in the dataset.

compute_metrics(all_preds, all_probs, all_labels)

Compute classification metrics aggregated across all predictions.

compute_one_metric(preds, probs, labels)

Compute classification metrics for a single set of predictions.

create_dataset_from_list(rnas)

Compute an RNADataset object from the lists touched in add_rna_to_building_list.

describe()

Get description of task dataset.

dummy_inference()

Run dummy inference on the test dataset.

evaluate(model, loader)

Evaluate model performance on a dataset.

from_scratch(size_thresholds)

Create task dataset from scratch.

from_zenodo()

Download the task dataset from Zenodo and load it.

get_split_datasets([recompute])

Get train, validation, and test datasets.

get_split_loaders([recompute])

Get train, validation, and test dataloaders.

get_task_vars()

Define a FeaturesComputer object to set which input and output variables will be used in the task.

init_metadata([additional_metadata])

Initialize dictionary to hold key/value pairs to self.metadata.

load()

Load dataset and splits from disk.

post_process()

Apply post-processing steps to remove redundancy.

process()

Tasks must implement this method.

remove_redundancy()

Remove redundant RNAs from the dataset based on similarity.

remove_representation(representation_name)

Remove a representation transform from the dataset.

set_datasets([recompute])

Set the train, val and test datasets.

set_loaders([recompute])

Set the dataloader properties.

split(dataset)

Calls the splitter and returns train, val, test splits.

to_csv(path)

Write a single CSV with all task data.

write()

Save task data and splits to root.

Attributes

default_splitter

The splitter used if no other splitter is specified.

dummy_model

Get a dummy model for testing purposes.

task_id

Task hash is a hash of all RNA ids and node IDs in the dataset.