rnaglib.dataset_transforms
¶
Dataset transforms are trraansforms which process a whole dataset. They take as input a dataset and return the same dataset with some features being added or modified or some elements removed or added.
imports for splitting module
Abstract classes¶
Subclass these to create your own dataset transforms.
Transforms is just a base class that performs a processing of a whole RNADataset |
|
|
Objects enabling the splitting of an RNADataset into train, validation and test sets |
|
Dataset transform adding to the dataset attributes a distance matrix encoding the pairwise distances between all RNAs of the dataset |
|
Dataset transform removing redundancy in a dataset by performing clustering on the dataset then keeping only the RNA with the highest resolution within each cluster |
Splitters¶
Ways to split your data (all of these are subclasses of Splitter abstract class).
|
Abstract class for splitting by clustering with a similarity function. |
|
Just split a dataset randomly. |
|
Splits a dataset based on hard-coded lists of RNA names to be included in train, val and test sets |
Distance computers¶
Ways to add to the dataset a distance matrix indicating distance between the samples of the dataset (all of these are subclasses of DistanceComputer abstract class)
|
|
|
Distance computer computing a structure-based pairwise distance between RNAs from a dataset |
Loading¶
Tools for loading RNAs stored in an RNADataset
batch-wise for deep learning models.
|
Wrapper for collate function, so we can use different node similarities. |
|
Fetch a loader object for a given dataset. |
|
This turns a graph dataloader or dataset into an edge data loader generator. |
|
Dataloader that yields base pairs |