rnaglib.transforms
¶
Transforms are objects which modify RNA dictionaries in various ways. You can apply a transform to an individual RNA or to a collection (filters can only be applied to collections).
In this example, we add a field 'rfam'
with the Rfam ID of an RNA.:
>>> from rnaglib.transforms import RfamTransform
>>> from rnaglib.dataset import RNADataset
>>> dataset = RNADataset(debug=True)
>>> t = RfamTransform()
>>> dataset = t(dataset)
>>> dataset[2]['rna'].graph['rfam']
'RF00005'
Note
You can often speed up a transform by passing parallel=True
to the transform constructor to apply the transform in parallel.
Generic transforms¶
This is the general formulation of the transform, from which specific Transforms described below are derived.
|
Transforms modify and add information to an RNA graph via the |
Annotation Transforms¶
These transforms update the information stored in an RNA dictionary.
|
A transform that computes an annotation for the RNA. |
|
Obtain the Rfam classification of an RNA and store as a graph attribute. |
|
Use the RNA-FM model to compute residue-level embeddings. |
|
Assign the RNA name using its PDBID |
|
Set the rna.name field using the pdbid and chain ID. |
|
Compute secondary structure in dot-bracket notation for each chain in the RNA and store in a graph-level dictionary. |
|
Annotate RNAs with small molecule binding information. |
|
Adds information at the residue level about the protein content of the environment of the residue. |
|
Generic annotator which enables to add node-level features to a dataset by only using a dictionary mapping the node names to the desired node features. |
|
Add a dummy attribute with value 1 to all nodes in an RNA graph. |
|
|
|
Annotation transform adding to each node of the dataset a binary node feature indicating whether it is part of a binding site |
Filters¶
These transforms filter out RNAs from a collection of RNAs based on various criteria.
|
Reject items from a dataset based on some conditions. |
|
Reject RNAs that are not in the given size bounds. |
|
Reject RNAs that lack a certain annotation at the whole RNA level. |
|
Reject RNAs that lack a certain annotation at the whole residue-level. |
|
Filter RNAs based on their residuess' names. |
|
Remove RNA if ribosomal |
|
Filter RNAs based on their names. |
|
Filter RNAs based on valid chain names for each structure. |
|
Filters RNA based on their resolution. |
Partitions¶
These transforms take an RNA and return an iterator of RNAs. Useful for splitting the RNA into substructures (e.g. by chain ID, binding pockets, etc.)
|
Break up a whole RNAs into substructures. |
|
Split up an RNA by chain. |
|
Split up an RNA by connected components. |
|
Partitions an RNA according to a partition defined in a dictionary. |
Representations¶
These transforms convert a raw RNA into a geometric representation such as graph, voxel and point cloud.
Callable object that accepts a raw RNA networkx object along with features and target vector representations and returns a representation of it (e.g. graph, voxel, point cloud). |
|
|
Represents RNA as a linear sequence following the 5'to 3' order of backbone edges. |
|
Converts RNA into a Leontis-Westhof graph (2.5D) where nodes are residues and edges are either base pairs or backbones. |
|
Converts RNA into a point cloud based representation |
|
Converts RNA into a voxel based representation |
|
Converts RNA into a ring based representation |
Featurizers¶
These transforms take an annotation in the RNA and cast it into a feature vector.
|
This class takes as input an RNA in the networkX form and computes the |