rnaglib.transforms.FeaturesComputer

class rnaglib.transforms.FeaturesComputer(nt_features=None, nt_targets=None, rna_features=None, rna_targets=None, bp_features=None, bp_targets=None, extra_useful_keys=None, custom_encoders=None)[source]

This class takes as input an RNA in the networkX form and computes the features_dict which maps node IDs to a tensor of features. The features_dict contains keys: 'nt_features'``for node features, ``'nt_targets' for node-level prediction targets. In RNADataset construction, the FeaturesComputer.compute_features() method is called during the RNADataset __getitem__() call.

Parameters:
  • nt_features (Union[List, str, None]) – List of keys to use as node (nucleotide) features and are meant to be inputs of the ML task, choose from the dataset[i][‘rna’] node attributes dictionary.

  • nt_targets (Union[List, str, None]) – List of keys to use as node (nucleotide) features and are meant to be outputs of the ML task, choose from the dataset[i][‘rna’] node attributes dictionary.

  • rna_features (Union[List, str, None]) – List of keys to use as graph features of graphs representing the whole RNA and are meant to be inputs of the ML task

  • rna_targets (Union[List, str, None]) – List of keys to use as graph features of graphs representing the whole RNA and are meant to be outputs of the ML task

  • bp_features (Union[List, str, None]) – List of keys to use as graph features of graphs representing an RNA binding pocket and are meant to be inputs of the ML task

  • bp_targets (Union[List, str, None]) – List of keys to use as graph features of graphs representing an RNA binding pocket and are meant to be outputs of the ML task

  • extra_useful_keys (Union[List, str, None]) – List of keys that are not RNA, nucleotide or binding pocket features ir targets but must be preserved when applying the FeaturesComputer to the dataset

  • custom_encoders (dict) – Dictionary of the form {feature_name : encoder}

__init__(nt_features=None, nt_targets=None, rna_features=None, rna_targets=None, bp_features=None, bp_targets=None, extra_useful_keys=None, custom_encoders=None)[source]

Methods

__init__([nt_features, nt_targets, ...])

add_feature([feature_names, ...])

Update the input/output feature selector with either an extra available named feature or a custom encoder

build_edge_feature_parser([asked_features])

build_feature_parser([asked_features, ...])

This function will load the predefined feature maps available globally.

compute_dim(node_parser)

Based on the encoding scheme, we can compute the shapes of the in and out tensors

encode_nodes(g, node_parser)

Simply apply the node encoding functions in node_parser to each node in the graph Then use torch.cat over the result to get a tensor for each node in the graph.

encode_rna(g, parser)

Simply apply the rna encoding functions in parser for all features.

forward(rna_dict)

Add 3 dictionaries to the rna_dict wich maps nts, edges, and the whole graph to a feature vector each.

remove_feature([feature_name, input_feature])

Update the input/output feature selector with either an extra available named feature or a custom encoder

remove_useless_keys(rna_graph)

Copy the original graph to only retain keys relevant to this FeaturesComputer

Attributes

input_dim

output_dim