rnaglib.transforms.RNAFMTransform

class rnaglib.transforms.RNAFMTransform(chunking_strategy='simple', chunk_size=512, cache_path=None, expand_mean=True, verbose=False, debug=False, **kwargs)[source]

Use the RNA-FM model to compute residue-level embeddings. Make sure rna-fm is installed by running pip install rna-fm. Sets a node attribute to ‘rnafm’ with a numpy array of the resulting embedding. Go here for the RNA-FM source code.

Parameters:
  • chunking_strategy (str) – How to process sequences longer than 1024. 'simple' just splits into non-overlapping segments.

  • chunk_size (int) – Size of chunks to use (default is 512)

  • cache_path – A directory containing pre-computed npz embeddings

  • expand_mean – If True, expand mean embeddings

Note

Maximum size for basic RNA-FM model is 1024. If sequence is larger than 1024 we apply 'chunking_strategy' to process the sequence.

__init__(chunking_strategy='simple', chunk_size=512, cache_path=None, expand_mean=True, verbose=False, debug=False, **kwargs)[source]

Methods

__init__([chunking_strategy, chunk_size, ...])

basic_chunking(seq)

chunk(seq_data)

Apply a chunking strategy to sequences longer than 1024.

forward(rna_dict)

Attributes

encoder

name