rnaglib.dataset_transforms.CDHitComputer

class rnaglib.dataset_transforms.CDHitComputer(similarity_threshold=0.9, **kwargs)[source]

Compute sequence similarity using CD-Hit clustering.

Parameters:
  • similarity_threshold (float) – Sequence similarity threshold for clustering (default 0.9)

  • kwargs – Additional arguments passed to DistanceComputer

__init__(similarity_threshold=0.9, **kwargs)[source]

Methods

__init__([similarity_threshold])

forward(dataset)

Computes sequence similarity between all pairs of RNAs.