Skip to content

evo_prot_grad.models

OneHotCNN

evo_prot_grad.models.downstream_cnn.OneHotCNN

Bases: nn.Module

A CNN that takes one-hot encoded sequences as input.

OneHotCNN uses 1D convolution over the one-hot encoding dimension to embed each amino acid into a vector of size matching the sequence length, and uses length max-pooling (1D max-pooling on the sequence length dimension) to reduce this dimension to 1. The output is then fed through a linear layer to produce a single scalar output.

__init__(vocab_size: int, kernel_size: int, input_size: int, dropout: int = 0.0)

Parameters:

Name Type Description Default
vocab_size int

the size of the vocabulary (e.g., 20).

required
kernel_size int

the size of the convolutional kernel

required
input_size int

the size of the input embedding

required
dropout float

the dropout probability

0.0
forward(x: torch.Tensor) -> torch.Tensor

Parameters:

Name Type Description Default
x torch.Tensor

one-hot tensor of shape [parallel_chains, seq_len, vocab_size]

required

Returns:

Name Type Description
output torch.Tensor

shape [parallel_chains]


EVCouplings Potts

evo_prot_grad.models.potts.EVCouplings

Bases: nn.Module

EVCoupling Potts model implemented in PyTorch.

Represents a Potts model with a single coupling matrix and a single bias vector for a specific region (i.e., subsequence) of the wild type protein sequence under directed evolution.

forward(x: torch.Tensor) -> torch.Tensor

Parameters:

Name Type Description Default
x torch.Tensor

one-hot tensor of shape [parallel_chains, seq_len, vocab_size]

required

Returns:

Name Type Description
hamiltonian torch.Tensor

shape [parallel_chains]