ample.util.sequence_util module¶

class Sequence(fasta=None, pdb=None, canonicalise=False)[source]¶

Bases: object

A class to handle a fasta file

add_pdb_data(pdbin)[source]¶: Add the resseq information from a pdb to ourselves when we already have the sequence information from a fasta file - such as from an alignment We assume that there will be gaps (-) in the sequence and the letters may be upper or lower case Currently this only supports adding data for single-chain pdbs

canonicalise()[source]¶

Reformat the fasta file

Needed because Rosetta has problems reading fastas. For it to be read, it has to have no spaces in the sequence, a name that is 4 characters and an underscore (ABCD_), everything has to be uppercase, and there has to be a return carriage at the end - this has to be linux formatted as when I make a fasta in windows, it doesnt recognize the return carriage.

Rosetta has a lot of problems with fastas so we put in this script to deal with it.

fasta_str(pdbname=False)[source]¶

from_fasta(fasta_file, canonicalise=True, resseq=True)[source]¶

from_pdb(pdbin)[source]¶

length(seq_no=0)[source]¶

mutate_residue(from_aa, res_seq, to_aa, seq_id=0)[source]¶

Change residue type from_aa at position res_seq to be of type to_aa

Note: res_seq positions start from 1 not zero!

Parameters:	from_aa (str) – Single-letter amino acid res_seq (int) – Residue sequence number to_aa (str) – Single-letter amino acid seq_id (int) – The index of the sequence to operate on (counting from zero)

numSequences()[source]¶

pirStr(seqNo=0)[source]¶: Return a canonical MAXWIDTH PIR representation of the file as a line-separated string

sequence(seq_no=0)[source]¶

toPir(input_fasta, output_pir=None)[source]¶: Take a fasta file and output the corresponding PIR file

write_fasta(fasta_file, pdbname=False)[source]¶

chain_data(chain)[source]¶

chain_sequence(chain)[source]¶

process_fasta(amoptd, canonicalise=False)[source]¶

sequence(pdbin)[source]¶

sequence_data(pdbin)[source]¶