Graphs

Types

PanGraph.Graphs.DelMapType
DelMap = Dict{Int,Int}

A sparse array of deletion events relative to a consensus. The key is the locus (inclusive) of the deletion; the value is the length.

source
PanGraph.Graphs.GraphType
struct Graph
    block    :: Dict{String, Block}
    sequence :: Dict{String, Path}
end

Representation of a multiple sequence alignment. Alignments of homologous sequences are stored as blocks. A genome is stored as a path, i.e. a list of blocks.

source
PanGraph.Graphs.GraphMethod
Graph(name::String, sequence::Array{UInt8}; circular=false)

Creates a singleton graph from sequence. name is assumed to be a unique identifier. If circular is unspecified, the sequence is assumed to be linear.

source
PanGraph.Graphs.InsMapType
InsMap = Dict{Tuple{Int,Int},Array{UInt8,1}}

A sparse array of insertion sequences relative to a consensus. The key is the (locus(after),offset) of the insertion; the value is the sequence.

source
PanGraph.Graphs.SNPMapType
SNPMap = Dict{Int,UInt8}

A sparse array of single nucleotide polymorphisms relative to a consensus. The key is the locus of the mutation; the value is the modified nucleotide.

source

Functions

PanGraph.Graphs.consistency_checkMethod
consistency_check(G::Graph)

performs final consistency checks on the graph. Implemented checks for now are:

  • check 1-1 correspondence between gaps and insertion positions in block alignments.
source
PanGraph.Graphs.detransitive!Method
detransitive!(G::Graph)

Find and remove all transitive edges within the given graph. A transitive chain of edges is defined to be unambiguous: all sequences must enter on one edge and leave on another. Thus, this will not perform paralog splitting.

source
PanGraph.Graphs.finalize!Method
finalize!(G::Graph)

Compute the position of the breakpoints for each homologous alignment across all sequences within Graph G. Intended to be ran after multiple sequence alignment is complete

source
PanGraph.Graphs.graphsMethod
graphs(io::IO; circular=false)

Parse a fasta file from stream io and return an array of singleton graphs. If circular is unspecified, all genomes are assumed to be linear.

source
PanGraph.Graphs.keeponly!Method
keeponly!(G::Graph, names::String...)

Remove all sequences from graph G that are passed as variadic parameters names. This will marginalize a graph, i.e. return the subgraph that contains only isolates contained in names

source
PanGraph.Graphs.marshal_fastaMethod
marshal_fasta(io::IO, G::Graph; opt=nothing)

Serialize graph G as a fasta format output stream io. Importantly, this will only serialize the consensus sequences for each block and not the full multiple sequence alignment.

opt is currently ignored. It is kept for signature uniformity for other marshal functions

source
PanGraph.Graphs.marshal_jsonMethod
marshal_json(io::IO, G::Graph; opt=nothing)

Serialize graph G as a json format output stream io. This is the main storage/exported format for PanGraph. Currently it is the only format that can reconstruct an in-memory pangraph.

opt is currently ignored. It is kept for signature uniformity for other marshal functions

source
PanGraph.Graphs.prune!Method
prune!(G::Graph)

Remove all blocks from graph G that are not currently used by any extant sequence. Internal function used during guide tree alignment.

source
PanGraph.Graphs.purge!Method
purge!(G::Graph)

Remove all blocks from paths found in graph G that have zero length. Internal function used during guide tree alignment.

source
PanGraph.Graphs.realign!Method
realign!(G::Graph; accept)

Realign blocks contained within graph G. Usage of this function requires MAFFT to be on the system PATH accept should be a function that returns true on blocks you wish to realign. By default, all blocks are realigned.

source
PanGraph.Graphs.testFunction
test(path)

Align all sequences found in the fasta file at path into a pangraph. Verifies that after the alignment is complete, all sequences are correctly reconstructed

source
PanGraph.Graphs.unmarshalMethod
unmarshal(io::IO)

Deserialize the json formatted input stream io into a Graph data structure. Return a Graph type.

source