evidence module¶

class mavis.validate.evidence.GenomeEvidence(*pos, **kwargs)[source]¶

Bases: mavis.validate.base.Evidence

compute_fragment_size(read, mate=None)[source]¶

generate_window(breakpoint)[source]¶

given some input breakpoint uses the current evidence setting to determine an appropriate window/range of where one should search for supporting reads

Parameters:	breakpoint (Breakpoint) – the breakpoint we are generating the evidence window for read_length (int) – the read length call_error (int) – adds a buffer to the calculations if confidence in the breakpoint calls is low can increase this
Returns:	the range where reads should be read from the bam looking for evidence for this event
Return type:	Interval

class mavis.validate.evidence.TranscriptomeEvidence(annotations, *pos, **kwargs)[source]¶

Bases: mavis.validate.base.Evidence

compute_fragment_size(read, mate)[source]¶

distance(start, end, strand='?', chrom=None)[source]¶

give the current list of transcripts, computes the putative exonic/intergenic distance given two genomic positions. Intronic positions are ignored

Intergenic calculations are only done if exonic only fails

exon_boundary_shift_cigar(read)[source]¶: given an input read, converts deletions to N when the deletion matches the exon boundaries. Also shifts alignments to correspond to the exon boundaries where possible

generate_window(breakpoint)[source]¶

given some input breakpoint uses the current evidence setting to determine an appropriate window/range of where one should search for supporting reads

Parameters:	breakpoint (Breakpoint) – the breakpoint we are generating the evidence window for annotations (dict of str and list of Gene) – the set of reference annotations: genes, transcripts, etc read_length (int) – the read length median_fragment_size (int) – the median insert size call_error (int) – adds a buffer to the calculations if confidence in the breakpoint calls is low can increase this stdev_fragment_size – the standard deviation away from the median for regular (non STV) read pairs
Returns:	the range where reads should be read from the bam looking for evidence for this event
Return type:	Interval

min_cds_shift(pos, strand='?', chrom=None)[source]¶

standardize_read(read)[source]¶

traverse(start, distance, direction, strand='?', chrom=None)[source]¶

given some genomic position and a distance. Uses the input transcripts to compute all possible genomic end positions at that distance if intronic positions are ignored

Parameters:	start (int) – the genomic start position distance (int) – the amount of exonic/intergenic units to traverse direction (ORIENT) – the direction wrt to the positive/forward reference strand to traverse transcripts (`list` of `PreTranscript`) – list of transcripts to use