Configuring VCF tracks
 

Genome Browser VCF tracks may be configured in a variety of ways to highlight different aspects of the displayed information. Click here for more information on VCF custom track creation.

If the VCF file contains genotype columns for at least two samples (four haplotypes), then a haplotype sorting display can be configured:

  • Enable Haplotype sorting display: If checked, then each sample's phased and/or homozygous genotypes are split into haplotypes, clustered by similarity around a central variant, and sorted for display by their position in the clustering tree. The tree (or as much of the tree as space allows) is drawn in the label area next to the track image. Leaf clusters, in which all haplotypes are identical (at least for the span of variants used in clustering), are colored purple. Each variant is drawn as a vertical column, using color to distinguish between reference alleles and alternate alleles of the horizontally running haplotypes. If unchecked, then the display is the same as for VCF without genotypes: a stacked bar graph of the top two alleles, showing the proportion of alleles if allele counts are available. This is checked (enabled) by default. The following options are applicable only when enabled.
    • Haplotype sorting order: Haplotypes are sorted using a distance function that uses a central variant; differences between haplotypes are penalized with weights that decrease for each successive variant away from the central variant. By default, the median variant in the window is used. By clicking on a variant in the display, you will get the option to always use that variant when it is in the current view.
    • Haplotype coloring scheme: There are three ways that reference and alternate alleles can be colored:
      • By default, the reference allele is invisible and the alternate allele is black. When multiple haplotypes must be combined into the same pixel row, grayscale is used to shade according to the proportions of reference and alternate alleles. The central variant has a thin purple outline. Extra pixel rows at the top and bottom show the locations of variants in case they are hard to see due when the invisible reference allele is the major allele. Variants used in clustering have purple marks in these rows; variants outside the clustered regions have black marks.
      • The reference allele is blue and the alternate allele is red. Purple indicates a mix of reference and alternate alleles. The central variant has a thick black outline.
      • Both alleles are colored using the same color scheme as when there are no genotypes: A is red, C is blue, G is green and T is magenta. Gray indicates a mix of reference and alternate alleles. The central variant has a thick black outline.
      In all coloring modes, if some alleles in a haplotype are undefined, a pale yellowish color is used for those alleles.
    • Haplotype clustering leaf shape: Leaf clusters are collections of identical haplotypes. By default, they are drawn as open triangles < . They can also be displayed as open rectangles [.
    • Haplotype sorting display height:Height in pixels of the haplotype sorting display. If this is fewer than the number of haplotypes (2 * the number of genotype columns), some horizontal pixel rows must represent multiple haplotypes; differing haplotypes' colors will be combined according to the selected coloring scheme.

Variants can be filtered out of the display according to several properties:

  • Exclude items with QUAL score less than N: If the checkbox is checked, then all variants whose QUAL column has a non-numeric value (e.g. ".") or a value less than N are excluded from display. By default, the checkbox is not checked and N is 0.
  • Exclude items with these FILTER values: This option appears only if the VCF header defines at least one FILTER code. There is a checkbox for each code defined in the header. If checked, then all variants with that code in the FILTER column are excluded from display. By default, no checkboxes are checked, so all variants are displayed regardless of FILTER column values.
  • Minimum minor allele frequency (if INFO column includes AF or AC+AN): If a variant's INFO field includes AF (alternate allele frequency) or both AC and AN (alternate allele count and total number of alleles), then its minor allele frequency can be compared against this threshold. If the minor allele frequency is less than the threshold, the variant will not be displayed.

When you have finished making your configuration changes, click the Submit button to return to the annotation track display page.