Net Format
 

The net file format is used to describe the axtNet data that underlie the net alignment annotations in the Genome Browser. For a detailed description of the methods used to generate these data, refer to the Genome Browser description pages that accompany the downloadable net alignment tracks.

At the beginning of each target species chromosome, a “net” line appears with the format:

net chromName chromSize
Example:
net chr2L 23011544
Where chromName is the target species chromosome name and chromSize is the size of that chromosome, followed by the rest of the fill and gap lines. When a new target chrom starts, there will be a new net line again.

File indentation: Line indentation level represents the parent/child relationship between records and is a necessary part of the net file format. Child records are indented one space from the parent, as seen in the example net file below.

net chr2L 23011544
 fill 6004 3278 chrXR_group3a - 1396397 2164 id 25606 score 23114 ali 782 qDup 576 type top tN 0 qN 0 tR 36 qR 0 tTrf 0 qTrf 0
  gap 6065 2 chrXR_group3a - 1398498 0 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0
  gap 6096 1485 chrXR_group3a - 1397572 897 tN 0 qN 0 tR 36 qR 0 tTrf 0 qTrf 0
   fill 6096 513 chrU - 5570675 533 id 48675 score 4435 ali 465 qDup 533 type nonSyn tN 0 qN 0 tR 0 qR 13 tTrf 0 qTrf 0
    gap 6116 8 chrU - 5571188 0 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0
    gap 6156 5 chrU - 5571156 0 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0
    gap 6184 3 chrU - 5571133 0 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0
    gap 6212 18 chrU - 5571106 0 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0
    gap 6244 9 chrU - 5571092 0 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0
    gap 6340 2 chrU - 5570996 0 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0
    gap 6515 3 chrU - 5570771 0 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0
  gap 7623 1 chrXR_group3a - 1397530 0 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0
  gap 7664 1007 chrXR_group3a - 1397008 482 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0
   fill 7664 382 chrXL_group1e - 8262003 506 id 25608 score 10609 ali 364 qDup 506 type nonSyn tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0
    gap 7784 4 chrXL_group1e - 8262361 0 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0
    gap 7792 3 chrXL_group1e - 8262357 0 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0
    gap 7921 2 chrXL_group1e - 8262126 0 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0
    gap 7949 9 chrXL_group1e - 8262092 0 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0
  gap 8693 1 chrXR_group3a - 1396985 0 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0
 fill 9833 1251 chrU - 5562980 1239 id 48675 score 10720 ali 1124 qDup 1094 type top tN 0 qN 0 tR 22 qR 88 tTrf 0 qTrf 0
  gap 9966 7 chrU - 5564075 0 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0
  gap 10015 3 chrU - 5564030 0 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0
  gap 10088 2 chrU - 5563957 0 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0
  gap 10101 8 chrU - 5563946 0 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0

Field definitions

The net file consists of 7 fixed fields and a set of optional name/value pair fields. In the descriptions below, target refers to the reference species and query refers to the aligning species.

Fixed fields

  • Class -- Either fill or gap. Fill refers to a portion of a chain.
  • Start in chromosome -- (target species)
  • Size -- target species)
  • Chromsome name -- (query species)
  • Relative orientation -- between target and query species.
  • Start in chromsome -- (query species)
  • Size -- (query species)

Optional fields (Name/value pairs)

  • id -- ID of associated chain (gapped alignment), if any.
  • score -- Score of associated chain.
  • ali -- Number of bases in alignments in chain.
  • qFar -- For fill that is on the same chromosome as parent, how far fill is from position predicted by parent. This helps determine if a rearrangement is local or if a duplication is tandem.
  • qOver -- Number of bases overlapping with parent gap on query side. Generally, this will be near zero, except for inverts.
  • qDup -- Number of bases in query region that are used twice or more in net. This helps distinguish between a rearrangement and a duplication.
  • type -- One of the following values:
    • top -- Chain is top-level, not a gap filler.
    • syn -- Chain is on same chromosome and in same direction as parent.
    • inv -- Chain is on same chromosome on opposite direction from parent.
    • nonSyn -- Chain is on a different chromosome from parent.
  • tN -- Number of unsequenced bases (Ns) on target side.
  • qN -- Number of unsequenced bases on query side.
  • tR -- Number of bases in RepeatMasker masked repeats on target.
  • qR -- Number of bases in RepeatMasker masked repeats on query.
  • tNewR -- Bases in lineage-specific repeats on target.
  • qNewR -- Bases in lineage-specific repeats on query.
  • tOldR -- Bases in repeats predating split on target.
  • qOldR -- Bases in repeats predating split on query.
  • tTrf -- Bases in trf (Tandem Repeat Finder) repeats on target.
  • qTrf -- Bases in trf repeats on query.