|
The bigBed format stores annotation items that can either be simple, or a
linked collection of exons, much as
BED files do.
BigBed files are created initially from BED type files,
using the program bedToBigBed. The
resulting bigBed files are in an indexed binary format. The main advantage of
the bigBed files is that only the portions of the files needed to display a
particular region are transferred to UCSC, so for large data sets bigBed is
considerably faster than regular BED files. The bigBed file remains on
your web accessible server (http, https, or ftp), not on the UCSC server.
Only the portion that is needed
for the chromosomal position you are currently viewing is locally cached as a
"sparse file".
Additional indices can be created for the items in a bigBed file.
These indices can be used to support item search in track hubs. See Example 3 for an example of how to build an additional index.
See
this page for help in selecting a graphing track data format
that is most approriate for the type of data you have.
Note that the bedToBigBed utility uses a substantial amount of
memory; somewhere on the order of 1/4 times more RAM than the
uncompressed BED input file.
To create a bigBed track, follow these steps:
- Create a BED format file following the directions
here.
- When converting a BED file to a bigBed file, you are limited to
one track of data in your input file; you must create a separate BED file
for each data track.
- Your BED file must be sorted by chrom then chromStart. You can use
the UNIX sort command to do this:
sort -k1,1 -k2,2n unsorted.bed > input.bed
- This is the file that is referred to as
input.bed in step 5 below.
- Remove any existing 'track' or 'browser' lines from your BED file
so that it contains only data.
- Download the bedToBigBed program from the
directory
of binary utilities.
- Use the fetchChromSizes script from the same
directory
to create the chrom.sizes file for the UCSC database you are working with
(e.g. hg19). Note that this is the file that is referred to as
chrom.sizes in step 5 below.
- Create the bigBed file from your sorted BED file using the bedToBigBed
utility like so:
bedToBigBed input.bed chrom.sizes myBigBed.bb
- Move the newly created bigBed file (myBigBed.bb) to a http,
https, or ftp location.
- If the file URL ends with .bigBed or .bb, you can paste the URL directly
into the custom track management page, click submit
and view in the Genome Browser.
The track name will then be the name of the file. If you
want to configure the track name and descriptions, you will need to create a
track line, as shown below.
- Construct a custom track
using a single
track line.
Note that any of the track attributes listed
here are applicable
to tracks of type bigBed.
The most basic version of the "track" line will look something
like this:
track type=bigBed name="My Big Bed" description="A Graph of
Data from My Lab" bigDataUrl=http://myorg.edu/mylab/myBigBed.bb
- Paste this custom track line into the text box in the
custom track management page.
The bedToBigBed program can also be run with several additional options.
Some of them, like the -as and -type options, are used in examples
below. A full list of the available options can be seen by running
bedToBigBed by itself with no arguments to display the usage message.
Example One
In this example, you will use an existing bigBed file to create a bigBed
custom track. A bigBed file that contains data on chromosome 21 on the hg19
assembly has been placed on our http server.
You can create a custom track using this bigBed file by pasting the URL
http://genome.ucsc.edu/goldenPath/help/examples/bigBedExample.bb
into the custom track management page, clicking
submit and clicking the chr21 link in the custom track listing.
Alternatively, you can construct a
"track" line that references this file like so:
track type=bigBed name="bigBed Example One"
description="A bigBed file"
bigDataUrl=http://genome.ucsc.edu/goldenPath/help/examples/bigBedExample.bb
Include the following "browser" line to ensure that the custom
track opens at the correct position:
browser position chr21:33,031,597-33,041,570
Paste the "browser" line and "track" line into the
custom track management page for the
human assembly hg19 (Feb. 2009), then
press the submit button.
On the following page, press the chr21 link in the custom track
listing to view the bigBed track in the Genome Browser.
Example Two
In this example, you will create your own bigBed file from an existing
bed file.
- Save this BED file
to your machine
(this satisfies steps 1 and 2 above).
- Save this text file to your machine.
It contains the chrom.sizes for the human (hg19) assembly
(this satisfies step 4 above).
- Download the bedToBigBed utility (see step 3).
- Run the utility to create the bigBed output file
(see step 5):
bedToBigBed bedExample.txt hg19.chrom.sizes myBigBed.bb
- Place the bigBed file you just created (myBigBed.bb) on a
web-accessible server (see step 6).
- Paste the URL itself into the Custom Tracks entry form or construct a
"track" line that points to your bigBed file (see step 7).
- Create the custom track on the human assembly hg19 (Feb. 2009), and
view it in the genome browser (see step 8). Note that the original
BED file contains data on only chromsome 21.
Example Three
In this example, you will create your own bigBed file from a fully-featured
existing BED file that contains the standard BED fields up to and including
the color field (field 9), plus two additional non-standard fields (two
alternate names for each item in the file).
BigBed files can store extra fields in addition to the
predefined BED fields.
If you add extra fields to your bigBed file, you must include
a .as (AutoSQL) format file describing the fields. See
this paper
for information on AutoSQL. There are several sample .as files
here.
This example also demonstrates how to create extra indices on the name field,
and the first of the extra fields to be used for track item search.
- Save this BED file
to your machine
(this satisfies steps 1 and 2 above).
- Save this text file to your machine.
It contains the chrom.sizes for the human (hg18) assembly
(this satisfies step 4 above).
- Save this .as file to your machine.
The .as file contains a description of the fields in the BED file.
This is required when the BED file contains a field for color.
- Download the bedToBigBed utility (see step 3).
- Run the utility to create the bigBed output file with an index on the name field and the first extra field:
(see step 5):
bedToBigBed -as=bedExample2.as -type=bed9+2 -extraIndex=name,geneSymbol bedExample2.bed
hg18.chrom.sizes myBigBed2.bb
- Place the bigBed file you just created (myBigBed2.bb) on a
web-accessible server (see step 6).
- Paste the URL itself into the custom tracks entry form or construct a "track" line that points to your bigBed file
(see step 7). Because this bigBed file includes a field for color,
you must include the itemRgb
attribute in the "track" line. It will look somewhat similar to
this (note that you must insert the correct URL to your bigBed file):
track type=bigBed name="bigBed Example Three"
description="A bigBed File with Color and two Extra Fields" itemRgb="On"
bigDataUrl=http://yourWebAddress/myBigBed2.bb
- Create the custom track on the human assembly hg18 (Mar. 2006), and
view it in the genome browser (see step 8). Note that the original
BED file contains data only on chromsome 7.
- If you are using the bigBed file in a track hub, then you can use the
additional indices for track item searches. See the setting "searchIndex" in the
Track Database Definition Document
for more information. For example, if you ran your bedToBigBed with the
option of "-extraIndex=name", you will only be able to search on the name field by
adding the following line, "searchIndex name" to the stanza about your bigBed
in the hub's trackDb.txt file.
Sharing Your Data with Others
If you would like to share your bigBed data track with a colleague, learn
how to create a URL by looking at Example 11 on
this page.
Extracting Data from the bigBed Format
Because the bigBed files are indexed binary files, they can be difficult to
extract data from. Consequently, we have developed the following two
programs, both of which are available from the
directory of binary
utilities.
- bigBedToBed — this program converts a bigBed file
to ASCII BED format.
- bigBedSummary — this program extracts summary information
from a bigBed file.
- bigBedInfo — this program prints out information about a
bigBed file.
These programs accept file names as input or alternatively also URLs to files.
As with all UCSC Genome Browser programs, simply type the program name at the
command line with no parameters to see the usage statement.
Troubleshooting
If you get an error when you run the bedToBigBed program,
it may be because your input BED file has data off the end of a chromosome.
In this case, use the bedClip program
here before the
bedToBigBed program. It will remove the row(s) in your input BED
file that are off the end of a chromosome.
| |