faq
General
- How many samples/reads can PyBamView visualize at once?
Theoretically, there is no limit to the number of samples or reads PyBamView can visualize. However, visualizing more than 5 or so samples can become cumbersome, both because the time to load each view increases and because it is hard to fit alignments for that many samples on one page.
Similarly, PyBamView might have trouble if you try to view too many reads at once. If you have extremely high coverage sequencing of targeted loci (e.g. more than 50-100x), you may want to consider downsampling your reads or removing PCR duplicates before visualizing to improve performance and visualization.
Bam files
- How does PyBamView determine which samples are present in BAM files?
PyBamView uses read groups to determine which samples to display. Any read group with the same sample (SM) field will be treated as coming from the same sample, even if reads for that sample span multiple bam files. Note that not all aligners automatically annotate read groups. You can use Picard's AddOrReplaceReadGroups tool to add read group information.
This page on the GATK website answers general FAQ about formatting BAM files.
- Are clipped bases displayed?
No, all clipped bases (soft and hard clipped) are hidden from display.
- Why does my called INDEL not show up in pybamview?
In some cases it is possible that INDELs called by a genotyping algorithm will not show up in pybamview. This may happen if, for example, one uses GATK's HaplotypeCaller (HC) to call INDELs. HC performs local re-assembly of reads, but normally outputs only VCF. Therefore, it may end up calling an INDEL that was not picked up by the aligner, and therefore will not appear in the original BAM.
One solution is to use HaplotypeCaller's --bamOutput option, and load the resulting BAM rather than the original. (Thanks to @roy_ronen for pointing this out).
- What size BAM files can PyBamView handle?
PyBamView can handle quite large BAM files (e.g. many gigabytes) since it only pulls out the reads it needs for the specific location you are viewing. However, for experiments with extremely high coverage at a small set of targeted regions, PyBamView can become very slow and inconvenient. PyBamView should handle most standard whole genome and whole exome experiments.
Reference genome
- I supplied PyBamView with a reference genome fasta file, but it says that the fasta is invalid.
PyBamView uses the PyFasta library to read fasta files efficiently. PyFasta generates two files (.flat and .gdx files) in the same directory as the fasta file. If this is the first time you're using the fasta file with PyBamView, it will attempt to create those files in the same directory as the original fasta. If you don't have permission to write to that directory, it won't be able to create the index, resulting in the "invalid fasta file" message.
- PyBamView is taking a while to load anything.
If this is the first time you are using PyBamView with a certain fasta reference file, it will need to index it first. This may take a short while (usually under a minute). The next time you use the fasta file, it will use the index files created earlier and should load much faster.
Web browser display
- How can I run PyBamView from one computer and view alignments in a web browser from another computer?
Run PyBamView normally, but specify "--ip 0.0.0.0 --no-browser". Then determine the hostname or IP address of the computer you are running PyBamView from (e.g. you can get the hostname using the command "hostname -f"). Then, open a web browser on another computer and navigate to "http://HOSTNAME:5000" (assuming you're using the default port 5000). Note this requires that the specified port is open for incoming http traffic. See this page for more details.
Targets
- How can I load a set of target loci to pybamview?
If you want to view alignments for a specific set of targeted loci, pybamview has an option to load a bed file with these targets using the --targets option. Then a dropdown box will appear on the alignments page allowing you to jump to different target regions. There is an example targets file in examples/example_targets.bed containing Y-STR and CODIS loci. This file should have 4 tab-delimited columns: chrom, start, end, and name of the marker. The name will be used in the drop down box.