Burrows-Wheeler Aligner (BWA) is an efficient program that aligns relatively short nucleotide sequences against a long reference sequence such as the human genome. It implements two algorithms, bwa-short and BWA-SW. The former works for query sequences shorter than 200bp and the latter for longer sequences up to around 100kbp. Both algorithms do gapped alignment. They are usually more accurate and faster on queries with low error rates. 1
BWA requires different approaches depending on the type of input data.
See the BWA Manual Reference Pages for further details.
up-to-date as of
bwa version 0.6.1-r104
samtools version 0.1.18 (r982:295)
Common to all approaches is creation of the BWA index. It is more nicely organized if this is kept in it’s own folder:
mkdir ref-index
cd ref-indx
ln -s /gnomes/ref-genome.fasta ref-genome.fasta
bwa index -p ref-genome ref-genome.fasta**
(-I = Illumina qualities, -t 3 = use 3 processors)
bwa aln -I -t 3 ./ref-index/ref-genome ../Trimmed_reads/s_1_trimmed.fastq >
s_1.aln.sai
bwa samse ./ref-index/ref-genome s_1.aln.sai
../Trimmed_reads/s_1_trimmed.fastq | gzip > s_1.sam.gz
samtools view -uS s_1.sam.gz | samtools sort - s_1
bwa bwasw -t 3 ./ref-index/ref-genome ../Trimmed_reads/454_trimmed.fastq | gzip > 454.sam.gz
samtools calmd -uS 454.sam.gz ./ref-index/ref-genome.fasta | samtools sort -
454
(align each side of the pair, then combine..)
bwa aln -t 3 ./ref-index/ref-genome ../Trimmed_reads/s_1_PE1.fastq > s_1_PE1.sai
bwa aln -t 3 ./ref-index/ref-genome ../Trimmed_reads/s_1_PE2.fastq >
s_1_PE2.sai
bwa sampe ./ref-index/ref-genome s_1_PE1.sai s_1_PE2.sai
../Trimmed_reads/s_1_PE1.fastq ../Trimmed_reads/s_1_PE2.fastq | gzip >
s_1_PE12.sam.gz
samtools view -uS s_1_PE12.sam.gz | samtools sort - s_1_PE12