Burrows-Wheeler Aligner (BWA) is an efficient program that aligns relatively short nucleotide sequences against a long reference sequence such as the human genome. It implements two algorithms, bwa-short and BWA-SW. The former works for query sequences shorter than 200bp and the latter for longer sequences up to around 100kbp. Both algorithms do gapped alignment. They are usually more accurate and faster on queries with low error rates. 1

BWA requires different approaches depending on the type of input data.

See the BWA Manual Reference Pages for further details.

up-to-date as of

bwa version 0.6.1-r104
samtools version 0.1.18 (r982:295)

Common to all approaches is creation of the BWA index. It is more nicely organized if this is kept in it’s own folder:

mkdir ref-index  
cd ref-indx  
ln -s /gnomes/ref-genome.fasta ref-genome.fasta  
bwa index -p ref-genome ref-genome.fasta**


Single end fastq with Illumina qualities:

(-I = Illumina qualities, -t 3 = use 3 processors)

bwa aln -I -t 3 ./ref-index/ref-genome ../Trimmed_reads/s_1_trimmed.fastq >
bwa samse ./ref-index/ref-genome s_1.aln.sai
../Trimmed_reads/s_1_trimmed.fastq | gzip > s_1.sam.gz

sort alignments and convert to BAM:

samtools view -uS s_1.sam.gz | samtools sort - s_1

Single end 454 (long) reads:

bwa bwasw -t 3 ./ref-index/ref-genome ../Trimmed_reads/454_trimmed.fastq | gzip > 454.sam.gz

call Mismatch (MD) tag, sort, and convert to BAM:

samtools calmd -uS 454.sam.gz ./ref-index/ref-genome.fasta | samtools sort -

Paired end short reads:

(align each side of the pair, then combine..)

bwa aln -t 3 ./ref-index/ref-genome ../Trimmed_reads/s_1_PE1.fastq > s_1_PE1.sai  
bwa aln -t 3 ./ref-index/ref-genome ../Trimmed_reads/s_1_PE2.fastq >
bwa sampe ./ref-index/ref-genome s_1_PE1.sai s_1_PE2.sai
../Trimmed_reads/s_1_PE1.fastq ../Trimmed_reads/s_1_PE2.fastq | gzip >

sort alignments and convert to BAM:

samtools view -uS s_1_PE12.sam.gz | samtools sort - s_1_PE12