The program removes the sequences with a quality lower the 'minq' threshold

qmean_filter(input, minq, q_format = NULL, check.encod = TRUE)

Arguments

input

ShortReadQ object

minq

Quality threshold

q_format

Quality format used for the file, as returned by check.encoding

check.encod

Check the encoding of the sequence? This argument is incompatible with q_format

Value

Filtered ShortReadQ object

Examples

require(ShortRead) set.seed(10) # create 30 sequences of width 20 input <- random_seq(30, 20) # create qualities of width 20 ## high quality (15 sequences) set.seed(10) my_qual <- random_qual(c(30,40), slength = 15, swidth = 20, encod = 'Sanger') ## low quality (15 sequences) set.seed(10) my_qual_2 <- random_qual(c(5,30), slength = 15, swidth = 20, encod = 'Sanger') # concatenate vectors input_q<- c(my_qual, my_qual_2) # create names input_names <- seq_names(30) # create ShortReadQ object my_read <- ShortReadQ(sread = input, quality = input_q, id = input_names) # watch the average qualities alphabetScore(my_read) / width(my_read)
#> [1] 34.95 34.65 36.55 34.80 35.20 35.35 34.65 34.20 34.45 35.05 34.80 34.70 #> [13] 34.40 34.80 35.85 15.95 16.55 19.65 16.85 16.65 18.20 16.80 18.00 14.85 #> [25] 18.90 17.00 16.65 16.15 16.80 20.35
# apply the filter filtered <- qmean_filter(my_read, minq = 30) # watch the average qualities alphabetScore(my_read) / width(my_read)
#> [1] 34.95 34.65 36.55 34.80 35.20 35.35 34.65 34.20 34.45 35.05 34.80 34.70 #> [13] 34.40 34.80 35.85 15.95 16.55 19.65 16.85 16.65 18.20 16.80 18.00 14.85 #> [25] 18.90 17.00 16.65 16.15 16.80 20.35
# watch the filtered sequences sread(filtered)
#> A DNAStringSet instance of length 15 #> width seq #> [1] 20 TGGTCCGGTGTTCTGGCGGA #> [2] 20 ATAGGTACAGTCCAGTAATT #> [3] 20 GCCTCCCGCAGACGCTGGGT #> [4] 20 CCGGAATGCCCTTTCTGAGC #> [5] 20 AGCTCCAGCCGTTTGACTTC #> ... ... ... #> [11] 20 CTTACTAAGATTTGCAATAC #> [12] 20 CTAAGCGAAGTGACAGATAT #> [13] 20 GTTCGTCATTCATCCAGGCA #> [14] 20 AGTGCGCGGACATCAATTAC #> [15] 20 CACACAATTAAATATGACTC