qmean_filter.Rd
The program removes the sequences with a quality lower the 'minq' threshold
qmean_filter(input, minq, q_format = NULL, check.encod = TRUE)
input |
|
---|---|
minq | Quality threshold |
q_format | Quality format used for the file, as returned by check.encoding |
check.encod | Check the encoding of the sequence? This argument is incompatible with q_format |
Filtered ShortReadQ
object
require(ShortRead) set.seed(10) # create 30 sequences of width 20 input <- random_seq(30, 20) # create qualities of width 20 ## high quality (15 sequences) set.seed(10) my_qual <- random_qual(c(30,40), slength = 15, swidth = 20, encod = 'Sanger') ## low quality (15 sequences) set.seed(10) my_qual_2 <- random_qual(c(5,30), slength = 15, swidth = 20, encod = 'Sanger') # concatenate vectors input_q<- c(my_qual, my_qual_2) # create names input_names <- seq_names(30) # create ShortReadQ object my_read <- ShortReadQ(sread = input, quality = input_q, id = input_names) # watch the average qualities alphabetScore(my_read) / width(my_read)#> [1] 34.95 34.65 36.55 34.80 35.20 35.35 34.65 34.20 34.45 35.05 34.80 34.70 #> [13] 34.40 34.80 35.85 15.95 16.55 19.65 16.85 16.65 18.20 16.80 18.00 14.85 #> [25] 18.90 17.00 16.65 16.15 16.80 20.35# apply the filter filtered <- qmean_filter(my_read, minq = 30) # watch the average qualities alphabetScore(my_read) / width(my_read)#> [1] 34.95 34.65 36.55 34.80 35.20 35.35 34.65 34.20 34.45 35.05 34.80 34.70 #> [13] 34.40 34.80 35.85 15.95 16.55 19.65 16.85 16.65 18.20 16.80 18.00 14.85 #> [25] 18.90 17.00 16.65 16.15 16.80 20.35#> A DNAStringSet instance of length 15 #> width seq #> [1] 20 TGGTCCGGTGTTCTGGCGGA #> [2] 20 ATAGGTACAGTCCAGTAATT #> [3] 20 GCCTCCCGCAGACGCTGGGT #> [4] 20 CCGGAATGCCCTTTCTGAGC #> [5] 20 AGCTCCAGCCGTTTGACTTC #> ... ... ... #> [11] 20 CTTACTAAGATTTGCAATAC #> [12] 20 CTAAGCGAAGTGACAGATAT #> [13] 20 GTTCGTCATTCATCCAGGCA #> [14] 20 AGTGCGCGGACATCAATTAC #> [15] 20 CACACAATTAAATATGACTC