plotfeaturetypeCounts.Rd
Function to visualize the distribution of reads across different feature types
for many alignment files in parallel. The plots are stacked bar plots
representing the raw or normalized read counts for the sense and antisense
strand of each feature. The graphics results are generated with
ggplot2
. Typically, the expected input is generated with the
affiliated featuretypeCounts
function.
plotfeaturetypeCounts(x, graphicsfile, graphicsformat = "pdf", scales = "fixed", anyreadlength = FALSE,
drop_N_total_aligned = TRUE, scale_count_val = 10^6, scale_length_val = NULL)
data.frame
with feature counts generated by the featuretypeCounts
function.
Path to file where to write the output graphics. Note, the function returns
the graphics instructions from ggplot2
for interactive plotting in R.
However, due to the complexity of the graphics generated here, the finished results
are written to a file directly.
Graphics file format. Currently, supported formats are: pdf, png or jpeg. Argument accepts one of them as character string.
Scales setting passed on to the facet_wrap
function of ggplot2
.
For details see ggplot2::facet_wrap
. The default fixed
assures a
constant scale across all bar plot panels, while free
uses the optimum
scale within each bar plot panel. To evaluate plots in all their details, it
may be necessary to generate two graphics files one for each scaling option.
If set to TRUE
read length specific read counts will be summed up to
a single count value to plot read counts for any read length. Otherwise
the bar plots will show the counts for each read length value.
If set to TRUE
the special feature count N_total_aligned
will not be included as a separate feature in the plots. However, the
information will still be used internally for scaling the read counts
to a fixed value if this option is requested under the
scale_count_val
argument.
Scales (normalizes) the read counts to a fixed value of aligned reads
in each sample such as counts per million aligned reads (default is 10^6).
For this calculation the N_total_aligned
values are used that are
reported in the input data.frame
generated by the upstream
featuretypeCounds
function. Assign NULL
to turn off
scaling by aligned reads.
Allows to adjust the raw or scaled read counts to a constant length interval
(e.g. scale_length_val=10^3
in bps) considering the total genomic length
of the corresponding feature type. The required genomic length information for
each feature type is obtained from the Featuretypelength
column of the
input data.frame
generated by the featuretypeCount
function. To
turn off feature length adjustment, assign NULL
(default).
The function returns bar plot graphics for aligned read counts with read
length resolution if the input contains this information and argument
anyreadlength
is set to FALSE
. If the input contains counts for
any read length and/or anyreadlength=TRUE
then there will be only one
bar per feature and sample. Due to the complexity of the plots, the results
are directly written to file in the chosen graphics format. However, the
function also returns the plotting instructions returned by ggplot2
to
display the result components using R's plotting device.
featuretypeCounts
, genFeatures
## Construct SYSargs2 object from param and targets files
targets <- system.file("extdata", "targets.txt", package="systemPipeR")
dir_path <- system.file("extdata/cwl", package="systemPipeR")
args <- loadWorkflow(targets=targets, wf_file="hisat2/hisat2-mapping-se.cwl",
input_file="hisat2/hisat2-mapping-se.yml", dir_path=dir_path)
args <- renderWF(args, inputvars=c(FileName="_FASTQ_PATH1_", SampleName="_SampleName_"))
args
#> Instance of 'SYSargs2':
#> Slot names/accessors:
#> targets: 18 (M1A...V12B), targetsheader: 4 (lines)
#> modules: 1
#> wf: 0, clt: 1, yamlinput: 7 (inputs)
#> input: 18, output: 18
#> cmdlist: 18
#> Sub Steps:
#> 1. hisat2-mapping-se (rendered: TRUE)
#>
#>
if (FALSE) {
## Run alignments
args <- runCommandline(args, dir = FALSE, make_bam = TRUE)
outpaths <- subsetWF(args, slot = "output", subset = 1, index = 1)
## Features from sample data of systemPipeRdata package
library(GenomicFeatures)
file <- system.file("extdata/annotation", "tair10.gff", package="systemPipeRdata")
txdb <- makeTxDbFromGFF(file=file, format="gff3", organism="Arabidopsis")
feat <- genFeatures(txdb, featuretype="all", reduce_ranges=TRUE, upstream=1000, downstream=0, verbose=TRUE)
## Generate and plot feature counts for specific read lengths
fc <- featuretypeCounts(bfl=BamFileList(outpaths, yieldSize=50000), grl=feat, singleEnd=TRUE, readlength=c(74:76,99:102), type="data.frame")
p <- plotfeaturetypeCounts(x=fc, graphicsfile="featureCounts.pdf", graphicsformat="pdf", scales="fixed", anyreadlength=FALSE)
## Generate and plot feature counts for any read length
fc2 <- featuretypeCounts(bfl=BamFileList(outpaths, yieldSize=50000), grl=feat, singleEnd=TRUE, readlength=NULL, type="data.frame")
p2 <- plotfeaturetypeCounts(x=fc2, graphicsfile="featureCounts2.pdf", graphicsformat="pdf", scales="fixed", anyreadlength=TRUE)
}