To ensure optimal efficiency and security of the data, all data flow from the sequencers to the users is automated through our on-site servers (2x Dell PowerEdge R430 with attached Dell PowerVault MD1400s for a total of 96 cores, 500 Gb RAM and ~100Tb of RAID10 storage). Standard fastq files are generated from the internal bcl Illumina files using Illumina’s own bcl2fastq package with all the recommended settings. Data is then returned via a password protected https link.
By default, we offer raw NGS data in the form of fastq files, however we can also offer bioinformatic analysis as an additional service.We currently have validated and best-practice bioinformatic pipelines in place for the following:-
- DNAseq variant calling: whether the NGS data is WGS, WES or custom panels (both amplicon and capture based) – alignment to the reference, variant calling and annotation/prioritisation of the results using GATK’s best practices.
- RNAseq Differential Expression analysis: we will provide alignment and read counts for all genes measured in the experiment, as well as the differential expression analysis between the experimental groups using a STAR/readCounts/DeSeq2/Limma pipeline (experimental group structure required prior to analysis).
- RRBS methylation bisulphite converted DNAseq: Bismark based pipeline for methylation levels in sequenced regions and differential methylation analyses (experimental group structure required prior to analysis).
- Custom analyses: any other analyses will be considered on a project by project basis.
Additionally, raw (and processed, if any) data will be stored for a period of 6 months by default. Data may be stored longer term, however additional costs may be incurred.