Arteria, which has been developed in conjunction with the SNP&SEQ Technology Platform, was designed to address the automation challenges that arise in the processing of genomic data in the petabyte range. Built in Python on top of existing open source technologies, it uses a micro-service architecture that allows the system to be easily adapted to the specific needs of a next-generation sequencing center. Arteria has been deployed at three sequencing core facilities to date.
More information can be found at https://arteria-project.github.io/
Dahlberg J, Hermansson J, Sturlaugsson S, et al, Arteria: An automation system for a sequencing core facility GigaScience, Vol 8, nr 12, 2019
Åslin M, Brandt M, Dahlberg J. CheckQC: Quick quality control of Illumina sequencing runs. The Journal of Open Source Software, Vol. 3, nr 22, 2018
The Illumina Infinium HumanMethylation450 BeadChip (450k) is widely used for the evaluation of DNA methylation levels in large-scale datasets, particularly in cancer. The 450k design allows copy number variant (CNV) calling using existing bioinformatics tools. However, in cancer samples, numerous large-scale aberrations cause shifting in the probe intensities and thereby may result in erroneous CNV calling.
To resolve the cancer-specific problem of erroneous copy number calling in data derived from the 450k array, we provide a freely available R package denoted CopyNumber450kCancer that can run on all operating systems with installed R (version > 3.0) and provides a novel functionality to correct the center in segmentation data obtained from copy number calling tools such as CopyNumber450k and ChAMP.
The reference article describing the CopyNumber450kCancer R package:
Marzouka N-a-d, Nordlund J, Bäcklin CL, Lönnerholm G, Syvänen A-C, Almlöf JC (2016) CopyNumber450kCancer: Baseline Correction for Accurate Copy Number Calling from the 450k Methylation Array. Bioinformatics 32(7), 1080-1082
ClusterA is a tool we created for calculating some statistics on clusters, the most important being the "Silhouette Score" used in our group for genotype cluster validation (termed Overall Average Silhouette Width in the original article published by Rousseeuw in 1987).
ClusterA is downloadable as an executable file and runs on a standard Windows PC : Download ClusterA
The source code for ClusterA, written in Visual Basic, is also availible here. The map ClusterA.NET contains upgraded code to Visual Basic 2005, whereas the files under ClusterAnalysis are the original code : Download ClusterA original code.
The reference article describing the ClusterA program:
Lovmar L, Ahlford A, Jonsson M, Syvänen A-C (2005) Silhouette scores for assessment of SNP genotype clusters. BMC Genomics 6:35