Height, skin color, hair type, and eye color are some of the hundreds of instructions stored within the DNA structure. All this information is organized in a DNA polymer framework, composed of nucleotide monomers. From macro to a micro organization, nucleotides have in common the presence of basic chemical components, such as a 5-carbon sugar, a phosphate group, and a nitrogenous base. The only difference between nucleotides in the DNA sequence is the nitrogenous base type displayed, which can be adenine (A), guanine (G), cytosine (C), or thymine (T).
Those bases have different chemical properties. Hence, A and G are classified as purines while C and T are classified as pyrimidines. To work as an information storage system, a set of well-established rules must be followed by the DNA bases. Thus, adenine bonds with thymine to form a base pair and cytosine bonds with guanine. Then, each DNA molecule is formed by two complementary and antiparallel strands connected by 2 or 3 hydrogen bonds depending on the base pair formed. Therefore, the genetic characteristics of an organism rely on the DNA base pair arrangement.
Interestingly, comparative studies using DNA sequences from distinct organisms have unveiled many conservated DNA portions shared among species. Similar sequences can be observed even when comparing DNA from human and bacterial cells. Nevertheless, it becomes more evident as we compare closely-related species. For example, humans share more than 98% of their DNA sequence with other primates. Therefore, the main difference between these organisms is based on punctual alterations in the base pair arrangement.
Formerly, comparative studies of DNA sequences were often costly and limited by the power, resolution, and coverage of the molecular biology assay used. Notwithstanding, the recent advances in high-throughput sequencing capabilities allowed many large-scale DNA sequence studies, consequently, providing insights into organism complexity and the biological mechanisms behind it.
Thenceforth, plenty of NGS methodologies have been developed for different applications. Usually, when compared with classic molecular biology assays, NGS methods achieve higher resolution, coverage, and have better cost-benefit overall. Such approaches include Whole Genome Sequencing (WGS) and Whole-Exome Sequencing (WES), Chromatin ImmunoPrecipitation and Sequencing (ChIP-Seq), Assay for Transposase-Accessible Chromatin using Sequencing (ATAC-Seq), RNA Sequencing (RNA-Seq) and Single Cell RNA-Seq (scRNA-Seq).
WGS and WES can detect alterations at the genomic level. However, both techniques cover distinct portions of the genome. WGS sequences the entire DNA from an organism, whereas WES focuses on the exonic regions, which correspond to a small portion of the genome. In humans, for example, it comprises approximately 2% of the genome.
Furthermore, WGS allows identification of large structural variants, while WES is mostly used to detect single nucleotide variants (SNVs) or small insertions and deletions. When the goal of the study is to identify epigenetic alterations, DNA-protein binding sites, histone modification or open chromatin regions, ChIP-Seq is more recommended.
Alternatively, accessible chromatin regions can be identified using the ATAC-Seq method. Each method provides insights at particular layers of DNA information. Also, all of the aforementioned techniques can be readily performed on the Basepair platform.
After the sequencing data explosion, human resources are still a bottleneck on the NGS data processing routine. This kind of data usually requires the intervention of bioinformaticians, which are frequently unavailable and sometimes a high-cost investment.
Besides, such analyses may require programming skills and be constantly up to date on bioinformatics’ tools and techniques. In fact, biologists and clinicians consider data analysis and interpretation to be the most valuable steps in the “omics age”, given that the generation of data is concurrent with its analysis and interpretation.
With this in mind, Basepair recently provided a suite of automated NGS analysis solutions, offering an easy-to-use NGS data analysis platform with no requirement for any coding skills. This platform allows research, clinical, and pharma teams to perform integrative analyses in DNA-Seq, ChIP-Seq, ATAC-Seq, and RNA-seq using popular NGS pipelines with interactive and intuitive visualization tools. For instance, RNA-Seq analysis on Basepair enables the study of differential gene expression, identification of novel transcripts, alternative splicing events and quantification of non-coding transcripts.
Importantly, RNA-Seq only detects the presence of DNA sequences that are converted into RNA molecules. Unlike DNA, RNA is a single-strand molecule composed of adenine, guanine, cytosine, and uracil (U) instead of thymine. Thus, RNA Seq analysis on Basepair can be used as a measuring unit of gene and allele expression. Also, by applying this technique at the single cell level, using scRNA-Seq, it is possible to obtain a better resolution of the biological processes occurring within that individual cell, which reflect many cellular mechanisms, aiding in their comprehension.
Data processing steps in RNA-Seq analysis on Basepair include quality control (QC) measures, reads trimming, short read mapping, Transcript quantification, and differential gene expression analyses. Another advantage of using intuitive NGS platform solutions is the possibility to obtain publication-ready reports, which provide accurate details about the analyses performed for non-bioinformaticians. With just one click users can perform RNA-Seq analysis on Basepair.
Indeed, automated solutions are ever more attractive because they facilitate reproducibility and transparency during data analyses. Besides, workflows are fast and optimized ways to provide high-quality results within an hour. Therefore, RNA-Seq analysis on Basepair is a promising tool that enables non-bioinformaticians to conduct NGS data analysis independently and in an effortless and efficient manner.