AQRNA-seq for Quantifying Small RNAs

Ruixi Chen; Daniel Yim; Peter C. Dedon

doi:10.3791/66335

A subscription to JoVE is required to view this content. Sign in or start your free trial.

Summary

Absolute quantification RNA sequencing (AQRNA-seq) is a technology developed to quantify the landscape of all small RNAs in biological mixtures. Here, both the library preparation and data processing steps of AQRNA-seq are demonstrated, quantifying changes in the transfer RNA (tRNA) pool in Mycobacterium bovis BCG during starvation-induced dormancy.

Abstract

AQRNA-seq provides a direct linear relationship between sequencing read counts and small RNA copy numbers in a biological sample, thus enabling accurate quantification of the pool of small RNAs. The AQRNA-seq library preparation procedure described here involves the use of custom-designed sequencing linkers and a step for reducing methylation RNA modifications that block reverse transcription processivity, which results in an increased yield of full-length cDNAs. In addition, a detailed implementation of the accompanying bioinformatics pipeline is presented. This demonstration of AQRNA-seq was conducted through a quantitative analysis of the 45 tRNAs in Mycobacterium bovis BCG harvested on 5 selected days across a 20-day time course of nutrient deprivation and 6 days of resuscitation. Ongoing efforts to improve the efficiency and rigor of AQRNA-seq will also be discussed here. This includes exploring methods to obviate gel purification for mitigating primer dimer issues after PCR amplification and to increase the proportion of full-length reads to enable more accurate read mapping. Future enhancements to AQRNA-seq will be focused on facilitating automation and high-throughput implementation of this technology for quantifying all small RNA species in cell and tissue samples from diverse organisms.

Introduction

Next-generation sequencing (NGS), also known as massively parallel sequencing, is a DNA sequencing technology that involves DNA fragmentation, ligation of adaptor oligonucleotides, polymerase chain reaction (PCR)-based amplification, sequencing of the DNA, and reassembly of the fragment sequences into a genome. The adaptation of NGS to sequence RNA (RNA-seq) is a powerful approach to identify and quantify RNA transcripts and their variants¹. Innovative developments in RNA library preparation workflows and bioinformatic analysis pipelines, coupled with advancements in laboratory instrumentation, have expanded the repertoire of RNA-seq applications, progressing beyond exome sequencing into advanced functional omics like non-coding RNA profiling², single cell analysis³, spatial transcriptomics⁴^,⁵, alternative splicing analysis⁶, among others. These advanced RNA-seq methods reveal complex RNA functions through quantitative analysis of the transcriptome in normal and diseased cells and tissue.

Despite these advances in RNA-seq, several key technical features limit the quantitative power of the method. While most RNA-seq methods allow precise and accurate quantification of changes in the levels of RNAs between experimental variables (i.e., biological samples and/or physiological states), they cannot provide quantitative comparisons of the levels of RNA molecules within a sample. For example, most RNA-seq methods cannot accurately quantify the relative number of copies of individual tRNA isoacceptor molecules in a cellular pool of expressed tRNAs. As highlighted in the companion publication⁷, this limitation to RNA-seq arises from several features of RNA structure and the biochemistry of library preparation. For example, the activity of the ligation enzymes used to attach the 3'- and 5'-end sequencing linkers to RNA molecules is strongly influenced by the identity of the terminal nucleotides of the RNA and the sequencing linkers. This leads to large variations in efficiencies of linker ligations and profound artifactual increases in sequencing reads⁸^,⁹^,¹⁰.

A second set of limitations arise from the inherent structural properties of RNA molecules. Specifically, RNA secondary structure formation and dynamic changes in the dozens of post-transcriptional RNA modifications of the epitranscriptome can cause polymerase fall-off or mutation during reverse transcription. These errors result in incomplete or truncated cDNA synthesis or altered RNA sequence. While both of these phenomena can be exploited to map secondary structures or some modifications, they degrade the quantitative accuracy of RNA-seq if subsequent library preparation steps fail to capture truncated cDNAs or if data processing throws out mutated sequences not matching a reference dataset¹¹^,¹². Furthermore, the immense chemical, length, and structural diversity of RNA transcripts, as well as the lack of tools to uniformly fragment long RNAs, diminishes the applicability of most RNA-seq methods to all RNA species¹³.

The AQRNA-seq (absolute quantification RNA sequencing) method has been developed to remove several of these technical and biological constraints that limit quantitative accuracy⁷. By minimizing sequence-dependent biases in capture, ligation, and amplification during RNA sequencing library preparation, AQRNA-seq achieves superior linearity compared to other methods, accurately quantifying 75% of a reference library of 963 miRNAs within 2-fold accuracy. This linear correlation of sequencing read count and RNA abundance is also observed in an analysis of a variable length pool of RNA oligonucleotide standards and in reference to orthogonal methods like northern blotting. Establishing linearity between sequencing read count and RNA abundance enables AQRNA-seq to achieve accurate, absolute quantification of all RNA species within a sample.

Here is a description of the protocol for AQRNA-seq library preparation workflow and the accompanying downstream data analytics pipeline. The method was applied to elucidate the dynamics of tRNA abundance during starvation-induced dormancy and subsequent resuscitation in the Mycobacterium bovis bacilli de Calmette et Guérin (BCG) model of tuberculosis. Results were presented for the exploratory visualization of the sequencing data, along with subsequent clustering and differential expression analyses that unveiled discernable patterns in tRNA abundance associated with various phenotypes.

Protocol

NOTE: Figure 1 provides a graphical illustration of the procedures involved in AQRNA-seq library preparation. Detailed information regarding the reagents, chemicals, and columns/kits used in the procedure can be found in the Table of Materials. It is recommended to perform a comprehensive evaluation of purity, integrity, and quantity of the input RNA samples using (i) 3% agarose gel electrophoresis, (ii) automated electrophoresis tools for sample quality control of biomolecules (see Table of Materials), and (iii) UV-Visible spectrophotometry and/or fluorometric quantification. It is mandatory to keep all reactions and master mixes on ice, unless otherwise specified. Transport reagents (e.g., enzymes) in cool boxes to and from -20 °C storage to preserve their shelf lives and avoid multiple freeze-thaw cycles of library intermediates.

1. Dephosphorylation of RNAs

NOTE: Removal of the 5'-phosphate (P; donor) prevents self-ligation to the 3'-hydroxyl (OH; acceptor) of RNAs. Linkers will not self-ligate as their 3' end is modified to incorporate either a dideoxycytidine (Linker 1) or a spacer (Linker 2). Linkers can only be ligated by joining their 5'-P to the 3'-OH of the RNAs or cDNAs.

Prepare the dephosphorylation reaction in a sterile PCR tube (e.g., 200 µL or 500 µL tube) by adding up to 2 µL of the RNA sample, 0.5 µL of 40 U/µL RNase inhibitor, 1 µL of 0.5 µM internal standard (Table 1), 0.5 µL of 10x T4 RNA ligase reaction buffer, 1 µL of 1 U/µL Shrimp alkaline phosphatase, and sufficient RNase-free water to bring the overall volume to 5 µL.
NOTE: For typical samples, 75 ng of RNAs (or approximately 2 pmol for an 80 nt RNA) is considered sufficient for quantification of small RNAs using this protocol.
Incubate at 37 °C for 30 min to dephosphorylate RNAs and then at 65 °C for 5 min to inactivate the enzyme and denature the RNAs. Keep samples at 4 °C for at least 10 min to prevent renaturation.

2. Ligation of Linker 1 to the 3' end of RNAs

Prepare the Linker 1 ligation reaction in a sterile PCR tube by adding 5 µL of the dephosphorylated RNAs (products from step 1), 0.5 µL of 40 U/µL RNase inhibitor, 1 µL of 100 µM Linker 1 (Table 1), 3 µL of 10 mM ATP, 2.5 µL of 10x T4 RNA ligase reaction buffer, 15 µL of PEG8000 (50% solution), 2 µL of 30 U/µL T4 RNA ligase 1, and 1 µL of RNase-free water.
NOTE: Reagents can be made into master mix to facilitate processing the samples. Do not include PEG8000 and T4 RNA ligase 1 in the master mix.
Incubate at 25 °C for 2 h and then at 16 °C for 16 h to ligate Linker 1 to the RNAs.
Column-purify the Linker-1-ligated RNAs. Use a kit for DNA/RNA recovery and clean-up (see Table of Materials).
NOTE: This column purification protocol applies to all subsequent column purification using the same kit. Kits that use gel-filtration technology to remove dye terminators from sequencing reactions (see Table of Materials) cannot be used here, as PEG8000 is incompatible with the gel filters of such kits. It is recommended to save an aliquot (1.2 µL) of each sample after purification. When necessary, these aliquots can be used for checking the ligation efficiency by running a commercial nucleic acid analyzer. Keep the remaining purified samples on ice or at -20 °C until further steps.
1. For sample volume < 50 µL, add RNase-free water to bring it to 50 µL. Add 2 volumes of the oligo binding buffer (provided in the kit) to 1 volume of the sample. Add 8 volumes of 100% ethanol to 1 volume of the sample.
2. Load the sample (up to 750 µL at a time) to a column placed inside a 2 mL collection tube (provided in the kit). To bind RNAs to the column, centrifuge at 10,000 x g for 30 s and discard the flow-through (i.e., the liquid in the collection tube). Place the column back into the collection tube.
3. To wash the impurities off the column, add 750 µL of the DNA wash buffer (provided in the kit) to the column. Centrifuge at 10,000 x g for 30 s and discard the flow-through. Place the column back into the collection tube.
4. Centrifuge at maximum speed (e.g., 16,000 x g for a benchtop microcentrifuge) for an additional 1 min to remove residual DNA wash buffer.
5. To elute the RNAs, carefully move the column to a sterile 1.5 mL tube, add RNase-free water to the column. Use a greater volume than needed when eluting the RNAs to account for potential volume loss during the elution process. For instance, if 15 µL is required for the following step, add 17 µL of RNase-free water for elution. Centrifuge at 10,000 x g for 30 s.

3. Removal of post-transcriptional methylations by AlkB demethylase

NOTE: AlkB is a bacterial enzyme that removes methyl groups from some, but not all, methylated nucleotides in DNA and RNA. Removal of several types of methylated ribonucleotides in RNA prevents fall-off of the reverse transcriptase to allow more full-length reads and identification of modification sites. This step needs to be controlled at a low pH to prevent unexpected degradation of RNAs.

Prepare a stock solution of 1 M 2-ketoglutarate by dissolving 1.4611 g of 2-ketoglutarate (146.11 g per M) in 10 mL of RNase-free water. Filter sterilize the solution through a 0.2 µm syringe filter. Aliquot the stock solution into 2 mL sterile tubes and store at -20 °C.
Prepare a stock solution of 0.5 M L-ascorbic acid by dissolving 0.88 g of L-ascorbic acid (176.12 g per M) in 10 mL of RNase-free water. Filter sterilize the solution through a 0.2 µm syringe filter. Aliquot the stock solution into 2 mL sterile tubes and store at -20 °C.
Prepare a stock solution of 0.25 M ammonium ferrous sulfate hexahydrate by dissolving 0.9835 g of ammonium ferrous sulfate hexahydrate (392.14 g per M) in 10 mL of RNase-free water. Filter sterilize the solution through a 0.2 µm syringe filter. Aliquot the stock solution into 2 mL sterile tubes and store at -20 °C.
Prepare a stock solution of 1 M HEPES by dissolving 2.383 g of HEPES (238.30 g per M) in 10 mL of RNase-free water. Adjust the pH of the solution to 8 using NaOH and filter sterilize the solution through a 0.2 µm syringe filter. Aliquot the stock solution into 2 mL sterile tubes and store at -20 °C.
Prepare a 2x AlkB reaction buffer. To make 10 mL of the buffer, combine 1.5 µL of 1 M 2-ketoglutarate (made in step 3.1), 80 µL of 0.5 M L-ascorbic acid (made in step 3.2), 6 µL of 0.25 M ammonium ferrous sulfate hexahydrate (made in step 3.3), 100 µL of 10 mg/mL BSA, 1000 µL of 1 M HEPES (made in step 3.4; add last), and 8812.5 µL of RNase-free water. Filter sterilize the buffer through a 0.2 µm syringe filter.
NOTE: The 2x AlkB reaction buffer must be prepared fresh immediately prior to each experiment due to the chemical lability of the components.
Prepare the AlkB digestion reaction in a sterile PCR tube by adding 20 µL of the Linker-1-ligated RNAs (products from step 2), 50 µL of the 2x AlkB reaction buffer (made in step 3.5), 2 µL of AlkB demethylase, 1 µL of RNase inhibitor, and 27 µL of RNase-free water.
Incubate at room temperature for 2 h to remove post-transcriptional methylations from the RNAs.
To remove AlkB from the reaction, follow the steps described below.
1. For a clean phase separation, add 50 µL of RNase-free water into the AlkB reaction, and then add 100 µL phenol: chloroform: isoamyl alcohol 25:24:1 (pH = 5.2).
2. Shake by hand for 10 s, and then centrifuge at 16,000 x g for 10 min. Make sure that the rotor of the benchtop centrifuge is compatible with the PCR tubes. Use adaptors if needed.
3. Transfer the RNAs (i.e., the aqueous layer on top; approximately 140 µL) into a sterile 1.5 mL tube. If chloroform (i.e., the bottom layer) is mixed into the aqueous layer, centrifuge again with the same settings.
4. Add 100 µL of chloroform to the extracted RNAs to remove residual phenol. Shake by hand for 10 s, and then centrifuge at 16,000 x g for 10 min.
5. Transfer the RNAs (i.e., the aqueous layer on top; approximately 120 µL) into a sterile 1.5 mL tube.
Column-purify the extracted RNAs. Use a kit for DNA/RNA recovery and clean-up (see Table of Materials). Follow the protocol detailed in step 2.3 to perform the column purification.

4. Removal of excess Linker 1

NOTE: It is recommended to save an aliquot (1.2 µL) of each sample after purification. When necessary, these aliquots can be used for checking the efficiency of RecJf digestion by running a commercial nucleic acid analyzer. Proceed with the purified samples immediately to reverse transcription.

Prepare the deadenylation reaction in a sterile PCR tube by adding 15 µL of Linker-1-ligated RNAs (products from step 3), 1 µL of 40 U/µL RNase inhibitor, 2 µL of 10x kit buffer 2 (see Table of Materials), and 2 µL of 50 U/µL 5'-deadenylase.
Incubate at 30 °C for 1 h to remove the adenine at the 5' end of Linker 1. Add 2 µL of 30 U/μL RecJf into the deadenylation reaction.
Incubate at 37 °C for 30 min to digest excess Linker 1. Add another 2 µL of 30 U/μL RecJf into the reaction.
Incubate at 37 °C for 30 min to continue the digestion of excess Linker 1 and then at 65 °C for 20 min to denature the enzyme.
Column-purify the Linker-1-ligated RNAs. Use a kit that uses gel-filtration technology to remove dye terminators from sequencing reactions (see Table of Materials), as it is effective at removing short remnants (e.g., oligonucleotides with a length of 2 to 10 bp). Follow the steps described below for purification.
1. Prepare gel-filter columns according to the manufacturer's protocol. Place a column into a sterile 1.5 mL tube and load the column with 24 µL of the sample.
2. To purify the RNAs, centrifuge at 800 x g for 3 min and discard the column. The purified RNAs are in the eluent.

5. Reverse transcription (RT) reaction

NOTE: The following RT (see Table of Materials) reaction setup follows the manufacturer's protocol, with minor modifications to allow for AQRNA-seq compatibility.

Prepare the RT primer annealing reaction in a sterile PCR tube by adding 24 µL of template RNAs (products from step 4), 1 µL of 2 µM RT primer (Table 1), and 1 µL of dNTP (10 mM of each type of nucleotides).
Incubate at 80 °C for 2 min to anneal RT primers to the template RNAs, and then cool on ice immediately for 2 min.
Prepare the RT reaction by adding 6 µL of 5x RT reaction buffer, 1 µL of 40 U/µL RNase inhibitor, and 1 µL of reverse transcriptase into the annealing reaction tube.
Incubate at 50 °C for 2 h to reverse transcribe the RNA templates, and then at 70 °C for 15 min to inactivate the enzyme. The RT products (i.e., RNA-cDNA hybrids) can be stored at 4 °C or -20 °C overnight.

6. RNA hydrolysis

Add 1 µL of 5 M NaOH into the RNA-cDNA hybrid (products from step 5). Incubate at 93 °C for 3 min to hydrolyze the RNA strand of the RNA-cDNA hybrid.
Add 0.77 µL of 5 M HCl to neutralize the reaction. After adding HCl, flick to mix, and spin down the tube. Neutralization is instantaneous.
NOTE: It is recommended to test the precise amount of 5 M HCl that is needed for neutralizing 1 µL of NaOH (e.g., using pH strips) in buffered conditions.
Column-purify the single-stranded cDNAs. Use a kit for DNA/RNA recovery and clean-up (see Table of Materials). Follow the protocol detailed in step 2.3 to perform the column purification.
NOTE: Kits that use gel-filtration technology to remove dye terminators from sequencing reactions (see Table of Materials) cannot be used here due to the pH variation in previous steps.
Speed-vac the purified cDNAs to < 5 µL and then add RNase-free water to bring the volume back to 5 µL. Be careful not to speed-vac the cDNAs to complete dryness.
Transfer the purified cDNAs into a sterile PCR tube. The purified cDNAs can be stored at -20 °C for up to 1 week.

7. Ligation of Linker 2 to the 3' end of cDNAs

Prepare the Linker 2 ligation reaction in a sterile PCR tube by adding 5 µL of cDNAs (products from step 6), 1 µL of 50 µM Linker 2 (Table 1), 2 µL of 10x T4 DNA ligase reaction buffer, 1 µL of 10 mM ATP, 9 µL of PEG8000 (50% solution), and 2 µL of 400 U/µL T4 DNA ligase.
NOTE: Reagents can be made into master mix to facilitate the processing of the samples. Do not include PEG8000 and T4 DNA ligase in the master mix.
Incubate at 16 °C for 16 h to ligate Linker 2 to the cDNAs.
Column-purify the Linker-2-ligated cDNAs. Use a kit for DNA/RNA recovery and clean-up (see Table of Materials). Follow the protocol detailed in step 2.3 to perform the column purification.

8. Removal of excess Linker 2

Prepare the deadenylation reaction in a sterile PCR tube by adding 16 µL of Linker-2-ligated cDNAs, 2 µL of 10x kit buffer 2, and 2 µL of 50 U/µL 5'-deadenylase. Incubate at 30 °C for 1 h to remove the adenine at the 5' end of Linker 2.
Add 2 µL of 30 U/μL RecJf into the deadenylation reaction. Incubate at 37 °C for 30 min to digest excess Linker 2. Add another 2 µL of 30 U/μL RecJf into the reaction.
Incubate at 37 °C for 30 min to continue the digestion of excess Linker 2 and then at 65 °C for 20 min to denature the enzyme.

9. PCR amplification of the cDNAs with sequencing primers

Assign PCR primers to the samples. Each sample needs a unique combination of forward and reverse primers (Table 1) for effective multiplexing.
Add RNase-free water to bring the sample volume to 25 µL. Save 5 µL of each sample into a sterile PCR tube as a backup in case PCR needs to be repeated.
Prepare the PCR reaction (see Table of Materials for PCR kit) by adding 20 µL of cDNAs (products from step 8), 1 µL of 2.5 µM forward primer, 1 µL of 2.5 µM reverse primer, 25 µL of 2x DNA polymerase buffer, 2 µL of RNase-free water, and 1 µL of DNA polymerase.
NOTE: Reagents can be made into master mix to facilitate the processing of the samples. Do not add DNA polymerase to the master mix.
Perform PCR with an initial denaturation at 94 °C for 1 min, followed by 18 cycles of denaturation at 98 °C for 20 s - annealing at 58 °C for 20 s - extension at 68 °C for 1 min.
NOTE: Do not PCR amplify beyond the linear range. A total of 18 cycles will be optimal for most experiments, but this may be context dependent.
Speed-vac the PCR products to less than 25 µL and then add RNase-free water to bring the volume back to 25 µL. Transfer 5 µL of the PCR products to a sterile 0.5 mL tube for checking the size distribution (see step 9.6). Store the remaining 20 µL of the PCR products at -20 °C until further steps.
Check the size distribution of the PCR products as described below.
1. Prepare 3% agarose gel in TAE buffer.
  NOTE: In this protocol, ethidium bromide (EtBr) is used for post-electrophoresis gel staining (see step 9.6.5). Appropriate DNA stains can be added into the gel solution at this step or used for gel staining later.
2. Mix 1 µL of 6x loading dye into 5 µL of PCR products (from step 9.5) and load the gel with the mixture.
3. Load 5 µL of DNA ladders into the well before the first sample and the well after the last sample. Use 50 bp or 100 bp DNA ladders to allow for improved size discrimination of PCR products between 150 bp and 300 bp.
4. Run gel electrophoresis to locate the PCR products. The appropriate running condition may be context dependent. Here, run at 120 V, 400 mA for 75 min for a 17.78 cm (width) x 10.16 cm (height) x 1 cm (thickness) gel slab.
5. Place the gel in a box and fill the box with deionized (DI) water until the gel is completely immersed. Add 10 µL of EtBr into the DI water soaking the gel. Wrap the box with foil and place it on a shaker. Stain the gel for 30 min while shaking.
6. Discard the EtBr-containing waste into a waste bottle placed in a fume hood. Rinse the gel with DI water once and discard the EtBr-containing water into the waste bottle.
7. Fill the box with DI water until the gel is completely immersed. Wrap the box with foil and place it on a shaker. Wash the gel for 10 min while shaking.
8. Discard the EtBr-containing water into the waste bottle in the fume hood. Use a gel imager to visualize the bands. Acquire a high-resolution image of the gel.

10. Gel purification

Prepare 3% agarose gel in TAE buffer. Make a 1 cm thickness gel with wide combs (1 mm thickness; 5 mm width; 15 mm depth), such that each well can contain at least 25 µL of sample-loading dye mixture.
Mix 4 µL of 6x loading dye with 20 µL of PCR products (from step 9.5) and load the gel with the mixture. Leave empty lanes in between samples to minimize cross-contamination during gel excision.
Load DNA ladders, run gel electrophoresis, stain and wash the gel, and take gel images as described in step 9.6.
Excise the gel blocks that contain PCR products within the target size range. To minimize contamination with primer dimers (175 bp linkers with no inserts), extract PCR products with size above 195 bp (175 bp linkers + 20 bp miRNAs).
Purify the PCR products using gel extraction. Use a gel extraction kit (see Table of Materials). The purification protocol is based on the manufacturer's protocol, with minor modifications for AQRNA-seq compatibility. All centrifugation steps should be conducted at 17,900 x g for 1 min using a benchtop centrifuge at room temperature, unless otherwise specified. Follow the steps described below.
1. Measure the weight of the gel blocks inside the tubes. Add 6 volumes of Buffer QG (provided in the kit) to 1 volume of gel block (1 mg gel is approximately 1 µL).
2. Incubate at 50 °C for 10 min or until complete dissolution of the gel blocks. Vortex tubes every 2 min to facilitate gel dissolution. After dissolving the gel, the mixture should resemble the color of the Buffer QG without the dissolved gel. If the color is orange or violet, add 10 µL of 3 M sodium acetate (pH = 5.0) and mix thoroughly.
3. Add 1 gel volume of isopropanol to the mixture and mix thoroughly. Place a spin column in a 2 mL collection tube (provided in the kit).
4. To bind DNA, apply the sample (up to 750 µL each time) to the column and centrifuge. Discard the flow-through and place the column back into the same collection tube. The maximum amount of gel per spin column is 400 mg.
5. Add 500 µL of Buffer QG to the column and centrifuge. Discard the flow-through and place the column back into the same collection tube.
6. To wash the impurities, add 750 µL of Buffer PE (provided in the kit) to the column, let the column stand for 5 min, and centrifuge. Discard the flow-through and place the column back into the same collection tube. Centrifuge once again to remove residual wash buffer.
7. Place the column into a sterile 1.5 mL tube. To elute DNA, add 30 µL of Buffer EB (provided in the kit) to the center of the column membrane, let the column stand for 4 min, and centrifuge.
8. Speed-vac and resuspend the gel-purified PCR products in 12 µL of Buffer EB.
Measure the concentration of the constructed libraries by UV-Visible spectrophotometry and/or fluorometric quantification.

11. Library sequencing

Submit the constructed libraries to an external sequencing center for quality assessment and Illumina sequencing. To ensure sufficient sensitivity in quantitative mapping of small RNA landscapes, opt for paired-end sequencing with 75-bp reads from each direction (i.e., PE75), aiming for at least 1.5 M raw sequence reads in each direction for each sample. Use custom primers (Table 1) for NextSeq sequencing, but this is optional for sequencing on MiSeq.
NOTE: Sequencing can be performed using MiSeq or NextSeq500 platforms. The choice of platform may depend on the nature of samples and the total sample count.

12. Data analytics pipeline

NOTE: Figure 2 provides a graphical illustration of simplified procedures involved in the data analytics pipeline, which takes raw sequence reads (in FASTQ format) as the input and generates an abundance matrix with rows representing members of small RNA species of interest and columns representing samples. For paired-end sequencing, each sample corresponds to two FASTQ files, one for the forward reads and the other for the reverse reads. The complete data analytics pipeline, with all the associated scripts and a manual with extensive annotations for each step, is available at GitHub (https://github.com/Chenrx9293/AQRNA-seq-JoVE.git).

Retrieve raw sequence reads from the external sequencing center and assess sequencing quality using open-source programs such as FastQC¹⁴ or fastp¹⁵.
Create a reference sequence library in FASTA format.
NOTE: The key to the pipeline's adaptability to diverse small RNA classes is an appropriate reference sequence library. To achieve accurate abundance estimates of members of specific RNA classes of interest (e.g., miRNA), the users are anticipated to meticulously curate a reference sequence library for use with the pipeline. All the other instructions for pipeline execution remain consistent across different small RNA classes.
Create a directory named AQRNA-seq for implementing the data analytics pipeline and place all scripts, quality-filtered sequence reads, and the reference sequence library into this directory. Follow the detailed instructions on GitHub (https://github.com/Chenrx9293/AQRNA-seq-JoVE.git) to prepare sub-directories and make essential modifications to the files for targeting different organisms and/or small RNA species, as well as compatibility of the operating system and/or job scheduler.
Implement the data analytics pipeline following the steps outlined in the manual on GitHub (https://github.com/Chenrx9293/AQRNA-seq-JoVE.git), which contains descriptions, input and output files, as well as command lines for each step. In summary, the pipeline encompasses (i) trimming linker sequences and random nucleotides from the reads, (ii) filtering reads based on their length, (iii) mapping the reads to the reference sequences, (iv) resolving ambiguous mappings, and (v) generating the abundance matrix.

Results

Mycobacterium bovis BCG (bacilli de Calmette et Guérin) strain 1173P2 undergoing exponential growth were subject to a time series (0, 4, 10, and 20 days) of nutrient starvation, followed by a 6-day resuscitation in nutrient-rich medium as previously presented in Hu et al.⁷. Small RNAs were isolated from bacterial culture, with three biological replicates, at each of the five designated time points. Illumina libraries were constructed using the above-described AQRNA-seq library prepar...

Discussion

The AQRNA-seq library preparation workflow is designed to maximize the capture of RNAs within a sample and minimize polymerase fall-off during reverse transcription⁷. Through a two-step linker ligation, novel DNA oligos (Linker 1 and Linker 2) are ligated in excess to fully complement the RNA within the sample. Excess linkers can be efficiently removed with RecJf, a 5' to 3' exonuclease specific to single-stranded DNAs, leaving the ligated products intact. In addition, AlkB treatment reduc...

Disclosures

P.C.D. is an inventor on two patents (PCT/US2019/013714, US 2019/0284624 A1) relating to the published work.

Acknowledgements

The authors of the present work are grateful to the authors of the original paper describing the AQRNA-seq technology⁷. This work was supported by grants from the National Institutes of Health (ES002109, AG063341, ES031576, ES031529, ES026856) and the National Research Foundation of Singapore through the Singapore-MIT Alliance for Research and Technology Antimicrobial Resistance IRG.

Materials

Name	Company	Catalog Number	Comments
2-ketoglutarate	Sigma-Aldrich	75890	Prepare a working solution (1 M) and store it at -20ºC
2100 Bioanalyzer Instrument	Agilent	G2938C
5'-deadenylase (50 U/μL)	New England Biolabs	M0331S (component #: M0331SVIAL)	Store at -20 °C
Adenosine 5'-Triphosphate (ATP)	New England Biolabs	M0437M (component #: N0437AVIAL)	NEB M0437M contains T4 RNA Ligase 1 (30 U/μL), T4 RNA Ligase Reaction Buffer (10X), PEG 8000 (1X), and ATP (100 mM); prepare a working solution (10 mM) and store it at -20ºC
AGAROSE GPG/LE	AmericanBio	AB00972-00500	Store at ambient temperature
Ammonium iron(II) sulfate hexahydrate	Sigma-Aldrich	F2262	Prepare a working solution (0.25 M) and store it at -20 °C
Bioanalyzer Small RNA Analysis	Agilent	5067-1548	The Small RNA Analysis is used for checking the quality of input RNAs and the efficiency of enzymatic reactions (e.g., Linker 1 ligation)
Bovine Serum Albumin (BSA; 10 mg/mL)	New England Biolabs	B9000	This product was discontinued on 12/15/2022 and is replaced with Recombinant Albumin, Molecular Biology Grade (NEB B9200).
Chloroform	Macron Fine Chemicals	4441-10
Demethylase	ArrayStar	AS-FS-004	Demethylase comes with the rtStar tRNA Pretreatment & First-Strand cDNA Synthesis Kit (AS-FS-004)
Deoxynucleotide (dNTP) Solution Mix	New England Biolabs	N0447L (component #: N0447LVIAL)	This dNTP Solution Mix contains equimolar concentrations of dATP, dCTP, dGTP and dTTP (10 mM each)
Digital Dual Heat Block	VWR Scientific Products	13259-052	Heating block is used with the QIAquick Gel Extraction Kit
DyeEx 2.0 Spin Kit	Qiagen	63204	Effective at removing short remnants (e.g., oligos less than 10 bp in length)
Electrophoresis Power Supply	Bio-Rad Labrotories	PowerPac 300
Eppendorf PCR Tubes (0.5 mL)	Eppendorf	0030124537
Eppendorf Safe-Lock Tubes (0.5 mL)	Eppendorf	022363611
Eppendorf Safe-Lock Tubes (1.5 mL)	Eppendorf	022363204
Eppendorf Safe-Lock Tubes (2 mL)	Eppendorf	022363352
Ethyl alcohol (Ethanol), Pure	Sigma-Aldrich	E7023	The pure ethanol is used with the Oligo Clean and Concentrator Kit from Zymo Research
Gel Imaging System	Alpha Innotech	FluorChem 8900
Gel Loading Dye, Purple (6X), no SDS	New England Biolabs	N0556S (component #: B7025SVIAL)	NEB N0556S contains Quick-Load Purple 50 bp DNA Ladder and Gel Loading Dye, Purple (6X), no SDS
GENESYS 180 UV-Vis Spectrophotometer	Thermo Fisher Scientific	840-309000	The spectrophotometer is used for measuring the oligo concentrations using the Beer's law
HEPES	Sigma-Aldrich	H4034	Prepare a working solution (1 M; pH = 8 with NaOH) and store it at -20 °C
Hydrochloric acid (HCl)	VWR Scientific Products	BDH3028	Prepare a working solution (5 M) and store it at ambient temperature
Isopropyl Alcohol (Isopropanol), Pure	Macron Fine Chemicals	3032-16	Isopropanol is used with the QIAquick Gel Extraction Kit
L-Ascorbic acid	Sigma-Aldrich	A5960	Prepare a working solution (0.5 M) and store it at -20ºC
Microcentrifuge	Eppendorf	5415D
NanoDrop 2000 Spectrophotometer	Thermo Fisher Scientific	ND-2000
NEBuffer 2 (10X)	New England Biolabs	M0264L (component #: B7002SVIAL)	NEB M0264L contains RecJf (30 U/μL) and NEBuffer 2 (10X); store at -20 °C
Nuclease-Free Water (not DEPC-Treated)	Thermo Fisher Scientific	AM9938
Oligo Clean & Concentrator Kit	Zymo Research	D4061	Store at ambient temperature
PEG 8000 (50% solution)	New England Biolabs	M0437M (component #: B1004SVIAL)	NEB M0437M contains T4 RNA Ligase 1 (30 U/μL), T4 RNA Ligase Reaction Buffer (10X), PEG 8000 (1X), and ATP (100 mM); prepare a working solution (10 mM) and store it at -20ºC
Peltier Thermal Cycler	MJ Research	PTC-200
Phenol:choloroform:isoamyl alcohol 25:24:1 pH = 5.2	Thermo Fisher Scientific	J62336
PrimeScript Buffer (5X)	TaKaRa	2680A
PrimeScript Reverse Transcriptase	TaKaRa	2680A
QIAquick Gel Extraction Kit	Qiagen	28704	This kit requires a heating block and isopropanol to work with
Quick-Load Purple 100 bp DNA Ladder	New England Biolabs	N0551S (component #: N0551SVIAL)
Quick-Load Purple 50 bp DNA Ladder	New England Biolabs	N0556S (component #: N0556SVIAL)	NEB N0556S contains Quick-Load Purple 50 bp DNA Ladder and Gel Loading Dye, Purple (6X), no SDS
RecJf (30 U/μL)	New England Biolabs	M0264L (component #: M0264LVIAL)	NEB M0264L contains RecJf (30 U/μL) and NEBuffer 2 (10X); store at -20 °C
RNase Inhibitor (murine; 40 U/μL)	New England Biolabs	M0314L (component #: M0314LVIAL)	Store at -20 °C
SeqAMP DNA Polymerase	TaKaRa	638509	TaKaRa 638509 contains SeqAMP DNA Polymerase and SeqAMP PCR Buffer (2X)
SeqAMP PCR Buffer (2X)	TaKaRa	638509	TaKaRa 638509 contains SeqAMP DNA Polymerase and SeqAMP PCR Buffer (2X)
Shrimp Alkaline Phosphatase (1 U/μL)	New England Biolabs	M0371L (component #: M0371LVIAL)
Sodium hydroxide (NaOH)	Sigma-Aldrich	S5881	Prepare a working solution (5 M) and store it at ambient temperature
T4 DNA Ligase (400 U/μL)	New England Biolabs	M0202L (component #: M0202LVIAL)	NEB M0202L contains T4 DNA Ligase (400 U/μL) and T4 DNA Ligase Reaction Buffer (10X)
T4 DNA Ligase Reaction Buffer (10X)	New England Biolabs	M0202L (component #: B0202SVIAL)	NEB M0202L contains T4 DNA Ligase (400 U/μL) and T4 DNA Ligase Reaction Buffer (10X)
T4 RNA Ligase 1 (30 U/μL)	New England Biolabs	M0437M (component #: M0437MVIAL)	NEB M0437M contains T4 RNA Ligase 1 (30 U/μL), T4 RNA Ligase Reaction Buffer (10X), PEG 8000 (1X), and ATP (100 mM)
T4 RNA Ligase Reaction Buffer (10X)	New England Biolabs	M0437M (component #: B0216SVIAL)	NEB M0437M contains T4 RNA Ligase 1 (30 U/μL), T4 RNA Ligase Reaction Buffer (10X), PEG 8000 (1X), and ATP (100 mM)

References

Byron, S. A., Van Keuren-Jensen, K. R., Engelthaler, D. M., Carpten, J. D., Craig, D. W. Translating RNA sequencing into clinical diagnostics: opportunities and challenges. Nat Rev Gen. 17 (5), 257-271 (2016).
Grillone, K., et al. Non-coding RNAs in cancer: platforms and strategies for investigating the genomic "dark matter.". J Exp Clin Cancer Res. 39 (1), 117 (2020).
Hwang, B., Lee, J. H., Bang, D. Single-cell RNA sequencing technologies and bioinformatics pipelines. Exp Mol Med. 50 (8), 1-14 (2018).
Goh, J. J. L., et al. Highly specific multiplexed RNA imaging in tissues with split-FISH. Nat Methods. 17 (7), 689-693 (2020).
Moses, L., Pachter, L. Museum of spatial transcriptomics. Nat Methods. 19 (5), 534-546 (2022).
Cummings, B. B., et al. Improving genetic diagnosis in Mendelian disease with transcriptome sequencing. Sci Transl Med. 9 (386), 5209 (2017).
Hu, J. F., et al. Quantitative mapping of the cellular small RNA landscape with AQRNA-seq. Nat Biotech. 39 (8), 978-988 (2021).
Alon, S., et al. Barcoding bias in high-throughput multiplex sequencing of miRNA. Genome Res. 21 (9), 1506-1511 (2011).
Fuchs, R. T., Sun, Z., Zhuang, F., Robb, G. B. Bias in ligation-based small RNA sequencing library construction is determined by adaptor and RNA structure. PLoS One. 10 (5), e0126049 (2015).
Pang, Y. L. J., Abo, R., Levine, S. S., Dedon, P. C. Diverse cell stresses induce unique patterns of tRNA up- and down-regulation: tRNA-seq for quantifying changes in tRNA copy number. Nuc Acids Res. 42 (22), e170 (2014).
Machnicka, M. A., Olchowik, A., Grosjean, H., Bujnicki, J. M. Distribution and frequencies of post-transcriptional modifications in tRNAs. RNA Biol. 11 (12), 1619-1629 (2014).
Li, F., et al. Regulatory impact of RNA secondary structure across the Arabidopsis transcriptome. Plant Cell. 24 (11), 4346-4359 (2012).
García-Nieto, P. E., Wang, B., Fraser, H. B. Transcriptome diversity is a systematic source of variation in RNA-sequencing data. PLOS Comput Biol. 18 (3), e1009939 (2022).
. FASTQC: a quality control tool for high throughput sequence data Available from: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/ (2010)
Chen, S., Zhou, Y., Chen, Y., Gu, J. Fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics. 34 (17), i884-i890 (2018).
Love, M. I., Huber, W., Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15 (12), 550 (2014).
R Core Team. R: a language and environment for statistical computing. R Foundation for Statistical Computing. , (2022).
Ougland, R., et al. AlkB restores the biological function of mRNA and tRNA inactivated by chemical methylation. Mol Cell. 16 (1), 107-116 (2004).
Bernhardt, H. S., Tate, W. P. Primordial soup or vinaigrette: did the RNA world evolve at acidic pH. Biol Direct. 7, 4 (2012).
Altschul, S. F., Gish, W., Miller, W., Myers, E. W., Lipman, D. J. Basic local alignment search tool. J Mol Biol. 215 (3), 403-410 (1990).
Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet. J. 17 (1), 10-12 (2011).

Reprints and Permissions

Request permission to reuse the text or figures of this JoVE article

Request Permission

Explore More Articles

AQRNA seq Small RNA Quantification Sequencing Read Counts Linear Relationship Methylation RNA Modifications Bioinformatics Pipeline TRNA Quantification Mycobacterium Bovis BCG Nutrient Deprivation Resuscitation Primer Dimer Full length Reads Read Mapping High throughput Implementation

This article has been published

Video Coming Soon

Keep me updated:

Method Article