How to use ReCo to get gRNA read counts from fastq data

17 Apr 2023

ReCo! is a new library to process deep sequencing fastq data and generates gRNA read counts. This post will talk about how to use the ReCo program.

Pre-installation

Install Cutadapt

Cutadapt could be installed with sudo apt install cutadapt on ubuntu. The README of ReCo said they were using version 2.8 but the 3.5 that came with Ubuntu 22 seems working fine.

Install bowtie2

Linux version of bowtie2 2.3.0 could be downloaded. After decompress the file, make sure add the root folder to the PATH.

Install Python3

Installation

Git Clone or download the source code from github https://github.com/MaWeffm/ReCo.
Create a virtual environment with pytohn3 -m venv venv
Activate the virtual environment with source venv/bin/activate
Install the python dependencies with pip install -r requirements.txt

Prepare data

First of all, we need the fastq data from deep sequencing. The fastq data could be in text format or gz format.

Other than the fastq data from deep sequencing, we need 2 other files.

A txt gRNA library in tsv format with all the unique names and sequence for gRNA. The file should not have headers.


Zglp1-0	CCTTCTCAATACCCATTCG
Zglp1-1	CTTCTCAATACCCATTCGC
Zglp1-2	GACCTGGGTGGCCGGCGAA
Zglp1-3	ACCTGGGTGGCCGGCGAAG
Zglp1-4	CGCGCCACGCACCTGATAT
Mdn1-0	ATGCGAACCCAGTGCGCTA
…	…

A sample sheet contains all the information for the samples.

Sample name	Sample type	FastQ read 1	Lib 1	Expected reads
TG-Li-CRISPR-Control 1	single	TG-Li-CRISPR-Control_S1_R1_001.fastq.gz	primary_lib1.txt	11,834,515
TG-Li-CRISPR-Control 2	single	TG-Li-CRISPR-Control_S1_R2_001.fastq.gz	primary_lib1.txt	11,834,515
TG-Li-CRISPR-Enriched 1	single	TG-Li-CRISPR-Enrich_S2_R1_001.fastq.gz	primary_lib1.txt	11,000,108
TG-Li-CRISPR-Enriched 2	single	TG-Li-CRISPR-Enrich_S2_R2_001.fastq.gz	primary_lib1.txt	11,000,108

Run Reco!

To run the program use the following command:

PATH="/path/to/bowtie2-2.3.0:$PATH" /path/to/python /path/to/reco/ReCo/ReCo.py cli -s /path/to/NGS_all.xlsx --o /path/to/output/ -j 8 -r

The counts will be in file with suffix final_guidecounts.csv.

How to use NMR to solve 3D molecule structures

15 Apr 2023

There are a few steps to get 3D molecule structures from NMR data:

Assign NMR peaks to atoms.
Use NOE and ROE data to get the distance between protons.
Use Molecular dynamics to calculate the 3D structure based on the distance information. read more

nmrglue

09 Apr 2023

nmrglue is a python module to work with nmr data. It supports multiple nmr formats including bruker, varian, sparky, etc. The complete list is in the document.

In this post, I will show how to use nmrglue to process and visualize 2D ucsf(sparky) NMR data. read more

Visualize 2D NMR spectrum with Python

06 Apr 2023

To better understand the NMR files, we can use Python’s Matplotlib to visualize the NMR files and compare the results with the NMR software. read more

NMR software and file formats

05 Apr 2023

Recently worked on helping my previous PI analyze some NMR (Nuclear Magnetic Resonance) data for some molecules. I am used to analyze NMR data with Sparky. I am going to discuss about some NMR related software and file formats in this post. read more

Asurin Software Developer