Asurin Software Developer

How to use ReCo to get gRNA read counts from fastq data

ReCo! is a new library to process deep sequencing fastq data and generates gRNA read counts. This post will talk about how to use the ReCo program.

Pre-installation

Install Cutadapt

Cutadapt could be installed with sudo apt install cutadapt on ubuntu. The README of ReCo said they were using version 2.8 but the 3.5 that came with Ubuntu 22 seems working fine.

Install bowtie2

Linux version of bowtie2 2.3.0 could be downloaded. After decompress the file, make sure add the root folder to the PATH.

Install Python3

Installation

  1. Git Clone or download the source code from github https://github.com/MaWeffm/ReCo.
  2. Create a virtual environment with pytohn3 -m venv venv
  3. Activate the virtual environment with source venv/bin/activate
  4. Install the python dependencies with pip install -r requirements.txt

Prepare data

First of all, we need the fastq data from deep sequencing. The fastq data could be in text format or gz format.

Other than the fastq data from deep sequencing, we need 2 other files.

  1. A txt gRNA library in tsv format with all the unique names and sequence for gRNA. The file should not have headers.

    Zglp1-0 CCTTCTCAATACCCATTCG
    Zglp1-1 CTTCTCAATACCCATTCGC
    Zglp1-2 GACCTGGGTGGCCGGCGAA
    Zglp1-3 ACCTGGGTGGCCGGCGAAG
    Zglp1-4 CGCGCCACGCACCTGATAT
    Mdn1-0 ATGCGAACCCAGTGCGCTA
  2. A sample sheet contains all the information for the samples.

Sample name Sample type Vector FastQ read 1 FastQ read 2 Lib 1 Lib 2 Expected reads
TG-Li-CRISPR-Control 1 single   TG-Li-CRISPR-Control_S1_R1_001.fastq.gz   primary_lib1.txt   11,834,515
TG-Li-CRISPR-Control 2 single   TG-Li-CRISPR-Control_S1_R2_001.fastq.gz   primary_lib1.txt   11,834,515
TG-Li-CRISPR-Enriched 1 single   TG-Li-CRISPR-Enrich_S2_R1_001.fastq.gz   primary_lib1.txt   11,000,108
TG-Li-CRISPR-Enriched 2 single   TG-Li-CRISPR-Enrich_S2_R2_001.fastq.gz   primary_lib1.txt   11,000,108

Run Reco!

To run the program use the following command:

PATH="/path/to/bowtie2-2.3.0:$PATH" /path/to/python /path/to/reco/ReCo/ReCo.py cli -s /path/to/NGS_all.xlsx --o /path/to/output/ -j 8 -r

The counts will be in file with suffix final_guidecounts.csv.

How to use NMR to solve 3D molecule structures

There are a few steps to get 3D molecule structures from NMR data:

  1. Assign NMR peaks to atoms.
  2. Use NOE and ROE data to get the distance between protons.
  3. Use Molecular dynamics to calculate the 3D structure based on the distance information. read more

nmrglue

nmrglue is a python module to work with nmr data. It supports multiple nmr formats including bruker, varian, sparky, etc. The complete list is in the document.

In this post, I will show how to use nmrglue to process and visualize 2D ucsf(sparky) NMR data. read more

Visualize 2D NMR spectrum with Python

To better understand the NMR files, we can use Python’s Matplotlib to visualize the NMR files and compare the results with the NMR software. read more

NMR software and file formats

Recently worked on helping my previous PI analyze some NMR (Nuclear Magnetic Resonance) data for some molecules. I am used to analyze NMR data with Sparky. I am going to discuss about some NMR related software and file formats in this post. read more