Students
Tuition Fee
Start Date
Medium of studying
Duration
Details
Program Details
Degree
Courses
Major
Data Analysis
Area of study
Information and Communication Technologies | Natural Science
Course Language
English
About Program

Program Overview


Introduction to the University Program

The university program is designed to provide students with a comprehensive education in various fields, including Next-Generation Sequencing Analysis, R Programming, and Bioinformatics.


Program Structure

The program is divided into several topics, including:


  • Next-Generation Sequencing Analysis Resources
  • Pre-Requisites, such as Intro to R and Introduction to Linux
  • NGS Sequencing Technology and File Formats, including FastA, FastQ, and SAM/BAM/CRAM formats
  • Alignment, Visualization, and Variant Calling
  • RNA-seq Analysis, including Aligning RNA-seq data and Introduction to R
  • HPC, including Resources for editing files on the HPC and SLURM
  • ChipSeq analysis, including CHiP-seq considerations and Deeptools2
  • De novo genome assembly, including Pre-processing and QC
  • Single cell RNA sequencing, including Prerequisites and Seurat
  • Metagenomics, including Quality Control and Shotgun Metagenomics
  • Deep Learning using Keras

R Programming

R is a powerful statistical programming language that allows scientists to perform statistical computing and visualization. It is based on the S programming language and is open source, making it accessible to the community at no charge.


What is R?

R is a well-developed programming language that contains all essential elements of a computer programming language, such as conditionals, loops, and user-defined functions.


Main Websites to Learn More About R

The main websites to learn more about R are:


  • R-project: The main destination to find everything you want to know about R, including links to tutorials and learning about how you can contribute.
  • CRAN Comprehensive R Archive Network: The place to download R and other packages.

Installing R

There are many tutorials on the web that will help you install R. For this course, we will also be using R Studio as the IDE to help us manage our R data and code.


Using R

R can be used in several ways, including:


  • Using R on the command line, which requires knowledge of command-line operations
  • Using the Window System (Rgui), which is the most convenient way to use R
  • Using RStudio, which is a much more powerful and user-friendly interface compared to Rgui

R Objects

R objects are containers for pieces of data or lines of code. They can be named so they can be accessed at any point.


Modes

The mode of an R object can be:


  • Numeric: numbers
  • Complex: complex numbers
  • Logical: True/False
  • Character: alphanumeric values
  • Raw: bytes

Classes

The class of an R object can be:


  • Vector
  • Matrix
  • Array
  • Factor
  • Data frame
  • List

Vectors

Vectors are the most basic data structure in R. They are sequences of data that can be numbers, characters, or logical values.


Matrices

Matrices are two-dimensional arrays. They can be created using the matrix() function or the array() function.


Data Frames

Data frames are composed of vectors of the same length but can be of different modes. They are perfect for mixed-type biomedical data.


Lists

Lists are collections of objects. They can contain vectors, matrices, and data frames of different lengths.


Some Useful Commands

Some useful commands in R include:


  • mode() and typeof(): provide mode and type of the object
  • attributes(): provide useful information such as dimensions and names
  • as(): can be used to coerce one object type to another
  • sample(): get a random sample of numbers
  • order(): returns a numeric vector of the element position in ascending order
  • sort(): returns the values in ascending order
  • paste(): create a character vector by concatenating two other vectors
  • print(): prints content of an object to screen
  • range(): returns minimum and maximum value of a vector
  • t(): transpose a matrix or data frame

Conclusion

The university program provides a comprehensive education in various fields, including Next-Generation Sequencing Analysis, R Programming, and Bioinformatics. R is a powerful statistical programming language that allows scientists to perform statistical computing and visualization. Understanding R objects, including modes, classes, vectors, matrices, data frames, and lists, is essential for working with R. Additionally, knowing some useful commands in R can help with data analysis and manipulation.


See More
How can I help you today?