** Batchelor of Science (BS) in Biology \hfill UC Riverside
* Skills
+** Data Science
++ Reproducible, scalable analyses using *R*, *perl*, and python with
+ workflows on cloud- and cluster-based systems on terabyte-scale
+ datasets
++ Experimental design and correction to overcome multiple testing,
+ confounders, and batch effects using Bayesian and frequentist
+ methods
++ Design, development, and deployment of algorithms and data-driven
+ products, including APIs, reports, and interactive web applications
++ Statistical modeling (regression, inference, prediction/forecasting,
+ time series, and machine learning in very large (> 1TB) datasets)
++ Data mining, cleaning, processing and quality assurance of data
+ sources and products using tidydata formalisms
++ Visualization using *R*, ggplot, Shiny, and custom written routines.
+** Software Development
++ Languages: perl, R, C, C++, python, groovy, sh, make
++ Collaborative Development: git, travis, continuous integration,
+ automated testing
++ Web, Mobile: Shiny, jQuery, JavaScript
++ Databases: Postgresql (PL/SQL), SQLite, Mysql, NoSQL
++ Office Software: Gnumeric, Libreoffice, \LaTeX, Word, Excel,
+ Powerpoint
+ ** Genomics and Epigenomics
+ + NGS and array-based Genomics and Epigenomics of complex human
+ diseases using RNA-seq, targeted DNA sequencing, RRBS, Illumina
+ bead arrays, and Affymetrix microarrays from sample collection to
+ publication.
+ + Reproducible, scalable bioinformatics analysis using make,
+ nextflow, and cwl based workflows on cloud- and cluster-based
+ systems on terabyte-scale datasets
+ + Alignment, annotation, and variant calling using existing and custom
+ software, including GATK, bwa, STAR, and kallisto.
+ + Correcting for and experimental design to overcome multiple
+ testing, confounders, and batch effects using Bayesian and
+ frequentist methods approaches
+ + Using evolutionary genomics to identify causal human variants
+ ** Statistics
+ + Statistical modeling (regression, inference, prediction, and
+ learning in very large (> 1TB) datasets)
+ + Addressing confounders and batch effects
+ + Reproducible research
** Big Data
+ Parallel and Cloud Computing (slurm, torque, AWS, OpenStack, Azure)
+ Inter-process communication: MPI, OpenMP