+# Skills
+## Data Science
++ Reproducible, scalable analyses using *R*, *perl*, and python with
+ workflows on cloud- and cluster-based systems on terabyte-scale
+ datasets
++ Experimental design and correction to overcome multiple testing,
+ confounders, and batch effects using Bayesian and frequentist
+ methods
++ Design, development, and deployment of algorithms and data-driven
+ products, including APIs, reports, and interactive web applications
++ Statistical modeling (regression, inference, prediction/forecasting,
+ time series, and machine learning in very large (> 1TB) datasets)
++ Data mining, cleaning, processing and quality assurance of data
+ sources and products using tidydata formalisms
++ Visualization using *R*, ggplot, Shiny, and custom written routines.
+## Software Development
++ Languages: perl, R, C, C++, python, groovy, sh, make
++ Collaborative Development: git, travis, continuous integration,
+ automated testing
++ Web, Mobile: Shiny, jQuery, JavaScript
++ Databases: Postgresql (PL/SQL), SQLite, Mysql, NoSQL
++ Office Software: Gnumeric, Libreoffice, \LaTeX, Word, Excel,
+ Powerpoint
+## Genomics and Epigenomics
++ NGS and array-based Genomics and Epigenomics of complex human
+ diseases using RNA-seq, targeted DNA sequencing, RRBS, Illumina
+ bead arrays, and Affymetrix microarrays from sample collection to
+ publication.
++ Reproducible, scalable bioinformatics analysis using make,
+ nextflow, and cwl based workflows on cloud- and cluster-based
+ systems on terabyte-scale datasets
++ Alignment, annotation, and variant calling using existing and custom
+ software, including GATK, bwa, STAR, and kallisto.
++ Correcting for and experimental design to overcome multiple
+ testing, confounders, and batch effects using Bayesian and
+ frequentist methods approaches
++ Using evolutionary genomics to identify causal human variants
+## Statistics
++ Statistical modeling (regression, inference, prediction, and
+ learning in very large (> 1TB) datasets)
++ Addressing confounders and batch effects
++ Reproducible research
+## Big Data
++ Parallel and Cloud Computing (slurm, torque, AWS, OpenStack, Azure)
++ Inter-process communication: MPI, OpenMP
++ Filestorage: Gluster, CEFS, GPFS, Lustre
++ Linux system administration
+## Genomics and Epigenomics
++ Linkage and association-based mapping of complex phenotypes using
+ next-generation sequencing and arrays
++ Alignment, annotation, and variant calling using existing and custom
+ software
+## Mentoring and Leadership
++ Mentored graduate students and Outreachy and Google Summer of Code
+ interns
++ Former chair of Debian's Technical Committee
+## Communication
++ Strong written communication skills as evidenced by publication
+ record
++ Strong verbal and presentation skills as evidenced by presentation
+ and teaching record
+## Consortia Involvement
++ *H3A Bionet*: Generating workflows and cloud resources for H3 Africa
++ *Psychiatric Genomics Consortium*: Identification of epigenetic
+ variants which are correlated with PTSD.
++ *SLEGEN*: System lupus erythematosus genetics consortium.
+# Authored Software
++ *[Debbugs](http://bugs.debian.org)*: Bug tracking software for the Debian GNU/Linux
+ distribution. [https://bugs.debian.org]
++ *[CairoHacks](https://git.donarmstrong.com/r/CairoHacks.git)*: Bookmarks and Raster images for large PDF plots in R.
++ *[Function2Gene](http://rzlab.ucr.edu/function2gene/)*: Gene selection tool based on literature mining which
+ enables Bayesian approaches to significance testing.
++ *[Helical Wheel Projections](http://rzlab.ucr.edu/scripts/wheel/wheel.cgi?sequence=ABCDEFGHIJLKMNOP&submit=Submit)*: Web-based tool to draw helical wheel
+ protein projections. [http://rzlab.ucr.edu/scripts/wheel]
+# Publications and Presentations
++ 24 peer-reviewed publications cited over 1800 times:
+ https://dla2.us/pubs
++ H index of 11
++ Numerous invited talks on EWAS of PTSD, genetics of SLE, and Open
+ Source: https://dla2.us/pres