From: Don Armstrong <don@donarmstrong.com>
Date: Thu, 22 Feb 2018 04:19:23 +0000 (-0800)
Subject: Merge branch 'master' into jobs/data_scientist
X-Git-Url: https://git.donarmstrong.com/?a=commitdiff_plain;h=b38c97c6bc73b0c05266f82de1652bfff0ed9bed;p=resume.git

Merge branch 'master' into jobs/data_scientist
---

b38c97c6bc73b0c05266f82de1652bfff0ed9bed
diff --cc don_armstrong_resume.org
index 5dcfff3,57c9f5e..f42d8d6
--- a/don_armstrong_resume.org
+++ b/don_armstrong_resume.org
@@@ -29,28 -64,25 +64,47 @@@
  ** Batchelor of Science (BS) in Biology \hfill UC Riverside
  
  * Skills
 +** Data Science
 ++ Reproducible, scalable analyses using *R*, *perl*, and python with
 +  workflows on cloud- and cluster-based systems on terabyte-scale
 +  datasets
 ++ Experimental design and correction to overcome multiple testing,
 +  confounders, and batch effects using Bayesian and frequentist
 +  methods
 ++ Design, development, and deployment of algorithms and data-driven
 +  products, including APIs, reports, and interactive web applications
 ++ Statistical modeling (regression, inference, prediction/forecasting,
 +  time series, and machine learning in very large (> 1TB) datasets)
 ++ Data mining, cleaning, processing and quality assurance of data
 +  sources and products using tidydata formalisms
 ++ Visualization using *R*, ggplot, Shiny, and custom written routines.
 +** Software Development
 ++ Languages: perl, R, C, C++, python, groovy, sh, make
 ++ Collaborative Development: git, travis, continuous integration,
 +  automated testing
 ++ Web, Mobile: Shiny, jQuery, JavaScript
 ++ Databases: Postgresql (PL/SQL), SQLite, Mysql, NoSQL
 ++ Office Software: Gnumeric, Libreoffice, \LaTeX, Word, Excel,
 +  Powerpoint
+ ** Genomics and Epigenomics
+ + NGS and array-based Genomics and Epigenomics of complex human
+   diseases using RNA-seq, targeted DNA sequencing, RRBS, Illumina
+   bead arrays, and Affymetrix microarrays from sample collection to
+   publication.
+ + Reproducible, scalable bioinformatics analysis using make,
+   nextflow, and cwl based workflows on cloud- and cluster-based
+   systems on terabyte-scale datasets
+ + Alignment, annotation, and variant calling using existing and custom
+   software, including GATK, bwa, STAR, and kallisto.
+ + Correcting for and experimental design to overcome multiple
+   testing, confounders, and batch effects using Bayesian and
+   frequentist methods approaches
+ + Using evolutionary genomics to identify causal human variants
+ ** Statistics
+ + Statistical modeling (regression, inference, prediction, and
+   learning in very large (> 1TB) datasets)
+ + Addressing confounders and batch effects
+ + Reproducible research
  ** Big Data
  + Parallel and Cloud Computing (slurm, torque, AWS, OpenStack, Azure)
  + Inter-process communication: MPI, OpenMP