#+OPTIONS: ^:nil
#+OPTIONS: toc:nil
+#+OPTIONS: auto-id:f
#+TITLE: Resume
#+AUTHOR: Don L. Armstrong
#+LATEX_CMD: xelatex
# #+LATEX_CLASS_OPTIONS: [10pt,breaklinks]
* Experience
-** Team Lead Data Engineering at Bayer Crop Science \hfill 2018--Present
+** Team Lead Data Engineering at Ginkgo Bioworks \hfill 2022--Present
++ Lead and manged team of data engineers, system administrators,
+ statisticians, bioinformaticians, and scientists at the PhD level
+ working within the AgBio unit of Ginkgo Bioworks.
++ Mentored and coached team members in data science, bioinformatics,
+ data engineering, and statistics.
++ Key leadership role in successful merger of AgBio unit with Ginkgo,
+ including all relevant R&D business applications and data-adjacent
+ systems.
+** Team Lead Data Engineering at Bayer Crop Science \hfill 2018--2022
+ Hired, managed, and developed team of 5+ Data Engineers, Systems
Administrators, and Business Analysts working within the Biologics
R&D unit of Bayer Crop Science enabling data capture, data
genomics, metabolomics, transcriptomics, spectroscopic, phenotypic
(/in vitro/ and /in planta/), and fermentation/formulation process
data for discovery and development
-+ Lead the development of multiple tools in python and R while
- coaching, mentoring, and developing multiple developers and
- engineers
++ Lead the development of multiple systems while coaching, mentoring,
+ and developing developers and engineers
+ Served as a key collaborator on multiple cross-function and
cross-divisional projects, including leading the architecture of a
life science collaboration using serverless architecture to provide
machine-learning estimates of critical parameters from
spectrographic measurements
++ Established and developed network of internal and external contacts
+ for technical implementation of Bayer program goals.
** Debian Developer \hfill 2004--Present
+ Maintained, managed configurations, and resolved issues in multiple
packages written in R, perl, python, scheme, C++, and C.
+ Provided vendor-level support for complex systems integration issues
on Debian GNU/Linux systems.
** Research Scientist at UIUC \hfill 2015--2017
-+ Primarily responsible for the planning, design, organization,
- execution, and analysis of multiple complex epidemiological studies
- involving epigenomics, transcriptomics, and genomics of diseases of
- pregnancy and post-traumatic stress disorder.
++ Planning, design, organization, execution, and analysis of multiple
+ complex epidemiological studies involving epigenomics,
+ transcriptomics, and genomics of diseases of pregnancy and
+ post-traumatic stress disorder.
+ Published results in scientific publications and presented results
orally at major scientific conferences.
+ Wrote and completed grants, including budgeting, scientific
+ Wrote software and designed relational databases using R, perl, C,
SQL, make, and very large computational systems ([[https://bluewaters.ncsa.illinois.edu/][Blue Waters]])
** Postdoctoral Researcher at USC \hfill 2013--2015
-+ Primarily responsible for the design, execution, and analysis of an
- epidemiological study to identify genomic variants associated with
- systemic lupus erythematosus using targeted deep sequencing.
-+ Designed, budgeted, configured, maintained, and supported a secure
- linux analysis cluster (MPI/torque) with a shared filesystem (NFS
- over gluster) for statistical analyses.
++ Design, execution, and analysis of an epidemiological study to
+ identify genomic variants associated with systemic lupus
+ erythematosus using targeted deep sequencing.
+ Wrote multiple pieces of software to reproducibly analyze and
archive large datasets resulting from genomic sequencing.
+ Coordinated with clinicians, molecular biologists, and biologists to
produce analyses and major reports.
** Postdoctoral Researcher at UCR \hfill 2010--2012
-+ Primarily responsible for the execution and analysis of an
- epidemiological study to identify genomic variants associated with
- systemic lupus erythematosus using prior information and array based
- approaches in a trio and cross sectional study of individuals from
- the Los Angeles and greater United States.
++ Executed and analyzed an epidemiological study to identify genomic
+ variants associated with systemic lupus erythematosus using prior
+ information and array based approaches in a trio and cross sectional
+ study of individuals from the Los Angeles and greater United States.
+ Wrote and maintained multiple software components to reproducibly
perform the analyses.
-** Independent Systems Administrator \hfill 2004--2018
-+ Researched, recommended, budgeted, designed, deployed, configured,
- operated, and monitored highly-available high-performance enterprise
- hardware and software for web applications, authentication, backup,
- email, and databases.
-+ Full life-cycle support of medium and small business networking
- infrastructure, including VPN, network security, wireless networks,
- routing, DNS, DHCP, and authentication.
* Education
** Doctor of Philosophy (PhD) in Cell, Molecular and Developmental Biology \hfill UC Riverside
** Batchelor of Science (BS) in Biology \hfill UC Riverside
* Skills
** Leadership and Mentoring
-+ Lead
++ Lead teams of PhD and MD scientists in multiple scientific and
+ industrial programs
+ Mentored graduate students and Outreachy and Google Summer of Code
interns
+ Former chair of Debian's Technical Committee
+ Head developer behind https://bugs.debian.org
+** Bioinformatics, Genomics, and Epigenomics
++ NGS and array-based Genomics and Epigenomics of complex human
+ diseases using RNA-seq, targeted DNA sequencing, RRBS, Illumina
+ bead arrays, and Affymetrix microarrays from sample collection to
+ publication
++ Reproducible, scalable bioinformatics analysis using make,
+ nextflow, and cwl based workflows on cloud- and cluster-based
+ systems on terabyte-scale datasets
++ Alignment, annotation, and variant calling using existing and custom
+ software, including GATK, bwa, STAR, and kallisto
++ Using evolutionary genomics to identify causal human variants
+** Statistics
++ Statistical modeling (regression, inference, prediction, and machine
+ learning in very large (> 1TB) datasets) using R and python.
++ Correcting & experimental design to overcome multiple testing,
+ confounders, and batch effects (both Bayesian and frequentist)
++ Reproducible research
** Software Development
+ Languages: python, R, perl, C, C++, python, groovy, sh (bash, POSIX,
and zsh), make
-+ Collaborative Development: git, Jira, github actions, Aha!, travis,
- continuous integration, automated testing, continuous deployment
++ Collaborative Development: git, Jira, gitlab CI/CD, github actions,
+ Aha!, continuous integration & deployment, automated testing
+ Web, Mobile: Shiny, jQuery, JavaScript
+ Databases: Postgresql (PL/SQL), SQLite, Mysql, NoSQL
** Big Data
** Applications and Daemons
+ Web: apache, ngix, varnish (load balancing/caching), REST, SOAP,
Tomcat
-+ SQL servers: PostgreSQL, MySQL, SQLite, oracle
+ Build Tools: GNU make, cmake
-+ Continuous Integration/Deployment: codebuild, travis, jenkins,
- github actions
+ Virtualization: libvirt, KVM, qemu, VMware, docker
-+ Statistics: R, SAS
+ VCS: git, mercurial, subversion
+ Mail: postfix, exim, sendmail, spamassassin
+ Configuration Infrastructure: puppet, hiera, etckeeper, git
+ Office Software: Gnumeric, Libreoffice, \LaTeX, Word, Excel,
Powerpoint
** Networking
- + Hardware, Linux routing and firewall experience, ferm, DHCP,
- openvpn, bonding, NAT, DNHS, SNMP, IPv4, and IPv6.
++ Hardware, Linux routing and firewall experience, ferm, DHCP,
+ openvpn, bonding, NAT, DNHS, SNMP, IPv4, and IPv6.
** Operating systems
- + GNU/Linux (Debian, Ubuntu, Red Hat)
- + Windows
- + MacOS
++ GNU/Linux (Debian, Ubuntu, Red Hat)
++ Windows
++ MacOS
** Communication
+ Strong written communication skills as evidenced by publication
record
-+ Strong verbal and presentation skills as evidenced by presentation
- and teaching record
-** Genomics and Epigenomics
-+ NGS and array-based Genomics and Epigenomics of complex human
- diseases using RNA-seq, targeted DNA sequencing, RRBS, Illumina
- bead arrays, and Affymetrix microarrays from sample collection to
- publication.
-+ Reproducible, scalable bioinformatics analysis using make,
- nextflow, and cwl based workflows on cloud- and cluster-based
- systems on terabyte-scale datasets
-+ Alignment, annotation, and variant calling using existing and custom
- software, including GATK, bwa, STAR, and kallisto.
-+ Correcting for and experimental design to overcome multiple
- testing, confounders, and batch effects using Bayesian and
- frequentist methods approaches
-+ Using evolutionary genomics to identify causal human variants
-** Statistics
-+ Statistical modeling (regression, inference, prediction, and
- learning in very large (> 1TB) datasets) using R and SAS.
-+ Addressing confounders and batch effects
-+ Reproducible research
++ Strong verbal and presentation skills as evidenced by presentation,
+ leadership, and teaching record
* Authored Open Source Software
+ *[[http://bugs.debian.org][Debbugs]]*: Bug tracking software for the Debian GNU/Linux
distribution. [[https://bugs.debian.org]]
* Publications and Presentations
+ 24 peer-reviewed publications cited over 3000 times:
https://dla2.us/pubs
-+ Publication record in GWAS, expression analysis of microarrays, SLE,
- GBM, epigenetics, comparative evolution of mammals, and lipid
- membranes
++ Publication record in GWAS, transcriptomics, SLE, GBM, epigenetics,
+ comparative evolution of mammals, and lipid membranes
+ H index >= 20
+ Multiple presentations on EWAS of PTSD, genetics of SLE, and Open
Source: https://dla2.us/pres