[[!meta title="Resumé"]] # Experience ## Team Lead Data Engineering at Ginkgo Bioworks 2022–Present + Lead and manged team of data engineers, system administrators, statisticians, bioinformaticians, and scientists at the PhD level working within the AgBio unit of Ginkgo Bioworks. + Mentored and coached team members in data science, bioinformatics, data engineering, and statistics. + Key leadership role in successful merger of AgBio unit with Ginkgo, including all relevant R&D business applications and data-adjacent systems. ## Team Lead Data Engineering at Bayer Crop Science 2018–2022 + Hired, managed, and developed team of 5+ Data Engineers, Systems Administrators, and Business Analysts working within the Biologics R&D unit of Bayer Crop Science enabling data capture, data integration, and operationalization of data analysis pipelines + Developed and supervised implementation of data capture, integration, and analysis strategies to increase the value of genomics, metabolomics, transcriptomics, spectroscopic, phenotypic (/in vitro/ and /in planta/), and fermentation/formulation process data for discovery and development + Lead the development of multiple systems while coaching, mentoring, and developing developers and engineers + Served as a key collaborator on multiple cross-function and cross-divisional projects, including leading the architecture of a life science collaboration using serverless architecture to provide machine-learning estimates of critical parameters from spectrographic measurements + Established and developed network of internal and external contacts for technical implementation of Bayer program goals. ## Debian Developer 2004–Present + Maintained, managed configurations, and resolved issues in multiple packages written in R, perl, python, scheme, C++, and C. + Resolved technical conflicts, developed technical standards, and provided leadership as the elected chair of the Technical Committee. + Developer of [Debbugs](https://bugs.debian.org), a perl and SQL-based issue-tracker with ≥ 100 million entries with web, REST, and SOAP interfaces. + Provided vendor-level support for complex systems integration issues on Debian GNU/Linux systems. ## Research Scientist at UIUC 2015–2017 + Planning, design, organization, execution, and analysis of multiple complex epidemiological studies involving epigenomics, transcriptomics, and genomics of diseases of pregnancy and post-traumatic stress disorder. + Published results in scientific publications and presented results orally at major scientific conferences. + Wrote and completed grants, including budgeting, scientific direction, project management, and reporting. + Mentored graduate students and collaborated with internal and external scientists. + Performed literature review, training, and applied new techniques to maintain abreast of current scientific literature, principles of scientific research, and modern statistical methodology. + Wrote software and designed relational databases using R, perl, C, SQL, make, and very large computational systems ([Blue Waters](https://bluewaters.ncsa.illinois.edu/)) ## Postdoctoral Researcher at USC 2013–2015 + Design, execution, and analysis of an epidemiological study to identify genomic variants associated with systemic lupus erythematosus using targeted deep sequencing. + Wrote multiple pieces of software to reproducibly analyze and archive large datasets resulting from genomic sequencing. + Coordinated with clinicians, molecular biologists, and biologists to produce analyses and major reports. ## Postdoctoral Researcher at UCR 2010–2012 + Executed and analyzed an epidemiological study to identify genomic variants associated with systemic lupus erythematosus using prior information and array based approaches in a trio and cross sectional study of individuals from the Los Angeles and greater United States. + Wrote and maintained multiple software components to reproducibly perform the analyses. # Education + Doctor of Philosophy (PhD) in Cell, Molecular and Developmental Biology at UC Riverside + Batchelor of Science (BS) in Biology at UC Riverside # Skills ## Leadership and Mentoring + Lead teams of PhD and MD scientists in multiple scientific and industrial programs + Mentored graduate students and Outreachy and Google Summer of Code interns + Former chair of Debian's Technical Committee + Head developer behind https://bugs.debian.org ## Bioinformatics, Genomics, and Epigenomics + NGS and array-based Genomics and Epigenomics of complex human diseases using RNA-seq, targeted DNA sequencing, RRBS, Illumina bead arrays, and Affymetrix microarrays from sample collection to publication + Reproducible, scalable bioinformatics analysis using make, nextflow, and cwl based workflows on cloud- and cluster-based systems on terabyte-scale datasets + Alignment, annotation, and variant calling using existing and custom software, including GATK, bwa, STAR, and kallisto + Using evolutionary genomics to identify causal human variants ## Statistics + Statistical modeling (regression, inference, prediction, and machine learning in very large (> 1TB) datasets) using R and python. + Correcting & experimental design to overcome multiple testing, confounders, and batch effects (both Bayesian and frequentist) + Reproducible research ## Software Development + Languages: python, R, perl, C, C++, python, groovy, sh (bash, POSIX, and zsh), make + Collaborative Development: git, Jira, gitlab CI/CD, github actions, Aha!, continuous integration & deployment, automated testing + Web, Mobile: Shiny, jQuery, JavaScript + Databases: Postgresql (PL/SQL), SQLite, Mysql, NoSQL ## Big Data + Parallel and Cloud Computing (slurm, torque, AWS, OpenStack, Azure) + Inter-process communication: MPI, OpenMP + Filestorage: Gluster, CEFS, GPFS, Lustre + Linux system administration ## Applications and Daemons + Web: apache, ngix, varnish (load balancing/caching), REST, SOAP, Tomcat + Build Tools: GNU make, cmake + Virtualization: libvirt, KVM, qemu, VMware, docker + VCS: git, mercurial, subversion + Mail: postfix, exim, sendmail, spamassassin + Configuration Infrastructure: puppet, hiera, etckeeper, git + Documentation: \LaTeX, confluence, emacs, MarkDown, MediaWiki, ikiwiki, trac + Monitoring: munin, nagios, icinga, prometheus + Issue Tracking: Debbugs, Request Tracker, Trac, JIRA + Office Software: Gnumeric, Libreoffice, \LaTeX, Word, Excel, Powerpoint ## Networking + Hardware, Linux routing and firewall experience, ferm, DHCP, openvpn, bonding, NAT, DNHS, SNMP, IPv4, and IPv6. ## Operating systems + GNU/Linux (Debian, Ubuntu, Red Hat) + Windows + MacOS ## Communication + Strong written communication skills as evidenced by publication record + Strong verbal and presentation skills as evidenced by presentation, leadership, and teaching record # Authored Open Source Software + *[Debbugs](http://bugs.debian.org)*: Bug tracking software for the Debian GNU/Linux distribution. + *[CairoHacks](http://git.donarmstrong.com/r/CairoHacks.git)*: Bookmarks and Raster images for large PDF plots in R. * Publications and Presentations + 24 peer-reviewed publications cited over 3000 times: https://dla2.us/pubs + Publication record in GWAS, transcriptomics, SLE, GBM, epigenetics, comparative evolution of mammals, and lipid membranes + H index >= 20 + Multiple presentations on EWAS of PTSD, genetics of SLE, and Open Source: https://dla2.us/pres # Funding and Awards ## Grants + 2017 R Consortium: *[Adding Linux Binary Builders to R-Hub](https://www.r-consortium.org/blog/2017/04/03/q1-2017-isc-grants)* Role: Co-PI + 2015 Blue Waters Allocation Grant: *Making ancestral trees using Bayesian inference to identify disease-causing genetic variants* Role: Primary Investigator + *Tracking placenta and uterine funciton using urinary extracellular vesicles* (R21 RFA-HD-16-037) Role: Key Personnel + *NIAMS* R01-AR045650-04 *Genetics of Childhood Onset SLE* to Chaim O. Jacob. Role: Key Personnel ## Scholarships and Fellowships + 2001–2003: University of California, Riverside Doctoral Fellowship + 1997–2001: Regents of the University of California Scholarship. # Academic Information You can also read my [Curriculum Vitæ](curriculum_vitae) ([pdf](dla-cv.pdf)), [Research Statement](research_statement) ([pdf](research_statement.pdf)), and [Teaching Statement](teaching_statement) ([pdf](teaching_statement.pdf)). For my contact information or additional references, please e-mail