From: Don Armstrong Date: Mon, 12 May 2025 03:32:45 +0000 (-0700) Subject: update resume X-Git-Url: https://git.donarmstrong.com/?a=commitdiff_plain;h=762574da65221a2e1d95db198a535f9387949b47;p=don.git update resume --- diff --git a/resume.mdwn b/resume.mdwn index 43e34c0..dd9a1fb 100644 --- a/resume.mdwn +++ b/resume.mdwn @@ -1,37 +1,70 @@ [[!meta title="Resumé"]] # Experience -## Team Lead Data Engineering at Ginkgo Bioworks 2022–Present -+ Lead and manged team of data engineers, system administrators, - statisticians, bioinformaticians, and scientists at the PhD level - working within the AgBio unit of Ginkgo Bioworks. -+ Mentored and coached team members in data science, bioinformatics, - data engineering, and statistics. -+ Key leadership role in successful merger of AgBio unit with Ginkgo, - including all relevant R&D business applications and data-adjacent - systems. - -## Team Lead Data Engineering at Bayer Crop Science 2018–2022 +## Career Synopsis & Outlook ++ A proven, transformative leader of teams that enable businesses to + harness the value of scientific and business data to achieve + business goals in biotechnology and other biology-adjacent + industries. ++ Significant experience mentoring, coaching, managing, and leading + managers and individual contributors from the entry level to + principal level, enabling them to develop into their full potential + as leaders and contributors. ++ Extensive scientific, computational, analytical, and business + background coupled with a history of effective communication with + diverse audiences enables bridging the needs and requirements of + challenging stakeholders and earning their trust and buy-in even in + complex, highly regulated environments. ++ Seeking opportunities to grow, lead, and transform organizations + with a larger scope and greater impact. +## Director of Data Science and Analytics at Ginkgo Bioworks 2022--Present ++ Directed a geographically distributed team of managers who lead + teams of data engineers, system administrators, statisticians, + bioinformaticians, and scientists at the PhD level working within + the Ag business unit of [Ginkgo Bioworks](https://www.ginkgobioworks.com) ++ Accountable for the data architecture, engineering, management, and + governance of all data within the Ag Business unit, including + complex modalities of research and development data from genomics to + complex phenotypic data, including chemistry, production, systems + biology, and business data. ++ Accountable for cost centers totalling $10 M annually, including + budgeting, procurement, vendor relationships, and policy compliance. ++ Hired and developed team members in data science, bioinformatics, + data engineering, software engineering, and statistics using + coaching, mentorship, and teaching approaches. ++ Accountable (and frequently responsible) for all R&D IT applications + in a business unit, including vendor selection, architectural + decisions, deployment, and development where appropriate. ++ Championed modern approaches to data governance and data stewardship + principles across multiple life-science and business functions. ++ Lead the development of multiple cloud-based serverless and + container-based applications in AWS and GCP with multiple API and UI + interfaces written in python and javascript to enable the management + of data, with dbt, airflow, postgresql, and snowflake handling data + storage and plumbing roles. ++ Key leadership role in multiple mergers and acquisitions, + specializing in R&D business applications and data-adjacent systems. ++ Extensive collaborations with scientific, business, and customer + leaders attest to my excellent communication and interpersonal + skills. +## Team Lead Data Engineering at Bayer Crop Science 2018--2022 + Hired, managed, and developed team of 5+ Data Engineers, Systems Administrators, and Business Analysts working within the Biologics R&D unit of Bayer Crop Science enabling data capture, data - integration, and operationalization of data analysis pipelines + integration, and operationalization of data analysis pipelines. + Developed and supervised implementation of data capture, integration, and analysis strategies to increase the value of genomics, metabolomics, transcriptomics, spectroscopic, phenotypic - (/in vitro/ and /in planta/), and fermentation/formulation process - data for discovery and development + (/in vitro/ and /in planta/), and fermentation/formulation process. + data for discovery and development using AWS, python, postgresql, R, and + Lead the development of multiple systems while coaching, mentoring, - and developing developers and engineers + and developing software and data engineers. + Served as a key collaborator on multiple cross-function and cross-divisional projects, including leading the architecture of a life science collaboration using serverless architecture to provide machine-learning estimates of critical parameters from - spectrographic measurements -+ Established and developed network of internal and external contacts - for technical implementation of Bayer program goals. - -## Debian Developer 2004–Present + spectrographic measurements. +## Debian Developer 2004--Present + Maintained, managed configurations, and resolved issues in multiple packages written in R, perl, python, scheme, C++, and C. + Resolved technical conflicts, developed technical standards, and @@ -40,8 +73,12 @@ million entries with web, REST, and SOAP interfaces. + Provided vendor-level support for complex systems integration issues on Debian GNU/Linux systems. - -## Research Scientist at UIUC 2015–2017 +## Research Scientist at UIUC 2015--2017 ++ Architected and engineered systems to store, retrieve, and analyze + complex R&D data including behavioral healthcare data (PTSD), + genomic, epigenomic, and other phenotypic healthcare data + (pre-eclampsia), while maintaining compliance with data privacy + regulations including HIPAA and institutional review boards. + Planning, design, organization, execution, and analysis of multiple complex epidemiological studies involving epigenomics, transcriptomics, and genomics of diseases of pregnancy and @@ -56,9 +93,8 @@ maintain abreast of current scientific literature, principles of scientific research, and modern statistical methodology. + Wrote software and designed relational databases using R, perl, C, - SQL, make, and very large computational systems ([Blue Waters](https://bluewaters.ncsa.illinois.edu/)) - -## Postdoctoral Researcher at USC 2013–2015 + SQL, make, and very large computational systems ([[https://bluewaters.ncsa.illinois.edu/][Blue Waters]]) +## Postdoctoral Researcher at USC 2013--2015 + Design, execution, and analysis of an epidemiological study to identify genomic variants associated with systemic lupus erythematosus using targeted deep sequencing. @@ -74,20 +110,25 @@ study of individuals from the Los Angeles and greater United States. + Wrote and maintained multiple software components to reproducibly perform the analyses. - # Education + Doctor of Philosophy (PhD) in Cell, Molecular and Developmental Biology at UC Riverside + Batchelor of Science (BS) in Biology at UC Riverside # Skills ## Leadership and Mentoring -+ Lead teams of PhD and MD scientists in multiple scientific and - industrial programs -+ Mentored graduate students and Outreachy and Google Summer of Code - interns -+ Former chair of Debian's Technical Committee -+ Head developer behind https://bugs.debian.org - ++ Lead managers and teams of PhD-level scientists in multiple + scientific and industrial programs. ++ Mentorship of multiple employees, graduate students, and + undergraduates throughout career, helping them to fully develop + their potential and thrive. ++ Chair or lead of multiple initiatives and committees, including + aligning highly cross-functional and diverse stakeholders. +## Data Governance/Management/Engineering ++ Leadership and implementation of data governance and management + programs across multiple functions within Ginkgo and Bayer. ++ Establishment of Metadata and master data management standards and + frameworks in life science and business domains. ++ Snowflake, dbt, Airflow ## Bioinformatics, Genomics, and Epigenomics + NGS and array-based Genomics and Epigenomics of complex human diseases using RNA-seq, targeted DNA sequencing, RRBS, Illumina @@ -99,22 +140,22 @@ + Alignment, annotation, and variant calling using existing and custom software, including GATK, bwa, STAR, and kallisto + Using evolutionary genomics to identify causal human variants - ## Statistics + Statistical modeling (regression, inference, prediction, and machine learning in very large (> 1TB) datasets) using R and python. + Correcting & experimental design to overcome multiple testing, confounders, and batch effects (both Bayesian and frequentist) + Reproducible research - ## Software Development -+ Languages: python, R, perl, C, C++, python, groovy, sh (bash, POSIX, ++ Languages: python, R, perl, C, C++, groovy, sh (bash, POSIX, and zsh), make + Collaborative Development: git, Jira, gitlab CI/CD, github actions, Aha!, continuous integration & deployment, automated testing + Web, Mobile: Shiny, jQuery, JavaScript -+ Databases: Postgresql (PL/SQL), SQLite, Mysql, NoSQL - ++ Databases: Postgresql (PL/SQL), SQLite, Mysql, NoSQL, RDS ++ Cloud: AWS, Azure, GCP, OpenStack ++ Infrastructure as Code: AWS Cloudformation, Terraform, puppet, + etckeeper, hieara ## Big Data + Parallel and Cloud Computing (slurm, torque, AWS, OpenStack, Azure) + Inter-process communication: MPI, OpenMP @@ -138,28 +179,31 @@ ## Networking + Hardware, Linux routing and firewall experience, ferm, DHCP, openvpn, bonding, NAT, DNHS, SNMP, IPv4, and IPv6. - ## Operating systems + GNU/Linux (Debian, Ubuntu, Red Hat) + Windows + MacOS - ## Communication + Strong written communication skills as evidenced by publication - record + record. ++ Proven experience communicating with cross-functional and diverse + teams and stakeholders at all organizational levels. + Strong verbal and presentation skills as evidenced by presentation, leadership, and teaching record - -# Authored Open Source Software +* Authored Open Source Software + *[Debbugs](http://bugs.debian.org)*: Bug tracking software for the Debian GNU/Linux - distribution. + distribution. + *[CairoHacks](http://git.donarmstrong.com/r/CairoHacks.git)*: Bookmarks and Raster images for large PDF plots in R. ++ *[Function2Gene](http://rzlab.ucr.edu/function2gene/)*: Gene selection tool based on literature mining which + enables Bayesian approaches to significance testing. ++ *[Helical Wheel Projections](http://rzlab.ucr.edu/scripts/wheel/wheel.cgi?sequence=ABCDEFGHIJLKMNOP&submit=Submit)*: Web-based tool to draw helical wheel + protein projections. * Publications and Presentations -+ 24 peer-reviewed publications cited over 3000 times: ++ 24 peer-reviewed publications cited over 4000 times: https://dla2.us/pubs + Publication record in GWAS, transcriptomics, SLE, GBM, epigenetics, comparative evolution of mammals, and lipid membranes -+ H index >= 20 ++ H index >= 21 + Multiple presentations on EWAS of PTSD, genetics of SLE, and Open Source: https://dla2.us/pres