[[!meta title="Resumé"]]
# Experience
-## Team Lead Data Engineering at Ginkgo Bioworks 2022–Present
-+ Lead and manged team of data engineers, system administrators,
- statisticians, bioinformaticians, and scientists at the PhD level
- working within the AgBio unit of Ginkgo Bioworks.
-+ Mentored and coached team members in data science, bioinformatics,
- data engineering, and statistics.
-+ Key leadership role in successful merger of AgBio unit with Ginkgo,
- including all relevant R&D business applications and data-adjacent
- systems.
-
-## Team Lead Data Engineering at Bayer Crop Science 2018–2022
+## Career Synopsis & Outlook
++ A proven, transformative leader of teams that enable businesses to
+ harness the value of scientific and business data to achieve
+ business goals in biotechnology and other biology-adjacent
+ industries.
++ Significant experience mentoring, coaching, managing, and leading
+ managers and individual contributors from the entry level to
+ principal level, enabling them to develop into their full potential
+ as leaders and contributors.
++ Extensive scientific, computational, analytical, and business
+ background coupled with a history of effective communication with
+ diverse audiences enables bridging the needs and requirements of
+ challenging stakeholders and earning their trust and buy-in even in
+ complex, highly regulated environments.
++ Seeking opportunities to grow, lead, and transform organizations
+ with a larger scope and greater impact.
+## Director of Data Science and Analytics at Ginkgo Bioworks 2022--Present
++ Directed a geographically distributed team of managers who lead
+ teams of data engineers, system administrators, statisticians,
+ bioinformaticians, and scientists at the PhD level working within
+ the Ag business unit of [Ginkgo Bioworks](https://www.ginkgobioworks.com)
++ Accountable for the data architecture, engineering, management, and
+ governance of all data within the Ag Business unit, including
+ complex modalities of research and development data from genomics to
+ complex phenotypic data, including chemistry, production, systems
+ biology, and business data.
++ Accountable for cost centers totalling $10 M annually, including
+ budgeting, procurement, vendor relationships, and policy compliance.
++ Hired and developed team members in data science, bioinformatics,
+ data engineering, software engineering, and statistics using
+ coaching, mentorship, and teaching approaches.
++ Accountable (and frequently responsible) for all R&D IT applications
+ in a business unit, including vendor selection, architectural
+ decisions, deployment, and development where appropriate.
++ Championed modern approaches to data governance and data stewardship
+ principles across multiple life-science and business functions.
++ Lead the development of multiple cloud-based serverless and
+ container-based applications in AWS and GCP with multiple API and UI
+ interfaces written in python and javascript to enable the management
+ of data, with dbt, airflow, postgresql, and snowflake handling data
+ storage and plumbing roles.
++ Key leadership role in multiple mergers and acquisitions,
+ specializing in R&D business applications and data-adjacent systems.
++ Extensive collaborations with scientific, business, and customer
+ leaders attest to my excellent communication and interpersonal
+ skills.
+## Team Lead Data Engineering at Bayer Crop Science 2018--2022
+ Hired, managed, and developed team of 5+ Data Engineers, Systems
Administrators, and Business Analysts working within the Biologics
R&D unit of Bayer Crop Science enabling data capture, data
- integration, and operationalization of data analysis pipelines
+ integration, and operationalization of data analysis pipelines.
+ Developed and supervised implementation of data capture,
integration, and analysis strategies to increase the value of
genomics, metabolomics, transcriptomics, spectroscopic, phenotypic
- (/in vitro/ and /in planta/), and fermentation/formulation process
- data for discovery and development
+ (/in vitro/ and /in planta/), and fermentation/formulation process.
+ data for discovery and development using AWS, python, postgresql, R, and
+ Lead the development of multiple systems while coaching, mentoring,
- and developing developers and engineers
+ and developing software and data engineers.
+ Served as a key collaborator on multiple cross-function and
cross-divisional projects, including leading the architecture of a
life science collaboration using serverless architecture to provide
machine-learning estimates of critical parameters from
- spectrographic measurements
-+ Established and developed network of internal and external contacts
- for technical implementation of Bayer program goals.
-
-## Debian Developer 2004–Present
+ spectrographic measurements.
+## Debian Developer 2004--Present
+ Maintained, managed configurations, and resolved issues in multiple
packages written in R, perl, python, scheme, C++, and C.
+ Resolved technical conflicts, developed technical standards, and
million entries with web, REST, and SOAP interfaces.
+ Provided vendor-level support for complex systems integration issues
on Debian GNU/Linux systems.
-
-## Research Scientist at UIUC 2015–2017
+## Research Scientist at UIUC 2015--2017
++ Architected and engineered systems to store, retrieve, and analyze
+ complex R&D data including behavioral healthcare data (PTSD),
+ genomic, epigenomic, and other phenotypic healthcare data
+ (pre-eclampsia), while maintaining compliance with data privacy
+ regulations including HIPAA and institutional review boards.
+ Planning, design, organization, execution, and analysis of multiple
complex epidemiological studies involving epigenomics,
transcriptomics, and genomics of diseases of pregnancy and
maintain abreast of current scientific literature, principles of
scientific research, and modern statistical methodology.
+ Wrote software and designed relational databases using R, perl, C,
- SQL, make, and very large computational systems ([Blue Waters](https://bluewaters.ncsa.illinois.edu/))
-
-## Postdoctoral Researcher at USC 2013–2015
+ SQL, make, and very large computational systems ([[https://bluewaters.ncsa.illinois.edu/][Blue Waters]])
+## Postdoctoral Researcher at USC 2013--2015
+ Design, execution, and analysis of an epidemiological study to
identify genomic variants associated with systemic lupus
erythematosus using targeted deep sequencing.
study of individuals from the Los Angeles and greater United States.
+ Wrote and maintained multiple software components to reproducibly
perform the analyses.
-
# Education
+ Doctor of Philosophy (PhD) in Cell, Molecular and Developmental Biology at UC Riverside
+ Batchelor of Science (BS) in Biology at UC Riverside
# Skills
## Leadership and Mentoring
-+ Lead teams of PhD and MD scientists in multiple scientific and
- industrial programs
-+ Mentored graduate students and Outreachy and Google Summer of Code
- interns
-+ Former chair of Debian's Technical Committee
-+ Head developer behind https://bugs.debian.org
-
++ Lead managers and teams of PhD-level scientists in multiple
+ scientific and industrial programs.
++ Mentorship of multiple employees, graduate students, and
+ undergraduates throughout career, helping them to fully develop
+ their potential and thrive.
++ Chair or lead of multiple initiatives and committees, including
+ aligning highly cross-functional and diverse stakeholders.
+## Data Governance/Management/Engineering
++ Leadership and implementation of data governance and management
+ programs across multiple functions within Ginkgo and Bayer.
++ Establishment of Metadata and master data management standards and
+ frameworks in life science and business domains.
++ Snowflake, dbt, Airflow
## Bioinformatics, Genomics, and Epigenomics
+ NGS and array-based Genomics and Epigenomics of complex human
diseases using RNA-seq, targeted DNA sequencing, RRBS, Illumina
+ Alignment, annotation, and variant calling using existing and custom
software, including GATK, bwa, STAR, and kallisto
+ Using evolutionary genomics to identify causal human variants
-
## Statistics
+ Statistical modeling (regression, inference, prediction, and machine
learning in very large (> 1TB) datasets) using R and python.
+ Correcting & experimental design to overcome multiple testing,
confounders, and batch effects (both Bayesian and frequentist)
+ Reproducible research
-
## Software Development
-+ Languages: python, R, perl, C, C++, python, groovy, sh (bash, POSIX,
++ Languages: python, R, perl, C, C++, groovy, sh (bash, POSIX,
and zsh), make
+ Collaborative Development: git, Jira, gitlab CI/CD, github actions,
Aha!, continuous integration & deployment, automated testing
+ Web, Mobile: Shiny, jQuery, JavaScript
-+ Databases: Postgresql (PL/SQL), SQLite, Mysql, NoSQL
-
++ Databases: Postgresql (PL/SQL), SQLite, Mysql, NoSQL, RDS
++ Cloud: AWS, Azure, GCP, OpenStack
++ Infrastructure as Code: AWS Cloudformation, Terraform, puppet,
+ etckeeper, hieara
## Big Data
+ Parallel and Cloud Computing (slurm, torque, AWS, OpenStack, Azure)
+ Inter-process communication: MPI, OpenMP
## Networking
+ Hardware, Linux routing and firewall experience, ferm, DHCP,
openvpn, bonding, NAT, DNHS, SNMP, IPv4, and IPv6.
-
## Operating systems
+ GNU/Linux (Debian, Ubuntu, Red Hat)
+ Windows
+ MacOS
-
## Communication
+ Strong written communication skills as evidenced by publication
- record
+ record.
++ Proven experience communicating with cross-functional and diverse
+ teams and stakeholders at all organizational levels.
+ Strong verbal and presentation skills as evidenced by presentation,
leadership, and teaching record
-
-# Authored Open Source Software
+* Authored Open Source Software
+ *[Debbugs](http://bugs.debian.org)*: Bug tracking software for the Debian GNU/Linux
- distribution.
+ distribution.
+ *[CairoHacks](http://git.donarmstrong.com/r/CairoHacks.git)*: Bookmarks and Raster images for large PDF plots in R.
++ *[Function2Gene](http://rzlab.ucr.edu/function2gene/)*: Gene selection tool based on literature mining which
+ enables Bayesian approaches to significance testing.
++ *[Helical Wheel Projections](http://rzlab.ucr.edu/scripts/wheel/wheel.cgi?sequence=ABCDEFGHIJLKMNOP&submit=Submit)*: Web-based tool to draw helical wheel
+ protein projections.
* Publications and Presentations
-+ 24 peer-reviewed publications cited over 3000 times:
++ 24 peer-reviewed publications cited over 4000 times:
https://dla2.us/pubs
+ Publication record in GWAS, transcriptomics, SLE, GBM, epigenetics,
comparative evolution of mammals, and lipid membranes
-+ H index >= 20
++ H index >= 21
+ Multiple presentations on EWAS of PTSD, genetics of SLE, and Open
Source: https://dla2.us/pres