Thiago de Paula Oliveira
Thiago de Paula Oliveira

Statistician

About Me

I am Thiago de Paula Oliveira, a statistician at AbacusBio with 14+ years of experience turning experimental, genomic, and performance data into decisions. I specialise in advanced mixed-model and Bayesian analytics, the development of economic and sustainability selection indexes, and the delivery of reproducible analytics products—from R/C++ codebases to Dockerised dashboards. Whether the brief is accelerating genetic gain, improving farm-system resilience, or supporting athlete health, I focus on rigour, transparency, and decision-ready outputs.
Interests
  • Statistics and biostatistics
  • Concordance analysis
  • Multilevel and forecast models
  • Generalized mixed-effects models
  • Longitudinal data
  • Quantitative genetics & breeding analytics
  • Agricultural decision-support dashboards
  • Reproducible analytics pipelines for agri-genomics
  • Economic and sustainability selection indexes
Education
  • PhD in Statistics

    University of São Paulo

  • MSc in Statistics

    University of São Paulo

  • BSc in Agricultural Engineering

    University of São Paulo

📈 Expertise and Research
I am a statistician with 14+ years of experience turning noisy experimental, genomic, and performance data into decisions. After completing my PhD in Statistics at the University of São Paulo, I specialised in advanced mixed-model/Bayesian analytics and in the development of economic and sustainability selection indices that keep breeding programmes accountable.

As a Consultant Statistician at AbacusBio, I lead cross-functional teams that deliver genetic-evaluation pipelines, automated QC/ETL workflows, and decision dashboards for livestock, crop, and agri-tech partners. That work depends on production-grade code in R/C++/Bash, Docker-based reproducible environments, and early collaboration between domain scientists and data engineers.

Earlier, I held a Marie Skłodowska-Curie COFUND fellowship at the Roslin Institute (University of Edinburgh), built predictive health and sports-analytics products at the Insight Centre (NUI Galway) and Orreco, and lectured in statistics at USP. Along the way I have published across Nature-branded journals, advised national breeding programmes, and mentored teams on delivering transparent, auditable analyses.

Whether the brief is accelerating genetic gain, improving farm-system resilience, or supporting athlete health, my bias is toward rigour, reproducibility, and decision-ready outputs. Browse my recent publications and projects, and get in touch if you would like to collaborate or have a specific challenge in mind.

Featured Publications
Recent Publications
(2025). Breeding for sustainability: Development of an index to reduce greenhouse gas in dairy cattle. Animal.
(2023). Developing best practices for genotyping-by-sequencing analysis in the construction of linkage maps. GigaScience.
(2023). Pedigree-based Animal Models Using Directed Acyclic Graphs. Under consideration in Livestock Science.
Cited by
2018: 1 citations20182019: 12 citations20192020: 22 citations20202021: 34 citations20212022: 28 citations20222023: 35 citations20232024: 52 citations20242025: 69 citations202576 citations (scale)
Recent & Upcoming Talks
Recent Posts

Benchmarking Kendall's Tau in R and Rcpp

Implement and benchmark a fast Kendall’s tau-a in C++ via Rcpp against base R, discuss tie handling (tau-b), and when to move from R to C++.

Exploring polynomial, fractional polynomial, and spline models

1 Polynomial models 2 Fractional polynomial models 2.1 Finding optimal power Values in fractional polynomials 3 Spline models 3.1 Example 3.2 Challenges 3.3 Selection process for spline models 4 Citation The ability to accurately model and interpret complex data sets is paramount. This technical exploration delves into three sophisticated modelling techniques: - Polynomial Models, - Fractional Polynomials, and - Spline Models. Each of these models serves as a fundamental tool in the statistical toolkit, enabling us to capture and understand the intricacies of linear and non-linear relationships inherent in real-world data.

🎓 Connect with an expert statistician

I focus on advanced statistical modelling, economic and sustainability selection indices, interactive dashboards, and reproducible (Dockerised) pipelines that deliver decision-ready insights for agriculture, genetics, and sports performance.

Areas of impact

Agriculture. Design and analyse agronomic and farm-systems experiments, including multi-environment trials and spatial models, to optimise yield, resource use, and sustainability.

Genetics. Build genetic-evaluation pipelines and economic and sustainability selection indices that maximise genetic gain and inform breeding objectives.

Sports analytics. Develop tools and applications that enhance athlete performance through data-driven insights.

Explore my publications, projects, and recent work. If you are interested in collaborating or would like to learn more, please get in touch.

Stay connected and follow my work in statistical modelling and data analysis:

Google Scholar · GitHub

Star