
Guang Ouyang
Boost AI/ML Skills with a Former Google Data Scientist's Expertise
Studied at University of Connecticut
Works at Axiom
Please note that the recurring weekly time slots shown on my calendar are available on an every-other-week basis. If you're unsure about a specific time, feel free to message me before booking.
Questions? Start chatting with this coach before you get started.
Guang's Offerings
Custom hourly · $90/hr
Get help with AI Fundamentals, LinkedIn Review, and .
Guang’s AI & ML Engineering Qualifications
Experience level: Executive
Welcome to my profile! With a PhD in Statistics and Data Mining from the University of Connecticut, I bring a deep understanding of data-driven decision-making to my coaching. As the Founder & Principal of Axiom Data Lab, I have built scalable research systems for investment strategies, and my 8 years experience as a Data Scientist at Google has honed my skills in ML development and process automation. I specialize in helping clients navigate the complexities of AI and ML engineering, leveraging my expertise in building end-to-end ML pipelines and optimizing systems for efficiency. Whether you're looking to enhance your skills in machine learning, data engineering, or scalable systems, I'm here to guide you. Let's connect and take your career to the next level!
Guang can help with:
AI Fundamentals
LinkedIn Review
ML Ops & Deployment
Model Development
Resume Review
Work Experience

Founder & Principal
Axiom
March 2025 - Present
Built a data-driven decision system that scans hundreds of U.S. equities (scaling toward full market coverage of 10,000+) to identify high-probability pre-breakout setups. - Designed and implemented end-to-end ML/data pipelines from ingestion and validation to modeling, backtesting, and evaluation - Built and maintained a proprietary market and fundamental data store (65+ years, full U.S. coverage) with automated daily (market) and weekly (fundamentals) updates - Developed feature pipelines to detect structured price consolidation patterns - Built predictive modeling framework to estimate expected returns - Constructed backtesting and simulation system with realistic execution constraints (slippage, exit logic) - Optimized system for efficiency under constrained compute environments and parallel scaling Focus: scalable research system for long-horizon, low-frequency investment strategies Machine Learning, SQL and +5 skills

Data Scientist
March 2017 - March 2025
YouTube Ads Revenue Optimization: Led the full-stack ML development cycle in C++ (from feature extraction to deployment) for user language preference models. This initiative resulted in an estimated $400M+ annual revenue gain while maintaining strict Ads targeting latency limits. Core ML Infrastructure: Expanded end-to-end model pipelines into a configurable framework, significantly expediting the development cycle for multiple internal Google projects and improving cross-team velocity. Process Automation: Architected a C++ validation infrastructure to automate weekly user profile refreshes. This system eliminated production errors and saved significant engineering hours by transitioning from manual to automated refreshes. Global Data Strategy: Resolved a major coverage issue in training data by launching personalized language readability surveys in 40+ countries, collecting ground truth data that drove measurable model quality improvements. Ads Attribution & Optimization: Analyzed ads incrementality metrics using multi-touch attribution models (Python/SQL). These insights improved ads conversion rates for partner publishers within the Google Display Ads network. A/B Testing, Exploratory Data Analysis and +6 skills

Data Scientist
1:ID
October 2015 - March 2017
Risk Modeling & Scoring: Developed and deployed production-grade credit and fraud scoring models for leading financial institutions. Leveraged Gradient Boosted Tree (GBT) algorithms to drive predictive accuracy in highly regulated financial environments. Distributed Systems Engineering: Architected a novel distributed clustering algorithm in Spark (Scala) capable of scaling to multi-billion nodes using only 600GB of memory. This innovation enabled high-efficiency user identity resolution at an enterprise scale. Evaluation Frameworks & Monitoring: Established new statistical metrics for evaluating clustering quality in massive-scale networks. Developed an automated outlier detection system to monitor high-throughput data streams for integrity and drift. Machine Learning, Data Engineering and +2 skills

Research Statistician
Pfizer
September 2011 - December 2014
Statistical Data Analysis by SAS/R on real data from Pharmaceutical Industry Research work on applying Proportional Odds Model to ordinal data
Education

University of Connecticut
PhD, Statistics/Data Mining
2010 - 2015
Grade: 4.06/4.3 My PhD research concentration is social network analysis. It covers following topics: 1. Exploring the real power driving social connections Transitivity (friend's friends have higher probability to be friends), and preferential attachment(rich get richer) effects are two commonly know effects driving social connections. I created a novel model providing statistical inference on these effects. 2.Large scale network clustering Developed a novel clustering algorithm that splits large network into multiple closely connected communities. It overcomes the resolution limitation of widely used Modularity-based clustering methods. 3. Influential node detection in social network In plan

Miami Herbert Business School
Master of Science, Mathematics
2008 - 2010

Wuhan University
Bachelor of Science, Mathmatics
2004 - 2008