Igor Muniz Soares

Brazil · United States · +55 62 999772996 · igor.muniz.ims@gmail.com
Senior Staff Machine Learning Engineer | Data Scientist | AI Researcher

Senior Staff–level Machine Learning Engineer with 9+ years of experience designing, building, and deploying production-grade ML and LLM systems across fintech, AI research, and enterprise platforms. Deep expertise in Large Language Models (LLMs), ML platform architecture, and end-to-end MLOps, with a strong hands-on mindset. Proven track record of owning complex ML systems end-to-end, leading technical direction without stepping away from code, and delivering scalable, reliable AI products used in real-world production. Experience spans LLM applications, recommender systems, NLP, computer vision, and AI research, including a publication in Nature Computational Science.


Experience

Senior Data Scientist / Machine Learning Engineer

Electrifai - Florida, USA (Remote)

Lead the design and delivery of LLM-powered AI products, including production chatbots and intelligent document processing systems.

Projects and responsibilities:

  • Designed and implemented RAG architectures using OpenAI, LangChain, and vector databases.
  • Partner closely with product and engineering teams to translate business requirements into scalable ML systems.
  • Act as technical authority on vector search adoption for recommendation system architectures, defining best practices and optimizing performance.

March 2024 - Present

Lead Machine Learning Engineer

Exponential Ventures - Austin, Texas, US (Remote)

Defined ML and data architecture for multiple AI-driven products, operating as a Staff-level Individual Contributor and technical leader.

Projects and responsibilities:

  • Led development of financial ML systems for a fintech operating under the ROSCA model.
  • Advanced state-of-the-art research in Genetic Engineering Attribution, achieving 10th place in Prediction Track and 2nd place in Innovation Track (out of 1,211 teams).
  • Designed, trained, and deployed production ML pipelines, including AutoML improvements.

August 2020 - December 2023

AI Researcher (PhD Program)

CEIA / Federal University of Goiás - Goiânia, GO, Brazil (Part-time)

Conducted advanced research in AI and machine learning as part of PhD program at Center of Excellence in Artificial Intelligence.

March 2021 - March 2022

Data Scientist

CEIA / Federal University of Goiás - Goiânia, GO, Brazil (Part-time)

Worked as a machine learning engineer consultant, developing and researching natural language processing models.

Projects:

  • Built pipelines for training, deploying, and automatically inferring ML models in a chatbot platform.
  • Developed Q&A systems and document retrieval pipelines using LLMs.
  • Designed parsing models and embedding generators to normalize and correct address strings.
February 2018 - August 2020

Data Scientist

Indra Company - Goiânia, GO, Brazil

Projects:

  • Researched, trained, and improved OCR models in Portuguese.
  • Built ML models for cloud classification and segmentation using satellite imagery.
  • Developed a Portuguese QA system based on BERT and Google QA Net, deployed in a production chatbot.
  • Created models for natural language to SQL conversion, enabling user-driven database queries.
  • Designed intent classification systems for chatbot flow control.
  • Built risk-based inspection recommendation systems using historical accident data.
  • Conducted big data analysis for Telefônica Brasil to identify customer connectivity issues.
  • Developed ML models for fraud detection in government financial databases.
February 2018 - August 2020

Computer Vision Engineer

Federal University of Goiás - Goiânia, GO, Brazil

  • Conducted research on person detection, pose estimation, and activity recognition.
  • Built a single end-to-end deep learning model for person detection, pose estimation, and activity classification.
  • Developed neural networks for arm movement classification using intramuscular signals for a myoelectric prosthesis.

February 2017 - February 2018

IoT Intern

Federal University of Goiás - Goiânia, GO, Brazil

  • Worked on embedded systems, firmware, and signal processing.

January 2016 - December 2016

Intern

MPT Engenharia - Goiânia, GO, Brazil

  • Software development in C / C++ and MATLAB.
  • PCB prototyping using ATMEGA microcontrollers.

2015 - 2016

Education

Federal University of Goiás

Master of Computer Engineering
Focus: Intelligent Systems & Applied/Theoretical AI

Thesis: A complete bottom-up approach to recognizing human activities in images through estimated pose using convolutional networks

September 2019

Federal University of Goiás

Bachelor of Computer Engineering
December 2016

Skills

Programming Languages & Tools

  • Python
  • SQL
  • TypeScript
  • C/C++
  • R
  • PyTorch
  • TensorFlow
  • Keras
  • Pandas
  • NumPy
  • Scikit-Learn
  • Spark / PySpark
  • OpenAI API
  • Hugging Face
  • LangChain
  • LlamaIndex
  • Pinecone
  • Milvus
  • FAISS
  • FastAPI
  • Flask
  • Django
  • Docker
  • Kubernetes
  • MLflow
  • BentoML
  • AWS
  • GCP

Knowledges

  • Machine Learning / Deep Learning
  • Large Language Models (LLMs)
  • Natural Language Processing
  • Computer Vision
  • Recommendation Systems & Ranking
  • MLOps & ML Platform Architecture
  • RAG & Vector Search
  • Prompt Engineering
  • AI Agents & Chatbots
  • Software Engineering
  • Math/Statistics
  • Research & Experimentation
  • Cloud Computing
  • Linux/Unix

Non-Technical Skills

  • Business acumen
  • Teamwork
  • Leadership
  • Problem Solver

Projects

Some of the projects I worked on and gave me experience on the following topics:

Natural Language Processing

Question Answering System

Adaptation and training of a deep learning model for a question answering system in Portuguese. Model based on Bert and Google QA Net. This system is applied in a Chatbot capable of answer questions based on unstructured texts

Intention Classification System

Training and classification of intentions present in certain phrases for Chatbots flow control

SeqTag: Sequence Tagging for portuguese datasets

Development of a deep learning model for token classification and sequence tagging in portuguese texts

Code:
Product categorization from title classification

Project developed for the Mercado Libre Data Challenge able to classify Portuguese and Spanish texts.

Code:
Natural Language to SQL

Model development for natural language conversion into sql language, allowing users to perform database queries.

Computer Vision

Activity Recognition based on pose estimation

This work propose a single end-to-end model able to detect people, estimate their pose, and recognize each one of their activities by their pose. The experiments show that the model has reached the state of the art in the tasks of person detection and pose estimation on MSCOCO Dataset 2017, and can recognize walking, running, sitting, and standing activities with an F1 score of 0.7344

Code:
Optical Character Recognition (OCR)

Training and improvements of existing ocr models in portuguese

Understanding Clouds from Satellite Images

Classification and segmentation of different cloud types from satellite images. 22nd place solution

Code:
AI and Computer Vision for Medicine

Development of deep learning models for identification of pulmonary diseases and intracranial hemorrhage in X-rays

Tabular Data

internet disconnection causes identification

Big data analysis for cross-checking information from Telefônica Brasil in order to find patterns in customers with internet peer disconnection issues

Fraud Identification

Data analysis and development of machine learning model to classify possible frauds in government advance database

Classification of intramuscular signals

Development of an artificial neural network to classify arm movements from the collected intramuscular signals. This feature is part of building a myoelectric prosthesis for people with amputated arms.

Code:

Publications

Improving lab-of-origin prediction of genetically engineered plasmids via deep metric learning
Soares, I.M., Camargo, F.H.F., Marques, A. et al.
Arxiv
Nature Computational Science - 2022
Deep metric learning improves lab of origin prediction of genetically engineered plasmids
IM Soares, FHF Camargo, A Marques, OM Crook
Arxiv
Nature Computational Science - 2022 (Preprint)
Ranking labs-of-origin for genetically engineered DNA using Metric Learning
I. Muniz, F. H. F. Camargo, A. Marques
Arxiv
arXiv - 2021
An end-to-end approach for recognizing human activity in images using pose estimation
I. Muniz, C. Vinhal, G. da Cruz Jr.
arXiv - 2019

Awards

  • 1st Place in the 2nd Workshop on Artificial Intelligence - Federal University of Goiás 2019
  • 2nd Place in the "Porto Seguro Data Challenge" - Kaggle 2021
  • 2nd Place in the "Genetic Engineering Attribution Challenge Innovation Track" - DrivenData 2021
  • 2nd Place in the Hackathon "Dev for a Change" - Indra Brasil 2019
  • 9th Place in the Latin America Mercado Libre Data Challenge - https://ml-challenge.mercadolibre.com/ 2019
  • 10th Place in the "Genetic Engineering Attribution Challenge Prediction Track" - DrivenData 2021
  • 22nd Place (Silver Medal) against 1538 teams in the Understanding Clouds from Satellite Images Challenge - Kaggle 2019
  • 29nd Place (Silver Medal) against 1305 teams in the SIIM-FISABIO-RSNA COVID-19 Detection - Kaggle 2021
  • 30th Place (Silver Medal) against 1620 teams in the Santa's Workshop Tour 2019 - Kaggle 2019/2020
  • 77th Place (Bronze Medal) against 1233 teams in the TensorFlow 2.0 Question Answering - Kaggle 2020
  • 78th Place (Bronze Medal) against 1275 teams in the VinBigData Chest X-ray Abnormalities Detection - Kaggle 2021