Igor Muniz Soares

Goiânia, Goiás, Brazil · +55 62 999772996 · igor.muniz.ims@gmail.com
Master of Computer Engineering · Data Scientist · Kaggle Competitions Expert

I am a data scientist working to bring insights into all types of data. With strong skills in mathematics, statistics and coding, specialist in machine learning engineering and passionate about puzzles, I build intelligent and scalable systems for large amounts of data. Working with deep learning for over four years.


Experience

Senior Data Scientist

Exponential Ventures

Exponential Ventures is a American Startup that solve today's biggest problems with Exponential Technologies such as AI, HMI, Robotics, and Quantum computing. As a Data Scientist, I'm responsible for analyzing large amounts of data in complex environments, building machine learning models and helping to improve our Auto ML product.

August 2020 - Present

NLP Researcher

CeIA - Artificial Intelligence Center of Excellence (Goiás)

CeIA is a scientific laboratory financed by private companies operating in several segments such as energy, retail, delivery, health, among others. I work part-time there as a machine learning engineer consultant, developing and researching natural language processing models.

March 2021 - Present

Deep Learning Professor

FASAM - South American College

Responsible for deep learning classes in the Big Data and Machine Learning postgraduate course.

January 2021 - Present

Data Scientist

Indra Company

Responsible for analyzing data, discovering patterns and proposing solutions to business problems in many different areas, creating custom reports for clients and automating processes for data prediction or classification. Development of AI algorithms (deep learning) for natural language processing (English and Pt-Br) and computer vision, acting from the creation until the deploy of the models.

February 2018 - August 2020

Deep Learning Researcher

Federal University of Goiás

Research on artificial intelligence techniques for person detection, pose estimation and activity classification on imagens. During this time, it was developed the work "A complete bottom-up approach to recognizing human activities in images through the estimated pose with convolutional networks"

February 2017 - July 2019

IoT Intern

MPT Engenharia

Software development in C / C ++ and Matlab and PCB prototyping with ATMEGA microcontrollers.

March 2015 - July 2016

Education

Federal University of Goiás

Master of Science
Computer Engineering

Intelligent Systems

March 2017 - September 2019

Federal University of Goiás

Bachelor of Science

Computer Engineering

March 2011 - December 2016

Skills

Programming Languages & Tools

  • Python
  • C/C++
  • SQL
  • Pandas
  • Spark
  • OpenCV
  • Spacy
  • NLTK
  • Scikit-Learn
  • Keras
  • Tensorflow
  • Pytorch
  • Hadoop Platform
  • Flask/FastAPI
  • Docker

Knowledges

  • Software Engineering
  • DevOps/MLOps
  • Math/Statistics
  • Research
  • Machine Learning / Deep Learning
  • Computer Vision
  • Natural Language Processing
  • Cloud Computing
  • Data Visualization/Presentation
  • Linux/Unix

Non-Technical Skills

  • Business acumen
  • Teamwork
  • Leadership
  • Problem Solver

Projects

Some of the projects I worked on and gave me experience on the following topics:

Natural Language Processing

Question Answering System

Adaptation and training of a deep learning model for a question answering system in Portuguese. Model based on Bert and Google QA Net. This system is applied in a Chatbot capable of answer questions based on unstructured texts

Intention Classification System

Training and classification of intentions present in certain phrases for Chatbots flow control

SeqTag: Sequence Tagging for portuguese datasets

Development of a deep learning model for token classification and sequence tagging in portuguese texts

Code:
Product categorization from title classification

Project developed for the Mercado Libre Data Challenge able to classify Portuguese and Spanish texts.

Code:
Natural Language to SQL

Model development for natural language conversion into sql language, allowing users to perform database queries.

Computer Vision

Activity Recognition based on pose estimation

This work propose a single end-to-end model able to detect people, estimate their pose, and recognize each one of their activities by their pose. The experiments show that the model has reached the state of the art in the tasks of person detection and pose estimation on MSCOCO Dataset 2017, and can recognize walking, running, sitting, and standing activities with an F1 score of 0.7344

Code:
Optical Character Recognition (OCR)

Training and improvements of existing ocr models in portuguese

Understanding Clouds from Satellite Images

Classification and segmentation of different cloud types from satellite images. 22nd place solution

Code:
AI and Computer Vision for Medicine

Development of deep learning models for identification of pulmonary diseases and intracranial hemorrhage in X-rays

Tabular Data

internet disconnection causes identification

Big data analysis for cross-checking information from Telefônica Brasil in order to find patterns in customers with internet peer disconnection issues

Fraud Identification

Data analysis and development of machine learning model to classify possible frauds in government advance database

Classification of intramuscular signals

Development of an artificial neural network to classify arm movements from the collected intramuscular signals. This feature is part of building a myoelectric prosthesis for people with amputated arms.

Code:

Publications

Improving lab-of-origin prediction of genetically engineered plasmids via deep metric learning
Soares, I.M., Camargo, F.H.F., Marques, A. et al.
Arxiv
Nature Computational Science - 2022
Deep metric learning improves lab of origin prediction of genetically engineered plasmids
IM Soares, FHF Camargo, A Marques, OM Crook
Arxiv
Nature Computational Science - 2022 (Preprint)
Ranking labs-of-origin for genetically engineered DNA using Metric Learning
I. Muniz, F. H. F. Camargo, A. Marques
Arxiv
arXiv - 2021

Awards

  • 1st Place in the 2nd Workshop on Artificial Intelligence - Federal University of Goiás 2019
  • 2nd Place in the "Porto Seguro Data Challenge" - Kaggle 2021
  • 2nd Place in the "Genetic Engineering Attribution Challenge Innovation Track" - DrivenData 2021
  • 2nd Place in the Hackathon "Dev for a Change" - Indra Brasil 2019
  • 9th Place in the Latin America Mercado Libre Data Challenge - https://ml-challenge.mercadolibre.com/ 2019
  • 10th Place in the "Genetic Engineering Attribution Challenge Prediction Track" - DrivenData 2021
  • 22nd Place (Silver Medal) against 1538 teams in the Understanding Clouds from Satellite Images Challenge - Kaggle 2019
  • 29nd Place (Silver Medal) against 1305 teams in the SIIM-FISABIO-RSNA COVID-19 Detection - Kaggle 2021
  • 30th Place (Silver Medal) against 1620 teams in the Santa's Workshop Tour 2019 - Kaggle 2019/2020
  • 77th Place (Bronze Medal) against 1233 teams in the TensorFlow 2.0 Question Answering - Kaggle 2020
  • 78th Place (Bronze Medal) against 1275 teams in the VinBigData Chest X-ray Abnormalities Detection - Kaggle 2021