Mohammed Samiul Saeef

Film buff, rock music enthusiast, amateur photographer and not-so-regular backpacker. I take a special interest in philosophy, geopolitics and modern history. I love playing with raw data .


Experience

Data Scientist Intern

Walmart Labs

In collaboration with the Voice Commerce team, built a knowledge base for Walmart's catalog focusing on grocery items to help in meaningful training data generation and question answering. Also implemented a fuzzy search technique for their recipe-to-ingredient use case.

June 2019 - August 2019

Graduate Research Assistant

University of Texas Arlington

Working in Innovative Database and Information Systems Research Laboratory (IDIR) under Dr. Chengkai Li.

August 2018 - Present

Graduate Teaching Assistant

University of Texas Arlington

Graded assignments, projects, tests; proctored exams and advised students for following courses:

CSE 2320 Algorithms & Data Structures
CSE 5311 Design & Analysis of Algorithms
CSE 4334 Data Mining
CSE 5311 Design & Analysis of Algorithms
CSE 5324 Software Engineering: Analysis, Design, and Testing
August 2016 - July 2018

Software Engineer

QI Analysis Inc.

I worked in a team for developing a CRM software that utilizes NLP, AI and visualization engines to analyze communications between parties, automate data capture, build opportunity pipeline, and highlight intelligence to improve deal management.

October 2015 - December 2015

Education

University of Texas Arlington

Ph.D. Candidate
Computer Science

GPA: 3.89

August 2016 - Present

Bangladesh University of Engineering & Technology

Bachelor of Science
Computer Science & Engineering
May 2010 - August 2015

Skills

Programming Languages
  • C
  • C++
  • Python
  • Java
  • MATLAB
  • Scala
  • R
  • Shell
  • LaTex
  • Assembly
  • Prolog

Projects

Orion

Innovative Database and Information Systems Research Laboratory

Orion is a visual interface for querying ultra-heterogeneous graphs. It iteratively assists users in query graph construction by making suggestions via data mining methods. In its active mode, Orion automatically suggests top-k edges to be added to a query graph. In its passive mode, the user adds a new edge manually, and Orion suggests a ranked list of labels for the edge. Orion’s edge ranking algorithm, Random Decision Paths (RDP), makes use of co-occurring edge sets to rank candidate edges by how likely they will match the user’s query intent. Extensive user studies using Freebase demonstrated that Orion users have a 70% success rate in constructing complex query graphs, a significant improvement over the 58% success rate by the users of a baseline system that resembles existing visual query builders. Furthermore, using active mode only, the RDP algorithm was compared with several methods adapting other data mining algorithms such as random forests and naïve Bayes classifier, as well as class association rules and collaborative filtering based on singular value decomposition. On average, RDP required 40 suggestions to correctly reach a target query graph (using only its active mode of suggestion) while other methods required 1.5–4 times as many suggestions.

August 2016 - Present

Awards & Certifications

  • 3rd Place - Texas Intercollegiate 9-ball Billiards Tournament (C-class) 2017
  • 13th Place - ACM ICPC Bangladesh National Programming Contest 2013
  • 24th Place - ACM ICPC Regional Contest, Dhaka Site 2012
  • 4th Place - ACM ICPC Regional Contest Preliminary Round, Dhaka Site 2012
  • 16th Place - BUET Inter-University Programming Contest 2011

Photography