Sheallika Singh

I just completed masters in the Data Science Institute at Columbia University. I am currently on the job market, looking for exciting and impactful work in Machine Learning and its applications. I have a strong background and interest in the field of Machine Learning, Deep Learning and its applications to Computer Vision, Natural Language Processing, and Statistics. I want to harness the power of technology to be able to positively impact a larger section of the society.

I graduated with a major in Mathematics and Scientific Computing and a minor in Industrial and Management from Indian Institute of Technology Kanpur in 2016. As part of my undergraduate thesis, I worked with Prof. Harish Karnick and Prof. Amit Mitra on a font free optical character recognition system for Devanagri Script. During the Summer of 2015 I worked as a data science intern at Fuzzy Logix under the mentorship of Mr. Partha Sen (CEO and CVO). Here I worked on the development of ensemble models for web based in-database analytics.

During the Summers of 2017, I was fortunate to work as a Machine Learning Researcher at Netflix (Cloud Media Systems team), where I lead two independent projects on Computer Vision and Language.

Email  /  Resume  /  Github  /  LinkedIn

Awards and Achievements
  • Recipient of Anita Borg Student Scholarship during the year 2015
  • Recipient of Innovation in Science Pursuit for Inspired Research Scholarship, awarded by Government of India during 2012-16
  • Senior Executive, Professional Affairs, Techkriti'14, the annual technical and entrepreneurial festival of IIT Kanpur in 2015
  • Teaching Assistant for two courses: Applied Machine Learning and Advanced Analytics during Spring 2017
  • Awarded Certificate of Excellence by Education Department, Chandigarh, India for academic excellence
Internships

Perceived Emotion Recognition

Worked on classifying overall perceived emotion for the video using the change in facial expression of persons in the video over time.

Dense Captioning Events in the Video

Worked on generating captions for events taking place in a video with context to what has happened in the neighboring events.

Extracting Relations From Large Plain-text Collections
Mentored by Mohamed AlTantawy (CTO) and Prem Ganesh Kumar (NLP Engineer)
Agolo, a NYC based news summarization startup, January'17 - Present

Working on extracting lesser known relationships based on some well known relationships from the financial newspaper articles data. The main idea is to create a large knowledge graph for different relationships.

Ensemble Models for web based in-database analytic services
Mentored by Partha Sen (CEO and CVO)
Fuzzy Logix (a Charlotte based Predictive and In-Database analytics startup) , May'15 - July'15

Analyzed effects of ensemble models on predictive power of Decision Trees, Logistic Regression, Neural networks and Naive Bayes model. Observed significant improvements in accuracy (by 5-15%) for ensemble comprising decision trees ( Random Forest) and neural networks, but not much significant improvement for ensemble of logistic regression

Developed stored procedures for in-database analytic solutions on Netezza and SQL server.

Was Offered a full time employment opportunity

Statistical Methods in Market Research
Research Project under Prof. Amit Mitra
Indian Institute of Technology Kanpur, May'14 - July'14

Modeled complex customer relationships through Markov Chain models using Statistical Analysis System to predict revenue band of customer at 94% accuracy

Undergraduate Thesis
bda

Font-Free Optical Character Recognition for Devanagari
Mentored by Prof. Harish Karnick and Prof. Amit Mitra
Indian Institute of Technology Kanpur
Report / PPT

Devised algorithms for (samyutakshars) conjunct characters identification and separation into individual characters. The proposed algorithm decreases Word Error Rate by 10-15\% than current state of art techniques

Work is submitted to ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP)

Course Projects
bda

Reducing Manufacturing Failures - A Kaggle Challenge
Big Data Analytics Course Project under Prof. Ching-Yung Lin, Columbia University
Report / Code

Utilized Extreme Gradient Boosting for feature extraction from 7 GB of training data, comprising of about 4200 features. Using the chosen subset of features, Random Forest (MCC Score: 0.40) and Gradient Boosting (MCC Score: 0.41) emerged the best models, to predict whether a product on the production line will be defective or not

dl

A Neural Algorithm of Artistic Style
Deep Learning and Neural Networks Course Project under Prof. Zoran Kostic, Columbia University
Report / Code

Fused the artistic style of an artwork with the content of an image by maximizing the correlation between the generated image and feature map of VGG-19 net layers used to extract style and content from image. Provided a modified implementation of the Neural Algorithm for Artistic Style transfer as proposed by Leon Gatys in the Paper

Also used Markov Random Fields for generating image (local image texturing) which gave better visualization of the fusion.

prl

Surveillance System for Vehicular Classification and Pedestrian Detection
Machine Learning Techniques Course Project under Prof. Harish Karnick, IIT Kanpur
Report

Implemented and compared different object detection, feature extraction (HOG, AlexNet) and classification methods (RandomForest, KernelSVM, Logistic Regression) on a large amount of unlabeled video surveillance data collected at IIT Kanpur

Entity recognition using Deep Region based Convolutional Neural Net, feature extraction by penultimate layer of AlexNet, classification by Random Forest yielded best performance (91.75% accuracy) and real time processing

bike

Capital Bikeshare Demand Prediction - A Kaggle Challenge
Data Mining Course Project under Prof. Amit Mitra, IIT Kanpur
Report / Code

Combined historical usage patterns with weather data to forecast bike rental demand in the Bikeshare program in Washington, D. C. Formulated random forests and exponential gradient boosting ensemble to predict bike rental demand with 88% accuracy

bike

Non Linear Classification using Kernel Methods
Convex Optimization Course Project under Prof. Ketan Rajawat, IIT Kanpur
Report

Looked towards Support Vector Machines and Nearest Neighbors as Classification methods, that can be kernelized. Also looked at different formulations of SVM’s, namely C-SVM and nu-SVM. Depicted the wide applicability and ease of Kernel SVMs through real-world problems like in face detection, handwritten character detection, spam/non-spam classification

bike

Dimensional Reduction - A Comparative Study
Learning with Kernels Course Project under Prof. Harish Karnick, IIT Kanpur
Report / PPT

Systematically studied the existing techniques and literature for dimensional reduction. Explored and compared across different techniques through comparisons on different type of datasets

bike

Recurrence Patterns for Breast Cancer and Time of Likely Recurrence
Regression Analysis Course Project under Prof. Sharmishtha Mitra, IIT Kanpur

Applied binomial logistics regression to classify patients having recurrent breast cancer and patients having non-recurrent disease with 72%. Predicted the time after which tumour onsets again in recurring cases using multiple linear regression model

bike

Foreign Exchange Rate Forecasting
Time Series Analysis Course Project under Prof. Amit Mitra, IIT Kanpur

Predicted daily foreign exchange rate of Indian Rupee with USD, Euro, Yen, Great Britain pound using linear model (ARIMA) and compared the result with random walk model. Successfully conducted that any other ARIMA model prediction cannot beat random walk model predictions of daily foreign exchange rates

Teaching
teaching

Teaching Assistant: Machine Learning, COMS4771 - Fall 2017 [Columbia University] with Prof. Nakul Verma

Teaching Assistant: Big Data Analytics, EECS E6893 - Fall 2017 [Columbia University] with Prof. Ching Yung Lin

Teaching Assistant: Applied Machine Learning, COMS4995 - Spring 2017 [Columbia University] with Prof. Andreas Mueller

Teaching Assistant: Advanced Analytic/ Quantitative Techniques, G4018 - Spring 2017 [Columbia University] with Prof. Gregory M. Eirich


Homepage Credits: this, this, this, this, and this.