Sheallika Singh
I just completed masters in the Data Science Institute at Columbia University. I am currently on the job market, looking for exciting and impactful work in Machine Learning and its applications. I have a strong background and interest in the field of Machine Learning, Deep Learning and its applications to Computer Vision, Natural Language Processing, and Statistics. I want to harness the power of technology to be able to positively impact a larger section of the society.
I graduated with a major in Mathematics and Scientific Computing and a minor in Industrial and Management from Indian Institute of Technology Kanpur in 2016. As part of my undergraduate thesis, I worked with Prof. Harish Karnick and Prof. Amit Mitra on a font free optical character recognition system for Devanagri Script. During the Summer of 2015 I worked as a data science intern at Fuzzy Logix under the mentorship of Mr. Partha Sen (CEO and CVO). Here I worked on the development of ensemble models for web based in-database analytics.
During the Summers of 2017, I was fortunate to work as a Machine Learning Researcher at Netflix (Cloud Media Systems team), where I lead two independent projects on Computer Vision and Language.
Email /
Resume /
Github /
LinkedIn
|
|
Awards and Achievements
- Recipient of Anita Borg Student Scholarship during the year 2015
- Recipient of Innovation in Science Pursuit for Inspired Research Scholarship, awarded by Government of India during 2012-16
- Senior Executive, Professional Affairs, Techkriti'14, the annual technical and entrepreneurial festival of IIT Kanpur in 2015
- Teaching Assistant for two courses: Applied Machine Learning and Advanced Analytics during Spring 2017
- Awarded Certificate of Excellence by Education Department, Chandigarh, India for academic excellence
|
|
Perceived Emotion Recognition
Worked on classifying overall perceived emotion for the video using the change in facial expression of persons in the video over time.
Dense Captioning Events in the Video
Worked on generating captions for events taking place in a video with context to what has happened in the neighboring events.
|
|
Extracting Relations From Large Plain-text Collections Mentored by Mohamed AlTantawy (CTO) and Prem Ganesh Kumar (NLP Engineer) Agolo, a NYC based news summarization startup, January'17 - Present
Working on extracting lesser known relationships based on some well known relationships from the financial newspaper articles data. The main idea is to create a large knowledge graph for different relationships.
|
|
Ensemble Models for web based in-database analytic services Mentored by Partha Sen (CEO and CVO)
Fuzzy Logix (a Charlotte based Predictive and In-Database analytics startup) , May'15 - July'15
Analyzed effects of ensemble models on predictive power of Decision Trees, Logistic Regression, Neural networks and Naive Bayes model. Observed significant improvements in accuracy (by 5-15%) for ensemble comprising decision trees ( Random Forest) and neural networks, but not much significant improvement for ensemble of logistic regression
Developed stored procedures for in-database analytic solutions on Netezza and SQL server.
Was Offered a full time employment opportunity
|
|
Statistical Methods in Market Research Research Project under Prof. Amit Mitra
Indian Institute of Technology Kanpur, May'14 - July'14
Modeled complex customer relationships through Markov Chain models using Statistical Analysis System to predict revenue band of customer at 94% accuracy
|
 |
Font-Free Optical Character Recognition for Devanagari
Mentored by Prof. Harish Karnick and Prof. Amit Mitra
Indian Institute of Technology Kanpur
Report / PPT
Devised algorithms for (samyutakshars) conjunct characters identification and separation into individual characters. The proposed algorithm decreases Word Error Rate by 10-15\% than current state of art techniques
Work is submitted to ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP)
|
 |
Reducing Manufacturing Failures - A Kaggle Challenge
Big Data Analytics Course Project under Prof. Ching-Yung Lin, Columbia University
Report / Code
Utilized Extreme Gradient Boosting for feature extraction from 7 GB of training data, comprising of about 4200 features. Using the chosen subset of features, Random Forest (MCC Score: 0.40) and Gradient Boosting (MCC Score: 0.41) emerged the best models, to predict whether a product on the production line will be defective or not
|
 |
A Neural Algorithm of Artistic Style
Deep Learning and Neural Networks Course Project under Prof. Zoran Kostic, Columbia University
Report / Code
Fused the artistic style of an artwork with the content of an image by maximizing the correlation between the generated image and feature map of VGG-19 net layers used to extract style and content from image. Provided a modified implementation of the Neural Algorithm for Artistic Style transfer as proposed by Leon Gatys in the Paper
Also used Markov Random Fields for generating image (local image texturing) which gave better visualization of the fusion.
|
 |
Surveillance System for Vehicular Classification and Pedestrian Detection
Machine Learning Techniques Course Project under Prof. Harish Karnick, IIT Kanpur
Report
Implemented and compared different object detection, feature extraction (HOG, AlexNet) and classification methods (RandomForest, KernelSVM, Logistic Regression) on a large amount of unlabeled video surveillance data collected at IIT Kanpur
Entity recognition using Deep Region based Convolutional Neural Net, feature extraction by penultimate layer of AlexNet, classification by Random Forest yielded best performance (91.75% accuracy) and real time processing
|
 |
Non Linear Classification using Kernel Methods
Convex Optimization Course Project under Prof. Ketan Rajawat, IIT Kanpur
Report
Looked towards Support Vector Machines and Nearest Neighbors as Classification methods, that can be kernelized. Also looked at different formulations of SVM’s, namely C-SVM and nu-SVM. Depicted the wide applicability and ease of Kernel SVMs through real-world problems like in face detection, handwritten character detection, spam/non-spam classification
|
 |
Dimensional Reduction - A Comparative Study
Learning with Kernels Course Project under Prof. Harish Karnick, IIT Kanpur
Report / PPT
Systematically studied the existing techniques and literature for dimensional reduction. Explored and compared across different techniques through comparisons on different type of datasets
|
 |
Recurrence Patterns for Breast Cancer and Time of Likely Recurrence
Regression Analysis Course Project under Prof. Sharmishtha Mitra, IIT Kanpur
Applied binomial logistics regression to classify patients having recurrent breast cancer and patients having non-recurrent disease with 72%. Predicted the time after which tumour onsets again in recurring cases using multiple linear regression model
|
 |
Foreign Exchange Rate Forecasting
Time Series Analysis Course Project under Prof. Amit Mitra, IIT Kanpur
Predicted daily foreign exchange rate of Indian Rupee with USD, Euro, Yen, Great Britain pound using linear model (ARIMA) and compared the result with random walk model. Successfully conducted that any other ARIMA model prediction cannot beat random walk model predictions of daily foreign exchange rates
|
|