Computational NeuroscienceMaster's Student @ University of Tübingen
Hi! I’m a first year master’s student of Computational Neuroscience at University of Tübingen. I’m exploring the fundamentals of neuroscience and deeply passionate about working on problems related to intelligence.
I completed my B.Tech. Computer Engineering at Jamia Millia Islamia University, New Delhi in June 2024. I was supervised by Prof. Tanvir Ahmad on my Bachelor’s thesis on “Music Generation from Brain Scans” which was my first major exploration into a problem of neuroscience involving reconstructing the music listened by subjects purely from their fMRI scans. This project made me fell in love with neuroscience.
I’ve also worked as a researcher at the HCTL Lab, TU Munich led by Prof. Dr. Enkelejda Kasneci. I was supervised by Yao Rong on the topic of self-supervised learning for scanpaths. I was a full-time summer research intern at TUM in 2023 funded by DAAD-WISE Scholarship. Previously, I’ve been a summer research intern at IIIT Allahabad under Prof. Anupam Agarwal where I worked on ASD diagnosis based on visual attention. I was mentored on my first research paper on wild scene text localization in images by Hitesh Hinduja.
I’ve had the honor of leading my team sCUDA_Divers to the Grand Finales of Smart India Hackathon for consecutive years of 2022 and 2023. Our hackathon project on the Super-Resolution of Digital Elevation Models culminated in a research paper published in IEEE IGARSS 2023.
Recently, as part of my Bachelors Minor Thesis, I explored GNNs, LLMs, and RAG techniques for the research competition of Multimodal Emotion-Cause Pair Extraction in Conversations, a SemEval 2024 Task. (paper accepted into conference!)
01/10/2024
Began Master’s in Computational Neuroscience at Tübingen!
23/05/2024
Graduated with Honours in B.Tech. Computer Engineering!
19/03/2024
Paper on LLMs Accepted in SemEval 2024 Workshop!
06/03/2024
2nd Rank in Third Year of Computer Engineering!
31/01/2024
Team JMI Secured 4th Position at SemEval Task-3!
19/12/2023
Grand Finalist of SIH 2023 at KIT, Kolhapur!
01/08/2023
Completed Summer Research Internship at TU Munich!
16/07/2023
Master GAN Paper Published in IGARSS 2023!
10/04/2023
3rd Rank in Second Year of Computer Engineering!
17/02/2023
Text Localization Paper Published in IOSR Journal!
16/01/2023
Achieved DAAD-WISE Scholarship!
25/08/2022
Grand Finalist of SIH 2022 at GTU, Ahmedabad!
15/07/2022
Completed Summer Research Internship at IIIT Allahabad!
JMI at SemEval 2024 Task 3: Two-step approach for multimodal ECAC using in-context learning with GPT and instruction-tuned Llama models
Arefa, M. A. Ansari*, C. Saxena, T. Ahmad
SemEval Workshop, 2024 (ArXiv Preprint)
[PDF] [Code]
Master GAN: Multiple Attention is all you Need: A Multiple Attention Guided Super Resolution Network for Dems
A. Mohammed, M. Kashif, M. H. Zama, M. A. Ansari* and S. Ali
IEEE IGARSS, 2023
[PDF] [Code]
Revisiting TextFuseNet: Text Context Enhanced Attention Networks For Scene Text Localization
H. Hinduja, M. A. Ansari*
IOSR Journal of Computer Engineering, 2023
[PDF] [Code]
In order of recency:
Music Generation from Brain Scans
Bachelors Major Thesis, 2024
[Thesis]
[Slides]
Tackled reconstruction of music listened by a subject based on their fMRI scans. Using Nakai et al’s dataset of 5 subjects’ fMRI scans while listening to 540 music pieces. Modified Meta’s MusicGen model for music generation conditioned on fMRI scans using the Map Method. Experimented with EnCodec, Chromagram Tokenizer, and T5 encoders, achieving the best performance using the T5 encoder with total averaging (FAD: 8.41, KL: 2.42, MCD: 4.87). Identified the temporal lobe as crucial for music reconstruction, highlighting the importance of auditory processing, language comprehension, and multimodal integration in the neural representation of music.
Multimodal Emotion-Cause Analysis in Conversations using in-context learning and instruction-tuned LLMs
SemEval 2024 Workshop Task 3 Competition
[Paper] [Code]
Developed an efficient video captioning technique for conversational videos using GPT-4-Vision. Used Demonstration learning through retrieved examples for emotion recognition and cause prediction using GPT-3.5 for SemEval Task 3. Also implemented instruction-tuned Llama-2 model using QLoRA tecnique. Our approach won rank 4 in the competition.
(Paper Accepted!)
Multimodal Emotion-Cause Pair Extraction using Graph Neural Networks
Bachelors Minor Thesis, 2023
[Thesis] [Slides]
Developed a graph neural network for emotion-cause pair extraction from multimodal conversational data. Utilized CLIP,
BERT, and HTS-AT audio encoder for diverse modality features. Explored multimodal fusion in transformers. Modeled
conversational structure with graph attention networks.
Real-time Indoor Video Dehazing using Knowledge Distillation
Smart India Hackathon Grand Finale, 2023
[Slides] [Solution Proposal]
We proposed to modify MAPNet, a UNET-based dehazing network for outdoor environments by replacing some of the blocks with TAM-Net, a 2D convolutional variant for videos. We experimented with distillation by creating a smaller student network for dehazing. During the hackathon, we experimented with Dark Channel Prior and Boundary Contrainst Regularization approaches for benchmarking.
Self-Supervised Learning for Free-Viewing Scanpaths
DAAD-WISE Research Project at TU Munich, 2023
[Slides] [Code]
Eye movements can serve as a proxy for extracting neurological states of a subject. Our goal was to improve classifcation of a subject’s cognitive characteristics based on their free-viewing scanpaths on images. We experimented with using a non-contrastive self-supervised learning technique based on BarlowTwins where we implemented novel scanpath distortion techniques to create multiple views of scanpaths. A combined dataset was created using multiple publicly availble free-viewing scanpath datasets. Experiments demonstrated improvements in the downstream task of Autism detection.
Super-Resolution of Digital Elevation Models (DEM)
Smart India Hackathon Grand Finale, 2022 & IEEE IGARSS, 2023
[Slides] [Code] [Paper]
Led a team in developing a U-Net based convolutional network with attention for DEM super-resolution in ISRO’s Smart India Hackathon. DEMs collected from USGS LiDAR and SRTM, NASA ASTER and ISRO CartoSAT were used to curate the training set. Our team proposed MASTER GAN architecture achieving state-of-the-art results (PSNR 31.024, SSIM 0.908){:target=”_blank” rel=”noopener”} which got published at IEEE IGARSS 2023 conference.
Improved Visual Attention Classification for Autism Spectrum Disorder through Time-Dependent
Representations.
Research Internship Project at IIIT Allahabad, 2022
[Slides] [Code]
Trained a deep learning network on Saliency4ASD dataset using ResNet-50 and LSTM using novel time-dependent representations. Encoded embeddings with duration via new techniques such as time-masking and joint embedding. Demonstrated improvements with incorporation of duration as a feature for ASD classification.
Text Localization using Efficient Attention
Research Project under Mentorship of Mr Hitesh Hinduja, 2021-23
[Code] [Paper]
Modified Mask R-CNN Architecture of Detectron2 library with efficient attention for improved text localization accuracy on SynthText dataset.
Robust Face Recognition Security System
HackJMI Hackathon Project, 2021
[Code]
Developed a robust face recognition security system using MTCNN, VGGFace, and inception-resnet siamese network capable of detecting spoof faces. A website with flask backend was created as MVP which won runner-up position at the hackathon.
Novel Bible Verse Generator
First Deep Learning Project, 2021
[Notebook] [Example]
Trained a character-level neural language model on Bible (KJV) using LSTM. Built a loop to generate 1000 characters from the seed text which was the fake Bible verse quote from pulp fiction (1994)