Projects
These are my machine learning projects that I have been hacking on outside of work. They showcase my interests from self-supervised learning, large language models and AI safety.
Analyzing LLMs' Preference for Incorrect Yet Appealing Answers
May 8th 2023 | Solo Project
Taking inspiration from Anthropic's research, this work examines the evaluation bias in large language models (LLMs) by assessing their tendency to choose pleasant-sounding but incorrect answers. The analysis involved a range of OpenAI models at different scales and revealed that all models had around a 50% chance of selecting such answers, regardless of model size or reinforcement learning from human feedback (RLHF). I am planning on working to refine the test set to further explore this phenomenon.
Image prompt: "Questions that sound nice but are incorrect, digital art"
AGISF Project: Is BabyAGI a fire alarm?
April 8th 2023 | Solo Project
I am currently in the middle of this project which is part of the AGI Safety Fundamentals course I am part of this year. The aim of this project is to understand if the new auto-prompting and self-prompting frameworks which are popular (such as BabyAGI and AutoGPT) are a cause for concern in terms of AI safety.
Visit the repo | Twitter thread
Image prompt: "Robot baby surrounded by fire, digital art"
Whisper Interpetability
November 13th 2022 | Team of 4
During our Alignment Jam Interpretability Hackathon project, I explored whether the concept of "logit lens" applies to the encoder and decoder layers in Whisper, an end-to-end speech recognition model. I found that removing layers from the decoder quickly degraded the output, while removing layers from the encoder gradually degraded the output. Others in the team delved deeper into attention patterns for audio examples that showed hallucinations.
Read the report | Visit the repo
Image prompt: "Whisper interpretability hallucinate, digital art"
Meaning Error Rate
October 25th 2022 | Solo Project
Meaning Error Rate (MER) is an alternative for evaluating speech recognition systems that considers changes in meaning. It is automated by using GPT-3, few shot learning, and chain of thought. It is based on NER (not to be confused for named entity recognition) and this is the first novel solution to automate it. This is exciting research for media broadcast firms (who are bound by NER scores in government regulation) as they can avoid expensive human labelling of the severity of errors.
Read the blog | Visit the repo | Watch talk @ Voice 22
Image prompt: "Brain made out of chains with a colourful background complex machine learning parts, digial art"
Emotion Recognition and CPC
September 2nd 2020 | Solo Project
Emotion detection in audio utilising self-supervised representations trained with Contrastive Predictive Coding (CPC). Results have improved from a baseline of 71% to 80% accuracy when using CPC which is a significant relative reduction in error of 30%.
Read the blog | Visit the repo
Image prompt: "Emotion recognition, cartoon"
Masters Thesis: Automatic Lecture Captioning
June 5th 2019 | Cambridge University Engineering Department
This project developed speaker-specific automatic speech recognition systems to transcribe lectures from the Cambridge University Engineering Department. The systems used language model refinement and acoustic model adaptation to correctly decode keywords chosen from lecture handouts. The quality of the transcription was primarily assessed using keyword occurrences and corresponding recall rates for all lecturers.
Read my thesis | See the slides | Watch the demo
Image prompt: "Futuristic full lecture theatre on machine learning, digital art"