research
Projects
Links
Improving Bounds for Randomly Sampling Graph Colorings
Undergraduate Senior Thesis (in progress).
Targeted Edge Perturbations on GNNs: Exploring Greedy, Heuristic, and Gradient-Driven Approaches
Spring
Here is an abstract for the final project:
Graph neural networks (GNNs) have grown in application and research over the past decade in tandem with an investigation into adversarial attacks targeting GNNs. Most adversarial attacks proposed thus far fail to be realistic in practice, as they prioritize budget-based impact compared to inconspicuousness. Guided by consideration of the Minimum Edge Set Perturbation problem, we propose a set of edge perturbation methods that guarantee no change in accuracy while achieving a budgeted perturbation relative to the initial topological structure of the input graph. Alongside these methods, we present the Greedy-Primed Metattack, which achieves a similar reduction in accuracy as the vanilla Metattack while staying covert for more than 50% of the specified budget. The Greedy-Primed Metattack sparks inquiry into optimal heuristics for maximally furtive adversarial attacks while fueling further analysis of architectural robustness.
Here is the final poster:
Winter
Please see the research log for updates.
We have pivoted to the minimum edge set perturbation problem. Here is the guiding question:
What is the maximum number of edges that can be added to a graph without changing the accuracy of a graph neural network classification? Are there certain metrics (homophily, degree, nearest neighbors), that provide an algorithmic way to add these edges? How does this compare to standard adversarial attacks?
Fall
Research Proposal:
Research Proposal Presentation:
GraphEval36K: Benchmarking Coding and Reasoning Capabilities of Large Language Models on Graph Datasets
The following project was also completed in Professor Singh’s DYNAMO Lab.
Abstract
Large language models (LLMs) have achieved remarkable success in natural language processing (NLP), demonstrating significant capabilities in processing and understanding text data. However, recent studies have identified limitations in LLMs’ ability to manipulate, program, and reason about structured data, especially graphs. We introduce GraphEval36K1, the first comprehensive graph dataset, comprising 40 graph coding problems and 36,900 test cases to evaluate the ability of LLMs on graph problemsolving. Our dataset is categorized into eight primary and four sub-categories to ensure a thorough evaluation across different types of graphs. We benchmark ten LLMs, finding that private models outperform open-source ones, though the gap is narrowing. We also analyze the performance of LLMs across directed vs undirected graphs, different kinds of graph concepts, and network models. Furthermore, to improve the usability of our evaluation framework, we propose Structured Symbolic Decomposition (SSD), an instruction-based method designed to enhance LLM performance on complex graph tasks. Results show that SSD improves the average passing rate of GPT-4, GPT4o, Gemini-Pro and Claude-3-Sonnet by 8.38%, 6.78%, 29.28% and 25.28%, respectively.
Paper
Here is a link to the paper: click here.