publications
published works
Davies, A., Nguyen, E., Simeone, M., Johnston, E., & Gubri, M. (2025). Social Science Is Necessary for Operationalizing Socially Responsible Foundation Models. In ICLR 2025 Workshop on Human-AI Coevolution.
paper poster
Lamb, T., Davies, A., Paren, A., Torr, P., Pinto, F (2025). Focus On This, Not That! Steering LLMs With Adaptive Feature Specification. In ICLR 2025 Workshop on Foundation Models in the Wild.
paper
Mannekote, A., Davies, A., Kang, J., & Boyer, K. E. (2025). Can LLMs Reliably Simulate Human Learner Actions? A Simulation Authoring Framework for Open-Ended Learning Environments. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 39, No. 28, pp. 29044-29052).
paper code
Hemmat, A., Davies, A., Lamb, T., Yuan, J., Torr, P., Khakzar, A., & Pinto, F. (2024). Hidden in Plain Sight: Evaluating Abstract Shape Recognition in Vision-Language Models. Advances in Neural Information Processing Systems, 37, 88527-88556.
webpage paper poster dataset code
Canby, M.*, Davies, A.*, Rastogi, C., & Hockenmaier, J. (2024). Measuring the Reliability of Causal Probing Methods: Tradeoffs, Limitations, and the Plight of Nullifying Interventions. In NeurIPS 2024 Workshop on Interpretable AI (Oral).
preprint (updated 2025) IAI 2024 paper IAI 2024 slides IAI 2024 poster
(* denotes equal contribution.)
Davies, A., Jiang, J., & Zhai, C. (2024). Competence-Based Analysis of Language Models. In NeurIPS 2024 Workshop on Interpretable AI.
paper poster
Mannekote, A., Davies, A., Pinto, J. D., Zhang, S., Olds, D., Schroeder, N. L., … & Zhai, C. (2024). Large Language Models for Whole-Learner Support: Opportunities and Challenges. Frontiers in Artificial Intelligence, 7, 1460364.
paper
Yuan, J.*, Pinto, F.*, Davies, A.*, & Torr, P. (2024). Not Just Pretty Pictures: Toward Interventional Data Augmentation Using Text-to-Image Generators. Proceedings of the 41st International Conference on Machine Learning, 235, 57924-57952.
paper webpage poster video code
(* denotes equal contribution.)
Zhang, Y.*, Davies, A.*, & Zhai, C. (2024). Understanding the social construction of juvenile delinquency: insights from semantic analysis of big-data historical newspaper collections. Journal of Computational Social Science, 1-43.
paper
(* denotes equal contribution.)
Satheesan, S. P., Bhavya, Davies, A., Craig, A. B., Zhang, Y., & Zhai, C. (2022). Toward a Big Data Analysis System for Historical Newspaper Collections Research. In Proceedings of the Platform for Advanced Scientific Computing Conference (pp. 1-11).
paper slides video
Davies, A. (2021). Definitional Templating: A Novel Approach to Modeling the Compositional Semantics of Noun Compounds (Publication No. UUCS 21-013).
senior thesis
preprints
Lee, S., Davies, A., Canby, M., Hockenmaier, J. (2025). Evaluating and Designing Sparse Autoencoders by Approximating Quasi-Orthogonality arXiv preprint arXiv:2503.24277
preprint
Davies, A., & Khakzar, A. (2024). The Cognitive Revolution in Interpretability: From Explaining Behavior to Interpreting Representations and Algorithms. arXiv preprint arXiv:2408.05859.
preprint