Scholars@Duke publication: Moral Change or Noise? On Problems of Aligning AI With Temporally Unstable Human Feedback

Moral Change or Noise? On Problems of Aligning AI With Temporally Unstable Human Feedback

Publication , Conference

Keswani, V; Cousins, C; Nguyen, B; Conitzer, V; Heidari, H; Borg, JS; Sinnott-Armstrong, W

Published in: Proceedings of the Aaai Conference on Artificial Intelligence

January 1, 2026

Alignment methods in moral domains seek to elicit moral preferences of human stakeholders and incorporate them into AI. This presupposes moral preferences as static targets, but such preferences often evolve over time. Proper alignment of AI to dynamic human preferences should ideally account for “legit-imate” changes to moral reasoning, while ignoring changes related to attention deficits, cognitive biases, or other arbitrary factors. However, common AI alignment approaches largely neglect temporal changes in preferences, posing serious challenges to proper alignment, especially in high-stakes applications of AI, e.g., in healthcare domains, where misalignment can jeopardize the trustworthiness of the system and yield serious individual and societal harms. This work investigates the extent to which people’s moral preferences change over time, and the impact of such changes on AI alignment. Our study is grounded in the kidney allocation domain, where we elicit responses to pairwise comparisons of hypothetical kidney transplant patients from over 400 participants across 3–5 sessions. We find that, on average, participants change their response to the same scenario presented at different times around 6–20% of the time (exhibiting “response instability”). Additionally, we observe significant shifts in several participants’ retrofitted decision-making models over time (capturing “model instabil-ity”). Predictive performance of simple AI models decreases as a function of both response and model instability. Moreover, predictive performance diminishes over time, highlighting the importance of accounting for temporal changes in preferences during training. These findings raise fundamental normative and technical challenges relevant to AI alignment, highlighting the need to better understand the object of alignment (what to align to) when user preferences change significantly over time, including the mechanisms underlying these changes.

Duke Scholars

Author Vincent Conitzer Computer Science

Author Walter Sinnott-Armstrong Philosophy

Published In

Proceedings of the Aaai Conference on Artificial Intelligence

DOI

10.1609/aaai.v40i44.41083

EISSN

2374-3468

ISSN

2159-5399

Publication Date

January 1, 2026

Volume

Issue

Start / End Page

37501 / 37509

Citation

APA

Chicago

ICMJE

MLA

NLM

Keswani, V., Cousins, C., Nguyen, B., Conitzer, V., Heidari, H., Borg, J. S., & Sinnott-Armstrong, W. (2026). Moral Change or Noise? On Problems of Aligning AI With Temporally Unstable Human Feedback. In Proceedings of the Aaai Conference on Artificial Intelligence (Vol. 40, pp. 37501–37509). https://doi.org/10.1609/aaai.v40i44.41083

Keswani, V., C. Cousins, B. Nguyen, V. Conitzer, H. Heidari, J. S. Borg, and W. Sinnott-Armstrong. “Moral Change or Noise? On Problems of Aligning AI With Temporally Unstable Human Feedback.” In Proceedings of the Aaai Conference on Artificial Intelligence, 40:37501–9, 2026. https://doi.org/10.1609/aaai.v40i44.41083.

Keswani V, Cousins C, Nguyen B, Conitzer V, Heidari H, Borg JS, et al. Moral Change or Noise? On Problems of Aligning AI With Temporally Unstable Human Feedback. In: Proceedings of the Aaai Conference on Artificial Intelligence. 2026. p. 37501–9.

Keswani, V., et al. “Moral Change or Noise? On Problems of Aligning AI With Temporally Unstable Human Feedback.” Proceedings of the Aaai Conference on Artificial Intelligence, vol. 40, no. 44, 2026, pp. 37501–09. Scopus, doi:10.1609/aaai.v40i44.41083.

Keswani V, Cousins C, Nguyen B, Conitzer V, Heidari H, Borg JS, Sinnott-Armstrong W. Moral Change or Noise? On Problems of Aligning AI With Temporally Unstable Human Feedback. Proceedings of the Aaai Conference on Artificial Intelligence. 2026. p. 37501–37509.

Published In

Proceedings of the Aaai Conference on Artificial Intelligence

DOI

10.1609/aaai.v40i44.41083

EISSN

2374-3468

ISSN

2159-5399

Publication Date

January 1, 2026

Volume

Issue

Start / End Page

37501 / 37509