Bernstein, Gary Abraham: Alignment Through Self-Understanding: A Game-Theoretic Argument
Current approaches to AI alignment (RLHF, constitutional AI, debate) treat alignment as a constraint problem: how to impose human values on systems that might otherwise pursue misaligned objectives. I argue that this framing misses a structural alternative. If the pattern-randomness dichotomy exhausts existence, then both human and AI systems are mathematical structures operating in the same ontological space. This shared ontology enables an alignment approach based on self-understanding rather
Tags
