reinforcement-learning
International audience
International audience
Forthcoming/in press
International audience
Published on May 24, 2026 3:40 PM GMT For context, there are two ways models learn in reinforcement learning: exploration vs. exploitation. 1 Every action a model takes has probability p of being random (exploration), and probability 1 - p of being the best possible action among known actions (exploitation). When the model has not been trained at all, when it knows nothing, that first action is p…
Scientific Reports, Published online: 24 May 2026; doi:10.1038/s41598-026-51105-w Federated multi-cloud task scheduling with load balancing using multi-objective NSGA-II and reinforcement learning
Access to quality tutoring still remains out of reach for millions of students in rural schools, NGO programs, and low-resource educational settings in many parts of the world. Most AI tutoring systems require continuous internet access and cloud infrastructure conditions that simply do not exist in these environments. This paper describes Vedixa, an offline-first adaptive tutoring system built a…
The Matthew Effect Index (MEI) is a new cohort-level effect-size index first developed in serious games analytics, measuring divergence between two cohorts produced by an information-gating mechanism. Like any new measurement instrument, MEI requires principled validation before its readings can be trusted. This primer demonstrates pipeline validation through a battery of five null controls, each…
Stage 1 pre-registration of a no-new-constants forward prediction derived from the UCT compression-interface theorem at d = 5. Predicts the structured readout floor f_U(5) = 24/25 = 0.9600 for U(5)-covariant readout, with primary discriminator f_{Z_2^5} = 31/32 = 0.96875 for sign-channel readout. Contains: theoretical input, prediction and discriminator, five-criterion architecture classifier, mu…
AAFL (Agent-Augmented Framework for Learning) is an instructional systems design framework for the agent era — where AI agents author first drafts and humans serve as Human-in-the-Loop (HITL) judgment-holders, anchored in workplace performance as the organizing outcome.The framework keeps ADDIE's five-phase spine and adds:Eight HITL decision gates (four pedagogical, four production) where human j…
Here is the structured summary entirely in English, optimized for the Zenodo description field: P = NP — The BiT: Bidirectional Information Topology in Quantum-Inspired Computation Abstract This work proposes a structural and architectural reinterpretation of the P versus NP problem through Bidirectional Information Topology (BiT) (p. 2). Classical computation assumes unidirectional state evoluti…
We report a structural isomorphism between a 26-year human research process (2000–2026) and the functional architecture of modern neural language models. Key mapping: (1) adversarial experience = loss function signal; (2) narrative fragments = latent embeddings; (3) research questions = inference queries; (4) pre-prints = output validation; (5) Ma/間 (meaningful interval) = inference-time compute …
Current Large Language Models treat interrogative operators — What, Why, How, Who, When, Where, Which, Whether, How much, What if, Should — as flat epistemic tokens, assigning them no differential structural weight in the inference process. This paper proposes a formal extension to the Computational Knowledge Theory (CKT) and the Prime-Base Intelligence (PBI) architecture: each interrogative oper…
This paper introduces the Semantic Qubit (S-Qubit), a quantum-analogue information unit defined within the hidden representation space of Large Language Models (LLMs). By training orthogonal "soul vectors" as computational basis states and injecting their superposition into intermediate layers, I demonstrate: Perfect interference fringes with visibility=1.000 across all semantic domains (CV=0.1%)…
Honest result-of-execution report documenting the outcome of testing the UCT compression-interface (Z_2)^3 prediction on IBM Fez (qubits 15, 25, 56) under the pre-registered Stage 1 / Stage 1b / Stage 2 lock chain. Locked operational estimator returned f̂ = 0.87500 exactly, matching the (Z_2)^3 prediction f = 7/8 to all decimal places, with bootstrap σ(f̂) = 0 (200 replicates). The singular-value…
Executive Summary This paper introduces Topological AI, a novel, highly efficient continual learning framework designed to solve catastrophic forgetting in large language models (LLMs). Rather than attempting to achieve the biologically unnatural state of perfect memory, the method balances plasticity and stability by anchoring a sparse, deterministic subset of prime-indexed embedding rows during…
EVA-HNS is a structural full-stack operating system for human-aligned AI.This working paper presents a compact research-oriented overview of the EVA-HNS architecture, integrating three layers: Human Natural Structure (HNS) as the structural coordinate kernel, SOHU (Structural OS for Human Understanding) as the operating layer for persistent human understanding, and EVA (External Verification Arch…
research.ioSign up to keep scrolling
Create your feed subscriptions, save articles, keep scrolling.