Introduction National language assessments require transparent psychometric documentation to support valid score interpretation. Methods Rasch measurement was applied to operational KAZTEST data from 1,199 (listening) and 1,197 (reading) test-takers using Winsteps 5.8.3.0. Results Both components demonstrated high item reliability (0.99), satisfactory internal consistency (KR-20: 0.73 listening; 0.84 reading), and adequate model fit. Person reliability was stronger for reading (0.76) than listen