Parameterized safe reinforcement learning for operational flexibility quantification of active distribution networks with low-observability