Arab music improvisation corpus for research (AMICOR): development and machine translation experiments

Under-resourced languages (and musics) pose a challenge to machine translation (MT). The challenge is greater when the content of the collected dataset is a varied sample taken from a data population that is even more diverse and dynamic. This is the challenge of Arab music vocal improvisation (mawwal). Here, we present the development of AMICOR, a parallel dataset consisting of vocal improvisatory phrases and their corresponding instrumental responses (or tarjamat in Arabic, which literally mea