Tonality recognition specifics in a speech flow

Mathematica modeling, numerical technique and program complexes


Аuthors

Balakirev N. E.*, Nguyen H. D.**

Moscow Aviation Institute (National Research University), 4, Volokolamskoe shosse, Moscow, А-80, GSP-3, 125993, Russia

*e-mail: balakirev1949@yandex.ru
**e-mail: nguyenhoangzuy@gmail.com

Abstract

The article discusses one of the possible approaches to solving the problem of recognizing specific speech aspects related to tonality, which occupies an important place in human communication. The degree of a particular state of tension is, above all, reflected not in the content of words, but in the way they are pronounced, which sometimes carries a different meaning regarding the content of words. Solution of such problem can be used in aviation technology, in particular, for automatic recognition of the emotional state onboard, for highlighting the emotional segments of speech in the records of flight recorders, as well as the speech of passengers of local and international flights. Tonality is of special key importance in the tonal languages of Southeast Asia, although it is of no little importance for European languages also, reflecting the character of the spoken phrase and introducing additional meaning into the content of the words and sentences being recognized. Anyhow, the tonality itself manifests itself identically, but in has its specifics in relation to the information content of a phoneme or a word. And it relates in the first place to the consideration of an object itself, bearing information about the tone. In contrast to the solution of the word sequence recognition problem, where the set of frequencies is a guide, the tonality recognition problem cannot rely on the generally accepted mathematical methods of wave processing and recognition. Considering the tonality recognition issues is, as a rule, beyond the scope of wide discussion in these methods, and the sphere suggestions for algorithmic solutions of this problem is considerably confined. Thus, the tonal component of the phoneme, that can be obtained by the special methods different from the conventional methods, is considered by the examples in the first place. The authors suggest the methods based on setting relations between the characteristic points and representation of these relations configuration in the form of matrix model.

In fact, such model is a qualitative tonality characteristic that does not depend on the amplitudes value, which allows compare different manifestations of tonality expressed in the loudness of pronunciation. The comparison itself assumes the presence of a qualitative measure, which allows reflecting the degree of difference of the considered phonemes in the speech flow.

Keywords:

speech recognition, tonality, tone, stress, intonation, phoneme, structural matrix

References

  1. Guseinov A.B., Makhovykh A.V. Trudy MAI, 2016, no. 90, available at: http://trudymai.ru/eng/published.php?ID=74833

  2. Aung Vin, Balakirev N.E., M’yu Tu Naing, Shcherbakov A.I. Vsesoyuznaya nauchno-tekhnicheskaya konferentsiya «Novye materialy i tekhnologii NMT-2008». Sbornik dokdadov (Moscow, 11-12 November 2008), Moscow, MATI, 2008, vol. 2, pp. 146 – 148.

  3. Nguyen V.L., Edmondson J.A. Tones and voice quality in modern northern Vietnamese: Instrumental case studies, Mon-khmer Studies Journal, 1998, vol. 28, pp. 1 – 18.

  4. Nguen Van Khung. Issledovanie i razrabotka algoritmov i programm avtomaticheskogo raspoznavaniya ogranichennogo nabora komand v’etnamskoi rechi (Research and development of algorithms and programs for automatic recognition of a limited set of commands of Vietnamese speech. Abstract of the dissertation for candidate of technical sciences), Moscow, MEI, 2010, 20 p.

  5. Sandakova L.L., Tyumeneva E.I. V’etnamskii yazyk (Vietnamese language. A guide to translation for senior classes), Moscow, Vostok-Zapad, 2004, 211 p.

  6. Petrovskii A., Borovich A., Parfenyuk M. Rechevye tekhnologii, 2008, no. 3, pp. 3 – 15.

  7. Rabiner L.R. Trudy instituta inzhenerov po elektrotekhnike i radioelektronike, 1989, vol. 77, no. 2, available at: https://b-ok.org/book/3079373/ed4973

  8. Balakirev N.E. Materialy XV Mezhdunarodnoi konferentsii “Informatika: problemy, metodologiya, tekhnologii” (Voronezh, 12-13 February 2015), Voronezh, VGU, vol. 1, pp. 31 – 36.

  9. Balakirev N.E. Vestnik Voronezhskogo gosudarstvennogo universiteta. Seriya: Sistemnyi analiz i informatsionnye tekhnologii, 2016, no. 2, pp. 65 – 72.

  10. Balakirev N.E., Nguen Kh.Z., Malkov M.A., Fadeev M.M. Programmnye produkty i sistemy, 2018, vol. 31, no. 4, pp. 768 – 776.

  11. Grenander U. A Calculus of Ideas: A Mathematical Study of Human Thought, World Scientific, 2012, 219 p.

  12. Grenander Ulf, Miller Michael. Pattern Theory: From Representation to Inference, Oxford University Press, 2007, 608 p.

  13. Kupriyanov A.I., Shevtsov V.V. Trudy MAI, 2012, no 55, available at: http://trudymai.ru/eng/published.php?ID=30112

  14. Samokhin V.F., Moshkov P.A. Trudy MAI, 2015, no. 82, available at: http://trudymai.ru/eng/published.php?ID=58711

  15. Balakirev N.E. Malkov M.A. Informatsionnye tekhnologii, 2008, no. 12, pp. 66 – 68.

  16. Galunov V.I., Galunov G.V. Science perspectives of speech technology, SpeeCom, 2001, 302 p.

  17. Razumikhin D.V. IV Vserossiiskaya konferentsiya “Neirokomp’yutery i ikh primenenie”. Tezisy dokladov, Moscow, Radiotekhnika, 2001. 288 p.

  18. Soloviev A.N., Victorova K.O., Razumikhin D.V. About using non-informational functions in models of speech communication, International workshop «Speech and Computer» Proceedings SPb, Russian, 2002, pp. 27 – 31.

  19. Purtov I.S., Sincha D.P. Trudy MAI, 2012, no. 52, available at: http://trudymai.ru/eng/published.php?ID=29444

  20. Korolev V.O., Gudaev R.A., Kulikov S.V., Aldokhina V.N. Trudy MAI, 2017, no. 94, available at: http://trudymai.ru/eng/published.php?ID=81109


Download

mai.ru — informational site MAI

Copyright © 2000-2024 by MAI

Вход