日本語
Professor Kazuya Takeda, Ph.D.
bio| research topics| recent publications| links

Short bio

Kazuya Takeda received his B.E.E., M.E.E., and Doctor of Engineering degrees from Nagoya University, Nagoya Japan, in 1983, 1985, and 1994, respectively. From 1986 to 1989 he was a researcher at Advanced Telecommunication Research laboratories (ATR), Osaka Japan. His main research interest at ATR was corpus based speech synthesis. He stayed at MIT, Cambridge USA as a visiting Scientist from November 1987 to April 1988. From 1989 to 1995, he has been a researcher and research supervisor at KDD Research and Development Laboratories, Kamifukuoka, Japan. He led a research project for voice-activated telephone extension (VOATEX) system. He received technical achievement award and Kiyoshi-Awaya award from Acoustical Society of Japan for the development the VOATEX system. From 1995 to 2003, he was an associate professor of the faculty of engineering. He also led an information transformation research team at Center for Integrated Acoustic Research (CIAIR). From 2003 he has been a professor at graduate school of information science. His current research interest is media signal processing and its applications include; spatial audio, robust speech recognition, behavior modeling and interfaces.
He is an author of more than 80 journal papers, 6 books and more than 100 conference papers.
He is a member of Acoustic Society of Japan (ASJ), the Institute of Electronics, Information, and Communication Engineers (IECE) Japan, Information Processing Society Japan (IPSJ) and IEEE. He is an executive board member of ASJ. He is an associate editor of IEICE Transaction on System and Information. He was the chair of the Spoken Language Processing technical group of IPSJ. He is a vice chair of ITU-T FG Distraction.

(top)

Research Topics

Audio Signal Processing

Spatial impression is one of the most important types of information in audio signals, and not only transmitting, but also generating 3D sound fields, has been a fundamental problem in the audio signal processing. The goal of our group's Selective Listening Point (SLP) audio system project is to realize an audio system that can generate a sound field at a given location using sounds captured through distant microphones. The current approach, i.e., combining the blind separation of acoustic signals under real conditions and controlling Head Related acoustic Transfer Functions (HRTFs), can generate natural directional sounds for anechoic space, whereas the perceptible degradation found in highly reverberant rooms.

Speech Signal Processing

As speech is the most natural way for humans to communicate, spoken language interfaces are considered to be the best modality for a wide range of information systems. Particularly, in a vehicular environment, where hands and/or eyes are not free for operating information devices, the use of speech recognition technology is appropriate. Since current speech recognition technology fully utilizes statistical approaches, such as hidden Markov Models of speech and the N-gram model of word sequences, the mismatch between training and testing conditions cause serious degradation of system performance. Therefore, noise reduction technology is a very important issue for speech recognition to succeed in real environments. Our group is studying speech enhancement technologies based on the multiple-regression of spatially distributed microphones for in-car applications. We have found that the method is very effective for low SNRs, i.e. -10 dB, and highly non-stationary environments. By extending the statistical modeling of the noise contamination process, we also found an effective noise reduction method for moderate SNR conditions. These methods' effectiveness is confirmed through speech recognition experiments using the standard corpus. In addition to the signal processing aspect of speech processing, we are investigating language modeling, field tests of spoken dialogue systems, and discrimination of spoken and song speech.

Human Behavior Signal Processing

The recent progress in sensor and communication technologies is making possible long-term human sensing with wearable devices. Here, the target area of signal processing covers a broad range of human-observation signals, highlighting the growing importance of human-behavior signal processing (HBSP).
As a pioneer group in the field of HBSP, we have been working on modeling driving behaviors. We found that the statistical phase space of driving, i.e. a joint probability of the head distance from the preceding car and the speed of the car generally represents the driving characteristics, and noticed that drivers' individuality can be extracted through a Gaussian Mixture Model (GMM) of the phase space. It was also found that the cepstrum analysis is an effective method for deconvolving the driving action into human dynamics and the command sequence. Showing that approximately 80% drivers can be correctly identified by a cepstrum feature with dynamic parameters, we confirmed the effectiveness of signal modeling of human behavior. Currently, we are applying HBSP to the generation of driving behavior.

(top)

Recent Publications

(Google Scholar Home)

(See Full List)

(top)

Links


bio| research topics| recent publications| links