I have been working on microprocessors for more than twenty years. I wonder if there is a way to implement voice recognition via a microprocessor. R5F562N8 is a very powerful MCU. The board Renesas provided to us has a microphone on it. I started to do something on voice button and I did a lot of work on trying, testing and debugging. Finally I can recognize my own voice more than 90% correct. But it changes while the background sound changes. In some cases the noise is even bigger than the signal itself. It’s very hard to get it working stable. As a industry control device, it must be reliable. For this sake, I gave up voice button implement and started something else.
Here sound button means using some specific sound as a button input source sound. The specific sound must possess a fixed frequency within a short period of time. Most of music instruments produces this kinds of sound. Guitar, Piano and Flute are some examples.
Sound Button implementation
Flute In this section, I am going to describe signal flow chart, the theory of the implementation, and the software.
Signal flow chart
Signal from around the board feeds to the microphone(U16) on the board, amplified by U15 which is enabled/disabled by AMP_SHDN(PORT5.5 on CPU PIN39). The output of U15 feeds to AN5(PIN89 on CPU). CPU samples the signal with a frequency of 8000HZ. Refer to figure 1.
Figure 1: Signal flow chart
The implementation theory
The Discrete Fourier Transform (DFT) is main tool I used to recognize the sound.
Examples are always easier to understand than theory itself. I try to explain it by examples.
Suppose the source signal 500 samples is from this sine wave:
XX(j) = 200 * Sin(2 * 3.1415926 * k * j / (sample_number – 1)) —-Equation 1
IMX(k) = IMX(k) – XX(i) * Sin(2 * 3.1415926 * k * i / (sample_number-1)) —-Equation 2
The real part in frequency domain for this signal is
REX(k) = REX(k) + 0.005 * XX(j) * Cos(2 * PI * k * j / (sample_number – 1)) —-Equation 3
Energy calculation for the signal in each frequency k, we use Equation 4.
MAG(k) = 0.000001755* (REX(k) * 2 + IMX(k) ^ 2) —-Equation 4
In Equation 1, 2, 3 and 4, 0≤j≤sample_number – 1, 0≤k≤sample_number/2, sample_number=500
For k=2, XX(j) and MAG(k) looks like this:
The maximum of MAG(k)=1.12385083518973 when k=2
For k=10, XX(j) and MAG(k) looks like this:
The maximum of MAG(k)=28.0250339687725 when k=10
For k=20, XX(j) and MAG(k) looks like this:
The maximum of MAG(k)=111.210331108156 when k=20
For pure sine wave, it is very easy to recognize what the signal’s frequency is. Just find out where the maximum value of MAG(k) occurs.
For the signal with more than one frequency like this:
XX(j) = 200 * Sin(2 * 3.1415926 * 17 * j / (sample_number – 1)) + 200 * Sin(2 * 3.1415926 * 40 * j / (sample_number – 1)) —-where 0≤j≤sample_number – 1, 0≤k≤sample_number/2, sample_number=500
It has two frequencies: k=17 and k=40.
XX(j) and MAG(k) looks like this:
The maximum of MAG(k)=107.286101237055 when k=40
The second maximum of MAG(k)=20.2292238985797 when k=17
We still can recognize what the signal’s frequency is. Just find out where the maximum value of MAG(k) is.
Now we come to the real signal CPU sampled from AN5, the output of U15.9 (PIN89 on CPU)
As mentioned earlier, Guitar, Piano and Flute can be used for producing the sounds for SW1, SW2 and SW3 on the board. I take flute as an example. You can buy this flute from Dollar mart.
Any three notes on the flute can be used for three SW’s. When I played note1, note2 and note3, I got all the X(i) and MAG(k) like these:
Note1–MAG(k) is The maximum when k=62
Note2–MAG(k) is The maximum when k=66
Note3–MAG(k) is The maximum when k=78
By checking k, we can recognize what note is should be.
You can redefine SW1, SW2 and SW3 to correspond to different notes on a music instrument. You can also use a guitar to produce sound for these three switches.
For example, If you want to redefine SW1, you only need to enter sound study menu, play a sound by music instrument and save it as SW1.
For speeding up the calculation, I used FFT in the firmware to calculate Equation 4.
|For only $10, You Can Get Sound Recognition Souce Code and Professional Support.||Frank_Tu@yahoo.ca1-647-557-6677|