Hi,

 

It’s summer and this that sitting at home behind the computer is not always the first priority. ;-)
It has been a month since the last post; so here some updates on different projects:

 

Concerning the D-STAR voice-announcement package, Ken, KE2N, discovered there is an issue with the text-messages send allong the voice-announcements:
If the text-message, which is located in the slow-data part of the voice-announcement, is repeated to quickly, certain D-STAR radios will not show the complete message on their screen. After some experiements with different settings by Ken, the voice-announcement code was changed so that the D-STAR text-message is repeated every 2 seconds multiplied by the number of characters in the message.

The updated source-code is available on github: https://github.com/on1arf/voice-ann
Thanks to ken for troubleshooting this issue.

 

Bobby, KF4GTA, has activated an internet-stream of the D-STAR repeater W4VLD of the Valdosta Amateur Radio Club; using the DSTK (D-STAR Switching matrix Toolkit). The system is currently still in test: click here to listen to it.
The current system uses the icecast streaming-server and vorbis ogg-based streaming souce-client ices but we would like to experiment with mp3 streaming too.
If somebody has any information on how to configure ezstream to stream/mp3-encode audio from standard-in, please feel free to drop us a note!

 

I have also taken the opportunity to do some cleanup of the DSTK code: ”cap2rpc” and “rpc2amb” have been updated and moved to a new folder: DSTK_v2.
Also, a new application has been added: “udpbounce“. This program can be used to move DSTK streams between different servers. The goal is have the server doing AMBE decoding (i.e. the server with the DVdongle) to be located on a different place then the PC used by the repeater.
The code for the DSTK can be found on github: https://github.com/on1arf/DSTK

 

The code of the GMSK modem has been changed so it can be optimised for different kinds of processors: especially concering the “multiply-and-add” commands used in the DSP code. By selecting an option in the Makefile, the source-code can now be compiled to use 64bit or 32bit integer math or floating point math. Also this code is available on github: https://github.com/on1arf/gmsk/tree/master/gmskmodem_codec2

 

 

The changes of the gmskmodem are interesting because they are actually the result of a completely different project: doing some experiments to port the “WSPR” application (Weak signal Propogation Reporter) to a raspberry-pie computer.
See the report of Bob G3WKW on this: http://homepage.ntlworld.com/re.thornton/G3WKW/files/WSPR_RasPi_Setup2.html

 

Looking into this issue, it turned out that the key to this is the “multiply-and-add” instruction that is used very often in DSP code. Actually, the low-pass filter in the GMSK demodulation executes this command about 5 million times per second!

Most ARM processors do have assembler instructions to do this kind of  actions in hardware. The ARM920T, as found on the friendlyarm mini2440, is able to execute a 16 bit by 32 bit multiply-and-add (resulting in a 64 bit result) in just 2 clock-cycli; so -in theory- the 5 million operations per second should only use 10 million clock-cycli; which is only 5 % of the 400 Mhz clock-rate the CPU is running on.

However, it turns out that the actual implementation in C is quite different.

  1. Simply doing the C-equivalent of this instruction (“a += b1 * b2“; where a is a 64bit “long long int” and b1 and b2 are 32bit “int”) does not work. Any value of “a” is limited to just 32 bits; even if it defined as a 64 bit integer.
    The only way to get this code executed correctly is by first doing a “cast” and convert one of the “b” variabled into a 64bit long long int: “a += (long long int) b1 * b2“.
  2. When looking at the assembler-code generated by gcc for the C-code above, it was found that this requires no less then 3 multiplication assembler instructions. So the 5 million multiplications per second become 15 million!

Together with all other overhead, this explained why the GMSK demodulator actually uses some 20 to 25 % of the CPU of a 400 Mhz ARM9.

 

So, how to fix this? There are two options and both deal with removing the need to do a cast of one of the integers from 32 bit to 64 bits.

Option one of for system that have a processor  that have a FPU (Floating Point Unit). In that case, the answer is simple: do all math in floating-points instead of integers. The dedicated hardware of the FPU are able to reduce the time for the number crunching to half that of needed by the integer-math core.
This approach has been used on a raspberry pie, where the ARM1176 core actually runs the gmskmode some 30 % faster when using floating-point math then when using 64bit integer math.

 

The 2nd approach, as can be used on processors without a FPU, is reducing the values of the two 32-bit integer values so that the product of them is always less then 32 bit:
In the 64bit math version of the gmskmodem; the mulitplication is actually a 16 bit value (signed 16 bit audio) multiplied by a 32 bit value (the coefficient of the FIR filter); resulting in a 48 bit product. Concidering the fact that some hardware gmskmodems (like the dv-rtpr) only have 10 bit audio anyway, two versions of the “32bit integer” where created: one using 10 bit audio (and hence 22 bit FIR filter coefficients) and one using 12 bit audio (and 20 bit FIR filter coefficients).
Using the 32bit integer version of the gmskmodem on a friendlyarm mini2440 with a ARM920T processor increases the speed of the gmskmodem by two to three fold!
However, further tests are still needed to see if reducing the resolution of the audio and the FIR-filter has any negative effect on the quality of the GMSK stream or its ability to deal with noise.

 

73
Kristoff – ON1ARF