It may sound hard to believe but, these days, writing a GMSK receiver is in principe not that hard. All you need to do are these for 4 steps:

  • Capture audio from the radio via the audio-device of the computer
  • GMSK demodulate the audio and turn it into a train of bits
  • Extract the raw D-STAR stream from that unordered stream of bits
  • Process some parts of this D-STAR stream for error-correction, descrambling, deinterleaving

The end result: a .dvtool file containing a D-STAR stream.

Easy no? :-)

The reason I wrote “these days” in the sentence above has to do with an evolution on the radio-amateur world that has changed it quite dramatically the last couple of years: open source software. Up to -say- one or two years ago, if you where interested in writing this kind of software (for an end-user application or just to learn more about it) you had to write everything herself. You had to understand everything about all these different aspects, including some quite complicated things, like low-level DSP code for GMSK modulation and demodulation, Forward error correction algorithms). These days, that is not the case anymore.

The importance of open source software

There are currently at least two different open-source software projects that implement at least part of a GMSK modem: the PC based pcrepeatercontroler project of Jonathan, G4KLX and the DVPRTR project around Jan, DO1FJN, Torsten, DG1HT and DH2YBE who target a “controller-board” based solution. The interesting part about open source software is its license: the so-called “GPL v2″ license. That software license is specifically designed, not only to allow people to reuse programming code written by somebody else in their own projects, but also to learn from it and even to modify it to ones own use.

What does this mean? Well, it means that you do not have to know everything about everything anymore. You can just use existing code, as written by somebody else, and start from there on.

Open source is much more then “free” software. By using open source, software and hardware designs become a way to learn. You can just take some project that already works, look how it works, learn from it, think about it, modify it to try out some ideas and even merge your code back into the original project.

Of course, it requires a bit of programmaing skills, but more then half of the work has already been done. This code provides a very solid based to get started and experiment with this or work out your own ideas. This is what this project is about: allow people to learn what a gmsk modem actually does and to allow people to experiment with it.

The Basics of GMSK and D-STAR streams

Now, before starting to program anything, the first step is to understand it. For D-STAR, this turns out one of the more difficult hurdles to take. Information about D-STAR is scattered over multiple documents, from different sources, written by different people. Its very easy for somebody new to D-STAR to get overwelmed by all these texts, especially as most of these documents about D-STAR on the internet are not even relevent to building a GMSK receiver.

In this mountain of information, there is one single “D-STAR specification” (the “shogen” document, a mere 11 pages) that described the radio level interface of D-STAR. The problem is that it written for people already experienced with this kind of technology. For somebody who is new to this matter, it looks -from one side- quite overwelming and -on the other size- not containing that much usefull information. We will try to provide this information in a more structured way.

Now, there are multiple ways one can describe a D-STAR radio stream: using pure dry specifications, as maths, what is looks like as a radio-frequency signal. The appoach taken in this article is by asking a simple question: “what does a D-STAR stream look like for a computer program that needs to decode it“. The goal of this exercise it to give some  insight into what a GMSK audio-stream actually is, how it is structured and -based on that- how does one go about writing software to decode it. It does this by reducing certain “magical” parts of the programming code (especially the DSP code to do GMSK demodulation, error-correction and scrambling) to a “magical box”. We will gracefully skip over this code in this article.

The programming code for this GMSK receiver application can be found here.

In the end, believe it or not, it actually turns out that understanding a D-STAR stream isn’t so difficult to all.

10 basic facts

If you search the internet for specifications about D-STAR, you will come across some words and terms that -for people who know digital communication- are common knowledge. These words provide certain basic information that one should be aware of. So, if you bare with me, here are the “10 basic facts about GMSK and D-STAR you should know and understand“:

  1. Digital: D-STAR is a digital communication system. This means that speech is not transmitted as a varying analog signal (like FM, AM or SSB) but converted to and transmitted as a serie of 0s and 1s (“bits”).
  2. GMSK: Althou we want to transmit speech as a serie of digital 0s and 1s, we still need to do this using analog radio-equipement. For that reason, the bits are converted into “tones” so it can be transmitted over an analog radio. This is the reason why, if you tune a analog FM radio to the frequency of a D-STAR repeater, a D-STAR signal will sound like “buzzing noises”. There exist a number of such “modulation” systems: FSK, PSK, ASK. The system used by D-STAR is GMSK, a system related to PSK.
  3. FM modulated: as all HAMs should know, transporting a analog signal with a radio can be done using a number of different modulation technologies: AM, SSB or FM. Just like certain other digital technologies (notebly packetradio and APRS), GMSK uses FM modulation. This explains why it is possible to use a normal FM-transceiver to send and receive GMSK signals.
  4. 9k6 data port: One of the things where a GMSK modem differs from a normal FM voice communication (and from packet-radio and APRS), is that it uses the “9k6″ dataport to the radio, instead of the normal “microphone/speaker” ports. That port is different from the other ports on a FM transceiver that is bypassed the “emphasis/deemphasis” circuit of a FM radio. These circuits modify the audio in a way which would change the gmsk “tones” and therefore also change the bits they carry. Hence, to avoid this, a gmsk modem is connected to the 9k6 port of a FM transceiver.
  5. 4800: that is the number of bits per second that are send by a D-STAR radio. These bits can be divided into 3 groups: 2400 bits/second “voice” information, 1200 bits/second “forward error correction” bits, used to protect the voice-information from transmission-errors and correct them if possible, and 1200 bits/second “slow data”.
  6. structured: althou when listening to a gmsk stream one wouldn’t say it, the bits in a gmsk stream do have a fixed predefined structure. In short, it consists of a header-frame of 328 bits that contain such information like the callsign of the sender and any number of data-frames of 12 octets each that containing digital voice and slow-data.
  7. Syncronisation: in addition to the structure mentioned above, a D-STAR GMSK stream also contains some special bitsequences that are needed for syncronisation. These are needed so that the receiver listens for bits at exactly the exact time that the transmitter has sended them. Syncronisation bits are used to mark the beginning of the stream (hence located in front of the D-STAR configuration frame), to mark that a stream has ended and also every 21 frames (i.e. every 420 milliseconds) to keep the receiver “in tempo” with the sender.
  8. 20ms: A voice communication -which can take up to minutes- cannot be processed at one go by a digital voice system. It is cut into pieces of 20 milliseconds each. This allows the voice to be processed by the “vocoder” (see one below) and converted into 9 octets (6 octets voice data + 3 octets error-correction). These 9 octets, plus 3 octets of “slow-data” make up the 12 octets that go into every dataframe (see one above).
  9. AMBE: 20 ms of unprocessed voice information normally takes up 160 octets. To fit it into the 6 octets of voice-data that D-STAR can carry, it needed to converted by a so-called “codec” (coder/decoder) or  ”vocoder” (short for “voice coder”). The vocoder used by D-STAR is called “AMBE”.
  10. DVdongle: The GMSK receiver software does not deal with the AMBE vocoder. It just dumps the received stream onto a local file and does not process the voice stream in any way. Software that DO want to encode or decode audio uses an external device for this (called the “DVdongle”) as the software that does the conversion from or to the AMBE format is not public and can only be bought in the form of a IC.

Hardware setup

Software is all nice but, in the end, hardware is always needed to get the received audio into the computer board running the GMSK modem. In this example, the setup is very limited:

(click on photo for more details)

These are the hardware components that are used:

  • A FM tranceiver with a 9k6 data-port (in this case, a Yaesu FT-857D)
  • Any kind of computing-device, either a “PC” class computer or developement board (in this photo, I used the pandaboard, but this can also run on less powerfull like the mini2440 also visible in the photo)
  • Any audio input device, either the native audio-in port of the computer / developement board, or an external USB fob. In this case, this device was used as these are known to work well. They have been tested quite thoughtly by the people of the freestar D-STAR network in Canada.
  • A cable connection the audio input device with the radio. In some cases, a capacitor is needed between the transceiver and the dongle to block the DC component. However, on the Yaesu FT-857D, there already is a capacitor located on the 9k6 port of the radio so there is no capacitor on the connection cable.

Note that in this setup, the connection between the radio and the USB audiofob are just two wired that are soldered together and hang freely in the air. This would -of course- not be something you would do in a operational enviroment as this would pick up all kind of RF interference from all kind of sources (like the two development boards and a laptop located closeby).

From an software-development point of view, as programs have to be able to deal with interference and biterrors, this “bad” interface actually helps to make the code more robust. A more complete interface circuit can be here, as provided by Ramesh, VA3UV.

Step 1: capturing and analysing an audio-stream

As mentioned above, this article looks at a GMSK D-STAR stream, as seen by an computer application that needs to decode it.

Sample 1:

So, the first step is … to do the same thing and listen what the gmsk decoder receives from the radio. Click below to listen to audio sample 1 (6 seconds):

So what do we hear? … Nothing? … Not really. We DO hear someting: noise!

This is a first interesting fact. When receiving audio from the 9k6 audio receive port of a radio, there is no “squelch” function. The reason for that is very simple: the 9k6 “port of a tranceiver is position BEFORE the FM discriminator while the audio “squelch” function is located completely at the end of the audio reception circuit.

Sample 2:

Next, audio sample 2 (16 seconds). Click below:

As one can hear, this audio-sample contains two gmsk burst, surrounded by noise. As the saying goes “a picture shows more then a thousand words”, this is the actual visual dump of this audio-file, as generated by the audio editor Audicity.

(click for a bigger image)  

So, again, we have the same things:

  • Two “bulb” of GMSK streams
  • Surrounded by noise when nothing is being received.

Another interesting fact is the audio-level of the received noise, when no FM signal is being received, is actually higher then there is a valid signal entering the FM receiver circuit.

A close-up of the beginning of the stream

Let’s make a close ups of one of the gmsk bursts in the stream. As shown on the timescale, this is the beginning of the 2nd gmsk burst. See image below:  

Visually, it is very easy to distinguish three different parts.

  1. Up to marker 7.985, no signal is being received, so the radio is picking up noise.
  2. Between 7.990 and 8.105, we see a very regular pattern
  3. From 8.195 onwards, this pattern becomes much more varied.

The explanation for this can be found in the 6th and 7th “basic fact” mentioned above. A D-STAR stream is not just a random stream of bits, it has a fixed structure. In particular, in front of the D-STAR stream itself, there is a fixed “syncronisation pattern” which marks the start of the stream. This patterns is not only fixed, it’s also very regular.

Another nice element in this audio-sample is the small “hick up” just at the beginning of the syncronisation pattern (around timestamp 7.990). The reason for it is currently not completely clear and discussions in our local radio-club center around either the behaviour of the VFO PLL circuit of the FM receiver and the effect of capacitors in the audio-ports of the receiver and the USB audiofob.

Demodulated signal of the beginning of a stream

Now, as a good journalist, one needs to have at least two independant sources to trust a story. So what does it actually look like when we GMSK demodulate this audio-stream? As the “gmsk receiver” has an option “-dd” (dump more), it is possible to see what the program actually sees coming in.

The process that creates this bittrain goes like this:

  1. The application captures an audio sample every 1/48000 second
  2. The value of that sample is send to the “gmsk demodulate function” (one of the “magical” pieces of code) which turns this into a “0″ or a “1″
  3. As we are sampling 48,000 samples per second but the gmsk signal only runs at 4,800 bits per second (one tenth of the sampling rate), some additional code is needed to take care of this. Again, we will concider this code is a “magic box” that we will accept “as it”.

So, back to the audio-sample. Let’s GMSK demodulate it and see what we get. First up, demodulation of the noise. This is the result:

00111000 10011100 00000010 00001000 00110111 11011111
00000000 10000001 00000000 00101101 10011111 10111111
11000001 11010010 01111010 01100001 00101000 00110100
00000000 01110100 10011010 00110000 00011000 10011011
00011101 10000111 10000111 11100111 10100011 11110101
11100000 11110000 11100110 11000001 01111101 10100001
(...)

 

(Note that the grouping of the bits in groups of 8 is just to make things a bit more readable, it has no special meaning).

Conclussion: as noise is in fact nothing else then a bunch or random voltage levels, turning this into a bitstream of 0s and 1s just produces a random string of 0s and 1s. OK, so far, nothing world shocking.

But then, a bit further, things do get more interesting.

(...)
10000110 00010000 00000001 00010111 11010110 10010100
00111111 10100111 00100011 00111101 00111100 10111111
11101000 10000001 11111101 11101010 11101101 10111111
00010000 11101011 11110000 00000000 11111110 10000000
00001000 11000001 01111100 11111000 10111101 11111111
11111100 00011111 11111111 11111111 11111111 11111111
11111111 11111111 11111111 11111111 11111111 11111111
11111111 11011101 01111101 01010101 01010101 01010101
01010101 01010101 01010101 01010101 01010101 01010101
01010101 01010101 01010101 01010101 01010101 01010101
01010101 01010101 01010101 01010101 01010101 01010101
01010101 01010101 01010101 01010101 01010101 01010101
01010101 01010101 01010101 01010101 01010101 01010101
01010101 01010101 01010101 01010101 01010101 01010101
01010101 01010101 01010101 01010101 01010101 01010101
01010101 01010101 01010101 01010101 01010101 01010101
01010101 01010101 01010101 01010101 01010101 010
... 11101 10010100 00

 

As we can see, at a certain moment, a fixed patterns starts to emerge: first all “1″ and then a repetitions of “1010″. Let’s look in the specification what this means. See page 3 of the “shogen” document:

2.1 Wireless Communication Packet
2.1.1 Frame structure of a packet: The explanation of the data frame structure the Radio Header follows.
(1) Bit Syn. (Bit synchronization): Repeated standard 64-bit synchronization pattern (for GMSK 1010). (...)

 

So, what we are seeing here, is the beginning of a D-STAR stream: at first, the receiver and the gmsk needed some time to syncronise themselfs, but after about 20 ms, the fixed “1010″ appears.

This is not only exactly as the specifications say how it should be, it also corresponds nicely with the audio-sample. As every line in the bitdump above corresponds to 20 ms, the whole process of syncronisation takes about 120 ms (6 lines) which is also shown in the dump of the audio file. Additional proof found!

So, what’s next? Well, let’s again have a look at the specification:

2.1 Wireless Communication Packet
2.1.1 Frame structure of a packet: The explanation of the data frame structure the Radio Header follows.
(...)
(2) Frame Syn. (Frame synchronization) : 15bit pattern (111011001010000).
(...)

 

Looking again at the bitdump, this 15 bit “frame sync” patterhn is exactly what we see at the end!!!

This “frame sync” bitpattern is nothing else then a marker in the frame-structure saying “here stops the bit syncronisation and starts the next part: the D-STAR header frame”.

The D-STAR header

As the specification says, the parts following the frame syn pattern is the D-STAR header frame. Let’s have a look at the output of the GMSK demodulator. It returns this information:

HEADER DUMP:
FLAG1: 00 - FLAG2: 00 - FLAG3: 00
RPT 2: DIRECT
RPT 1: DIRECT
YOUR : CQCQCQ
MY : ON1ARF /KRIS
Check sum = E441 (OK)

 

As one might see, this is just the same information that one has to fill in into a D-STAR transceiver, plus 3 “flag” octets and a 2 octet checksum.

Based on this, one would say that the header would have a length of 328 bits (41 octets * 8), however that is not the case. The header is no less then 660 bits. To understand the reason for this, we will have a look at the (simplified) source-code of the gmsk decoder.

In the programming code processing the 660 bits received after the frame-sync sequence, we find this:

scramble(radioheaderbuffer_in,radioheaderbuffer_temp1);
deinterleave(radioheaderbuffer_temp1,radioheaderbuffer_temp2);
length=FECdecoder(radioheaderbuffer_temp3,radioheaderbuffer_out);

 

The received header bits are processed by 3 different functions: scrambling, deinterleaving and FECdecoding. Explaining the reason for this is however more easy if we look at what happens at the opposite side, at the transmitter. Overthere, we see the same processes, but in reverse order: first FECencoding, then interleaving and finally scrambling.

As is explained below, these three processes are there to make the transmission of the header more robust, i.e. to make them better capable to deal with packet loss.

Step 0: adding two bits

The very first step in this process is a bit strange. Before processing, two bits (containing 0) are added at the end of the 328 bits of the header. These bits do actually not have any particalur use (hence “step 0″), They do increase the header-size from 328 to 330 bits.

 

Step 1: FEC

The first real step in processing the header on the sender side is “FECencoding”. FEC means “Forward Error Correction” and it does just as its name implies. It makes a stream better able to deal with errors, even BEFORE transmitting it.

The idea is very simple. By adding bits to the stream that is being send,  when bits get corrupted during the transmission, the receiver (actually the FECdecoding algorithm in the receiver software) can use these additional “error correction” bits to correct the faulty ones. How FEC really works internally is something will we gracefull pass over in this article.

What is interesting to know is that the particular system used in the D-STAR header creates one additional bit for every data bit; hence doubling the size of the header from 330 to 660 bits.

Step 2: interleaving

The second step is a process called “interleaving”, which is just a fancy word for “juggling bits around”. This means that you complety change the order in which the bits are transmitted. If the header would be only 4 bits, instead of sending the bits as “1 – 2 – 3 – 4″, they would be send as “1 – 3 – 2 – 4″. This may look like a very odd thing to do, but there actually is a very good reason for this. It is related to the Forward Error Correction mechanism described above.

As mentioned above, the FEC algorith tries to make a stream more robust by adding error-correction bits to the stream. However, these error-correction bits are located just next the data-bit. In our 4-bit header example, bits 1 and 2 form one FEC pair and so would bits 3 and 4.

Now, all these 4 bits are send sequencially. image that, during transmission, there is a interference pulse that not only wipes out bit 1 but also bit 2, both bits of the FEC pair will be corrupted and the FEC system will not able to correct this. This is where the “interleaving” system comes it. By spreading out the 660 bits of the header, this makes it less vulnerable to these kind of “impulse noise” interference. The bits of the FEC pair are located much more appart.

Programming-wize, interleaving is very easy to implement with just a plain lookup table.

Step 3: scrambling

Althou the name might indicate otherwize, scrambling has nothing to do with encryption. This process -also known as randomizing- is related to how datacommunications systems keep themselfs syncronised.

GMSK demodulation use the timing of the transitions from 0 to 1 and from 1 to 0 to keep the receiving GMSK demodulation process in sync with the sender. These exact timestamps are used as markers to avoid the GMSK receiver losing sync. For that reason, long sequences of all-0 or all-1 are something that need to be avoided.

What the scrambling-process does is convert a bit-sequence into another (predetermined) sequence, making these all-0′s and all-1′s less lickely. Luckely, there are open source software-modules to implement scrambling so we do not need to worry to much about it.

Note the the scrambling-process on the server side is exactly the same in the receiver. There does not exist a seperate “unscrambling” function.

So, to summarize:

  1. The D-STAR header frame contains the information like callsigns, that are needed for routing a D-STAR stream.
  2. To make it less prone to transmission errors, three processes are applied to it, resulting in a structure that is twice its original size
  3. Luckely, there exist open source software for these functions, so we can just concider this as a “magical black box”
  4. At 4800 bits per second, ransmitting the header takes 137.5 ms.
Finally, voice

Let’s look at the specification what is next. Page 5 provides the answer: a sequence of the voice and slow-data information.

The structure of this part of the stream is very simple. Every 20 ms, there 96 bits (12 octets) are send:

  • 72 bits (9 octets) of AMBE encoded voice
  • 24 bits (3 octets) of slow-data

Looking at the bitstream dump, there is not that much to see at first sight:

11011111 11010000 00011000 00011001 11100110 10010100
10110111 11111111 10011101 10101010 10110100 01101000

01111101 01010000 01100000 00011111 10100100 11011100
10001001 01100010 10100111 00001100 00100000 10000011

01011111 10010001 01010000 00000011 10100100 11111100
10011111 11100000 10100101 10011100 00111000 11100011

(...)

 

Every line (containing 12 octets) represents 20 ms. The left 9 octets contain the digital voice + FEC. The right 3 octets contain slow-data.

Concerning slow data, two things need to be noted:

  • The 3 octets of slow data of the first frame and every 21 consequative frame do not contain slow-data. Instead, they contain a fixed pattern (10101010 10110100 01101000), as seen in the example above. This pattern is an additional syncronisation pattern that is used to keep the receiver in sync with the sender. This will be explained more in detail in a later article.
  • The slow-data found in the other frames are processed by the same scrambling process as used on the header frame. Again, the goal is to avoid long all-1 and all-0 sequences that would risk upsetting the syncronisation of the GMSK demodulator.
The end

After the header of the voice stream and the actual voice and slow-data frames, we arrive at the last part of the stream: the end.

One of the advantages of a digital telecommunication stream is that you can actually mark the end of a stream INSIDE the stream. You do not have to depend on some carrier dropping away. By saying “the stream stops here”, this allows the stream to be terminated much more correct.

Again, let’s look at the specification and see what it says. At page 6 of the document, we read this:

"The last data frame, which requires a means of terminating the transmission, is a unique synchronizing signal (32 bit + 15bit “000100110101111” + “0”, making 48 bits) as defined by the modulation type.".

 

Hum!  Interesting wording! :-(

Looking at the simplified source-code, we see that this “unique syncronisation signal”  is equivalent to the following bitpattern:

  • The last 3 octets of a frame contain the pattern “10101010 10101010 10101010″
  • The first 3 octets of the next frame contain the pattern 10101010 00010011 01011110″

Again, we look at the output of the gmsk receiver program, and we see this:

11011001 10110000 00101101 00001101 11000011 11000110
00111111 01100110 01010110 10000100 00110000 01000011

00111011 10010011 00000101 10001000 01101011 01111111
11110001 11001100 10010111 11111110 10101010 10101010

10101010 00010011 01011110 END.

 

Now, if you look at this carefully, you see that the pattern does not completly match. The 10th octet of the last full frame should contain “10101010″ but it contains something else. However, that pattern is still correctly seen as a valid “END” marker by the functions that have to detect an end of a stream.

The reason is that these functions are coded in such a way that it can deal with small errors. In this case, it allows up to 3 bit errors. If this would not be done, a single biterror in the transmission of the “end” marker would risk the receiver completely missing the end of the stream.

The picture below shows this pattern, as it was received by the board:

The end-sequence pattern can be found between timestamp 11.844 and 11.850 (with the 16-times “01″ nicely showing up). And it even shows why the beginning of this patterns was incorrectly decoded: there seams to be some drift on the signal.

This shows one of main difficulties with programming a GMSK receiver: making it work correctly, even if the stream contains biterrors. But that is a subject that will discussed in a later article.

 

This finished this article on the structure of a D-STAR stream. I hope that it has been a little bit instructive.

If you have any questions or remarks, discussions about this subject can be best done on the gmsk_dv_modem group on yahoo. Feel free to drop by!

73

Kristoff – ON1ARF