String parsing and manipulating?

30 Apr 2012

Hi. I'm quite new to not only the mbed but also c programming itself and I think I need some help!!

I have a string from my PC being read by my mbed and it looks something like this:

$GPRMC,121212,A,4807.038,N,01131.000,E,022.4,084.4,030611,003.1,W*6A

This string of information is from a GPS simulator running on my PC. Below show what each part means.

RMC,Time,status,latitude,longitude,speed,angle,date,magnetic variation,check-sum

I want to parse the string into it's separate parts then manipulate some of the data and output to an LCD. I'm okay with outputting to the LCD and I think I've parsed the string! :S But I'm not sure how to manipulate it.

eg. I want to take the date from the string so I've got 030611 and i want to output it on the LCD as 03-June-2011

How do i go about doing this?

Here's how far I've got...

#include "mbed.h"
#include "NewTextLCD.h"

Serial pc(USBTX, USBRX);

TextLCD lcd(p10,p11,p12,p13,p14,p15,TextLCD::LCD20x4);

char GPS [100];
char RMC, Time, Status, Lat, Long, Speed, Angle, Date, MagV, Check;


int main() {

    pc.scanf(data, "%7c%7c%2c%11c%12c%6c%6c%7c%7c%3c", &RMC, &Time, &Status, &Lat, &Long, &Speed, &Angle, &Date, &MagV, &Check);

Thankyou very much. I hope you understand my problem =]

Also If what I have already is ineffective a I would appreciate someone to correct me. =]

30 Apr 2012

Hi Aaron,

At a quick glance at your code, I don't think it will do what you are expecting it to do. scanf parses a string (the 1st parameter) based on a format specifier (the 2nd parameter) and places the parsed data into whatever variables/buffers are pointed to by the later parameters. It is here that we encounter the first problem:

Quote:

pc.scanf(data, "%7c%7c%2c%11c%12c%6c%6c%7c%7c%3c", &RMC, &Time, &Status, &Lat, &Long, &Speed, &Angle, &Date, &MagV, &Check);

This code takes the first 7 chars, and places it into RMC, the next 7 into Time, the next 2 into Status, etc....however...

Quote:

char RMC, Time, Status, Lat, Long, Speed, Angle, Date, MagV, Check;

This code only allocates enough memory for one char for each field...when scanf tries to put multiple chars in, this will probably lead to memory corruption.

You would need to allocate an array of chars for each field, something along the lines of :

char RMC[7];
char Time[7];
char Status[2];
//etc

But this leads to another potential problem. Using scanf with the %c format specifier parses data as a bunch of chars...but it does not add a NULL character on the end to make it a valid string. This means you will probably encounter problems if you try to use any other string parsing functions. scanf does take a %s format specifier, however it uses whitespace to determine when it has reached the end of the string (and looking at your data string, you don't have any whitespace chars)..so this probably won't work.

I personally never use scanf as I find it too unpredictable as to what it will do if the input data doesn't match the format specifier exactly.

Here is an alternative approach that uses the strtok function instead. strtok has the ability to break a string into proper NULL terminated substrings (or tokens - hence the name) based on a user supplied set of delimiter characters (that are used to determine when the end of a token has been reached).

#include <stdio.h>
#include <string.h>
#include <stdint.h>


char testStringBuffer[] = "$GPRMC,121212,A,4807.038,N,01131.000,E,022.4,084.4,030611,003.1,W*6A";

#define NUM_FIELDS	(12) //number of comma seperated values in the data...TODO does this remain constant?
char* pFields[NUM_FIELDS];






// This function seperates the single input string in to numFields substrings
void ParseFields(char* inputBuffer, char** pFields, uint32_t numFields, char* delimiterChars)
{
	char* pString = inputBuffer;
	char* pField;
	
	for(uint32_t i=0; i<numFields; i++)
	{
		pField = strtok(pString, delimiterChars);

		if(pField != NULL)
		{
			pFields[i] = pField;
		}
		else
		{
			pFields[i] = "";
		}

		pString = NULL; //to make strtok continue parsing the next field rather than start again on the original string (see strtok documentation for more details)
	}
}







int main(int argc, char* argv[])
{
	ParseFields(testStringBuffer, pFields, NUM_FIELDS, ",");


	return 0;
}


The above code splits the input string based on a comma character and stores pointers to the start each of the tokens in the pFields array. One important thing to note is that the strtok function modifies the contents of the original string buffer to overwrite the delimiter char (in this case the commas) with a NULL char (to terminate each of the tokens).

Give it a go and let me know if it works for you.

Regards,

Steven

30 Apr 2012

Hi Aaron,

To convert the date, a little bit more string parsing is needed:

#include <stdio.h>
#include <string.h>
#include <stdint.h>
#include <stdlib.h>
#include <time.h>

//this function takes in a 6 character date string in the form "ddmmyy" and converts it to a fuller string format of the form "dd-MMM-yyyy"
void ParseDate(char* pEncodedDataString, char* pOutputBuffer, uint32_t outputBufferLength)
{
	char dayStr[2+1];
	char monthStr[2+1];
	char yearStr[2+1];

	uint32_t dayInt;
	uint32_t monthInt;
	uint32_t yearInt;

	struct tm date;


	//Step 1 - Seperate the day, month and year into their own NULL terminated string buffers...
	dayStr[0] = pEncodedDataString[0];
	dayStr[1] = pEncodedDataString[1];
	dayStr[2] = '\0';

	monthStr[0] = pEncodedDataString[2];
	monthStr[1] = pEncodedDataString[3];
	monthStr[2] = '\0';

	yearStr[0] = pEncodedDataString[4];
	yearStr[1] = pEncodedDataString[5];
	yearStr[2] = '\0';


	//Step 2 - Run atoi() to parse the strings into integer values...
	dayInt = atoi(dayStr);
	monthInt = atoi(monthStr);
	yearInt = atoi(yearStr) + 2000;


	//Step 3 - Build up our date data structure...this will be used by the next step...
	date.tm_mday = dayInt;
	date.tm_mon = monthInt - 1; //num months since January - so Jan=0, Feb=1, Mar=2, etc
	date.tm_year = yearInt - 1900; //num years since 1900


	//Step 4 - Build the formated string...
	strftime(pOutputBuffer, outputBufferLength, "%d-%b-%Y", &date);
}
30 Apr 2012

Putting it altogether:

#include <stdio.h>
#include <string.h>
#include <stdint.h>
#include <stdlib.h>
#include <time.h>

char testStringBuffer[] = "$GPRMC,121212,A,4807.038,N,01131.000,E,022.4,084.4,030611,003.1,W*6A";

#define NUM_FIELDS	(12) //number of comma seperated values in the data...TODO does this remain constant?
char* pFields[NUM_FIELDS];


char dateStringBuffer[64+1];



// This function seperates the single input string in to numFields substrings
void ParseFields(char* inputBuffer, char** pFields, uint32_t numFields, char* delimiterChars)
{
	char* pString = inputBuffer;
	char* pField;
	
	for(uint32_t i=0; i<numFields; i++)
	{
		pField = strtok(pString, delimiterChars);

		if(pField != NULL)
		{
			pFields[i] = pField;
		}
		else
		{
			pFields[i] = "";
		}

		pString = NULL; //to make strtok continue parsing the next field rather than start again on the original string (see strtok documentation for more details)
	}
}





//this function takes in a 6 character date string in the form "ddmmyy" and converts it to a fuller string format of the form "dd-MMM-yyyy"
void ParseDate(char* pEncodedDataString, char* pOutputBuffer, uint32_t outputBufferLength)
{
	char dayStr[2+1];
	char monthStr[2+1];
	char yearStr[2+1];

	uint32_t dayInt;
	uint32_t monthInt;
	uint32_t yearInt;

	struct tm date;


	//Step 1 - Seperate the day, month and year into their own NULL terminated string buffers...
	dayStr[0] = pEncodedDataString[0];
	dayStr[1] = pEncodedDataString[1];
	dayStr[2] = '\0';

	monthStr[0] = pEncodedDataString[2];
	monthStr[1] = pEncodedDataString[3];
	monthStr[2] = '\0';

	yearStr[0] = pEncodedDataString[4];
	yearStr[1] = pEncodedDataString[5];
	yearStr[2] = '\0';


	//Step 2 - Run atoi() to parse the strings into integer values...
	dayInt = atoi(dayStr);
	monthInt = atoi(monthStr);
	yearInt = atoi(yearStr) + 2000;


	//Step 3 - Build up our date data structure...this will be used by the next step...
	date.tm_mday = dayInt;
	date.tm_mon = monthInt - 1; //num months since January - so Jan=0, Feb=1, Mar=2, etc
	date.tm_year = yearInt - 1900; //num years since 1900


	//Step 4 - Build the formated string...
	strftime(pOutputBuffer, outputBufferLength, "%d-%b-%Y", &date);
}





int main(int argc, char* argv[])
{
	ParseFields(testStringBuffer, pFields, NUM_FIELDS, ",");

	ParseDate(pFields[9], dateStringBuffer, 64);


	printf("%s\n", dateStringBuffer);

	return 0;
}


All you should need to do now is:

  1. read the data from your serial port into a buffer (make sure you allocate enough space!)
  2. call ParseFields to split the compound string into seperate strings
  3. call ParseDate to convert the 6-char encoded date string into a "nicer" formatted string
  4. write that out to your LCD

Hope this helps!

Regards,

Steven

30 Apr 2012

Wow! That's really helpful. I'm gonna try this, If I have any more problems I'll get back to you.

Thanks a lot,

Aaron

11 Apr 2016

hi I dun understand how to use this function to find out another information such as latitude and longitude.

// This function seperates the single input string in to numFields substrings
void ParseFields(char* inputBuffer, char** pFields, uint32_t numFields, char* delimiterChars)
{
    char* pString = inputBuffer;
    char* pField;
    
    for(uint32_t i=0; i<numFields; i++)
    {
        pField = strtok(pString, delimiterChars);
 
        if(pField != NULL)
        {
            pFields[i] = pField;
        }
        else
        {
            pFields[i] = "";
        }
 
        pString = NULL; //to make strtok continue parsing the next field rather than start again on the original string (see strtok documentation for more details)
    }
}
11 Apr 2016

@Koon Chung Fong, that function just parses the data based on a delimiter.

So if your string contains something like:

const char* from_gps = "bla,4.0971,51.0321";

You can parse this into three separate fields using:

char* parsed[3]
parseFields(from_gps, parsed, 3, ",");
// now parsed contains
// [0] = bla
// [1] = 4.0971
// [2] = 51.0321

However, you probably want the lat/lon to be float rather than char*, for that look at f.e. the stof function.

17 Apr 2016

/media/uploads/sbrown603/code.jpg

Hi there,

I tried implementing the above code to read data from the serial port and output it onto my LCD. However when i read data from the serial port it will not output to the LCD. Am i missing something very simple???

Thanks in advance guys!!

19 Apr 2016

@Conan, can you go to the component page and hit 'Ask a question' there? More people will see it, rather than bumping an old thread.