html - How to Find all occurrences of a Substring in C -


i trying write parsing program in c take segments of text html document. this, need find every instance of substring "name": in document; however, c function strstr finds first instance of substring. cannot find function finds beyond first instance, , have considered deleting each substring after find strstr return next one. cannot either of these approaches work.

by way, know while loop limits 6 iterations, testing see if function work in first place.

while(entry_count < 6) {        printf("test");     if((ptr = strstr(buffer, "\"name\":")) != null)     {            ptr += 8;         int = 0;         while(*ptr != '\"')         {                company_name[i] = *ptr;             ptr++;             i++;         }            company_name[i] = '\n';         int j;         for(j = 0; company_name[j] != '\n'; j++)             printf("%c", company_name[j]);         printf("\n");         strtok(buffer, "\"name\":");         entry_count++;     }    }    

just pass returned pointer, plus one, strstr() find next match:

char *ptr = strstr(buffer, target); while (ptr) {     /* ... ptr ... */     ptr = strstr(ptr+1, target); } 

ps. while can this, i'd suggest may wish consider more suitable tools job:

  • c low-level language, , trying write string parsing code in laborious (especially if insist on coding scratch, instead of using existing parsing libraries or parser generators) , prone bugs (some of which, buffer overruns, can create security holes). there plenty of higher-level scripting languages (like perl, ruby, python or javascript) much better suited tasks this.

  • when parsing html, should use proper html parser (preferably combined dom builder , query tool). allow locate data want based on structure of document, instead of matching substrings in raw html source code. real html parser transparently take care of issues character set conversion , decoding of character entities. (yes, there are html parsers c, such gumbo , hubbub, can , should use 1 if insist on sticking c.)


Comments

Popular posts from this blog

asp.net mvc - SSO between MVCForum and Umbraco7 -

Python Tkinter keyboard using bind -

ubuntu - Selenium Node Not Connecting to Hub, Not Opening Port -