Search for text in a file

Is there any way to search for text in a file (not part of the program) and then read the following text until it sees a certain line of text. For example:
Hello World. How are you today? I am fine.
is the text file, and I want the program to search for ‘World.’ then read the text (and use it, say as an output) and stop when it sees ‘I am’, without knowing what is between. Thank you.

There’s two options. You can use the C++ standard library and read in a file with ifstream, or use a FILE* pointer and fopen() and read in the file the old-fashioned way with cstdio. If you need an explanation on how to open and read a file, feel free to post here again or PM me. Once you’ve opened the file, you have to options.

  1. You can read the entire text file all at once into a memory buffer and then search for your text using strstr(), which returns a pointer to the start of the sub-string you searched for.

  2. Alternatively, you can read in the file line at a time (using gets() with FILE* pointers/fopen() and ifstream::getline() for ifstream) and search each line you load in with strstr().

If you’re using FILE* pointers, don’t forget to call fclose() on the file when you’re done.

As I said, feel free to let me know if you need any more help.

Can you show me an example program please?

Gladly. Before I do, one question though:

So are you asking to read everything between “World” and “I am” ? I’m confused by what text you want read or outputted.

Karatekid: it almost sounds like you’re just looking for a generic tool to search for text, as opposed to something as part of your code. If thats the case, then I’d recommend grep. For windows, you can get it at http://gnuwin32.sourceforge.net/packages/grep.htm

Grep is pretty easy to use for basic things. Go to a command line and type

grep "text to search for" filename

There are a lot more advanced ways to use grep of course…

I want it to find ‘World’ then output the text until it finds ‘I am’

Got it. You’ll have a demo program later today.

Ok. Thank you very much.

Share and enjoy.

#define _CRT_SECURE_NO_WARNINGS

//For all our file reading functions and printf()
#include <cstdio>
//For seacrhing through the strings
#include <cstring>
//For dynamic memory allocation.
#include <cstdlib>
//For calling system().  Be aware that system("pause") is a Windows command
//which causes the "Press any key to continue..." message to show up.
//On other systems (Linux, Macs, etc.), pause is not a valid command.
//Another way would be to print "Press enter to continue..." and then call getchar(),
//which waits for the user to hit enter.
#include <iostream>

int main(void)
{
	//Strings that hold our start and end key phrases.
	//I'm making them constant, but if you want to implement a way to change them, go ahead.
	const char* startKey = "World!";
	const char* endKey = "I am";


	//Create our file pointer.  I'm using this system instead of ifstream because
	//I learned C before C++ so I've never gotten around to using the C++ file system.
	//Feel free to use whichever you want.  Both work identically.
	FILE* file;
	
	//Call fopen(), which takes two arguments:  the file name and a mode.  We're using "r"
	//for "read" mode.  "w" is for write, and you can look up other codes online by Googling fopen.
	//Also, if you were reading or writing binary data instead of text, you'd use "wb" or "rb"
	//Two backslashes are used after "C:" because \ is reserved for special characters, such as
	// 
, which jumps to the next line.  "\\" is the code for \.

	//fopen() also won't work if the file doesn't exist yet.  Make sure this file exists and is in the
	//correct location.
	file = fopen("C:\\TestText.txt", "r");

	//if fopen() fails, it sets the file pointer to NULL.  We'll check for that and quit the program
	//if this happens.
	if(file == NULL)
	{
		printf("File couldn't be opened properly.
");
		system("pause");
		return 1;
	}

	//I'm going to get the length of the file by calling fseek() to go the end of the file,
	//then ftell() to see how far into the file we are.
	//The arguments for fseek() are the file pointer, the offset (how far to go from your base point),
	//and base point (beginning, current position, or end)
	fseek(file, 0, SEEK_END);

	//The return value of ftell() is actually how many bytes into the file we are.
	//However, since ASCII characters (such as ones stored in char strings) are one byte long,
	//this is also a representation of how many characters there are in the file.
	//Note that this includes special characters like 
 to go to the next line
	int fileLength = ftell(file);

	//We're going to create a buffer in the memory to read our string into.
	//If you're not familiar with dynamic memory allocation, feel free
	//to contact me with more questions.
	//A great article to explain the concept is found here:
	//http://irc.essex.ac.uk/www.iota-six.co.uk/c/f7_dynamic_memory_allocation.asp

	//I'm making the buffer 1 longer than the file length because the last byte of a string
	//needs to be 0 to tell the computer the string has ended.  This zero byte is called the
	//"Null terminator", and is automatically added to strings when you declare them.
	//malloc(), calloc(), and realloc() return a void* pointer, which you can use for whatever.
	//We're casting it to a char* pointer so the compiler knows we're using it for a
	//character string.
	char* fileBuffer = (char*)calloc(fileLength + 1, 1);

	//Use fseek() to go back to the beginning (you can't read a file backwards).
	fseek(file, 0, SEEK_SET);

	//Now we're going to read the file into the buffer.  We call fread() to do this.
	//The first argument is a pointer to your buffer, the second is the size of each element
	//you're reading, the third is how many elements there are,
	//and the final argument is a file pointer.
	fread(fileBuffer, 1, fileLength, file);

	//We're going to search for the start key. strstr() searches for a string inside another string
	//and returns a pointer to the first instance it finds.  We're going to add the length of the start
	//key to the pointer it returns so that we know where what we're looking for starts, not the start key.
	//strlen() returns the length of a string in bytes.
	char* start = strstr(fileBuffer, startKey) + strlen(startKey);

	//Calculate the length of our ouput string by finding the distance between the end of the start key
	//and the start of the end key.
	int outputLength = strstr(fileBuffer, endKey) - start;

	//Note that we haven't accounted for what might happen if the end key is before the start key,
	//if one of them doesn't exist, or other circumstances.  If you want your code to be robust you might
	//want to check for these things.

	//create a new buffer to hold our output string and copy it in using memcpy().
	//memcpy() takes the destination, the source, and how many bytes you want to copy.
	//Keep in mind that your output is everything between the start and end key, including spaces.
	char* output = (char*)calloc(outputLength + 1, 1);
	memcpy(output, start, outputLength);

	//Now that we've copied our output string to its own buffer, we can free the file buffer.
	//Always, always free any memory you've allocated when you're done using it.
	//Free memory you've allocated with malloc(), calloc(), or realloc() with free(),
	//And any memory you've allocated with "new" with "delete".
	//Generally, "new" and "delete" are better for use with C++ classes since they
	//can be automatically constructed and deconstructed when you call them,
	//but for simple buffers for strings and binary data, malloc() and free() work well.
	//"new" and "delete" work for arrays too, so it's a preference thing.
	//However, you can't resize a buffer using realloc() that you made with "new".

	//Basically if you don't free your memory when you're done with it, the comptuer
	//assumes that you're still using it and won't let any other program or process use it.
	//This is called a "memory leak," and it can be very bad.
	free(fileBuffer);

	//I'm just printing it as an example, but you can now do whatever you want with your output string.
	printf("My output string is:
%s

", output);
	system("pause");

	//Make sure to free your output buffer when you're done with it.
	free(output);

	//Always close the file when you're done with it so other programs can access it
	//and so you don't leave the resources the program used for the file lying around.
	fclose(file);
}

From the checking I did, this will work. However, it also seems to be the way to do it in C and not the primary way to do such in c++. If anyone knows the c++ formatting to do this, it would be helpful.

As I said, this is the “C way” to do it, but all the C libraries work perfectly well in C++. This won’t put you at a disadvantage in any way. If you want to do it the “C++ way,” just read your data into the buffer with ifstream. That’s all you have to change.