Strange File Buffer Behavior, Windows 7
So, I’m (trying to) writing a program that, among other things, monitors changes to a particular log file. I have QFileSystemWatcher in place, and it successfully notices changes to text documents I create and modify and save, etc. So, I know the implementation is correct, as far as it goes.
The issue lies in what I can only guess is extremely peculiar behavior of the log generating program. It seems to never flush its buffer unless either 1, the program is closed, or (and this makes no sense to me) windows explorer refreshes/navigates to the directory. I’m not sure if that’s a ‘feature’ of windows 7 or poor design on the application developers’ part, but its extremely annoying.
So, I’m trying to figure out a way to either force that buffer to write to disk, or simulate a explorer refresh in my application. Its less than ideal, but I’m okay with using QTimer to periodically execute some task to accomplish it. At the moment, I’ve tried getting a list of the files in the directory on a timer, but that does not do the trick. Is there something lower level I can use, or am I forced to use the windows API?
The application you are trying to monitor probably uses a buffered file writer, which is very common (and makes a lot of sense for performance reasons). For example, the fwrite() and fprintf() functions from the C Standard Library use an internal buffer. The data does not actually get written to the file, unless either the buffer is full or fflush() is called explicitly (or the file is closed). I doubt you can “force” an application to flush its buffer “from outside” with some Win32 API function. The buffer is inside the application (e.g. in the C Standard Library) and, from the operating systems point of view, the Write operation has not happened before the buffer is flushed! For example I would assume that the C Standard Library does not call to the Win32 API function WriteFile() until its internal buffer is flushed. There also is a Win32 function FlushFileBuffers(), but this won’t help if the data is actually buffered on the application level (e.g. in the C Standard Library). Moreover FlushFileBuffers() requires a handle to the file! So only the application itself can call this function – using its file handle. Sure, your application could open a handle to the very same file that the “other” application is writing to. But then FlushFileBuffers() on your handle would only flush the data that Win32 has buffered for your application. As the “other” application is using its own handle (and thus its own buffer), it doesn’t get flushed if your app calls FlushFileBuffers() on your handle.
I can only think of one way: You need to call fflush() and/or FlushFileBuffers() from the context of the “other” application. This could be achieved by “injecting” a DLL into that other process and then calling fflush() or FlushFileBuffers(). Still you would need to somehow locating the desired handle within the process’ memory. That is rather “hackish” and glitchy solution! One might even argue that this is using “malware”-like techniques…
Thanks for the reply! I figured forcing the other application to flush its buffer was impossible, but I’m confused why refreshing windows explorer causes it to happen, and that causes me to think there’s some way to do it.
That, and I know its been done on the same program before, I just don’t know how it was implemented (I cannot access the source).
Now I’m even more confused. Suddenly (and, I mean out of nowhere), QFileSystemWatcher stopped notifying me of changes even when I refresh explorer. Just to make sure it wasn’t something I changed in the code, I ran both my program and the other program from the same source on my laptop — works like a charm. Both have the same OS, as well.
Well, I think the QFileSystemWatcher cannot recognize that the file has changed, until the Write operation has actually been performed on the “file system” level, i.e. the effect of the Write operation actually becomes “visible” in the file. As said before, Write operations are usually buffered. And buffering may happen on the application level, on the OS level or on both of them (actually another layer of buffering may happen on the hardware level, but that shouldn’t matter here). If the application explicitly triggers a flush after the Write operation, then the data will actually be written to the file ASAP. But otherwise, when the flush is not triggered explicitly (and the file is still kept open), the behavior is “undefined”. All you know, when a flush is not triggered explicitly, is that the data will be written eventually. But the exact moment may depend on various factors. The flush may happen because the buffer ran full and thus the system needed to flush it. It may also happen because the system flushed the buffer after a certain timeout. May even depend on the system load or access pattern…
(…or on the phase of the moon)
Let me explain this in more detail. I do appreciate what you’re trying to explain, but I really don’t think we’re seeing eye to eye.
I launch the program that generates the log file I’m trying to watch. I then navigate to the folder where the log is generated using Windows Explorer. I can click the refresh button periodically and the “file size” attribute increases consistently and as expected – almost as if the program was writing its buffer regularly.
Earlier today, if I had the monitoring program running while I did these refreshes, it would register a change in the file. If I did not refresh the explorer window, it would not register anything. This was the case 100% of the time, and I did it a good hundred times. So, at first I thought maybe it was a limitation of QFileSystemWatcher, so I created a new text file, opened it in Notepad, closed explorer completely, and watched the new text file. Whenever I changed the text and saved it, QFileSystemWatcher reported the change immediately, independent of Windows Explorer.
It was this that led me to thinking the the other application must not flush its buffer unless Windows Explorer registers some kind of event to request a refresh. I am trying to figure out what event Explorer registers, and if I can register the same event on a timer so I can get the buffer to write to file regularly.
Even more strangely, at the time of my last post, QFileSystemWatcher stopped notifying me of changes to the file, even though explorer continued to show file size increases exactly the same as before. When I executed the exact same process, using the same source, on a separate machine running the same OS, it worked as it did earlier. I am currently scouring my source trying to figure out what is going on.
If you understood what I meant all along, and I’m somehow being dense, I apologize. I just felt I wasn’t being clear enough.
If you want to see the source, and you have a lot of free time (lol), its all available at Git Hub. [github.com]
If it helps, the applicable code should be in main.cpp and filemon.h
Well, if we assume that the “writing” application does not explicitly flush it’s file handle after each Write operation – and usually you don’t that (for performance reasons) – then the exact moment when the effect of the Write operation will become “visible” on the file systems is undefined. And, unless the flush has happened eventually, QFileSystemWatcher (and thus the “monitoring” application) has no chance to recognize the change.
If we further assume that the buffering actually happens on the operating system level (and not on the “application” or “runtime” level), then only Windows itself decides when the data will be flushed. It may delay the flush until it thinks the right time has arrived. I’m not exactly sure why refreshing the view in Windows Explorer apparently triggers the flush. I can only speculate here. Probably Windows Explorer will “re-scan” the directory when you do a refresh. In that case, Windows Explorer (which is just an application using the Win32 API) will probably query the info of all files in that directory. And, maybe, when the info of some file is queried, Windows will flush all pending Write operations for that file before it returns the requested info. That would explain why refreshing the view in Windows Explorer makes the pending changes “visible” to the QFileSystemWatcher.
Still there is absolutely no guarantee that querying the info of a file does flush all of it’s pending writes. Even if it can be observed on your system, I don’t think it’s documented anywhere. And thus it may be subject to change at any time. Or it may only happen under certain conditions, which would explain why the behavior has changed…
Also note that the QFileSystemWatcher probably does not continuously query the file info in order to detect changes (would be far too inefficient!), but instead uses some “notification” API of the underlying OS.
Well, the good news is I figured out what was causing the inconsistency I mentioned in later posts. Not how to fix it yet, but I know what the culprit is. I’ll figure it out eventually.
And yeah, I know QFileSystemWatcher uses API calls to do its task, I’m just not sure which. I may end up using the API natively to get the right control I need. I’d prefer not to; its been a long time since I’ve done anything windows API related.
I’ll get back with any success.
I think the only correct and reliable solution here would be changing the “writing” application to properly trigger a flush after each Write operation – at least if that application needs to ensure that the result of the Write becomes visible to other applications as soon as possible. I don’t know which kind of information you are passing between the two applications. But if you need to pass data from one application to another one in “real time”, you may get much better results by using a QSharedMemory for the data transfer instead of a file. The two applications could be synchronized with a QSystemSemaphore. I have implemented inter-process communication this way…
Okay, so, after several days of trying to figure out the source of the problem, I conclusively determined that the problem lies in the Windows Write Cache buffer, NOT the application buffer. Sadly, I cannot turn off the Write Cache Buffer on my disk (virtual, RAID 10E, not supported by driver, evidently), so I had to find a way to flush the buffer. Calling FlushFileBuffers() didn’t work, because that requires GENERIC_WRITE access to the file, which I cannot get, since the application generating the file doesn’t have SHARE_FILE_WRITE mode on.
So, I scoured the internet, and it was far harder than it had any right to be, but here is the solution, if anyone ever has a similar issue. Its in C#, but the answer is valid for anything. The exact specifications of the problem are well spelled out, and his findings with explorer agree with mine. Link [stackoverflow.com]
Once I get the completed code, I’ll post it here.