[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Major JFFS2 bug (?)



Hi David/all,

Power fail testing strikes again (I think :)

I apologize in advance. This is going to be a long mail.

After being convinced that the JFFS2 file system (meta data etc.) were
stable and that the fs was not walking over other static file on the system,
I "enhanced" my checkfs program to check for reliability of data to the
*file being written to* when power fails.

To this end, I made sure that the entire write was done with just one single
"write()" system call and checked to make sure that the # of bytes returned
was the # of bytes requested to be written (never had a problem with this).

With the assumption that if the "write()" system call succeeded, the new
data was available, and if it did not (i.e. pwr failed during the write()
call) then the old data was still available, I made my "pass" criterion that
ZERO files could have bad CRC on power up!

I started with a max file size (of random data in them, with random file
size)) of 20KBytes.

I had a failure within 3 power cycles (or so).

I tried a couple of more times. Same result- the last file being written to
when power failed would be corrupt.

On further investigation, I noticed that the "new" file size on JFFS2 was
4096 bytes. Ah! Ha! A page size. I think that JFFS2 is not handling writes
more than a page size properly for sure. This is bug#1.

Ok, I reduced the max allowed file size created by the program to 4000
bytes. The system ran till approx 55 power cycles before having a bad CRC
again. Again it was a file that was last been written to when power failed.
This time the size was 2710 bytes (Huh?). The CRC was bad. So obviously,
neither the older data (with its CRC) was preserved, nor did the new data
(and its CRC) take.

This is bug#2.

Why I have split the two into separate bugs is that IMHO bug#1 is a feature
defeciency. The system does (seems to) do page writes to JFFS2, *and commits
them by having valid CRC's/versions ID's etc.). IMHO this is wrong. Till all
the pages (i.e. data send down in a single write) is written to JFFS2, they
must not be "committed" *logically* to the fs.

This may not seem like a big deal, but it is for a power fail resilient
system (sorry, I'm not preaching to you, but some hypothetical person that
may argue with this position :) Consider the case of a config or otherwise
data struct. Having part of the older struct and part of the new struct is
quite a disaster. We need to retain the older stuff- in toto- till the new
one is committed fully. A successfull write() to JFFS/2 has to be a
"logically" atomic operation.

Obviously, if it cannot be "physically" an atomic operation, then we need to
maintain a page "n" of "m" count in the node headers along with the version
ID for the node. On a remount (after power fail), a higher ID node is
accepted *only if all the pages (m) are present with valid CRC*. (or
somthing similar in logic).

Bug #2 is possibly a real bug in the system, where there is a minute window
of opportunity where there is a chance of corruption when we are sending
down even less than 1 page of data.

Sorry for what is possibly the longest post in the history of this thread ;)

Vipin

To unsubscribe from this list: send the line "unsubscribe jffs-dev" in
the body of a message to majordomo@xxxxxxx.com