[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

JFFS1 async power down problem fix patch.



Hi Folks,

The attached patch has successfully completed 456 async power down
cycles (and going strong).
After each power restore 100 data files with random binary data (and CRC
protected) were checked
for data validity. The log from all the power down's was also checked
manually for any "problem printk's".

Additionally the jffs f/s was being mounted as root, so any data corruption
in quite a lot of other files
would prevent the f/s being mounted and the "checkfs" test from being
restarted (at which point the testing would
automatically halt).

This represents at least an order of magnitude improvement from the code that
this guy patches (in the CVS). With
that original code, I got problems within 10 power cycles. (plus it's still
going on).

The track that I've taken is as follows:

1.  Minimal code changes, that's why I tried to avoid the two pass scan
algorithm. That would have been quite
a lot of changes and I wanted to see if easier stuff would work.

2.  It is still 2 pass, in the sense that I scan for "bit flipping" sectors
over the entire flash even before getting into
reading inodes/free space etc. Any "bit flipping" sectors are then erased
before the actual scanning of the data on
the flash is started.

3. There is still a check for these bit flipping sectors during of the actual
scan of data on the flash. This test simply
checks for dirty space being allocated for 0 bytes on the flash (a classic
symptom of "bit flipping"). I *have* encountered
1 power down case in 456 cycles where the original pre scan (reading the
flash 4 times) missed a bit flipping sector.
If this is encountered, rather than figure out where we are in the scan
process, and try to recover, I just erase the
offending sector and return  with a -EAGAIN at which point jffs_build_fs()
frees all allocated memory and retires
the jffs_scan_flash() one more time.

4. During the scan, I only accept at most 2 free spaces and each MUST be at
least 1 erase sector long. This fixes the
problem of the head of the log starting on an unaligned offset (after a
little amount of random/dirty 0xff's were accepted as
free in the original code).

Please review the patch and fire those comments away!

Thanks

Vipin


David Woodhouse wrote:

> On Wed, 7 Mar 2001, Alan Cox wrote:
>
> > If its a problem then lose 8 bytes from the start of each erase area
> > to write a 'Yes I finished erasing this' signature ?
>
> For JFFS2 that's almost trivial, so I'll do it as soon as I've fixed any
> showstopper bugs which may have turned up overnight.
>
> For JFFS1 it's less so, because it would break up the free area. JFFS1
> doesn't currently handle erase blocks individually, or have code to write
> a node which runs to the end of one block, and then write another node
> with the remainder of the data to be written in the next block, etc.
>
> We could fix that, I suppose, without too much intrusion - it could
> probably be contained almost entirely in jffs_write_node(). Not sure about
> the accounting/garbage_collection issues that may result.
>
> I'm reluctant to go _too_ far in making such changes to JFFS1. That's what
> JFFS2 is supposed to be for - taking the design of JFFS1 and fixing the
> few things which were wrong with it.
>
> --
> dwmw2


To unsubscribe from this list: send the line "unsubscribe jffs-dev" in
the body of a message to majordomo@xxxxxxx.com