[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Ramblings on NAND flash support.
p2@xxxxxxx.be said:
> I have the impression the current code assumes it can all free space
> as 1 contiguous block (possibly by forcing garbage collection). If the
> NAND flash develops a bad block, this is obviously not true.
JFFS2 doesn't assume that, JFFS1 does. In JFFS2, we deal with erase blocks
individually, and you can just stick the offending block on the bad_list
and never be troubled by it again.
That's the easy part.
What's more difficult is dealing with the write-cycle restrictions on NAND
flash. That means firstly that you can't go back and mark nodes obsolete,
so you have to make sure your garbage-collection doesn't delete, say, a
deletion dirent before the real dirent which it obsoletes has been erased.
Because we don't go through the flash linearly, it's possible that we come
to garbage collect the deletion dirent _before_ the earlier node that it
obsoletes.
The good news on that front is that I already dealt with the _really_ hairy
case which required fundamental changes to the core code (and lots of
banging of head against wall) - the case where you truncate a file, later
extend it leaving a hole, and don't want the old data to 'show through' the
holes. The code to deal with that has been in there from the start, even
though it wasn't strictly necessary on NOR flash. Fixing it required fairly
fundamental changes to how we do truncation, holes, etc.
You need to fix the deletion dirent case - that's not hard. Currently we
have this comment in gc.c::jffs2_garbage_collect_deletion_dirent():
/* FIXME: When we run on NAND flash, we need to work out whether
this deletion dirent is still needed to actively delete a
'real' dirent with the same name that's still somewhere else
on the flash. For now, we know that we've actually obliterated
all the older dirents when they became obsolete, so we didn't
really need to write the deletion to flash in the first place.
*/
In fact, there's a simple but naïve method for this. Look through the
physical nodes belonging to the parent directory. For each one that's
marked (in the jffs2_raw_node_ref in memory) obsolete, read it and check
whether it is obsoleted by the one you're trying to GC (i.e. check if the
name matches). Repeat. If you find one that _is_ still obsoleted by the
deletion dirent you're trying to GC, you need to write that deletion dirent
out to the flash again. If you don't find one, you can just let it get
erased as we currently do. 20-odd lines of code.
You also have to deal with the restriction on write cycles during normal
operation - you can't just write lots and lots of tiny nodes consecutively,
in individual write cycles, and expect it to work. If you just write lots
of dirents consecutively, you could quite feasibly fit more than ten in a
512-byte page. And some NAND flash chips allow even fewer writes per page
than that.
I think the best approach here is to keep a page-sized (NAND page, 512
byte) write cache, and to _actually_ write to the flash only when that
becomes full. This way, you can use the helpful hardware ECC built in to
devices like the DiskOnChip and some SmartMedia adaptors, which is
block-based.
This approach gives you a couple of interesting problems to solve, but
they're not that bad. First, you have to ensure that you don't actually
erase an older node which is obsoleted by a node that's still in the write
cache. You have to make sure the new node is entirely on the medium before
erasing the old node which it obsoletes. That's not too hard to achieve - if
the new node that finally obsoletes an old block is still in the write
cache, you don't stick that eraseblock on the erase_pending_list
immediately, you stick it on a new erase_pending_wbuf list, and the code
that flushes the write buffer moves it to the erase_pending list when it's
done.
Secondly, you need to deal with write errors when flushing the write cache.
You haven't lost any data - either your node is entirely in the write cache
and can just be written out elsewhere, or only the end of it is in the write
cache and you know that the beginning of it is on the flash OK - in which
case you can read it back and write the whole thing out elsewhere.
Oh, and you'll need to implement fsync() I suppose. That won't be hard
either - you can just check whether any nodes belonging to the inode
being synced are currently in the write cache. When you need to flush the
write cache to honour fsync() I think it's probably best just to trigger
garbage collection to _fill_ the write cache and make it get flushed
naturally, rather than padding the block.
You'll need to move the CLEANMARKER node into the 'spare' area of your
flash chip, which is trivial to do.
Altogether, it doesn't seem quite as complicated as once I thought it
was.... is that because I've missed out something obvious?
--
dwmw2
To unsubscribe from this list: send the line "unsubscribe jffs-dev" in
the body of a message to majordomo@xxxxxxx.com