[Yaffs] A possible bug when handle nand write chunk error.

Top Page
Attachments:
Message as email
+ (text/plain)
+ (text/html)
Delete this message
Reply to this message
Author: xun sun
Date:  
To: yaffs
Subject: [Yaffs] A possible bug when handle nand write chunk error.
Hi Charles, hi all,
I use Micron nand and yaffs2 modules which runs on linux/mtd driver. The
nand reports file corruptions and many bad block after several months' test.
According to our debug message, we think there is a BUG on yaffs2 handling
write chunk error from low layer. Here is detail information:

1. yaffs version: we patched to HASH: 8a3135f
2. the code which may contains BUG:

file: yaffs_guts.c
 488 static int yaffs_write_new_chunk(struct yaffs_dev *dev,
 489 +--  2 lines: const u8
*data,------------------------------------------------------------------------------------------------------------------------------------------
 491 {
 492   int attempts = 0;
 493   int write_ok = 0;
 494   int chunk;
 495
 496   yaffs2_checkpt_invalidate(dev);
 497
 498   do {
 499     struct yaffs_block_info *bi = 0;
 500     int erased_ok = 0;
 501
......
 560     if (write_ok != YAFFS_OK) {
 561       /* Clean up aborted write, skip to next block and
 562        * try another chunk */
 563       yaffs_handle_chunk_wr_error(dev, chunk, erased_ok);
 564       continue;
 565     }


and :
 232 static void yaffs_handle_chunk_wr_error(struct yaffs_dev *dev, int
nand_chunk,
 233           int erased_ok)
 234 {
 235   int flash_block = nand_chunk / dev->param.chunks_per_block;
 236   struct yaffs_block_info *bi = yaffs_get_block_info(dev, flash_block);
 237
 238   yaffs_handle_chunk_error(dev, bi);
 239
 240   if (erased_ok) {
 241     /* Was an actual write failure,
 242      * so mark the block for retirement.*/
 243     bi->needs_retiring = 1;
 244     yaffs_trace(YAFFS_TRACE_ERROR | YAFFS_TRACE_BAD_BLOCKS,
 245       "**>> Block %d needs retiring", flash_block);
 246   }


3. why we think it's BUG:
  In above code logic, yaffs2 try first verify the written block has been
erased( and just verify the first chunk and skip other chunk of the same
block).
  Then, try to write chunk to nand. The logic should be that any chunk
written error should lead to retire the block.
  Currently,  500     int erased_ok = 0; will not retire the rest of chunks
(ie 1..63 chunks).
  So I think the default value of erased_ok should be 1 and the chunk
written failure of lost power should be correctly detected when scanning
the whole nand or
  when mount yaffs2.


If I make any mistake please point out, Thanks,

Best Regards,

SAM