[Yaffs] please help check if this is a potential BUG about r…

Top Page
Attachments:
Message as email
+ (text/plain)
Delete this message
Reply to this message
Author: xun sun
Date:  
To: yaffs
Subject: [Yaffs] please help check if this is a potential BUG about retire blocks when writes chunk error(is it right that yaffs_handle_chunk_wr_error default value of erased_ok should be 1 ?)
Hi Charles, hi all,
I use Micron nand and yaffs2 modules which runs on linux/mtd driver.
The nand reports file corruptions and many bad block after several
months' installation.
According to our debug message, we think there is a BUG on yaffs2
handling write error from low layer. Here is detail information:

1. yaffs version: we patched to HASH: 8a3135f
2. the code which may contain BUG:

file: yaffs_guts.c
 488 static int yaffs_write_new_chunk(struct yaffs_dev *dev,
 489 +--  2 lines: const u8
*data,------------------------------------------------------------------------------------------------------------------------------------------
 491 {
 492   int attempts = 0;
 493   int write_ok = 0;
 494   int chunk;
 495
 496   yaffs2_checkpt_invalidate(dev);
 497
 498   do {
 499     struct yaffs_block_info *bi = 0;
 500     int erased_ok = 0;
 501
......
 560     if (write_ok != YAFFS_OK) {
 561       /* Clean up aborted write, skip to next block and
 562        * try another chunk */
 563       yaffs_handle_chunk_wr_error(dev, chunk, erased_ok);
 564       continue;
 565     }


and :
 232 static void yaffs_handle_chunk_wr_error(struct yaffs_dev *dev,
int nand_chunk,
 233           int erased_ok)
 234 {
 235   int flash_block = nand_chunk / dev->param.chunks_per_block;
 236   struct yaffs_block_info *bi = yaffs_get_block_info(dev, flash_block);
 237
 238   yaffs_handle_chunk_error(dev, bi);
 239
 240   if (erased_ok) {
 241     /* Was an actual write failure,
 242      * so mark the block for retirement.*/
 243     bi->needs_retiring = 1;
 244     yaffs_trace(YAFFS_TRACE_ERROR | YAFFS_TRACE_BAD_BLOCKS,
 245       "**>> Block %d needs retiring", flash_block);
 246   }


3. why we think it's BUG:
  In above code logic, yaffs2 try first verify the written block has
been erased( and just verify the first chunk and skip other chunk of
the same block).
  Then, try to write chunk to nand. The logic should be that any chunk
written error should lead to retire the block.
  Currently,  500     int erased_ok = 0; will not retire the rest of
chunks (ie 1..63 chunks).
  So I think the default value of erased_ok should be 1 and the chunk
written failure of lost power should be correctly detected when
scanning the whole nand or
  when mount yaffs2.


If I make any mistake please point out, Thanks,

Best Regards,

SAM