Re: [Yaffs] bad block management

Top Page
Attachments:
Message as email
+ (text/plain)
Delete this message
Reply to this message
Author: bpqw
Date:  
To: Charles Manning, bpqw
CC: yaffs@lists.aleph1.co.uk
Subject: Re: [Yaffs] bad block management
Hi Charles,
I am clear,
Currently there is no YAFFS_ECC_RESULT_REFRESH, so we need to add this case and the threshold as you recommend should be (refresh_threshold + ecc_strength + 1) /2,
I will plan to create a patch for this.
Do you have any proposal?

Br
White Ding
____________________________
EBU APAC Application Engineering
Tel:86-21-38997078
Mobile: 86-13761729112
Address: No 601 Fasai Rd, Waigaoqiao Free Trade Zone Pudong, Shanghai, China

-----Original Message-----
From: Charles Manning [mailto:cdhmanning@gmail.com]
Sent: Thursday, August 07, 2014 5:44 AM
To: bpqw
Cc:
Subject: Re: [Yaffs] bad block management

On Wed, Aug 6, 2014 at 7:26 PM, bpqw <> wrote:
> Hi Clarles,
> We recommended if the bitflip over threshold we just need to refresh the block but not retire it.
> So we doubt is it reasonable just according to the bitflips over
> mtd->bitflip_threshold over three times to judge the block as bad block?
>


Hello White Ding

I certainly understand where you are coming from here.

The concern this raises is that we then lose the safety net of retiring blocks before they go bad.

What we really need is two thresholds in mtd: refresh only, refresh and apply retiring logic.

I suppose we could use bitflip_threshold and bitflip_strength for those, or maybe something slightly different to give more margin.
Unfortunately the threshold and strength are often set to be the same.

So I would like to use something that can be made up called, say, retire_limit.
Where
retire_limit = (refresh_threshold + ecc_strength + 1) /2.


For that to work Yaffs needs another level of ECC error.

How about this:
enum yaffs_ecc_result {
    YAFFS_ECC_RESULT_UNKNOWN,
    YAFFS_ECC_RESULT_NO_ERROR,
    YAFFS_ECC_RESULT_REFRESH,
    YAFFS_ECC_RESULT_FIXED,
    YAFFS_ECC_RESULT_UNFIXED
};

Then we can have something like, say:

if errors < bitflip_threshold --> NO_ERROR else if errors < retire_limit --> REFRESH else if not corrupted --> FIXED else --> UNFIXED

So for a concrete example, let us say we have something where:
ecc_strength = 8, bitflip_threshold = 4 then we would have retire_limit = 6

So we would get the following behaviour:

* 0 to 3 errors --> NO_ERROR
* 4 to 5 errors --> REFRESH (refresh only)
* 6 to 8 errors -->FIXED (refresh and increment chunk_error_strikes)
* more than that --> UNFIXED.

Does that make sense?

Charles