On Friday 01 July 2005 10:41, Charles Manning wrote:
> On Thursday 30 June 2005 17:43, Beat Morf wrote:
> > Hi
> >
> > I am using YAFFS direct implementation and actually testing some
> > basically requirements.
> >
> > When I write hundreds of files into the root of my YAFFS-FS (let's say
> > 600) and dump them afterwards ('yaffs_opendir()', 'yaffs_readdir()'), I
> > saw, that the 'yaffs_readdir(...)' functions needs much more time for
> > the last files than for the first one.
> >
> > Each call to the 'yaffs_readdir(...)' function will start at the first
> > object within the 'yaffs_DIR'. For knowing which object was allready
> > returned, a separate list of allready showed objects is applied (the
> > objects are identifiable with the ObjectID); Means long times for a lot
> > of files!
> >
> > What is exactly the reason for such a method if I don't need to give
> > them out in an special ordered way?
> >
> > Actually I am using following 'yaffs_xxxxdir(...)' functions with a good
> > performance ('dsc->list' is used for pointing to the next
> > yaffsfs_ObjectListEntry entry):
>
> Thanx Beat
>
> I have had this comment once before from someone using huge directories.
> I believe they got around the problem by caching names in their
> application.
>
> There are some interesting issues associated with directory reading. For
> example, what happens if another thread is writing new files to the
> directory or deleting files from the directory?
>
> The easiest way to do a directory traversal is to keep a pointer to the
> next sibling in the list as we search, but that can go wrong if that object
> gets deleted before you read it:
>
> eg.
>
> 1. Directory has files xx,yy zz.
> 2. Read dir, get xx. Pointer points to yy
> 3. Delete yy
> 4. Now pointer points to a non-existant object!
>
> Another way that does not work is to try remember the number where you are:
> eg.
> 1. Directory has files xx,yy zz.
> 2. Read dir, get xx. Remember we're pointing to item 1  (starting at zero)
> 3. Delete xx
> 4. Now item 1 is zz and we skipped past yy.
>
>
> The code, as is, covers for cuveballs like this but it isn't very
> efficient. This is something that I should look at improving.
>
>
> I have not looked at your code suggestion in detail yet, but I shall.
>
> -- Charles

Having looked at Beat's code I think it would fail under one of the delete 
case I mention.

I spent some time this weekend trying to implement some code using a balanced 
AVL tree instead of a linked list. This would possibly improve the speed but 
things would still degrade with large directories.

I'm now thinking of a scheme that is like Beat'ssuggestion, but is delete 
safe.

Essentially, this will use a pointer into the directory (same as Beat's), but 
it would also add a check to any file deletion from a directory to check 
whether the file is currently being pointed to, and if so move it on to the 
next item in the directory.

Ideas anybody?

-- Charles