E2fsprogs Workout

From WorkOutWiki2008

Revision as of 17:39, 27 November 2008 by DhavalGiani (Talk | contribs)
(diff) ←Older revision | Current revision (diff) | Newer revision→ (diff)
Jump to: navigation, search

Contents

Proposer

Christoph Hellwig

Aneesh Kumar KV

Andreas Dilger

Pre-requisites

Languages, tools to be known

Getting and compiling the code

Links to overall design/architecture

Tasks

simple fix to mke2fs to allow specifying the journal location in the filesystem via "-J offset={blocknumber}"

port "libdisk" from xfsutils to e2fsprogs so that mke2fs can get the RAID geometry from the underlying device and store it into the superblock if the "-E stride" and "-E stripe_width" options are not given.

update online resizing to work with META_BG feature, so that filesystem can be resized to arbitrary limits. There was an old patch, maybe it can be used as reference, or updated to latest kernel.

implement a new bitmap format for e2fsprogs. Ted had previously suggested an rbtree for the bitmaps. Co-ordinate with Val on this one.

per-block checksums for the journal transactions. This is to handle the case where the transaction checksum is bad, to only replay the good blocks, and only mark filesystem in error if the bad block(s) do not get overwritten by a later copy in the journal. A very good algorithm to do this would be the "tiger hash tree" (see Wikipedia) so that the per-block checksums can be aggregated into a single large per-transaction checksum. Can ask me for details.

finish inode table readahead for e2fsck. We have patches for this, but it needs to be integrated

properly implement the "htree hash->inode table" range mapping at inode allocation that Coly Li and I were working on. His later patches diverged away from the original idea I had to speed up htree performance for large dirs. Can ask me for details.

implement checksums for ext4 by using the commit callback added for OCFS.

allow ext4 to work with a large (maybe SSD) journal by writing all of the data/metadata to the journal and drop the buffer from memory. This can be done if the filesystem disks are not running and/or to improve performance for small file writes. Data-journaling could optionally be chosen based on the size of the write (for sync writes) so that small sync writes can go straight to the journal, and then be flushed to disk later.

Currently the journal buffers would consume all memory (e.g. with a 40GB journal). When the journal needs to be flushed then the buffers are read from the journal again. This allows the disk(s) in the filesystem to spin down for lower power, or use slow disks and have a single fast SSD device for the journal. It would essentially allow ext4 to run like a log-structured filesystem, without the need to garbage collect.



Existing work

Getting in touch

Links to ML, IRC, people to contact

Participants

Aneesh Kumar KV

Christoph Hellwig

Personal tools