SlideShare a Scribd company logo
Improve
Your
old
Storage
with
bcachefs
Marian Marinov <mm@yuhu.biz>
W
ho Am
I?
W
ho Am
I?
I want to share a story...
The story
of some
hard
drives...
➢ 2 years ago I upgraded my home storage
➢ 2 years ago I upgraded my home storage
➢ from all HDD to all SSD drives
you all can speculate
as to why I did that
Let's take a look at
how HDD work
Let's take a look at
how HDD work
Platters
Improve your storage with bcachefs
The
SLOWEST
task of a
rotational
drive
Improve your storage with bcachefs
Optimization
➢ File systems tried to optimize for this
inefficiency
Optimization
➢ File systems tried to optimize for this
inefficiency
➢ write contents of the files within the same cylinder
➢ utilizing the space under the heads on all platters
Optimization
➢ File systems tried to optimize for this
inefficiency
➢ write contents of the files within the same cylinder
➢ utilizing the space under the heads on all platters
➢ continue the writes on to next sector
(try not to skip any, reducing fragmentation)
Optimization
➢ File systems tried to optimize for this
inefficiency
➢ write contents of the files within the same cylinder
➢ utilizing the space under the heads on all platters
➢ continue the writes on to next sector
(try not to skip any, reducing fragmentation)
➢ continue directly on to the next track
Optimization
➢ File systems tried to optimize for this
inefficiency
➢ The Linux kernel tried to optimize the reads
and writes
➢ read-ahead
➢ combine and order the writes
Optimization
➢ File systems tried to optimize for this
inefficiency
➢ The Linux kernel tried to optimize the reads
and writes
➢ Storage controllers
➢ write-behind with BBU
Now comes bcache
Oh NO, this is not bcachefs
Now comes bcache
Improve your storage with bcachefs
➢ Its 2011 and StorPool are just starting
➢ A guy named Kent Overstreet
➢ SSD booming
➢ Storage vendors starting to offer things like
➢ cachecade
➢ cachevault
➢ etc.
Bcache is born
➢ Bcache is simple block level cache
➢ 2013 it is included in kernel 3.10
➢ It has some bugs :)
Bcache
➢ Bcache is the ground work for bcachefs
➢ It only knows about the blocks, but not their
use
➢ It orders the writes so the backing rotational
disk will do more sequential writes
Finally bcachefs
bcachefs was born in 2014-2015
➢current features
➢ Copy on write (COW) - like zfs or btrfs
➢ Full data checksumming
➢ Caching
➢ Compression (LZ4,ZSTD,)
➢ Replication (RAID1/10)
The juicy details
AVG MIN MAX
sw raid1 86 80 93 MB/s
bcachefs (1) 93 90 97 MB/s
bcachefs (2) 123 118 128 MB/s
36 38 25
1. bcachefs with only rotational HDDs
2. bcachefs with SSD backed device
dd if=/dev/zero of=test$i oflag=direct count=200 bs=5M
Fun facts
root@firefly:~# mount /dev/sdc1 /bcache/
root@firefly:~# df -h /bcache
Filesystem Size Used Avail Use% Mounted on
/dev/sdc1 639G 1.3M 629G 1% /bcache
root@firefly:~# bcachefs device add --group=hdd1 /bcache /dev/sdd1
root@firefly:~# df -h /bcache
Filesystem Size Used Avail Use% Mounted on
/dev/sdc1:/dev/dev-1 1.3T 1.3M 1.3T 1% /bcache
root@firefly:~# bcachefs device add --group=hdd1 /bcache /dev/sde1
root@firefly:~# bcachefs device add --group=hdd1 /bcache /dev/sdf1
root@firefly:~# df -h /bcache
Filesystem Size Used Avail Use% Mounted
on
/dev/sdc1:/dev/dev-1:/dev/dev-2:/dev/dev-3 2.6T 1.5M 2.5T 1% /bcache
Fun facts
root@firefly:~# mount /dev/sdc1 /bcache/
root@firefly:~# df -h /bcache
Filesystem Size Used Avail Use% Mounted on
/dev/sdc1 639G 1.3M 629G 1% /bcache
root@firefly:~# bcachefs device add --group=hdd1 /bcache /dev/sdd1
root@firefly:~# df -h /bcache
Filesystem Size Used Avail Use% Mounted on
/dev/sdc1:/dev/dev-1 1.3T 1.3M 1.3T 1% /bcache
root@firefly:~# bcachefs device add --group=hdd1 /bcache /dev/sde1
root@firefly:~# bcachefs device add --group=hdd1 /bcache /dev/sdf1
root@firefly:~# df -h /bcache
Filesystem Size Used Avail Use% Mounted
on
/dev/sdc1:/dev/dev-1:/dev/dev-2:/dev/dev-3 2.6T 1.5M 2.5T 1% /bcache
Installation
➢ clone their kernel and build it
https://evilpiepirate.org/git/bcachefs.git
➢ clone bcachefs-tools and build them
https://evilpiepirate.org/git/bcachefs-tools.git
First steps
➢ Prepare your drives (as partitions or whole)
➢ bcachefs format /dev/sdc1
➢ bcachefs format /dev/sdd1
➢ bcachefs format /dev/sde
➢ bcachefs format /dev/sdf
➢ mount
mount -t bcachefs 
/dev/sdc1:/dev/sdd1:/dev/sde:/dev/sdf 
/bcache
Make it faster
➢ Let's assume sdb is our SSD drive
# bcachefs format /dev/sdb
# bcachefs device add --group=hdd2 /bcache
/dev/sdb
➢ We still haven't done anything...
# cd /sys/fs/bcachefs/UUID/options
# echo hdd2 > promote_target
# echo hdd2 > foreground_target
# echo hdd2 > metadata_target
Make it faster
➢ foreground_target
➢ writes are first cached here
Make it faster
➢ foreground_target
➢ writes are first cached here
➢ promote_target
➢ reads are cached on this device
Make it faster
➢ You have multiple options for compression
# cd /sys/fs/bcachefs/UUID/options
# cat background_compression
[none] lz4 gzip zstd
➢ Documentation states to avoid zstd for now
➢ also keep in mind block alignment
Make it redundant
➢ Make two copies of the data
# cd /sys/fs/bcachefs/UUID/options
# echo 2 > data_replicas
# echo 2 > metadata_replicas
You can do it all at once
# bcachefs format 
--group=ssd /dev/sdb 
--group=hdd /dev/sdc /dev/sdd /dev/sde /dev/sdf 
--data_replicas=2 --metadata_replicas=2 
--foreground_target=ssd 
--background_target=hdd 
--promote_target=ssd
# mount -t bcachefs 
/dev/sdb:/dev/sdc:/dev/sdd/dev/sde:/dev/sdf /mnt
Some management stuff
➢ Remove a device
# bcachefs device evacuate /dev/sdf1
# bcachefs device remove /dev/sdf1
# bcachefs device offline /dev/sdf1
➢ Bring back a device
# bcachefs device online /dev/sdf1
# bcachefs device add /dev/sdf1
➢ Make sure all of your data is replicated
# bcachefs data rereplicate /bcache/
QUESTIONS?
Marian Marinov <mm@yuhu.biz>
Thank You
Marian Marinov <mm@yuhu.biz>

More Related Content

Improve your storage with bcachefs

  • 3. I want to share a story...
  • 5. ➢ 2 years ago I upgraded my home storage
  • 6. ➢ 2 years ago I upgraded my home storage ➢ from all HDD to all SSD drives
  • 7. you all can speculate as to why I did that
  • 8. Let's take a look at how HDD work
  • 9. Let's take a look at how HDD work Platters
  • 13. Optimization ➢ File systems tried to optimize for this inefficiency
  • 14. Optimization ➢ File systems tried to optimize for this inefficiency ➢ write contents of the files within the same cylinder ➢ utilizing the space under the heads on all platters
  • 15. Optimization ➢ File systems tried to optimize for this inefficiency ➢ write contents of the files within the same cylinder ➢ utilizing the space under the heads on all platters ➢ continue the writes on to next sector (try not to skip any, reducing fragmentation)
  • 16. Optimization ➢ File systems tried to optimize for this inefficiency ➢ write contents of the files within the same cylinder ➢ utilizing the space under the heads on all platters ➢ continue the writes on to next sector (try not to skip any, reducing fragmentation) ➢ continue directly on to the next track
  • 17. Optimization ➢ File systems tried to optimize for this inefficiency ➢ The Linux kernel tried to optimize the reads and writes ➢ read-ahead ➢ combine and order the writes
  • 18. Optimization ➢ File systems tried to optimize for this inefficiency ➢ The Linux kernel tried to optimize the reads and writes ➢ Storage controllers ➢ write-behind with BBU
  • 20. Oh NO, this is not bcachefs Now comes bcache
  • 22. ➢ Its 2011 and StorPool are just starting ➢ A guy named Kent Overstreet ➢ SSD booming ➢ Storage vendors starting to offer things like ➢ cachecade ➢ cachevault ➢ etc.
  • 23. Bcache is born ➢ Bcache is simple block level cache ➢ 2013 it is included in kernel 3.10 ➢ It has some bugs :)
  • 24. Bcache ➢ Bcache is the ground work for bcachefs ➢ It only knows about the blocks, but not their use ➢ It orders the writes so the backing rotational disk will do more sequential writes
  • 25. Finally bcachefs bcachefs was born in 2014-2015 ➢current features ➢ Copy on write (COW) - like zfs or btrfs ➢ Full data checksumming ➢ Caching ➢ Compression (LZ4,ZSTD,) ➢ Replication (RAID1/10)
  • 26. The juicy details AVG MIN MAX sw raid1 86 80 93 MB/s bcachefs (1) 93 90 97 MB/s bcachefs (2) 123 118 128 MB/s 36 38 25 1. bcachefs with only rotational HDDs 2. bcachefs with SSD backed device dd if=/dev/zero of=test$i oflag=direct count=200 bs=5M
  • 27. Fun facts root@firefly:~# mount /dev/sdc1 /bcache/ root@firefly:~# df -h /bcache Filesystem Size Used Avail Use% Mounted on /dev/sdc1 639G 1.3M 629G 1% /bcache root@firefly:~# bcachefs device add --group=hdd1 /bcache /dev/sdd1 root@firefly:~# df -h /bcache Filesystem Size Used Avail Use% Mounted on /dev/sdc1:/dev/dev-1 1.3T 1.3M 1.3T 1% /bcache root@firefly:~# bcachefs device add --group=hdd1 /bcache /dev/sde1 root@firefly:~# bcachefs device add --group=hdd1 /bcache /dev/sdf1 root@firefly:~# df -h /bcache Filesystem Size Used Avail Use% Mounted on /dev/sdc1:/dev/dev-1:/dev/dev-2:/dev/dev-3 2.6T 1.5M 2.5T 1% /bcache
  • 28. Fun facts root@firefly:~# mount /dev/sdc1 /bcache/ root@firefly:~# df -h /bcache Filesystem Size Used Avail Use% Mounted on /dev/sdc1 639G 1.3M 629G 1% /bcache root@firefly:~# bcachefs device add --group=hdd1 /bcache /dev/sdd1 root@firefly:~# df -h /bcache Filesystem Size Used Avail Use% Mounted on /dev/sdc1:/dev/dev-1 1.3T 1.3M 1.3T 1% /bcache root@firefly:~# bcachefs device add --group=hdd1 /bcache /dev/sde1 root@firefly:~# bcachefs device add --group=hdd1 /bcache /dev/sdf1 root@firefly:~# df -h /bcache Filesystem Size Used Avail Use% Mounted on /dev/sdc1:/dev/dev-1:/dev/dev-2:/dev/dev-3 2.6T 1.5M 2.5T 1% /bcache
  • 29. Installation ➢ clone their kernel and build it https://evilpiepirate.org/git/bcachefs.git ➢ clone bcachefs-tools and build them https://evilpiepirate.org/git/bcachefs-tools.git
  • 30. First steps ➢ Prepare your drives (as partitions or whole) ➢ bcachefs format /dev/sdc1 ➢ bcachefs format /dev/sdd1 ➢ bcachefs format /dev/sde ➢ bcachefs format /dev/sdf ➢ mount mount -t bcachefs /dev/sdc1:/dev/sdd1:/dev/sde:/dev/sdf /bcache
  • 31. Make it faster ➢ Let's assume sdb is our SSD drive # bcachefs format /dev/sdb # bcachefs device add --group=hdd2 /bcache /dev/sdb ➢ We still haven't done anything... # cd /sys/fs/bcachefs/UUID/options # echo hdd2 > promote_target # echo hdd2 > foreground_target # echo hdd2 > metadata_target
  • 32. Make it faster ➢ foreground_target ➢ writes are first cached here
  • 33. Make it faster ➢ foreground_target ➢ writes are first cached here ➢ promote_target ➢ reads are cached on this device
  • 34. Make it faster ➢ You have multiple options for compression # cd /sys/fs/bcachefs/UUID/options # cat background_compression [none] lz4 gzip zstd ➢ Documentation states to avoid zstd for now ➢ also keep in mind block alignment
  • 35. Make it redundant ➢ Make two copies of the data # cd /sys/fs/bcachefs/UUID/options # echo 2 > data_replicas # echo 2 > metadata_replicas
  • 36. You can do it all at once # bcachefs format --group=ssd /dev/sdb --group=hdd /dev/sdc /dev/sdd /dev/sde /dev/sdf --data_replicas=2 --metadata_replicas=2 --foreground_target=ssd --background_target=hdd --promote_target=ssd # mount -t bcachefs /dev/sdb:/dev/sdc:/dev/sdd/dev/sde:/dev/sdf /mnt
  • 37. Some management stuff ➢ Remove a device # bcachefs device evacuate /dev/sdf1 # bcachefs device remove /dev/sdf1 # bcachefs device offline /dev/sdf1 ➢ Bring back a device # bcachefs device online /dev/sdf1 # bcachefs device add /dev/sdf1 ➢ Make sure all of your data is replicated # bcachefs data rereplicate /bcache/