14. Optimization
➢ File systems tried to optimize for this
inefficiency
➢ write contents of the files within the same cylinder
➢ utilizing the space under the heads on all platters
15. Optimization
➢ File systems tried to optimize for this
inefficiency
➢ write contents of the files within the same cylinder
➢ utilizing the space under the heads on all platters
➢ continue the writes on to next sector
(try not to skip any, reducing fragmentation)
16. Optimization
➢ File systems tried to optimize for this
inefficiency
➢ write contents of the files within the same cylinder
➢ utilizing the space under the heads on all platters
➢ continue the writes on to next sector
(try not to skip any, reducing fragmentation)
➢ continue directly on to the next track
17. Optimization
➢ File systems tried to optimize for this
inefficiency
➢ The Linux kernel tried to optimize the reads
and writes
➢ read-ahead
➢ combine and order the writes
18. Optimization
➢ File systems tried to optimize for this
inefficiency
➢ The Linux kernel tried to optimize the reads
and writes
➢ Storage controllers
➢ write-behind with BBU
22. ➢ Its 2011 and StorPool are just starting
➢ A guy named Kent Overstreet
➢ SSD booming
➢ Storage vendors starting to offer things like
➢ cachecade
➢ cachevault
➢ etc.
23. Bcache is born
➢ Bcache is simple block level cache
➢ 2013 it is included in kernel 3.10
➢ It has some bugs :)
24. Bcache
➢ Bcache is the ground work for bcachefs
➢ It only knows about the blocks, but not their
use
➢ It orders the writes so the backing rotational
disk will do more sequential writes
25. Finally bcachefs
bcachefs was born in 2014-2015
➢current features
➢ Copy on write (COW) - like zfs or btrfs
➢ Full data checksumming
➢ Caching
➢ Compression (LZ4,ZSTD,)
➢ Replication (RAID1/10)
26. The juicy details
AVG MIN MAX
sw raid1 86 80 93 MB/s
bcachefs (1) 93 90 97 MB/s
bcachefs (2) 123 118 128 MB/s
36 38 25
1. bcachefs with only rotational HDDs
2. bcachefs with SSD backed device
dd if=/dev/zero of=test$i oflag=direct count=200 bs=5M
27. Fun facts
root@firefly:~# mount /dev/sdc1 /bcache/
root@firefly:~# df -h /bcache
Filesystem Size Used Avail Use% Mounted on
/dev/sdc1 639G 1.3M 629G 1% /bcache
root@firefly:~# bcachefs device add --group=hdd1 /bcache /dev/sdd1
root@firefly:~# df -h /bcache
Filesystem Size Used Avail Use% Mounted on
/dev/sdc1:/dev/dev-1 1.3T 1.3M 1.3T 1% /bcache
root@firefly:~# bcachefs device add --group=hdd1 /bcache /dev/sde1
root@firefly:~# bcachefs device add --group=hdd1 /bcache /dev/sdf1
root@firefly:~# df -h /bcache
Filesystem Size Used Avail Use% Mounted
on
/dev/sdc1:/dev/dev-1:/dev/dev-2:/dev/dev-3 2.6T 1.5M 2.5T 1% /bcache
28. Fun facts
root@firefly:~# mount /dev/sdc1 /bcache/
root@firefly:~# df -h /bcache
Filesystem Size Used Avail Use% Mounted on
/dev/sdc1 639G 1.3M 629G 1% /bcache
root@firefly:~# bcachefs device add --group=hdd1 /bcache /dev/sdd1
root@firefly:~# df -h /bcache
Filesystem Size Used Avail Use% Mounted on
/dev/sdc1:/dev/dev-1 1.3T 1.3M 1.3T 1% /bcache
root@firefly:~# bcachefs device add --group=hdd1 /bcache /dev/sde1
root@firefly:~# bcachefs device add --group=hdd1 /bcache /dev/sdf1
root@firefly:~# df -h /bcache
Filesystem Size Used Avail Use% Mounted
on
/dev/sdc1:/dev/dev-1:/dev/dev-2:/dev/dev-3 2.6T 1.5M 2.5T 1% /bcache
29. Installation
➢ clone their kernel and build it
https://evilpiepirate.org/git/bcachefs.git
➢ clone bcachefs-tools and build them
https://evilpiepirate.org/git/bcachefs-tools.git
30. First steps
➢ Prepare your drives (as partitions or whole)
➢ bcachefs format /dev/sdc1
➢ bcachefs format /dev/sdd1
➢ bcachefs format /dev/sde
➢ bcachefs format /dev/sdf
➢ mount
mount -t bcachefs
/dev/sdc1:/dev/sdd1:/dev/sde:/dev/sdf
/bcache
31. Make it faster
➢ Let's assume sdb is our SSD drive
# bcachefs format /dev/sdb
# bcachefs device add --group=hdd2 /bcache
/dev/sdb
➢ We still haven't done anything...
# cd /sys/fs/bcachefs/UUID/options
# echo hdd2 > promote_target
# echo hdd2 > foreground_target
# echo hdd2 > metadata_target
32. Make it faster
➢ foreground_target
➢ writes are first cached here
33. Make it faster
➢ foreground_target
➢ writes are first cached here
➢ promote_target
➢ reads are cached on this device
34. Make it faster
➢ You have multiple options for compression
# cd /sys/fs/bcachefs/UUID/options
# cat background_compression
[none] lz4 gzip zstd
➢ Documentation states to avoid zstd for now
➢ also keep in mind block alignment
35. Make it redundant
➢ Make two copies of the data
# cd /sys/fs/bcachefs/UUID/options
# echo 2 > data_replicas
# echo 2 > metadata_replicas
36. You can do it all at once
# bcachefs format
--group=ssd /dev/sdb
--group=hdd /dev/sdc /dev/sdd /dev/sde /dev/sdf
--data_replicas=2 --metadata_replicas=2
--foreground_target=ssd
--background_target=hdd
--promote_target=ssd
# mount -t bcachefs
/dev/sdb:/dev/sdc:/dev/sdd/dev/sde:/dev/sdf /mnt
37. Some management stuff
➢ Remove a device
# bcachefs device evacuate /dev/sdf1
# bcachefs device remove /dev/sdf1
# bcachefs device offline /dev/sdf1
➢ Bring back a device
# bcachefs device online /dev/sdf1
# bcachefs device add /dev/sdf1
➢ Make sure all of your data is replicated
# bcachefs data rereplicate /bcache/