What are the differences between B trees and B+ trees?

Question

In a b-tree you can store both keys and data in the internal and leaf nodes, but in a b+ tree you have to store the data in the leaf nodes only.

Is there any advantage of doing the above in a b+ tree?

Why not use b-trees instead of b+ trees everywhere, as intuitively they seem much faster?

I mean, why do you need to replicate the key (data) in a b+ tree?

I think what they're saying is "B-Tree" vs. B+-Tree. They mean a hyphen, not a minus sign. — stu, Commented Nov 20, 2015 at 21:34

Rose Perrone · Accepted Answer · 2012-08-17 23:42:46Z

496

The image below helps show the differences between B+ trees and B trees.

Advantages of B+ trees:

Because B+ trees don't have data associated with interior nodes, more keys can fit on a page of memory. Therefore, it will require fewer cache misses in order to access data that is on a leaf node.
The leaf nodes of B+ trees are linked, so doing a full scan of all objects in a tree requires just one linear pass through all the leaf nodes. A B tree, on the other hand, would require a traversal of every level in the tree. This full-tree traversal will likely involve more cache misses than the linear traversal of B+ leaves.

Advantage of B trees:

Because B trees contain data with each key, frequently accessed nodes can lie closer to the root, and therefore can be accessed more quickly.

B and B+ tree

answered Aug 17, 2012 at 23:42

Rose Perrone

62.9k60 gold badges213 silver badges247 bronze badges

6

Is their any constrain on number of entries in leaf node??
– TLE
Commented Feb 21, 2014 at 1:02
49

@TLE Good question! Yes. A hard drive accesses a minimum of a page of memory at a time, so we want to fit all the pointers in a single page of memory. We want to require only one disk read per leaf access, so we don't want to assign more than a page size of pointers to a leaf. If we fill a leaf with a page size of pointers, and then we want to add another pointer to this leaf, we create two children of this node, and give half of the leaf's pointers to each new child. Of course, there may be some reshuffling to ensure the tree's height is kept to a minimum. Does this help?
– Rose Perrone
Commented Feb 21, 2014 at 5:28
the last pointer of each leaf node of B-tree should point to the next leaf node, right?
– camino
Commented Jun 15, 2014 at 21:55
9

So sorry for bumping such an old thread, but @Babyburger's comment about how camino's comment was correct is not actually true; a B-Tree does not, in fact, have connected leaf nodes. A B+, sure.
– Jason
Commented Jun 11, 2017 at 19:17
1

@Siddhartha From DbSystemConcepts 6 (457): Large objects are often represented using B+-tree file organizations. B+-tree file organizations permit us to >read an entire object<, or specified byte ranges in the object, as well as to insert and delete parts of the object. B+Tree file organization is one of extensions for this data structure. I think this can be one of use cases related to your question.
– ssukienn
Commented Jun 24, 2018 at 12:09

| Show 2 more comments

Community · Accepted Answer · 2016-04-03 22:51:17Z

132

The principal advantage of B+ trees over B trees is they allow you to pack in more pointers to other nodes by removing pointers to data, thus increasing the fanout and potentially decreasing the depth of the tree.

The disadvantage is that there are no early outs when you might have found a match in an internal node. But since both data structures have huge fanouts, the vast majority of your matches will be on leaf nodes anyway, making on average the B+ tree more efficient.

edited Apr 3, 2016 at 22:51

CommunityBot

11 silver badge

answered May 15, 2009 at 19:05

Vic E

1,4461 gold badge8 silver badges4 bronze badges

2

I prefer Jeff's answer, because it emphasizes the difference in efficiency when doing a full scan.
– Rose Perrone
Commented Aug 17, 2012 at 23:06
1

I am really confused because traversing a b-tree using an in-order traversal will read all of the values in sorted order in O(n) time. If each tree node is optimally sized for the physical page size, it seems like things don't get any more optimal. Conversely, the cost to get to the first (smallest) value in a b+tree is O(log n) and then to walk through every leaf is O(n) so the total cost is O(log n + n). This is more work and more disk reads which makes sense because the tree has all this extra data in it. I don't get it.
– Eric Rini
Commented Sep 2, 2015 at 2:45
What would it be another word for 'fanout' in the above sentence?
– j--
Commented Jul 3, 2016 at 2:15
4

@JorgeBucaran fanout = number of edges coming out of a node
– bantmen
Commented Jul 28, 2016 at 5:02

Add a comment |

saxbophone · Accepted Answer · 2019-11-23 08:21:22Z

In a B tree search keys and data are stored in internal or leaf nodes. But in a B+-tree data is stored only in leaf nodes.
Full scan of a B+ tree is very easy because all data are found in leaf nodes. Full scan of a B tree requires a full traversal.
In a B tree, data may be found in leaf nodes or internal nodes. Deletion of internal nodes is very complicated. In a B+ tree, data is only found in leaf nodes. Deletion of leaf nodes is easy.
Insertion in B tree is more complicated than B+ tree.
B+ trees store redundant search keys but B tree has no redundant value.
In a B+ tree, leaf node data is ordered as a sequential linked list but in a B tree the leaf node cannot be stored using a linked list. Many database systems' implementations prefer the structural simplicity of a B+ tree.

Jeff Mc · Accepted Answer · 2009-05-15 20:33:12Z

45

B+Trees are much easier and higher performing to do a full scan, as in look at every piece of data that the tree indexes, since the terminal nodes form a linked list. To do a full scan with a B-Tree you need to do a full tree traversal to find all the data.

B-Trees on the other hand can be faster when you do a seek (looking for a specific piece of data by key) especially when the tree resides in RAM or other non-block storage. Since you can elevate commonly used nodes in the tree there are less comparisons required to get to the data.

answered May 15, 2009 at 20:33

Jeff Mc

3,7931 gold badge23 silver badges27 bronze badges

2

Would you agree then a B+ tree would be used for situations in which there may be a sequential read across all of the data thus be able to go across the leaves. Whereas the B tree would be ideal for Random Access situations?
– JDPeckham
Commented Aug 19, 2017 at 21:41
1

@JDPeckham very curious about your question as well
– Siddhartha
Commented Nov 18, 2020 at 3:29

Add a comment |

camino · Accepted Answer · 2014-06-17 13:19:21Z

21

Example from Database system concepts 5th

B+-tree B+tree

corresponding B-tree Btree

answered Jun 17, 2014 at 13:19

camino

10.4k20 gold badges68 silver badges118 bronze badges

9

I don't think a B-Tree has links to the node's children. For instance form the Clearview bucket to the Mianus Bucket. It wouldn't make much sense to do that anyway because in between the two you have the Downtown bucket which much be searched in the event you want to do an Index Scan in a B-tree (requires backtracking). Where did you get this?
– Evan Carroll
Commented Apr 20, 2018 at 20:13
2

@EvanCarroll Database system concepts 5th, maybe you need to confirm with the author :)
– camino
Commented Apr 20, 2018 at 20:19

Add a comment |

Saket · Accepted Answer · 2013-07-12 15:50:36Z

Adegoke A, Amit

I guess one crucial point you people are missing is difference between data and pointers as explained in this section.

Pointer : pointer to other nodes.

Data :- In context of database indexes, data is just another pointer to real data (row) which reside somewhere else.

Hence in case of B tree each node has three information keys, pointers to data associated with the keys and pointer to child nodes.

In B+ tree internal node keep keys and pointers to child node while leaf node keep keys and pointers to associated data. This allows more number of key for a given size of node. Size of node is determined mainly by block size.

Advantage of having more key per node is explained well above so I will save my typing effort.

Javier · Accepted Answer · 2009-05-15 19:52:02Z

13

B+ Trees are especially good in block-based storage (eg: hard disk). with this in mind, you get several advantages, for example (from the top of my head):

high fanout / low depth: that means you have to get less blocks to get to the data. with data intermingled with the pointers, each read gets less pointers, so you need more seeks to get to the data
simple and consistent block storage: an inner node has N pointers, nothing else, a leaf node has data, nothing else. that makes it easy to parse, debug and even reconstruct.
high key density means the top nodes are almost certainly on cache, in many cases all inner nodes get quickly cached, so only the data access has to go to disk.

answered May 15, 2009 at 19:52

Javier

61.9k9 gold badges80 silver badges126 bronze badges

2

mostly for in-memory trees; but there are other popular options, like red-black trees, skip lists, and such.
– Javier
Commented May 15, 2009 at 21:31
B-trees are also designed for efficient block-based storage, limiting the asymptotic number of node accesses. Otherwise, if using a memory-like storage medium with random access, one can use a self-balancing binary tree such as a red-black tree to achieve better results.
– dionyziz
Commented Feb 1, 2012 at 9:44
shouldn't your first point say "less seeks" rather than "more seeks". Smaller depth -> less seeks
– Jesse
Commented May 9, 2012 at 16:30
1

@Jesse: high fanout=> low depth => less seeks, but mixing data and pointers means less pointers => low fanout => more depth => more seeks
– Javier
Commented May 10, 2012 at 21:19
2

@AdegokeA: a B+tree has two kinds of nodes: inner nodes with only keys and pointers, no data; and leaf nodes, with data and no pointers. that allows for maximum number of keys on each inner node. if you store data on an inner node, then you can fit less pointers and your tree gets taller.
– Javier
Commented Apr 18, 2013 at 13:38

| Show 1 more comment

Charlie Martin · Accepted Answer · 2009-05-15 18:45:07Z

11

Define "much faster". Asymptotically they're about the same. The differences lie in how they make use of secondary storage. The Wikipedia articles on B-trees and B+trees look pretty trustworthy.

answered May 15, 2009 at 18:45

Charlie Martin

112k26 gold badges195 silver badges264 bronze badges

2

I agree with Charlie. Since one node of a B-Tree represents one secondary memory page or block, the passage from one node to another requires a time consuming page-change.
– user78706
Commented Jan 2, 2013 at 8:49

Add a comment |

VS7 · Accepted Answer · 2009-12-28 04:29:18Z

7

In B+ Tree, since only pointers are stored in the internal nodes, their size becomes significantly smaller than the internal nodes of B tree (which store both data+key). Hence, the indexes of the B+ tree can be fetched from the external storage in a single disk read, processed to find the location of the target. If it has been a B tree, a disk read is required for each and every decision making process. Hope I made my point clear! :)

answered Dec 28, 2009 at 4:29

VS7

1332 silver badges6 bronze badges

Add a comment |

Kapil Kumar · Accepted Answer · 2013-03-13 08:59:51Z

6

**

The major drawback of B-Tree is the difficulty of Traversing the keys sequentially. The B+ Tree retains the rapid random access property of the B-Tree while also allowing rapid sequential access

** ref: Data Structures Using C// Author: Aaro M Tenenbaum

http://books.google.co.in/books?id=X0Cd1Pr2W0gC&pg=PA456&lpg=PA456&dq=drawback+of+B-Tree+is+the+difficulty+of+Traversing+the+keys+sequentially&source=bl&ots=pGcPQSEJMS&sig=F9MY7zEXYAMVKl_Sg4W-0LTRor8&hl=en&sa=X&ei=nD5AUbeeH4zwrQe12oCYAQ&ved=0CDsQ6AEwAg#v=onepage&q=drawback%20of%20B-Tree%20is%20the%20difficulty%20of%20Traversing%20the%20keys%20sequentially&f=false

answered Mar 13, 2013 at 8:59

Kapil Kumar

611 silver badge1 bronze badge

2

This should have been the correct answer. In short: Locality of reference.
– Theodore Zographos
Commented Dec 27, 2018 at 18:52

Add a comment |

khr055 · Accepted Answer · 2012-11-12 15:59:52Z

3

The primary distinction between B-tree and B+tree is that B-tree eliminates the redundant storage of search key values.Since search keys are not repeated in the B-tree,we may not be able to store the index using fewer tree nodes than in corresponding B+tree index.However,since search key that appear in non-leaf nodes appear nowhere else in B-tree,we are forced to include an additional pointer field for each search key in a non-leaf node. Their are space advantages for B-tree, as repetition does not occur and can be used for large indices.

edited Nov 12, 2012 at 15:59

khr055

28.9k16 gold badges37 silver badges48 bronze badges

answered Nov 12, 2012 at 15:39

Mary

311 bronze badge

1

Interesting, the thoughts about repetition are unique among the replies here and make more sense than in-order traversal of b+tree being more efficient than in-order traversal of a b-tree. As far as I can tell, that's either not quite right, or not the whole story as in order traversal of a b-tree is O(n) and finding the smallest node in a b+tree is O(log n) and then traversing each leaf is O(n) in addition to that. However, if you were indexing something with a small range of values, like a boolean field, the b+tree makes a lot more sense than a b-tree because of its duplicate handling.
– Eric Rini
Commented Sep 2, 2015 at 3:03

Add a comment |

Amit · Accepted Answer · 2012-07-23 23:58:13Z

2

Take one example - you have a table with huge data per row. That means every instance of the object is Big.

If you use B tree here then most of the time is spent scanning the pages with data - which is of no use. In databases that is the reason of using B+ Trees to avoid scanning object data.

B+ Trees separate keys from data.

But if your data size is less then you can store them with key which is what B tree does.

answered Jul 23, 2012 at 23:58

Amit

291 bronze badge

2

"If you use B tree here then most of the time is spent scanning the pages with data" - not necessary. B-tree nodes can keep only "pointers" to data on disc, not data itself.
– TT_ stands with Russia
Commented Dec 23, 2013 at 23:24

Add a comment |

Vivek Rakholiya · Accepted Answer · 2011-05-03 12:24:25Z

1

A B+tree is a balanced tree in which every path from the root of the tree to a leaf is of the same length, and each nonleaf node of the tree has between [n/2] and [n] children, where n is fixed for a particular tree. It contains index pages and data pages. Binary trees only have two children per parent node, B+ trees can have a variable number of children for each parent node

answered May 3, 2011 at 12:24

Vivek Rakholiya

192 bronze badges

1

Just for clarity, B trees are not binary trees. In fact, B trees and B+trees are closer to each other in construction and usage than binary trees. The wiki articles can help in clearing the definitions - B+Tree, B Tree and Binary Tree
– uutsav
Commented Sep 3, 2015 at 9:56

Add a comment |

bearbear123 · Accepted Answer · 2018-11-11 22:38:28Z

1

One possible use of B+ trees is that it is suitable for situations where the tree grows so large that it does not fit into available memory. Thus, you'd generally expect to be doing multiple I/O's.
It does often happen that a B+ tree is used even when it in fact fits into memory, and then your cache manager might keep it there permanently. But this is a special case, not the general one, and caching policy is a separate from B+ tree maintenance as such.

Also, in a B+ tree, the leaf pages are linked together in a linked list (or doubly-linked list), which optimizes traversals (for range searches, sorting, etc.). So the number of pointers is a function of the specific algorithm that is used.

edited Nov 11, 2018 at 22:38

bearbear123

92 bronze badges

answered May 15, 2009 at 18:51

Stack Programmer

3,4561 gold badge21 silver badges13 bronze badges

This is in answer to the question that why should we not use B-trees instead of B+ trees everywhere :)
– Stack Programmer
Commented May 15, 2009 at 19:20
3

But you only described one side, as far as we know, with your answer b-trees could function exactly the same way. The OP asked to explain the differences and you only talked about one and not the other. You can't have a venn diagram with one circle!
– Malfist
Commented May 15, 2009 at 20:00

Add a comment |

Collectives™ on Stack Overflow

What are the differences between B trees and B+ trees?

14 Answers 14

Not the answer you're looking for? Browse other questions tagged
database
data-structures
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

14 Answers 14

Not the answer you're looking for? Browse other questions tagged databasedata-structures or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
database
data-structures
or ask your own question.