C++ Raw Pointer vs. Smart Pointer Performance with 1 Million Linked Lists
In my research work, I need to maintain about 1,000,000 linked lists, with each growing to about 100 nodes and frequently deleting the head. These lists initially populate by reading in data from a file in a matter of seconds, and I’m aiming for <1 millisecond performance to add one new node to all the lists. Each node occupies 29 bytes of storage, including the pointer to the next node. That’s 2.7 GB of data that needs to be processed and managed with minimal overhead. At this scale and with the particular target application, every nanosecond counts.
The conventional wisdom for this problem would be to use smart pointers to manage the node references, especially since the nodes themselves must contain the pointers and the nodes are deleted by a function from a different class. unique_ptrs are the fastest, with the CPP docs saying they offer “little to no overhead over built-in pointers (depending on the deleter used).” So I took their word for it and used unique_ptr references in my nodes with the default deleter. But the performance was awful. Truly bad.
I set up the following test to verify that the use of smart pointers was the culprit, making sure to allocate and populate the memory realistically, as well as test deleting and reinitializing data. Here’s the code to test raw pointer performance:
And here’s the code to test unique_ptr performance:
Here were my results from running the programs in a terminal:
The raw pointers performed about 6% better than the unique_ptrs on my machine. This may not seem like a huge difference, but as I said above it really adds up noticeably over the program lifecyle. Things get particularly bad, however, when I’m tyring to debug the code.
Debugging in Visual Studio Code I achieve only a minor slowdown with the raw pointers over their straight execution:
Meanwhile the use of unique_ptrs caused a 24X slowdown when debugging:
With my approach to development, such a slowdown is intolerable. At this stage in my project, I often need to debug and step through large chunks of my code in the middle of execution (after several GB of data have been processed). So I’ll gladly take the higher risk of using raw pointers over the severe debugging bottleneck of smart pointers, in this case.