I spent a while talking with Greg Price about the Haskell Network.HTTP issue previously featured here, and he was unconvinced I had the whole story. He spent a while reading some source and thinking about folds, and pointed out today that my analysis was incomplete, and flat-out wrong for the cabal case referenced in the commit.
One detail I glossed over in the last post (deliberately for simplicity, but over-zealously!) was the bufOps variable. bufOps is an instance of BufferOps, which provides an abstraction layer that allows the same code to work on a variety of byte-buffer or string types.
For simplicity, I chose to ignore the indirection and did the analysis assuming a strict buffer backed by a flat memory array, such as Haskell’s Data.ByteString. And my analysis was correct for a caller using Data.ByteString.
However, Greg pointed out, it is much more common in modern idiomatic Haskell to use Data.ByteString.Lazy, and indeed cabal does so. And for Data.ByteString.Lazy’s different buffer implementation, my analysis is completely wrong!
For Data.ByteString.Lazy, the issue hinges on the representation of the bytestring, and the implementation of append. ByteString.Lazy stores buffers as immutable cons-cell style lists of immutable chunks of string data – the string “Hello, World” might be stored as
("Hell") -> ("o, Wor") -> ("ld") -> null
In order to append two of these structures, we must copy the list structure of the left-hand side, replacing the final null with a pointer to the right-hand-side. e.g.
append(("Hello, ") -> ("World") -> null,
("Good Night, ") -> ("Moon") -> null)
= ("Hello, ") -> ("World") -> ("Good Night, ") -> ("Moon") -> null
This copies the left hand side, but reuses the right-hand side directly and thus has performance O(n) where n is the length of the left-hand side, and the length of the right-hand side is irrelevant.
Armed with that knowledge, we can consider the effects of
foldr (flip append) empty strs
foldr will repeatedly apply flip append (which is just append with the arguments reversed: (flip append) a b = append b a) to pairs of strings, starting with the empty string, and working from the end of the list backwards. If we suppose strs = ["a", "b", "c", "d"], this will result in the following series of calls:
(flip append) "d" "" = append "" "d" = "d"
(flip append) "c" "d" = append "d" "c" = "dc"
(flip append) "b" "dc" = append "dc" "b" = "dcb"
(flip append) "a" "dcb" = append "dcb" "a" = "dcba"
Note that the ever-growing result always ends up on the left-hand side of append. Since append is linear in its left-hand-side, this repeated copying and accumulation yields quadratic behavior.
But consider instead foldr append $ reverse strs (which is just foldr append (reverse strs) – here you can think of $ as an open-left-paren that is closed by end-of-line). That expression results in this sequence of calls:
append "a" "" = "a"
append "b" "a" = "ba"
append "c" "ba" = "cba"
append "d" "cba" = "dcba"
Now, the left-hand-side consists always of individual strings, giving a total runtime of O(Σn) = O(N) where N is the total length.
For Data.ByteString.Lazy, then, the whole improvement comes from moving from flip foldr to reverse; Switching from an explicit fold to concat makes the code more concise, but doesn’t affect the asymptotics. But for Data.ByteString, as we saw last time, it is the switch to concat that nets the improvement to O(n).
This code was, in essence, O(n²) for two different reasons, depending on which bytestring implementation the caller asked for, and the patch fixed both of them at once! Now that is the level of subtlety I expect to see packed into a few characters of Haskell code :)
I also think this example makes for a great example of the points I made in my accidentally quadratic manifest.
Whoever wrote the original code was a skilled Haskell programmer, and almost certainly more than familiar with big-O notation and basic algorithmic analysis. This quadratic pathology arose not by incompetence or lack of knowledge by the author, but because of the subtle interaction of two different individually-linear layers (foldr and append). And as we’ve seen, the layering really is key: the author could not have fully analyzed this code without peeking below the abstraction boundary and considering various concrete buffer implementations.
Additionally, this example shows off the interplay between constant factors and algorithmics. If your HTTP connections are slow, which are you going to assume: That the network or the server is slow, or that your HTTP client library has quadratic behavior hidden inside a simple operation? I’m speculating here, but it seems likely to me that that was a factor in allowing this bug to lurk longer than it might have.