Commit c60e520f authored by Tomas Vondra's avatar Tomas Vondra

Use memcpy instead of a byte loop in pglz_decompress

The byte loop used in pglz_decompress() because of possible overlap may
be quite inefficient, so this commit replaces it with memcpy. The gains
do depend on the data (compressibility) and hardware, but seem to be
quite significant.

Author: Andrey Borodin
Reviewed-by: Michael Paquier, Konstantin Knizhnik, Tels
Discussion: https://postgr.es/m/469C9ED9-348C-4FE7-A7A7-B0FA671BEE4C@yandex-team.ru
parent 6d61c3f1
...@@ -714,11 +714,13 @@ pglz_decompress(const char *source, int32 slen, char *dest, ...@@ -714,11 +714,13 @@ pglz_decompress(const char *source, int32 slen, char *dest,
if (ctrl & 1) if (ctrl & 1)
{ {
/* /*
* Otherwise it contains the match length minus 3 and the * Set control bit means we must read a match tag. The match
* upper 4 bits of the offset. The next following byte * is coded with two bytes. First byte uses lower nibble to
* contains the lower 8 bits of the offset. If the length is * code length - 3. Higher nibble contains upper 4 bits of the
* coded as 18, another extension tag byte tells how much * offset. The next following byte contains the lower 8 bits
* longer the match really was (0-255). * of the offset. If the length is coded as 18, another
* extension tag byte tells how much longer the match really
* was (0-255).
*/ */
int32 len; int32 len;
int32 off; int32 off;
...@@ -731,16 +733,44 @@ pglz_decompress(const char *source, int32 slen, char *dest, ...@@ -731,16 +733,44 @@ pglz_decompress(const char *source, int32 slen, char *dest,
/* /*
* Now we copy the bytes specified by the tag from OUTPUT to * Now we copy the bytes specified by the tag from OUTPUT to
* OUTPUT. It is dangerous and platform dependent to use * OUTPUT (copy len bytes from dp - off to dp). The copied
* memcpy() here, because the copied areas could overlap * areas could overlap, to preven possible uncertainty, we
* extremely! * copy only non-overlapping regions.
*/ */
len = Min(len, destend - dp); len = Min(len, destend - dp);
while (len--) while (off < len)
{ {
*dp = dp[-off]; /*---------
dp++; * When offset is smaller than length - source and
* destination regions overlap. memmove() is resolving
* this overlap in an incompatible way with pglz. Thus we
* resort to memcpy()-ing non-overlapping regions.
*
* Consider input: 112341234123412341234
* At byte 5 here ^ we have match with length 16 and
* offset 4. 11234M(len=16, off=4)
* We are decoding first period of match and rewrite match
* 112341234M(len=12, off=8)
*
* The same match is now at position 9, it points to the
* same start byte of output, but from another position:
* the offset is doubled.
*
* We iterate through this offset growth until we can
* proceed to usual memcpy(). If we would try to decode
* the match at byte 5 (len=16, off=4) by memmove() we
* would issue memmove(5, 1, 16) which would produce
* 112341234XXXXXXXXXXXX, where series of X is 12
* undefined bytes, that were at bytes [5:17].
*---------
*/
memcpy(dp, dp - off, off);
len -= off;
dp += off;
off += off;
} }
memcpy(dp, dp - off, len);
dp += len;
} }
else else
{ {
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment