54 Commits

Author SHA1 Message Date
Rich Felker
c5b8f19305 add support for LC_TIME and LC_MESSAGES translations
for LC_MESSAGES, translation of strerror and similar literal message
functions is supported. for messages in other places (particularly the
dynamic linker) that use format strings, translation is not yet
supported. in order to make it possible and safe, such messages will
need to be refactored to separate the textual content from the format.

for LC_TIME, the day and month names and strftime-style format strings
provided by nl_langinfo are supported for translation. however there
may be limitations, as some of the original C-locale nl_langinfo
strings are non-unique and thus perhaps non-suitable as keys.

overall, the locale support activated by this commit should not be
seen as complete and polished but as a basis for beginning to test
locale functionality and implement locales.
2014-07-26 05:36:25 -04:00
Rich Felker
7424ac58b1 consolidate str[n]casecmp_l into str[n]casecmp source files
this is mainly done for consistency with the ctype functions and to
declutter the src/locale directory.
2014-07-02 21:38:54 -04:00
Rich Felker
cef0f289f6 fix incorrect comparison loop condition in memmem
the logic for this loop was copied from null-terminated-string logic
in strstr without properly adapting it to work with explicit lengths.

presumably this error could result in false negatives (wrongly
comparing past the end of the needle/haystack), false positives
(stopping comparison early when the needle contains null bytes), and
crashes (from runaway reads past the end of mapped memory).
2014-06-19 00:42:28 -04:00
Rich Felker
476cd1d965 fix false negatives with periodic needles in strstr, wcsstr, and memmem
in cases where the memorized match range from the right factor
exceeded the length of the left factor, it was wrongly treated as a
mismatch rather than a match.

issue reported by Yves Bastide.
2014-04-18 17:38:35 -04:00
Timo Teräs
6fbdeff0e5 fix search past the end of haystack in memmem
to optimize the search, memchr is used to find the first occurrence of
the first character of the needle in the haystack before switching to
a search for the full needle. however, the number of characters
skipped by this first step were not subtracted from the haystack
length, causing memmem to search past the end of the haystack.
2014-04-09 21:06:17 -04:00
Szabolcs Nagy
571744447c include cleanups: remove unused headers and add feature test macros 2013-12-12 05:09:18 +00:00
Michael Forney
b300d5b7bd strcmp: Remove unnecessary check for *r
If *l == *r && *l, then by transitivity, *r.
2013-11-23 16:17:38 -05:00
Rich Felker
90edf1cc15 optimized C memcpy
unlike the old C memcpy, this version handles word-at-a-time reads and
writes even for misaligned copies. it does not require that the cpu
support misaligned accesses; instead, it performs bit shifts to
realign the bytes for the destination.

essentially, this is the C version of the ARM assembly language
memcpy. the ideas are all the same, and it should perform well on any
arch with a decent number of general-purpose registers that has a
barrel shift operation. since the barrel shifter is an optional cpu
feature on microblaze, it may be desirable to provide an alternate asm
implementation on microblaze, but otherwise the C code provides a
competitive implementation for "generic risc-y" cpu archs that should
alleviate the urgent need for arch-specific memcpy asm.
2013-08-28 03:34:57 -04:00
Rich Felker
a543369e3b optimized C memset
this version of memset is optimized both for small and large values of
n, and makes no misaligned writes, so it is usable (and near-optimal)
on all archs. it is capable of filling up to 52 or 56 bytes without
entering a loop and with at most 7 branches, all of which can be fully
predicted if memset is called multiple times with the same size.

it also uses the attribute extension to inform the compiler that it is
violating the aliasing rules, unlike the previous code which simply
assumed it was safe to violate the aliasing rules since translation
unit boundaries hide the violations from the compiler. for non-GNUC
compilers, 100% portable fallback code in the form of a naive loop is
provided. I intend to eventually apply this approach to all of the
string/memory functions which are doing word-at-a-time accesses.
2013-08-27 18:08:29 -04:00
Rich Felker
cccc1844be add arm-optimized memcpy implementation from bionic libc
the approach of this implementation was heavily investigated prior to
adopting it. attempts to obtain similar performance with pure C code
were capping out at about 75% of the performance of the asm, with
considerably larger code size, and were fragile in that the compiler
would sometimes compile part of memcpy into a call to itself.
therefore, just using the asm seems to be the best option.

this commit is the first to make use of the new subarch-specific asm
framework. the new armel directory is the location for arm asm that
should not be used for all arm subarchs, only the default one. armhf
is the name of the little-endian hardfloat-ABI subarch, which can use
the exact same asm. in both cases, the build system finds the asm by
following a memcpy.sub file.

the other two subarchs, armeb and armebhf, would need a big-endian
variant of this code. it would not be hard to adapt the code to big
endian, but I will hold off on doing so until there is demand for it.
2013-08-14 03:06:21 -04:00
Rich Felker
926272ddff optimized memset asm for i386 and x86_64
the concept of both versions is the same; they differ only in details.
for long runs, they use "rep movsl" or "rep movsq", and for small
runs, they use a trick, writing from both ends towards the middle,
that reduces the number of branches needed. in addition, if memset is
called multiple times with the same length, all branches will be
predicted; there are no loops.

for larger runs, there are likely faster approaches than "rep", at
least on some cpu models. for 32-bit, it's unlikely that there is any
faster approach that does not require non-baseline instructions; doing
anything fancier would require inspecting cpu capabilities. for
64-bit, there may very well be faster versions that work on all
models; further optimization could be explored in the future.

with these changes, memset is anywhere between 50% faster and 6 times
faster, depending on the cpu model and the length and alignment of the
destination buffer.
2013-08-01 21:44:43 -04:00
Rich Felker
c713d87978 fix a couple misleading/wrong signal descriptions in strsignal
there are still several more that are misleading, but SIGFPE (integer
division error misdescribed as floating point) and and SIGCHLD
(possibly non-exit status change events described as exiting) were the
worst offenders.
2013-07-09 02:30:21 -04:00
Rich Felker
c90fa2ace7 add realtime signals to strsignal
the name format RTnn/RTnnn was chosen to minimized bloat while
uniquely identifying the signal.
2013-07-09 02:23:16 -04:00
Rich Felker
8599822ee1 fix off-by-one array bound in strsignal 2013-07-09 02:11:52 -04:00
Isaac Dunham
14f0272ea1 Add ABI compatability aliases.
GNU used several extensions that were incompatible with C99 and POSIX,
so they used alternate names for the standard functions.

The result is that we need these to run standards-conformant programs
that were linked with glibc.
2013-04-05 23:20:28 -07:00
Rich Felker
5afc74fbaa fix integer type issue in strverscmp
lenl-lenr is not a valid expression for a signed int return value from
strverscmp, since after implicit conversion from size_t to int this
difference could have the wrong sign or might even be zero. using the
difference for char values works since they're bounded well within the
range of differences representable by int, but it does not work for
size_t values.
2013-02-26 01:42:11 -05:00
Rich Felker
4853c1f7f7 implement non-stub strverscmp
patch by Isaac Dunham.
2013-02-26 01:36:47 -05:00
Rich Felker
e864ddc368 replace stub with working strcasestr 2013-02-21 23:54:25 -05:00
Rich Felker
330fd96213 fix wrong return value from wmemmove on forward copies 2013-02-21 23:19:18 -05:00
Rich Felker
820fccdefe fix alignment logic in strlcpy 2012-12-26 23:48:02 -05:00
Rich Felker
838951c97e simplify logic in stpcpy; avoid copying first aligned byte twice
gcc seems to be generating identical or near-identical code for both
versions, but the newer code is more expressive of what it's doing.
2012-10-22 15:17:09 -04:00
Rich Felker
c86f2974e2 add memmem function (gnu extension)
based on strstr. passes gnulib tests and a few quick checks of my own.
2012-10-15 23:02:57 -04:00
Rich Felker
68dbd05039 optimize strchrnul/strcspn not to scan string twice on no-match
when strchr fails, and important piece of information already
computed, the string length, is thrown away. have strchrnul (with
namespace protection) be the underlying function so this information
can be kept, and let strchr be a wrapper for it. this also allows
strcspn to be considerably faster in the case where the match set has
a single element that's not matched.
2012-09-27 17:19:09 -04:00
Rich Felker
3f9ff1514e slightly cleaner strlen, also seems to compile to better code
testing with gcc 4.6.3 on x86, -Os, the old version does a duplicate
null byte check after the first loop. this is purely the compiler
being stupid, but the old code was also stupid and unintuitive in how
it expressed the check.
2012-09-27 16:56:33 -04:00
Rich Felker
2bf469310d asm for memmove on i386 and x86_64
for the sake of simplicity, I've only used rep movsb rather than
breaking up the copy for using rep movsd/q. on all modern cpus, this
seems to be fine, but if there are performance problems, there might
be a need to go back and add support for rep movsd/q.
2012-09-10 19:04:24 -04:00
Rich Felker
1701e4f3d4 reenable word-at-at-time copying in memmove
before restrict was added, memove called memcpy for forward copies and
used a byte-at-a-time loop for reverse copies. this was changed to
avoid invoking UB now that memcpy has an undefined copying order,
making memmove considerably slower.

performance is still rather bad, so I'll be adding asm soon.
2012-09-10 18:16:11 -04:00
Rich Felker
400c5e5c83 use restrict everywhere it's required by c99 and/or posix 2008
to deal with the fact that the public headers may be used with pre-c99
compilers, __restrict is used in place of restrict, and defined
appropriately for any supported compiler. we also avoid the form
[restrict] since older versions of gcc rejected it due to a bug in the
original c99 standard, and instead use the form *restrict.
2012-09-06 22:44:55 -04:00
Rich Felker
bac03cdde1 remove dependency of wmemmove on wmemcpy direction
unlike the memmove commit, this one should be fine to leave in place.
wmemmove is not performance-critical, and even if it were, it's
already copying whole 32-bit words at a time instead of bytes.
2012-09-06 20:28:42 -04:00
Rich Felker
594318fd3d remove dependency of memmove on memcpy direction
this commit introduces a performance regression in many uses of
memmove, which will need to be addressed before the next release. i'm
making it as a temporary measure so that the restrict patch can be
committed without invoking undefined behavior when memmove calls
memcpy with overlapping regions.
2012-09-06 20:25:48 -04:00
Rich Felker
aaa9eb5101 memcpy asm for i386 and x86_64 2012-08-11 21:33:13 -04:00
Rich Felker
f997e224fc remove unused but buggy code from strstr.c 2012-08-11 18:40:33 -04:00
Rich Felker
35c16933f0 remove buggy short-string wcsstr implementation; always use twoway
since this interface is rarely used, it's probably best to lean
towards keeping code size down anyway. one-character needles will
still be found immediately by the initial wcschr call anyway.
2012-08-11 18:39:12 -04:00
Rich Felker
970ef6a124 optimize mempcpy to minimize need for data saved across the call 2012-07-31 21:18:17 -04:00
Rich Felker
f313a16224 make strerror_r behave nicer on failure
if the buffer is too short, at least return a partial string. this is
helpful if the caller is lazy and does not check for failure. care is
taken to avoid writing anything if the buffer length is zero, and to
always null-terminate when the buffer length is non-zero.
2012-06-20 12:07:18 -04:00
Rich Felker
054ba18599 fix overrun (n essentially ignored) in wcsncmp
bug report and solution by Richard Pennington
2012-05-26 18:04:17 -04:00
Rich Felker
aefd0f69bd fix failure of strrchr(str, 0)
bug report and solution by Richard Pennington
2012-05-26 18:01:34 -04:00
Rich Felker
e0614f7cd4 add all missing wchar functions except floating point parsers
these are mostly untested and adapted directly from corresponding byte
string functions and similar.
2012-03-01 23:24:45 -05:00
Rich Felker
a6540174be add dummied strverscmp (obnoxious GNU function)
programs that use this tend to horribly botch international text
support, so it's questionable whether we want to support it even in
the long term... for now, it's just a dummy that calls strcmp.
2011-09-11 22:45:56 -04:00
Rich Felker
73d2fde119 fix wrong type for wcsrchr argument 2 2011-06-13 14:06:04 -04:00
Rich Felker
86339bc4ba fix strncat and wcsncat (double null termination)
also modify wcsncpy to use the same loop logic
2011-05-22 21:58:43 -04:00
Rich Felker
e98136207a fix wcsncpy writing past end of buffer 2011-05-22 21:54:42 -04:00
Rich Felker
b5b41212a6 function signature fix: add const qualifier to mempcpy src arg 2011-04-26 12:28:41 -04:00
Rich Felker
6597f9ac13 implement memrchr (nonstandard) and optimize strrchr in terms of it 2011-04-13 08:36:29 -04:00
Rich Felker
cb8dff2149 fix misplaced *'s in string functions (harmless) 2011-04-07 16:19:30 -04:00
Rich Felker
1fee6186fe fix prototype for strsep 2011-04-06 14:28:29 -04:00
Rich Felker
16675df793 fix misaligned read on early string termination in strchr
this could actually cause rare crashes in the case where a short
string is located at the end of a page and the following page is not
readable, and in fact this was seen in gcc compiling certain files.
2011-04-05 09:27:41 -04:00
Rich Felker
c68b26369e fix serious bug in strchr - char signedness
search for bytes with high bit set was giving (potentially dangerous)
wrong results. i've tested, cleaned up, and hopefully sped up this
function now.
2011-04-03 18:16:11 -04:00
Rich Felker
9ae8d5fc71 fix all implicit conversion between signed/unsigned pointers
sadly the C language does not specify any such implicit conversion, so
this is not a matter of just fixing warnings (as gcc treats it) but
actual errors. i would like to revisit a number of these changes and
possibly revise the types used to reduce the number of casts required.
2011-03-25 16:34:03 -04:00
Rich Felker
a012aa879f fix broken wmemchr (unbounded search) 2011-03-17 22:38:45 -04:00
Rich Felker
2a195dd31c fix missing prototype for strsignal 2011-02-26 23:50:26 -05:00