Stories
Slash Boxes
Comments

Dev.SN ♥ developers

posted by LaminatorX on Thursday March 20 2014, @09:33AM   Printer-friendly
from the ilibc-ulibc-we-all-C-for-libc dept.

dalias writes

"The musl libc project has released version 1.0, the result of three years of development and testing. Musl is a lightweight, fast, simple, MIT-licensed, correctness-oriented alternative to the GNU C library (glibc), uClibc, or Android's Bionic. At this point musl provides all mandatory C99 and POSIX interfaces (plus a lot of widely-used extensions), and well over 5000 packages are known to build successfully against musl.

Several options are available for trying musl. Compiler toolchains are available from the musl-cross project, and several new musl-based Linux distributions are already available (Sabotage and Snowflake, among others). Some well-established distributions including OpenWRT and Gentoo are in the process of adding musl-based variants, and others (Aboriginal, Alpine, Bedrock, Dragora) are adopting musl as their default libc."

 
This discussion has been archived. No new comments can be posted.
Display Options Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 5, Interesting) by ArghBlarg on Thursday March 20 2014, @12:46PM

    by ArghBlarg (1449) on Thursday March 20 2014, @12:46PM (#18944)

    Length-counted strings (ala Pascal) need to be introduced from the ground up (ie., the OS itself upwards).

    Null-terminated strings were a hack from back in the days when it was considered an extravagance to use 2 or 4 bytes for a dope vector preceding strings to represent the length. But such a scheme is a) more secure and b) more efficient (no strlen() scanning for a null).

    Eliminating null-delimited strings would mean a necessary break from all libc compatibility, but there's no getting around the ugly fact that in the long term, the wrong decision was made about how to represent strings, and this opened up a whole world of vulnerabilities that just shouldn't be possible or tolerated.

    I myself would love to see a fork of Linux + core tools that uses NO null-terminated strings anywhere. It would be a Herculean task, but result in a more secure OS and dev universe.

    Starting Score:    1  point
    Moderation   +4  
       Insightful=1, Interesting=3, Total=4
    Extra 'Interesting' Modifier   0  

    Total Score:   5  
  • (Score: 1, Insightful) by Anonymous Coward on Thursday March 20 2014, @12:51PM

    by Anonymous Coward on Thursday March 20 2014, @12:51PM (#18949)

    I myself would love to see a fork of Linux + core tools that uses NO null-terminated strings anywhere. It would be a Herculean task, but result in a more secure OS and dev universe.

    Then get to coding. Stop bitching and complaining and fix it yourself, bum.

    • (Score: 1) by ArghBlarg on Thursday March 20 2014, @01:46PM

      by ArghBlarg (1449) on Thursday March 20 2014, @01:46PM (#18978)

      I should have taken bets on how long it would take for someone to say exactly this :).

      A break like this is even bigger than changing from COFF to ELF or switching ABIs. It would take a whole team and a lot of time. First a core string lib, then adapting the whole kernel to use it, then all of userland would have to be rewritten...

      I'm not that young.. I'd be willing to contribute, but frankly I don't have the life-cycles to spare doing it as just a hobby unless there's some proof it would ever be finished. Point me to a team committed to taking it from start to finish, and I might take you up on it.

      • (Score: 0) by Anonymous Coward on Thursday March 20 2014, @01:56PM

        by Anonymous Coward on Thursday March 20 2014, @01:56PM (#18982)

        Why should anyone do the work for you which is of dubious benefit? Put up or shut up, bum.

        • (Score: 1) by ArghBlarg on Thursday March 20 2014, @02:21PM

          by ArghBlarg (1449) on Thursday March 20 2014, @02:21PM (#18996)

          Not that I care, but it might be less anti-social of you to

          a) post under a username, if you insist on insulting someone; and

          b) consider that suggesting a solution, while not committing to implement said solution, does nothing to invalidate the original point.

          So from one 'bum' to an anonymous coward, try to be a little more civil please. If you wanted to troll, you might have more fun on that "other site" where it's rampant.

          • (Score: 0, Flamebait) by Desler on Thursday March 20 2014, @02:37PM

            by Desler (880) on Thursday March 20 2014, @02:37PM (#19010)

            a) post under a username, if you insist on insulting someone;

            Nope, I'll insult you all I want however I want.

            b) consider that suggesting a solution, while not committing to implement said solution, does nothing to invalidate the original point.

            Sure it does. You whining and complaining that other people won't implement your "great" idea. The solution to your problem is to stop being a lazy bum.

            • (Score: 2, Interesting) by ArghBlarg on Thursday March 20 2014, @03:34PM

              by ArghBlarg (1449) on Thursday March 20 2014, @03:34PM (#19039)

              Ah, thank you for logging in. I appreciate the small effort. Now could you take the further effort of trying to actually be civil when people are discussing improving the life of working programmers?

              I assure you, as a programmer I am *not* lazy about such things such as string bounds-checking and so forth. I simple get annoyed when, yet again, I have to write or maintain code that's trying to programmatically build strings with snprintf(), strncpy(), and especially strncat(): keeping track of how much buffer space is left, going over my code (and other team members', or worse 3rd-party code I can't change), day in and day, code review after code review, looking for subtle errors in bounds-checking arithmetic when it should be pushed into string lib routines.

              Null-terminated strings appear to make pushing that logic into the string libraries more difficult than it ought to be. If it's so easy, then why hasn't the standard lib pushed all of this druge-work fully into the library so that no one has to do it any more, since it's so error-prone?

              Ironically, if this particular "programmer", as you so contemptibly put it, was exclusively working in higher-level languages, he wouldn't be so annoyed about this aspect of the standard libs after so many years of working with them.

            • (Score: 1) by ArghBlarg on Thursday March 20 2014, @04:11PM

              by ArghBlarg (1449) on Thursday March 20 2014, @04:11PM (#19049)

              All right, I should resist, but I'd like to make one final point in the spirit of *constructive* discussion.

              1. Programmers should be lazy, in the respect that underlying causes of common errors are solved, rather than manually fixing the same things over and over again. So on that point, I'll take your 'lazy bum' label and wrap it around myself proudly. Perhaps in the meantime I could be diligent in future projects and write my own wrappers around the (still not-good-enough) snprintf(), strncpy(), strncat() libs and try to apply them everywhere I can.

              But they won't be standard, and they won't be in the OSes I use.

              2. You still have not added constructively to this discussion by stating what better solution there is, or might be. If you truly believe the status quo is the best there is or will ever be, then we'll just have to agree to disagree. I, for one, would like to keep thinking about the possibilities for a better solution.

              So: What's your better solution, that will allow all programmers henceforth to be able to use strings without worrying as much about buffer overflows and fencepost errors?

              • (Score: 3, Insightful) by gringer on Thursday March 20 2014, @04:19PM

                by gringer (962) on Thursday March 20 2014, @04:19PM (#19051)

                So: What's your better solution, that will allow all programmers henceforth to be able to use strings without worrying as much about buffer overflows and fencepost errors?

                A higher level language.

                • (Score: 1) by ArghBlarg on Thursday March 20 2014, @06:03PM

                  by ArghBlarg (1449) on Thursday March 20 2014, @06:03PM (#19082)

                  Fair enough, that's a valid solution for userspace development. Still doesn't address the fact that all current mainstream OSes are effectively locked-in to the use of C-style strings. Maybe that's not a problem for anyone else but me.

      • (Score: 0) by Anonymous Coward on Friday March 21 2014, @08:31AM

        by Anonymous Coward on Friday March 21 2014, @08:31AM (#19259)

        A break like this is even bigger than changing from COFF to ELF or switching ABIs. It would take a whole team and a lot of time.

        SO, the sooner you start.... :)

  • (Score: 4, Informative) by threedigits on Thursday March 20 2014, @01:45PM

    by threedigits (607) on Thursday March 20 2014, @01:45PM (#18976)

    Here goes my karma, but hey, what the hell.

    Len prefixed strings are, to paraphrase Linus Torvals, insane.

    For being competitive in terms of efficiency you need to use the native word size in the native endianess. This kills their use as an interchange format. And years and years of benchmarks have proved that it doesn't have any real advantage, because most strings are short, and in those cases the scan time is mostly negligible (and any sane code does it only once).

    Also, they are NOT more secure. Just try to use a 16-bit prefixed string as a 32-bit one and have a lot of fun.

    So, null terminated strings are universal, as safe and as fast as the alternatives for most uses. Don't expect Linux changing any time soon.

    • (Score: 1) by Desler on Thursday March 20 2014, @02:01PM

      by Desler (880) on Thursday March 20 2014, @02:01PM (#18983)

      Why should you think you'll burn karma? You're absolutely write. For string lengths of even 100 or less characters will take fractions of a second and that's against ancient Pentium 4s. It is highly doubtful that the overhead of strlen is a hotspot in the vast of programs and if it is then it's usually due to some idiot calling it repeatedly for the same string rather than caching the value.

      • (Score: 0) by Anonymous Coward on Thursday March 20 2014, @02:03PM

        by Anonymous Coward on Thursday March 20 2014, @02:03PM (#18987)

        Absolutely right of course. Facepalm at myself.

    • (Score: 0) by Anonymous Coward on Thursday March 20 2014, @02:22PM

      by Anonymous Coward on Thursday March 20 2014, @02:22PM (#18999)

      I agree. NULL vs Pascal gains you nothing other than a library headache.

      Instead of attacking the end of the string what stops me from memory underrun and attacking the size? Not only that I know exactly where to attack instead of having to search for it.

      The only security it would give you is for the short while it took for hackers to figure out how to attack it.

      • (Score: 1) by ArghBlarg on Thursday March 20 2014, @02:26PM

        by ArghBlarg (1449) on Thursday March 20 2014, @02:26PM (#19002)

        It may not make code more resistant to attack (ie., intentional overflows), but it would help prevent accidental overflows (ie., programmer error).

        And if you're concerned about 16-vs 32-bit lengths, standardize on one then. Memory's cheap.

        I know I'd like to never have to think about terminating strings again.. it's a stupidly menial task and no matter how careful people try to be, someone somewhere forgets a memset() or a fixup on an snprintf() or strncat() somewhere... and before anyone says "so write a wrapper once that does it right and forget about it".. easy to say for you own code, but you probably use lots of other people's code and if it's third-party libs you do NOT want to go through all of that when the changes won't get pushed upstream.

        • (Score: 0) by Anonymous Coward on Friday March 21 2014, @11:58AM

          by Anonymous Coward on Friday March 21 2014, @11:58AM (#19351)

          but it would help prevent accidental overflows

          How? That would *only* work if you made sure to use the libraries for everything. If you are doing that who cares how it is terminated. And then I could just overflow anyway by a stupid cast somewhere (which are trivial to do and usually done at function boundaries), and that is just 1 example. Stupidity is not created from the language. It comes from poor knowledge and bad mistakes.

          Memory's cheap.
          Yes, if you are building 1 of something it is. If you are building 20k of something not so much.

          You are fighting for something that does not exist for C. There is basically no standard 'p-string' type in C. C is a 'buffer' orientated language. I can turn a int64 into a buffer with 1 cast. char * is no different. You can do pstring things in other languages as it is 'built in'.

          The language does not do it. Libraries can help you do it though. The C language is fairly simple. It has no concept of 'printf'. The library that goes along with it? Thats a beast.

          Feel free though to create a 'pstring' crt. I am sure many would use it.

      • (Score: 0) by Anonymous Coward on Thursday March 20 2014, @02:28PM

        by Anonymous Coward on Thursday March 20 2014, @02:28PM (#19004)

        The GGP is probably some "programmer" who only writes in VM-languages with little real-world experience not having his hand held. That usually where most of these inane suggestions come from.

        • (Score: 1) by ArghBlarg on Thursday March 20 2014, @02:46PM

          by ArghBlarg (1449) on Thursday March 20 2014, @02:46PM (#19018)

          Wow. Just wow. Why don't you go home with your "air quote" ad-hominem attacks already?

          OK Mr. Coward, let's hear your better ideas. And "rewriting libc over-and-over again and patching holes after we find them, with no attempt to fix root causes" isn't an idea. I may not have any ultimate answers, but at least I'm trying to think about them.

          Fine, mentioning strlen() was probably a red herring -- that' not really an efficiency issue. But from a computational standpoint, it's still unncecessary (and yes I know compilers may optimize invariants like that in certain situations, buy why should that even be necessary?)

          I never said changing how strings were represented would be easy. It's easy to say in hindsight that radical things that turned out well were obviously good.. except it wasn't obvious unless someone did it. (See any successful wheel that's been rewritten 'just cuz' someone wanted to).

          BTW my experience is primarily with RTOS and embedded development, so I'm not totally clueless as suggested.

  • (Score: 2) by maxwell demon on Thursday March 20 2014, @06:24PM

    by maxwell demon (1608) on Thursday March 20 2014, @06:24PM (#19089)

    Nobody forces you to use the standard string library. Apart from the file operations and command line arguments (and those can easily be wrapped), I don't see anything that forces you to use zero-terminated strings. Indeed, with C99, you even have a portable way to implement your length-prefixed strings:

    typedef struct MyString
    {
        int length;
        char data[];
    } *pMyString;

    Of course with that definition, you get to implement all string functions yourself. But then, the string functions C provides are mostly quite basic anyway.

    --
    The Tao of math: The numbers you can count are not the real numbers.
    • (Score: 1) by Subsentient on Thursday March 20 2014, @07:50PM

      by Subsentient (1111) on Thursday March 20 2014, @07:50PM (#19122) Homepage

      char data[] is an incomplete type.

      • (Score: 1) by Subsentient on Thursday March 20 2014, @08:25PM

        by Subsentient (1111) on Thursday March 20 2014, @08:25PM (#19124) Homepage

        Wait, guess it isn't in C99. I'm tired.

        • (Score: 2) by maxwell demon on Saturday March 22 2014, @08:06AM

          by maxwell demon (1608) on Saturday March 22 2014, @08:06AM (#19670)

          In C99, it is specifically allowed at the end of a struct. It allows to allocate extra memory after the struct and use that as members of the array. It's called flexible array member.

          --
          The Tao of math: The numbers you can count are not the real numbers.
  • (Score: 2, Informative) by dalias on Thursday March 20 2014, @10:21PM

    by dalias (3909) on Thursday March 20 2014, @10:21PM (#19134)

    I'm the main author/maintainer of musl and this topic, C vs Pascal strings, is actually something I've addressed before, e.g. in this answer on Stack Overflow:

    http://stackoverflow.com/questions/4418708/whats-t he-rationale-for-null-terminated-strings/4419243#4 419243 [stackoverflow.com]

    The other answers on that question are also very informative.

    In short, Pascal strings force you to allocate storage and make copies of strings in many places where you could otherwise use them in place, which in turn creates failure cases, which people forget or don't think they need to check for, and therefore more bugs.

    It would be nice if more interfaces took a (pointer,length) pair as an argument rather than requiring null termination, but storing the length at a fixed location relative to the string data is a bad design.