Fresh Hacker News | System.LongBool

▲System.LongBool(docwiki.embarcadero.com)

38 points by surprisetalk 5 days ago | 11 comments

▲MarkSweep 7 hours ago

I assume this type is for compatibility with the 32-bit BOOL type on Windows. This is a common bugaboo when doing interoperability, as I think languages tend to define bool as a 8-bit value.

https://learn.microsoft.com/en-us/windows/win32/winprog/wind...

This must be a pretty slow news day for this to make the front page of Hacker News.

▲kevincox 8 hours ago

> Note: The ByteBool, WordBool, and LongBool types exist to provide compatibility with other languages and operating system libraries.

Which makes sense. So these are really only intended to be used for FFI, not internal Delphi code. If you are bridging from C where bools are a byte you want to determine how you handle the other values.

I think the one thing missing is specifying what the True and False constants map to. It is implied that False maps to 0 by "A WordBool value is considered False when its ordinality is 0" but it doesn't suggest that True has a predictable value which would be important for FFI use cases.

▲Sharlin 7 hours ago

And bools are definitely a byte only on "new" C standard versions (as in since C99), before that there was no boolean type and ints were often used. Thus, LongBool.

▲jjmarr 3 hours ago

I think bool is implementation-defined, no?

And a std::vector<bool> uses 1 bit per bool.

▲maxlybbert 3 hours ago

Looking at the C 2011 standard, it looks like "_Bool" is implementation defined ( https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf , pg. 57 of the PDF; the page labeled 39 section 6.2.5 paragraph 2): " An object declared as type _Bool is large enough to store the values 0 and 1."

I believe everybody uses a single byte for that -- a single byte can store the values 0 and 1 -- but it looks like they aren't required to.

I believe C++ specifies "bool" is one byte; it's definitely never larger than a "char".

As far as std::vector<bool>, the fact that each value is defined at taking up a single bit inside the std::vector<bool> doesn't really say anything about how large each bool would be outside of the std::vector<bool>. std::vector<bool> was arguably a case of the Committee being too clever ( https://isocpp.org/blog/2012/11/on-vectorbool ).

▲spc476 29 minutes ago

And you can take the address of a _Bool, so it must be addressable, so the least it could be is sizeof(bool) == 1.

▲ronsor 1 hour ago

Isn't the usual definition of true "not 0" (or rather "not false")?

▲stefs 8 hours ago

I'm not sure I get you - "not 0" is a more predicable value than a certain number, isn't it?

▲kevincox 7 hours ago

I'm talking about output. For sure, if I am reading this bool from FFI I want to have "not 0" be truthy. However if I am writing a bool to a FFI interface I want the value to be predictable (for example 1) rather than "some non-zero value".

Although this does open interesting cases where if you read a bool from one FFI interface and write to another it may have an unexpected value (ex 2). But I still think it is useful for the in-language conversions for example Boolean to LongBool and the True constant to have predictable values.

▲saurik 7 hours ago

I presume this FFI goes in both directions; some APIs really want the value of a boolean to be 1 while others really want it to be "all 1s"/0xfff.../-1 because, internally, someone decided to do something silly and compare == or switch on TRUE.

▲masfuerte 6 hours ago

The .Net runtime generates code that relies on bools being either 0 or 1. It's quite easy using interop to inadvertently set a bool to something else and this leads to very odd bugs where simple boolean expressions appear to be giving the wrong result.

(Tangentially, VB traditionally used -1 for true. VB.NET uses the same 0 or 1 internal representation for a bool as C# but if you convert a bool to a number in VB.NET it comes out as -1.)

▲da_chicken 5 hours ago

-1 isn't even a bad choice, since that's basically using 0xFF for true and 0x00 for false. The weirdness is the fact that you're converting a binary value to a signed integer.

▲int_19h 2 hours ago

This goes all the way to early BASIC, and it's signed because the language didn't have any unsigned numbers to begin with.

The main reason for this particular arrangement is that, so long as you can rely on truth being represented as -1 - i.e. all bits set - bitwise operators double as logical ones. Thus BASIC would have NOT, AND, OR, XOR, IMP, EQV all operating bitwise but mostly used for Booleans in practice (it misses short-circuiting, but languages of that era rarely defaulted to it).

▲tux3 4 hours ago

There was a proposal for a unit type for C++, it would be a zero-sized type just like void except that you could actually use it to declare a variable and pass around as a value, like other regular types.

The less serious proposal: have it be `long void`

▲cryptonector 37 minutes ago

This would be useful for decoding ASN.1 CHOICEs and open types that contain a NULL option. These should compile to a sum type where the components have values, except that the NULL cases should not.

▲kragen 3 hours ago

Empty structs are supported in clang; I think it's a GCC extension to C. This program:

  #include <stdio.h>
  
  typedef struct {} unit;
  
  static unit f(unit *p)
  {
      return (unit){};
  }
  
  static void g(unit u)
  {
  }
  
  int main()
  {
      unit a = {}, b = {};
      a = b;
      f(&a);
      g(b);
      printf("a is at %lx, b is at %lx, "
              "sizeof unit is %d\n",
              (unsigned long)&a, (unsigned long)&b,
              (int)sizeof(unit));
      return 0;
  }

compiles without complaints on my cellphone and produces the output:

a is at 7ffb39805b, b is at 7ffb39805a, sizeof unit is 0

So you can declare empty structs as variables, return them, assign them, pass them as parameters, take their addresses, create them in struct literals, and dereference pointers to them. clang is assigning different addresses to different empty-struct local variables, but presumably in an array all of them would have the same address, unlike in C++.

I wouldn't be confident that you could malloc them, and I wouldn't be surprised if passing them by value uncovered compiler divergences in the interpretation of the ABI. (I spent most of last night tracking down a bug due to LuaJIT/GCC ABI differences in the implementation of parameter passing.)

▲int_19h 2 hours ago

Why couldn't you malloc them? Malloc doesn't care about types, it just wants the number of bytes, and 0 is a perfectly valid value to pass to malloc. Now, it's unspecified whether malloc will give you back a pointer distinct from all other non-freed allocated blocks or NULL, but if you don't rely on address as identity that is irrelevant.

▲kragen 2 hours ago

You're right, in the standard it's implementation-defined. I mistakenly thought malloc(0) was undefined. That said, it's probably not the best-tested code path in the system library.

▲magicalhippo 4 hours ago

In Delphi, and possibly other Pascal variants, you can do this with a type that's just an empty record (struct). It'll have zero size, but you can declare variables of that type, pass it as parameters and such.

IIRC things get a bit funky using this in certain situations, so I'm not sure the compiler devs actually considered this or if it's just a happy accident of sorts.

▲int_19h 2 hours ago

The problem is that it's still a nominal type system, so each empty struct is still a different type. A proper unit type has a single value of that type, and it's the same throughout the entire program.

▲magicalhippo 2 hours ago

> The problem is that it's still a nominal type system, so each empty struct is still a different type.

Fair point, though for me that was a feature when I used it in Delphi. Allowed me to differentiate them when using them as type parameters (generics).

▲noir_lord 7 hours ago

Amazes me that Embarcadero/Delphi is still going - it's been 25ish years since I wrote a line of Object Pascal, it was a very nice language in its day with an even nicer IDE.

▲1313ed01 8 hours ago

Also in Free Pascal?

https://wiki.freepascal.org/Data_type#Boolean_types

▲knodi123 8 hours ago

I chuckled, but presumably it's useful for applications where you want data types to take up the same amount of space, like for matrices or database columns? Or maybe where you coerce different data types into boolean? The language offers WordBool and ByteBool too, so they're pretty consistent. And AFAIK, there aren't any languages where you can specifically allocate only a single bit for a single boolean.

▲nneonneo 8 hours ago

You kind of can in C with bit size specifications on struct members, but you’ll still face the problem that C’s minimum alignment is one byte - so a struct containing a single 1-bit field will still occupy a byte in memory. However, it does let you “allocate” different bits within a byte for different member fields.

C++ has vector<bool>, which is supposed to be an efficient bit-packed vector of Booleans, but due to C++ constraints it doesn’t quite behave like a container (unlike other vector<T>s). Of course, if you make a vector<bool> of a single bit, that’s still going to occupy much more than one bit in memory.

There are plenty of hardware specification languages where it’s trivial to “allocate” one bit, but those aren’t allocating from the heap in a traditional sense. (Simulators for these languages will often efficiently pack the bits in memory, but that’s more of an implementation detail than a language guarantee).

▲mastax 7 hours ago

Don’t they use int-size bools in Solaris? I think I remember seeing those in ZFS.

▲azhenley 8 hours ago

I need some context… why?

▲dale_glass 8 hours ago

It says it right there:

"Note: The ByteBool, WordBool, and LongBool types exist to provide compatibility with other languages and operating system libraries."

▲beyondCritics 7 hours ago

Turns out that old versions of Visual C++ used their own typedef of bool as int: https://stackoverflow.com/questions/4897844/is-sizeofbool-de...

▲bobmcnamara 7 hours ago

There were risc platforms with int sized bool, usually where one byte math wasn't in the instruction set.

▲bobmcnamara 7 hours ago

In C, sizeof(bool) is implementation specific. Typical values are sizeof(char) and sizeof(int).

▲keketi 6 hours ago

In case you need a lot more than two boolean values some day.

▲huflungdung 8 hours ago

[dead]

▲spyrja 6 hours ago

Next up: 64-bit booleans for the win!

▲ray_v 5 hours ago

I'd like to allocate memory for one mega-bool please.

▲hamonrye 3 hours ago

[dead]

▲nine_k 7 hours ago

This reply was wrong, removed.

Please downvote it to oblivion.

▲p_l 7 hours ago

Every major CPU ISA still in use can, in fact, address individual bytes (C and C++ standards even demand atomic access).

It's just inefficient, but sometimes needed (MMIO, inter-cpu visible byte changes, etc)

▲int_19h 2 hours ago

C and C++ standards demand atomic access to individual chars.

It is not a given that a C/C++ char is a "byte" in the conventional modern understanding of that word, though. sizeof(char)==sizeof(bool)===sizeof(int)==1 is a perfectly valid arrangement for an architecture that is only capable of addressing machine words, and there have been such architectures historically although I'm not sure any are still around today.

▲piperswe 18 minutes ago

Don't modern C and C++ standards mandate an 8-bit char now though?

EDIT: never mind, guess I misremembered! it's just mandated to be at least 8 bits

▲ 7 hours ago

▲jonathrg 6 hours ago

You can just make stuff up on this website