Origins of why "char" is signed?

Discuss programming topics for any language, any source base. If it is programming related but doesn't fit in one of the below categories, it goes here.
Post Reply
Baker
Posts: 3666
Joined: Tue Mar 14, 2006 5:15 am

Origins of why "char" is signed?

Post by Baker »

Char is a single byte ranging from 0-127 (128 values) and on the negative side from -128 to -1 (128) values.

Why was this datatype signed? Low memory era? Early days of IBM EBDDIC character set that only went from 0-127? (pre-ASCII, pre-ANSI, pre-UTF8)

Clearly from the name the char datatype had to have been meant for characters.

Just seems like a weird oddity and I can't think of very many situations where you would actually want signed char as in "char" instead of unsigned char ... unless you are in 1979 writing Asteroids or something but I bet those days probably used machine language.

Then again, considering the farther one goes into the past, the more important ASM and such become, maybe this has to do with registers.

/May self-research this ... probably will.
The night is young. How else can I annoy the world before sunsrise? 8) Inquisitive minds want to know ! And if they don't -- well like that ever has stopped me before ..
Baker
Posts: 3666
Joined: Tue Mar 14, 2006 5:15 am

Re: Origins of why "char" is signed?

Post by Baker »

http://coding.derkeiler.com/Archive/C_C ... 00751.html
C requires that "plain" char have the same range and representation as
either signed char or unsigned char, but it is implementation-defined
as to which.

There are historical reasons for this. In the early days of C, long
before the standardization by ANSI and ISO, there was just plain char.
As C compilers were implemented on different platforms with different
types of processors, the implementers tended to use whatever was most
efficient on that particular processor. Some made char signed, some
made it unsigned.

As the language evolved it became useful to have both signed and
unsigned character types. Signed chars could hold small numeric
values between -127 and 127 and save space. Unsigned chars have the
value of being C's raw data type, any memory accessible to a program
can be examined as an array of unsigned chars.

When it came time to standardize the language, the committee had a
mandate to avoid as much as possible making changes that would cause
existing working code to fail. If the standard said that plain char
always had to be signed or unsigned, it would break some code on one
type of implementation or the other.

So the solution was to have three types of char, even though on every
implementation plain char has the same representation and properties
as one of the other two.

Use signed char when you need to hold small numbers that might have
negative values, if they will always be in the range -127 to 127. Use
unsigned char when you want access to the raw bits in memory, or when
you want to hold small numbers that will never be negative, in the
range 0 to 255.

And used plain char when you are dealing with ordinary text and
strings. All C library functions that accept pointers to strings
require pointer to char, not pointer to signed or unsigned char.
The night is young. How else can I annoy the world before sunsrise? 8) Inquisitive minds want to know ! And if they don't -- well like that ever has stopped me before ..
Post Reply