107: Data Types: Chars And Bytes.
Take Up Code - Un pódcast de Take Up Code: build your own computer games, apps, and robotics with podcasts and live classes
Categorías:
Chars and bytes form some of the most basic data types available. But what are they really? And what can you do with them? This will depend on what language you are using and maybe even what platform you’re building your application to run on. Some languages such as C# may have both types while others like C++ may have just one but with an idea of what the other is. And some languages may have just one or possible neither. In general though, these types are used to hold small numeric values and character codes. The episode also describes both signed and unsigned types and a common way of representing negative values called two’s complement. To convert between positive and negative values using two’s complement, all you have to do is first flip all the bits and then add one. This works both ways. An easy way to tell if a signed value is negative or not is to look at the most significant bit. If it’s set to one, then the value is negative. And if it’s a zero, then the value is positive. Listen to the full episode or you can also read the full transcript below. Transcript This episode gets us back to some of the basics and I’ll explain many data types in their own episode. You might already have an idea of what a char is after listening to the other episodes but there are some things that we can discuss in this dedicated episode that would have taken other episodes off topic. This might seem simple but there are some complexities. The byte type is normally 8 bits long but it can be different. I’ve always known bytes to be 8 bits and that’s how I think of them. The technical term though for a type that’s always exactly 8 bits is an octet. You’ll probably comes across this term sometimes especially in networking. If you think of a byte as the smallest unit of addressable memory, then there have been many different sizes of bytes both smaller and larger than 8 bits throughout history. And some embedded systems even today can have a different number of bits in a byte. The final answer then is that a byte normally has 8 bits but maybe not. If you really want to be specific and refer to exactly 8 bits, then you can call it an octet. Except in C# where a byte is defined to be 8 bits and is unsigned. I’ll explain signed and unsigned in just a moment. C# also gives you a signed byte or just sbyte as it’s written. In C++, there is no byte data type but the size of a byte is defined to be at least 8 bits. The data type that you use for a byte in C++ is the char. You also have besides the char, an explicit signed char and an unsigned char. This is where it gets strange. All three types, char, signed char, and unsigned char are distinct types in C++. So is char signed or unsigned then? Well, that depends on your compiler and sometimes even on the platform that you’re compiling for. Alright, how does C# represent chars then? Hopefully your heads not spinning too much yet. You might want to sit down though so you don’t get dizzy. In C#, a char is 2 bytes long and C# really goes out of its way to make sure that chars are used just for 16 bit unicode characters. So while it has the same representation as an unsigned 16 bit numeric type, you really shouldn’t think of a char in C# as a number. This is different in C++ where a char is defined to be one byte. Here, you can easily store simple ASCII characters as well as numbers. You just need to make sure that the number value will fit. For that, you need to know about signed vs. unsigned values. I’ll explain those right after this message from our sponsor. ( Message from Sponsor ) There are many ways that you could represent negative numbers and probably the most common is called the two’s-complement. Let’s start with some simple binary counting though. We’ll just count with 2 bits because anything more gets confusing with just audio. Okay, with 2 bits, we can count in binary up to the